1 Introduction

The distribution of factorization patterns on univariate polynomials over a finite field \(\mathbb {F}_{q}\) is a classical subject of combinatorics. Let \({\varvec{\lambda }}:=1^{\lambda _1}2^{\lambda _2}\ldots r^{\lambda _r}\) be a factorization pattern for polynomials of degree r, namely \(\lambda _1,\ldots ,\lambda _r\in \mathbb {Z}_{\ge 0}\) satisfy \(\lambda _1+2\lambda _2+\cdots +r\lambda _r=r\). A seminal article of Cohen [10] shows that the proportion of elements of \(\mathbb {F}_{q}[T]\) of degree r is roughly the proportion \(\mathcal {T}({\varvec{\lambda }})\) of permutations with cycle pattern \({\varvec{\lambda }}\) in the rth symmetric group \(\mathbb {S}_r\). (An element of \(\mathbb {S}_r\) has cycle pattern \({\varvec{\lambda }}\) if it has exactly \(\lambda _i\) cycles of length i for \(1\le i\le r\).)

In particular, the number of irreducible polynomials, or more generally the distribution of factorization patterns, of polynomials of “given forms” has been considered in a number of recent articles (see, e.g., [1, 8, 31, 46]). In [11], a subset of the set of polynomials of degree r is called uniformly distributed if the proportion of elements with factorization pattern \({\varvec{\lambda }}\) is roughly \(\mathcal {T}({\varvec{\lambda }})\) for every \({\varvec{\lambda }}\). The main result of that paper [11, Theorem 3] provides a criterion for a linear family of polynomials of \(\mathbb {F}_{q}[T]\) of given degree to be uniformly distributed in the sense above. Bank et al. [1], Cesaratto et al. [8] and Ha [31] provide explicit estimates on the number of elements with factorization pattern \({\varvec{\lambda }}\) on certain linear families of \(\mathbb {F}_{q}[T]\), such as the set of polynomials with some prescribed coefficients.

In [23, Problem 2.2], the authors ask for estimates on the number of polynomials of a given degree with a given factorization pattern lying in nonlinear families of polynomials with coefficients parameterized by an affine variety defined over \(\mathbb {F}_{q}\). Except for general results (see, e.g., [9, 20]), very little is known on such a number. In this article, we address this question, providing a general criterion for a nonlinear family \(\mathcal {A}\subset \mathbb {F}_{q}[T]\) to be uniform distributed in the sense of Cohen and explicit estimates on the number of elements of \(\mathcal {A}\) with a given factorization pattern.

Then, we apply our results to analyze the behavior of the classical factorization algorithm restricted to such families \(\mathcal {A}\). The classical factorization algorithm (see, e.g., [50]) is not the fastest one. Nevertheless, it is worth analyzing it, since it is implemented in several software packages for symbolic computation, and a number of scientific problems rely heavily on polynomial factorization over finite fields.

A precise worst-case analysis is given in [50]. On the other hand, an average-case analysis for the set of elements of \(\mathbb {F}_{q}[T]\) of a given degree is provided in [18]. This analysis relies on methods of analytic combinatorics which cannot be extended to deal with the nonlinear families we are interested in this article. For this reason, we provide an analysis of its average-case complexity when restricted to any nonlinear family \(\mathcal {A}\) satisfying our general criterion.

Now, we describe precisely our results. Let \(\overline{\mathbb {F}}_{q}\) be the algebraic closure of \(\mathbb {F}_{q}\). Let m and r be positive integers with \(m<r\) and \(A_{r-1}{,\ldots ,}A_{0}\) indeterminates over \(\overline{\mathbb {F}}_{q}\). For a fixed k with \(0\le k\le r-1\), we denote \(\mathbb {F}_{q}[\varvec{A}_k]:= \mathbb {F}_{q}[A_{r-1},\ldots ,A_{k+1},A_{k-1},\ldots ,A_0]\). Let \(G_1{,\ldots ,}G_m\in \mathbb {F}_{q}[\varvec{A}_k]\) and let \(W:=\{G_1=0,\ldots ,G_m=0\}\) be the set of common zeros in \(\overline{\mathbb {F}}_{q}{}^r\) of \(G_1,\ldots ,G_m\). Denoting by \(\mathbb {F}_{q}[T]_r\) the set of monic polynomials of degree r with coefficients in \(\mathbb {F}_{q}\), we consider the following family of polynomials:

$$\begin{aligned} \mathcal {A}&:=\{T^r+a_{r-1}T^{r-1}+\cdots +a_0\in \mathbb {F}_{q}[T]_r: G_i(a_{r-1},\ldots ,a_{k-1},a_{k+1},\ldots ,a_{0})\nonumber \\&=0\,\,(1\le i \le m)\}. \end{aligned}$$
(1.1)

Consider the weight \({{{\mathsf {wt}}}}:\mathbb {F}_{q}[\varvec{A}_k]\rightarrow \mathbb {N}_0\) defined by setting \({{{\mathsf {wt}}}}(A_j):=r-j\) for \(0\le j\le r-1\), \(j\not =k\), and denote by \(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}\) the components of highest weight of \(G_1{,\ldots ,}G_m\). Let \((\partial \varvec{G}/\partial \varvec{A}_k)\) be the Jacobian matrix of \(G_1{,\ldots ,}G_m\) with respect to \(\varvec{A}_k\). We shall assume that \(G_1 {,\ldots ,}G_m\) satisfy the following conditions:

  • \(({{\mathbf {\mathsf{{H}}}}}_\mathbf{1})\)\(G_1,\ldots ,G_m\) form a regular sequenceFootnote 1 of \(\mathbb {F}_{q}[\varvec{A}_k]\).

  • \(({{{\mathbf {\mathsf{{H}}}}}}_\mathbf{2})\)\((\partial \varvec{G}/ \partial \varvec{A}_k)\) has full rank on every point of W.

  • \(({{{\mathbf {\mathsf{{H}}}}}}_\mathbf{3})\)\(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}\) satisfy \(({{\mathsf {H}}}_1)\) and \(({{\mathsf {H}}}_2)\).

In what follows, we identify the set \(\overline{\mathbb {F}}_{q}[T]_r\) of monic polynomials of \(\overline{\mathbb {F}}_{q}[T]\) of degree r with \(\overline{\mathbb {F}}_{q}{}^r\) by mapping each \(f_{{\varvec{a}}_0}:= T^ r + a_{r-1}T^ {r-1}+\cdots +a_0 \in \overline{\mathbb {F}}_{q}[T]_r\) to \({\varvec{a}}_0:=(a_{r-1},\ldots ,a_0)\in \overline{\mathbb {F}}_{q}{}^r\). For \(\mathcal {B}\subset \overline{\mathbb {F}}_{q}[T]_r\), the set of elements of \(\mathcal {B}\) which are not square-free is called the discriminant locus \(\mathcal {D}(\mathcal {B})\) of \(\mathcal {B}\) (see [21, 40] for the study of discriminant loci). For \(f_{{\varvec{a}}_0}\in \mathcal {B}\), let \({{\mathrm {Disc}}}(f_{{\varvec{a}}_0}):={\mathrm {Res}}(f_{{\varvec{a}}_0},f'_{{\varvec{a}}_0})\) denote the discriminant of \(f_{{\varvec{a}}_0}\), that is, the resultant of \(f_{{\varvec{a}}_0}\) and its derivative \(f'_{{\varvec{a}}_0}\). Since \(f_{{\varvec{a}}_0}\) has degree r, by basic properties of resultants we have

$$\begin{aligned} {{\mathrm {Disc}}}(f_{{\varvec{a}}_0})= {\mathrm {Disc}}(F({\varvec{A}}_0, T))|_{{\varvec{A}}_0={\varvec{a}}_0} := {\mathrm {Res}}(F({\varvec{A}}_0, T), F'({\varvec{A}}_0, T), T)|_{{\varvec{A}}_0={\varvec{a}}_0}, \end{aligned}$$

where the expression \({\mathrm {Res}}\) in the right-hand side denotes resultant with respect to T. It follows that \(\mathcal {D}(\mathcal {B}):=\{{\varvec{a}}_0 \in \mathcal {B}: {\mathrm {Disc}}(F({\varvec{A}}_0, T))|_{{\varvec{A}}_0={\varvec{a}}_0}= 0\}\). We shall need further to consider first subdiscriminant loci. The first subdiscriminant locus \(\mathcal {S}_1(\mathcal {B})\) of \(\mathcal {B}\subset \overline{\mathbb {F}}_{q}[T]_r\) is the set of \({\varvec{a}}_0\in \mathcal {D}(\mathcal {B})\) for which the first subdiscriminant \({\mathrm {Subdisc}}(f_{{\varvec{a}}_0}):={\mathrm {Subres}}(f_{{\varvec{a}}_0},f'_{{\varvec{a}}_0})\) vanishes, where \({\mathrm {Subres}}(f_{{\varvec{a}}_0},f'_{{\varvec{a}}_0})\) denotes the first subresultant of \(f_{{\varvec{a}}_0}\) and \(f'_{{\varvec{a}}_0}\). Since \(f_{{\varvec{a}}_0}\) has degree r, basic properties of subresultants imply

$$\begin{aligned} {\mathrm {Subdisc}}(f_{{\varvec{a}}_0} )&= {\mathrm {Subdisc}}(F({\varvec{A}}_0, T))|_{{\varvec{A}}_0={\varvec{a}}_0} \\&:= {\mathrm {Subres}}(F({\varvec{A}}_0, T),F'({\varvec{A}}_0, T), T))|_{{\varvec{A}}_0={\varvec{a}}_0}, \end{aligned}$$

where \({\mathrm {Subres}}\) in the right-hand side denotes first subresultant with respect to T. We have \(\mathcal {S}_1(\mathcal {B}):=\{{\varvec{a}}_0\in \mathcal {D}(\mathcal {B}): {\mathrm {Subdisc}}(F({\varvec{A}}_0, T))|_{{\varvec{A}}_0={\varvec{a}}_0}=0\}\). Our next conditions require that the discriminant and the first subdiscriminant locus intersect well W:

  • \(({{{\mathbf {\mathsf{{H}}}}}}_\mathbf{4})\)\(\mathcal {D}(W)\) has codimension at least one in W.

  • \(({{{\mathbf {\mathsf{{H}}}}}}_\mathbf{5})\)\( (A_0\cdot \mathcal {S}_1)(W):=\{{\varvec{a}}_0\in W: a_0=0\}\cup \mathcal {S}_1(W)\) has codimension at least one in \(\mathcal {D}(W)\).

  • \(({{{\mathbf {\mathsf{{H}}}}}}_\mathbf{6})\)\(\mathcal {D}(V(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}))\) has codimension at least one in \(V(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}})\subset \overline{\mathbb {F}}_{q}{}^r\).

We briefly discuss hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\). Hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_2)\) merely state that W has the expected dimension \(r-m\) and it is smooth. These conditions are satisfied for any sequence \(G_1,\ldots ,G_m\in \mathbb {F}_{q}[{\varvec{A}}_k]\) as above with general coefficients (see, e.g., [2] or [51]). Hypothesis \(({{\mathsf {H}}}_3)\) requires that \(G_1,\ldots ,G_m\) behave properly “at infinity,” which is also the case for general \(G_1,\ldots ,G_m\). Hypotheses \(({{\mathsf {H}}}_4)\)\(({{\mathsf {H}}}_5)\) require that “most” of the polynomials of \(\mathcal {A}\) are square-free, and among those which are not, only “few” of them have roots with high multiplicity or several multiple roots. As we are looking for criteria for uniform distribution, namely families which behave as the whole set \(\mathbb {F}_{q}[T]_r\), it is clear that such a behavior is to be expected. Further, it is required that “few” polynomials in the family under consideration have 0 as a multiple root, which is a common requirement for uniformly distributed families (see, e.g., [11]). Finally, hypothesis \(({{\mathsf {H}}}_6)\) requires that the discriminant locus at infinity is not too large. We provide significant examples of families of polynomials satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\), which include in particular the classical case of polynomials with prescribed coefficients.

Our main result shows that any family \(\mathcal {A}\) satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\) is uniformly distributed in the sense of Cohen, and provides explicit estimates on the number \(|\mathcal {A}_{{\varvec{\lambda }}}|\) of elements of \(\mathcal {A}\) with factorization pattern \({\varvec{\lambda }}\). In fact, we have the following result (see Theorem 4.6 for a more precise statement).

Theorem 1.1

For \(m<r\) and \({\varvec{\lambda }}\) a factorization pattern, we have

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |\le q^{r-m-1}\big (\mathcal {T}({\varvec{\lambda }}) \big (D\delta \, q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )+r^2\delta \big ), \end{aligned}$$

where \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\) and \(D:=\sum _{i=1}^m({{{\mathsf {wt}}}}(G_i)-1)\).

Our methodology differs significantly from that of [10, 11], as we express \(|\mathcal {A}_{{\varvec{\lambda }}}|\) in terms of the set of common \(\mathbb {F}_{q}\)-rational zeros of certain symmetric multivariate polynomials defined over \(\mathbb {F}_{q}\). This allows us to establish several facts concerning the geometry of the set of zeros of such polynomials over \(\overline{\mathbb {F}}_{q}\). Combining these results with estimates on the number of common \(\mathbb {F}_{q}\)-rational zeros of such polynomials (see, e.g., [3] or [6]), we obtain our main results.

Then, we consider the average-case complexity of the classical factorization algorithm restricted to \(\mathcal {A}\). This algorithm works in four main steps. First, it performs an “elimination of repeated factors.” Then, it computes a (partial) factorization of the result of the first step by splitting its irreducible factors according to their degree (this is called the distinct-degree factorization). The third step factorizes each of the factors computed in the second step (the equal-degree factorization). Finally, the fourth step consists of the factorization of the repeated factors left aside in the first step (factorization of repeated factors). The following result summarizes our estimates on the average-case complexity of each of these steps (see Theorems 6.2, 6.4, 6.8 and 6.9 for more precise statements).

Theorem 1.2

Let \(\delta _G:=\deg G_1\cdots \deg G_m.\) Denote by \(E[\mathcal {X} _1] \), \(E[\mathcal {X} _2]\), \(E[\mathcal {X} _3]\) and \(E[\mathcal {X} _4]\) the average cost on \(\mathcal {A}\) of the steps of elimination of repeated factors, distinct-degree factorization, equal-degree factorization and factorization of repeated factors.

For \(q > 15\delta _{{\varvec{G}}}^{13/3}\), assuming that fast multiplication is used, we have

$$\begin{aligned}&E [\mathcal {X} _1] \le c \, \mathcal {U} (r) +o(1),\quad \\&E[\mathcal {X}_2] \le \xi \,(2\, \tau _1 \lambda (q)+\tau _1+\tau _2\log r) \, M(r)\,(r+1)\big (1+o(1)\big ),\\&E[\mathcal {X}_3] \le \tau \, M(r) \log q\,(1+o(1)),\quad E[\mathcal {X}_4] \le \tau _1 M(r)(1+o(1)), \end{aligned}$$

where \(M(r):=r\log r\log \log r\) is the fast-multiplication time function, \(\mathcal {U}(r):=M(r)\log r\) is the \(\gcd \) time function, \(\lambda (q)\) is the number of multiplications required to compute qth powers using repeated squaring, \(\xi \sim 0.62432945\ldots \) is the Golomb–Dickman constant, and c, \(\tau _1\), \(\tau _2\) and \(\tau \) are constants independent of q and r.

Here, the o(1) terms go to zero as q tends to infinity, for fixed r and \(\deg G_1,\ldots ,\deg G_m\). See Theorems 6.2, 6.4, 6.8 and 6.9 for explicit expressions of these terms.

This result significantly strengthens the conclusions of the average-case analysis of [18], in that it shows that such conclusions are not only applicable to the whole set \(\mathbb {F}_{q}[T]_r\) of monic polynomials of degree r, but to any family \(\mathcal {A}\subset \mathbb {F}_{q}[T]_r\) satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\).

The paper is organized as follows. In Sect. 2, we collect the notions of algebraic geometry we use. In Sect. 3, we obtain a lower bound on the number of elements of the family \(\mathcal {A}\) under consideration. Section 4 is devoted to describe our algebraic-geometry approach to the distribution of factorization patterns and to prove Theorem 1.1. In Sect. 5, we exhibit examples of linear and nonlinear families of polynomials satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\). Finally, in Sect. 6 we perform the average-case analysis of the classical polynomial factorization restricted to \(\mathcal {A}\), showing Theorem 1.2.

2 Basic notions of algebraic geometry

In this section, we collect the basic definitions and facts of algebraic geometry that we need in the sequel. We use standard notions and notations which can be found in, e.g., [36, 47].

Let \(\mathbb {K}\) be any of the fields \(\mathbb {F}_{q}\) or \(\overline{\mathbb {F}}_{q}\). We denote by \(\mathbb {A}^r\) the affine r-dimensional space \(\overline{\mathbb {F}}_{q}{}^{r}\) and by \(\mathbb {P}^r\) the projective r-dimensional space over \(\overline{\mathbb {F}}_{q}{}^{r+1}\). Both spaces are endowed with their respective Zariski topologies over \(\mathbb {K}\), for which a closed set is the zero locus of a set of polynomials of \(\mathbb {K}[X_1,\ldots , X_r]\), or of a set of homogeneous polynomials of \(\mathbb {K}[X_0,\ldots , X_r]\).

A subset \(V\subset \mathbb {P}^r\) is a projective variety defined over\(\mathbb {K}\) (or a projective \(\mathbb {K}\)-variety for short) if it is the set of common zeros in \(\mathbb {P}^r\) of homogeneous polynomials \(F_1,\ldots , F_m \in \mathbb {K}[X_0,\ldots , X_r]\). Correspondingly, an affine variety of\(\mathbb {A}^r\)defined over\(\mathbb {K}\) (or an affine \(\mathbb {K}\)-variety) is the set of common zeros in \(\mathbb {A}^r\) of polynomials \(F_1,\ldots , F_{m} \in \mathbb {K}[X_1,\ldots , X_r]\). We think a projective or affine \(\mathbb {K}\)-variety to be equipped with the induced Zariski topology. We shall denote by \(\{F_1=0,\ldots , F_m=0\}\) or \(V(F_1,\ldots ,F_m)\) the affine or projective \(\mathbb {K}\)-variety consisting of the common zeros of \(F_1,\ldots , F_m\).

In the remaining part of this section, unless otherwise stated, all results referring to varieties in general should be understood as valid for both projective and affine varieties.

A \(\mathbb {K}\)-variety V is irreducible if it cannot be expressed as a finite union of proper \(\mathbb {K}\)-subvarieties of V. Further, V is absolutely irreducible if it is \(\overline{\mathbb {F}}_{q}\)-irreducible as a \(\overline{\mathbb {F}}_{q}\)-variety. Any \(\mathbb {K}\)-variety V can be expressed as an irredundant union \(V=\mathcal {C}_1\cup \cdots \cup \mathcal {C}_s\) of irreducible (absolutely irreducible) \(\mathbb {K}\)-varieties, unique up to reordering, called the irreducible (absolutely irreducible) \(\mathbb {K}\)-components of V.

For a \(\mathbb {K}\)-variety V contained in \(\mathbb {P}^r\) or \(\mathbb {A}^r\), its defining idealI(V) is the set of polynomials of \(\mathbb {K}[X_0,\ldots , X_r]\), or of \(\mathbb {K}[X_1,\ldots , X_r]\), vanishing on V. The coordinate ring\(\mathbb {K}[V]\) of V is the quotient ring \(\mathbb {K}[X_0,\ldots ,X_r]/I(V)\) or \(\mathbb {K}[X_1,\ldots ,X_r]/I(V)\). The dimension\(\dim V\) of V is the length n of a longest chain \(V_0\varsubsetneq V_1 \varsubsetneq \cdots \varsubsetneq V_n\) of nonempty irreducible \(\mathbb {K}\)-varieties contained in V. We say that V has pure dimensionn if every irreducible \(\mathbb {K}\)-component of V has dimension n. A \(\mathbb {K}\)-variety of \(\mathbb {P}^r\) or \(\mathbb {A}^r\) of pure dimension \(r-1\) is called a \(\mathbb {K}\)-hypersurface. A \(\mathbb {K}\)-hypersurface of \(\mathbb {P}^r\) (or \(\mathbb {A}^r\)) can also be described as the set of zeros of a single nonzero polynomial of \(\mathbb {K}[X_0,\ldots , X_r]\) (or of \(\mathbb {K}[X_1,\ldots , X_r]\)).

The degree\(\deg V\) of an irreducible \(\mathbb {K}\)-variety V is the maximum of \(|V\cap L|\), considering all the linear spaces L of codimension \(\dim V\) such that \(|V\cap L|<\infty \). More generally, following [33] (see also [22]), if \(V=\mathcal {C}_1\cup \cdots \cup \mathcal {C}_s\) is the decomposition of V into irreducible \(\mathbb {K}\)-components, we define the degree of V as

$$\begin{aligned} \deg V:=\sum _{i=1}^s\deg \mathcal {C}_i. \end{aligned}$$

The degree of a \(\mathbb {K}\)-hypersurface V is the degree of a polynomial of minimal degree defining V. We shall use the following Bézout inequality (see [22, 33, 52]): if V and W are \(\mathbb {K}\)-varieties of the same ambient space, then

$$\begin{aligned} \deg (V\cap W)\le \deg V \cdot \deg W. \end{aligned}$$
(2.1)

Let \(V\subset \mathbb {A}^r\) be a \(\mathbb {K}\)-variety, \(I(V)\subset \mathbb {K}[X_1,\ldots , X_r]\) its defining ideal and x a point of V. The dimension\(\dim _xV\)ofVatx is the maximum of the dimensions of the irreducible \(\mathbb {K}\)-components of V containing x. If \(I(V)=(F_1,\ldots , F_m)\), the tangent space\(\mathcal {T}_xV\) to V at x is the kernel of the Jacobian matrix \((\partial F_i/\partial X_j)_{1\le i\le m,1\le j\le r}(x)\) of \(F_1,\ldots , F_m\) with respect to \(X_1,\ldots , X_r\) at x. We have \(\dim \mathcal {T}_xV\ge \dim _xV\) (see, e.g., [47, p. 94]). The point x is regular if \(\dim \mathcal {T}_xV=\dim _xV\); otherwise, x is called singular. The set of singular points of V is the singular locus\({\mathrm {Sing}}(V)\) of V; it is a closed \(\mathbb {K}\)-subvariety of V. A variety is called nonsingular if its singular locus is empty. For projective varieties, the concepts of tangent space, regular and singular point can be defined by considering an affine neighborhood of the point under consideration.

Let V and W be irreducible affine \(\mathbb {K}\)-varieties of the same dimension and \(f:V\rightarrow W\) a regular map with \(\overline{f(V)}=W\), where \(\overline{f(V)}\) denotes the closure of f(V) with respect to the Zariski topology of W. Such a map is called dominant. Then, f induces a ring extension \(\mathbb {K}[W]\hookrightarrow \mathbb {K}[V]\) by composition with f. We say that the dominant map f is finite if this extension is integral, namely each element \(\eta \in \mathbb {K}[V]\) satisfies a monic equation with coefficients in \(\mathbb {K}[W]\). A dominant finite morphism is necessarily closed. Another fact we shall use is that the preimage \(f^{-1}(S)\) of an irreducible closed subset \(S\subset W\) under a dominant finite morphism f is of pure dimension \(\dim S\) (see, e.g., [14, §4.2, Proposition]).

2.1 Rational points

Let \(\mathbb {P}^r(\mathbb {F}_{q})\) be the r-dimensional projective space over \(\mathbb {F}_{q}\) and \(\mathbb {A}^r(\mathbb {F}_{q})\) the r-dimensional \(\mathbb {F}_{q}\)-vector space \(\mathbb {F}_{q}^n\). For a projective variety \(V\subset \mathbb {P}^r\) or an affine variety \(V\subset \mathbb {A}^r\), we denote by \(V(\mathbb {F}_{q})\) the set of \(\mathbb {F}_{q}\)-rational points of V, namely \(V(\mathbb {F}_{q}):=V\cap \mathbb {P}^r(\mathbb {F}_{q})\) in the projective case and \(V(\mathbb {F}_{q}):=V\cap \mathbb {A}^r(\mathbb {F}_{q})\) in the affine case. For an affine variety V of dimension n and degree \(\delta \), we have the following bound (see, e.g., [3, Lemma 2.1]):

$$\begin{aligned} |V(\mathbb {F}_{q})|\le \delta \, q^n. \end{aligned}$$
(2.2)

On the other hand, if V is a projective variety of dimension n and degree \(\delta \), then we have the following bound (see [25, Proposition 12.1] or [4, Proposition 3.1]; see [38] for more precise upper bounds):

$$\begin{aligned} |V(\mathbb {F}_{q})|\le \delta \, p_n, \end{aligned}$$
(2.3)

where \(p_n:=q^n+q^{n-1}+\cdots +q+1=|\mathbb {P}^n(\mathbb {F}_{q})|\).

2.2 Complete intersections

Elements \(F_1,\ldots , F_m\) in \(\mathbb {K}[X_1,\ldots ,X_r]\) or \(\mathbb {K}[X_0,\ldots ,X_r]\) form a regular sequence if \(F_1\) is nonzero and no \(F_i\) is zero or a zero divisor in the quotient ring \(\mathbb {K}[X_1,\ldots ,X_r]/ (F_1,\ldots ,F_{i-1})\) or \(\mathbb {K}[X_0,\ldots ,X_r]/ (F_1,\ldots ,F_{i-1})\) for \(2\le i \le m\). In that case, the (affine or projective) \(\mathbb {K}\)-variety \(V:=V(F_1,\ldots ,F_m)\) is called a set-theoretic complete intersection. We remark that V is necessarily of pure dimension \(r-m\). Further, V is called an (ideal-theoretic) complete intersection if its ideal I(V) over \(\mathbb {K}\) can be generated by m polynomials. We shall frequently use the following criterion to prove that a variety is a complete intersection (see, e.g., [15, Theorem 18.15]).

Theorem 2.1

Let \(F_1,\ldots ,F_m\in \mathbb {K}[X_1,\ldots ,X_r]\) be polynomials which form a regular sequence and let \(V:=V(F_1,\ldots ,F_m)\subset \mathbb {A}^r\). Denote by \((\partial \varvec{F}/\partial \varvec{X})\) the Jacobian matrix of \(F_1,\ldots ,F_m\) with respect to \(X_1,\ldots ,X_r\). If the subvariety of V defined by the set of common zeros of the maximal minors of \((\partial \varvec{F}/\partial \varvec{X})\) has codimension at least one in V, then \(F_1,\ldots ,F_m\) define a radical ideal. In particular, V is a complete intersection.

If \(V\subset \mathbb {P}^r\) is a complete intersection defined over \(\mathbb {K}\) of dimension \(r-m\), and \(F_1,\ldots , F_m\) is a system of homogeneous generators of I(V), the degrees \(d_1,\ldots , d_m\) depend only on V and not on the system of generators. Arranging the \(d_i\) in such a way that \(d_1\ge d_2 \ge \cdots \ge d_m\), we call \((d_1,\ldots , d_m)\) the multidegree of V. In this case, a stronger version of (2.1) holds, called the Bézout theorem (see, e.g., [32, Theorem 18.3]):

$$\begin{aligned} \deg V=d_1\cdots d_m. \end{aligned}$$
(2.4)

A complete intersection V is called normal if it is regular in codimension 1, that is, the singular locus \({\mathrm {Sing}}(V)\) of V has codimension at least 2 in V, namely \(\dim V-\dim {\mathrm {Sing}}(V)\ge 2\). (Actually, normality is a general notion that agrees on complete intersections with the one we define here.) A fundamental result for projective complete intersections is the Hartshorne connectedness theorem (see, e.g., [36, Theorem VI.4.2]): If \(V\subset \mathbb {P}^r\) is a complete intersection defined over \(\mathbb {K}\) and \(W\subset V\) is any \(\mathbb {K}\)-subvariety of codimension at least 2, then \(V{\setminus } W\) is connected in the Zariski topology of \(\mathbb {P}^r\) over \(\mathbb {K}\). Applying the Hartshorne connectedness theorem with \(W:={\mathrm {Sing}}(V)\), one deduces the following result.

Theorem 2.2

If \(V\subset \mathbb {P}^r\) is a normal complete intersection, then V is absolutely irreducible.

3 Estimates on the number of elements of \(\mathcal {A}\)

Let \(X_1 {,\ldots ,}X_r\) be indeterminates over \(\overline{\mathbb {F}}_{q}\). Denote by \(\Pi _1 {,\ldots ,}\Pi _{r}\) the elementary symmetric polynomials of \(\mathbb {F}_{q}[X_1 {,\ldots ,}X_r]\). Observe that \(f:=T^r+ a_{r-1}T^{r-1}+\cdots +a_0 \in \mathcal {A}\) if and only if there exists \({\varvec{x}}\in \mathbb {A}^r\) such that \(a_j=(-1)^{r-j} \Pi _{r-j}({\varvec{x}})\) for \(0 \le j \le r-1\) and

$$\begin{aligned}&R_i:=G_i(-\Pi _1({\varvec{x}}) {,\ldots ,}{(-1)^{r-k-1}\Pi _{r-k-1}({\varvec{x}})},\\&\quad {(-1)^{r-k+1}\Pi _{r-k+1}({\varvec{x}})} {,\ldots ,}(-1)^{r}\Pi _{r}({\varvec{x}}))=0 \end{aligned}$$

for \(1\le i \le m\). Thus, we associate with \(\mathcal {A}\) the polynomials \(R_1{,\ldots ,}R_m\in \mathbb {F}_{q}[X_1 {,\ldots ,}X_r]\) and the variety \(V \subset \mathbb {A}^r\) defined by \(R_1 {,\ldots ,}R_m\).

Our estimates on the distribution of factorization patterns in \(\mathcal {A}\) require asymptotically tight estimates on the number of \(\mathbb {F}_{q}\)-rational points of V, and for the average-case analysis of the classical factorization algorithm restricted to \(\mathcal {A}\) we need asymptotically tight lower bounds on the number of elements of \(\mathcal {A}\). For this purpose, we shall prove several facts concerning the geometry of the affine varieties V and W.

Hypothesis \(({{\mathsf {H}}}_1)\) implies that W is a set-theoretic complete intersection of dimension \(r-m\). Furthermore, by \(({{\mathsf {H}}}_2)\) it follows that the subvariety of W defined by the set of common zeros of the maximal minors of \((\partial \varvec{G}/\partial \varvec{A}_k)\) has codimension at least one in W. Applying Theorem 2.1, we deduce the following result.

Lemma 3.1

\(W\subset \mathbb {A}^r\) is a complete intersection of dimension \(r-m\).

Consider the following surjective morphism of affine \(\mathbb {F}_{q}\)-varieties:

$$\begin{aligned} {{\varvec{\Pi }}^r}: \mathbb {A}^r&\rightarrow \mathbb {A}^{r} \nonumber \\ \varvec{x}&\mapsto (-\Pi _1(\varvec{x}),\ldots ,(-1)^{r}\Pi _{r}(\varvec{x})). \end{aligned}$$
(3.1)

It is easy to see that \({\varvec{\Pi }}^r\) is a dominant finite morphism with \({\varvec{\Pi }}^r(V)=W\). By hypothesis \(({{\mathsf {H}}}_1)\), the variety \(W^j:=V(G_1,\ldots ,G_j)\subset \mathbb {A}^r\) has pure dimension \(r-j\) for \(1\le j \le m\). This implies that \(V^j:=({\varvec{\Pi }}^r)^{-1}(W^j)=V(R_1{,\ldots ,}R_j)\) has pure dimension \(r-j\) for \(1\le j \le m\). We conclude that \(R_1{,\ldots ,}R_{m}\) form a regular sequence of \(\mathbb {F}_{q}[X_1{,\ldots ,}X_r]\), namely we have the following result.

Lemma 3.2

V is a set-theoretic complete intersection of dimension \(r-m\).

Next we study the singular locus of V. For this purpose, we make some remarks concerning the Jacobian matrix of \((\partial {\varvec{\Pi }}^r/\partial {{\varvec{X}}})\) of \({\varvec{\Pi }}^r\) with respect to \(X_1,\ldots ,X_r\). Denote by \(A_r\) the \((r\times r)\)-Vandermonde matrix

$$\begin{aligned} A_r:=(X_j^{i-1})_{1\le i,j\le r}. \end{aligned}$$

Taking into account the following well-known identities (see, e.g., [37]):

$$\begin{aligned} \frac{\partial \Pi _i}{\partial X_{j}}= \Pi _{i-1}-X_{j} \Pi _{i-2} + X_{j}^2 \Pi _{i-3} +\cdots + (-1)^{i-1} X_{j}^{i-1}\quad (1\le i,j \le r), \end{aligned}$$

we conclude that \((\partial {\varvec{\Pi }}^r/\partial {{\varvec{X}}})\) can be factored as

$$\begin{aligned} \left( \frac{\partial {{\varvec{\Pi }}^r}}{\partial {\varvec{X}}}\right) :=B_r\cdot A_r := \left( \begin{array}{ccccc} -\,1 &{} \quad 0 &{} \quad 0 &{} \ldots &{} \quad \; 0 \\ \qquad \Pi _1 &{} -\,1 &{} \quad 0 &{} \quad &{} \\ \quad -\,\Pi _2 &{} \quad \Pi _1 &{} -\,1 &{} \ddots &{} \quad \;\vdots \\ \quad \vdots &{} \quad \vdots &{} \quad \vdots \;&{} \quad \ddots &{} \quad \; 0 \\ (-1)^{r}\Pi _{r-1}\quad &{} \quad (-1)^{r-1}\Pi _{r-2} \quad &{}\quad (-1)^{r-2}\Pi _{r-3} \quad &{}\quad \ldots \quad &{} \quad -\,1\,\end{array}\right) \cdot A_r. \end{aligned}$$
(3.2)

Since \(\det B_r=(-1)^r\), we see that

$$\begin{aligned} \det \left( \frac{\partial {{\varvec{\Pi }}^r}}{\partial {\varvec{X}}}\right) =(-1)^{r} \prod _{1\le i < j\le r}(X_j-X_i). \end{aligned}$$

A critical point in the study of the singular locus of V is the analysis of the zero locus of the \((r-1)\times (r-1)\) minors of \(\left( {\partial {{\varvec{\Pi }}^r}}/{\partial {\varvec{X}}}\right) \). For this purpose, we have the following result.

Proposition 3.3

For k with \( 0 \le k \le r-1\) as in the introduction and l with \( 1 \le l \le r\), denote by \(M_{r-k,l}\) the \((r-1)\times (r-1)\)-matrix obtained by deleting the row \(r-k\) and the column l of \((\partial {{\varvec{\Pi }}^r}/\partial {{\varvec{X}}})\). Then,

$$\begin{aligned} \det M_{r-k,l}=(-1)^{r-k-1}\Delta _l\cdot X_l^k, \end{aligned}$$
(3.3)

where \(\Delta _l:=\prod _{1\le i < j\le r, \, \, i,j \ne l}(X_j-X_i).\)

Proof

According to the factorization (3.2), we have

$$\begin{aligned} M_{r-k,l}=B_{r}^{r-k} \cdot A_{r}^l, \end{aligned}$$

where \(B_{r}^{r-k}\) is the \((r-1)\times r\)-submatrix of \(B_r\) obtained by deleting its \((r-k)\)th row and \(A_{r}^l\) is the \(r \times (r-1)\)-submatrix of \(A_{r}\) obtained by deleting its lth column. By the Cauchy–Binet formula, it follows that

$$\begin{aligned} \det M_{r-k,l}=\sum _{j=1}^r \det B_r^{r-k,j}\cdot \det A_r^{j,l}, \end{aligned}$$

where \(B_{r}^{r-k,j}\) is the \((r-1)\times (r-1)\)-matrix obtained by removing the jth column of \(B_r^{r-k}\) and \(A_r^{j,l}\) is the \((r-1)\times (r-1)\)-matrix obtained by removing the jth row of \(A_r^l\).

From [16, Lemma 2.1], we deduce that

$$\begin{aligned} \det A_r^{j,l}= \Delta _l \cdot \Pi _{r-j}^*, \end{aligned}$$
(3.4)

where \(\Pi _{r-j}^*=\Pi _{r-j}(X_1 {,\ldots ,}X_{l-1},X_{l+1} {,\ldots ,}X_r).\)

Next we obtain an explicit expression of \(\det B_{r}^{r-k,j}\) for \(1\le j \le r\). Observe that \(B_r^{r-k}\) has a block structure:

$$\begin{aligned} B_{r}^{r-k}:=\left( \begin{array}{ll} B_{r-k-1} &{}\quad \mathbf {0}\\ * &{}\quad \mathcal {T}_k^*\\ \end{array} \right) , \end{aligned}$$
(3.5)

where \(B_{r-k-1}\) is the \((r-k-1)\times (r-k-1)\) principal submatrix of \(B_r\) consisting on its first \(r-k-1\) rows and columns and \(\mathcal {T}_k^*\) is the \(k\times (k+1)\)-matrix

$$\begin{aligned} \mathcal {T}_k^*:=\left( \begin{array}{ccccccc} \quad \Pi _1 &{}\quad -\,1 &{}\quad 0 &{}\quad \ldots &{}\quad 0 &{}\quad 0 \\ -\,\Pi _2 &{}\quad \ddots &{} \quad \ddots &{} \quad &{}\quad \vdots &{} \quad \vdots \\ \vdots &{}\quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad 0 &{} \quad 0 \\ \vdots &{} \quad &{} \quad \ddots &{} \quad \ddots &{}\quad -\,1 &{}\quad 0 \\ (-1)^{k+1}\Pi _k &{}\quad \ldots &{} \quad \ldots &{} \quad -\,\Pi _2&{} \qquad \Pi _1 &{}\quad -\,1 \end{array} \right) . \end{aligned}$$

From (3.5), we readily deduce that

$$\begin{aligned} \det B_r^{r-k,j}=\left\{ \begin{array}{ll} 0 &{}\quad {\text {for }}1\le j \le r-k-1, \\ (-1)^{r-1} &{}\quad {\text {for }}j=r-k, \\ (-1)^{r-i-1} \det \mathcal {T}_i &{}\quad {\text {for }}j=r-k+i,\,1\le i\le k, \end{array}\right. \end{aligned}$$
(3.6)

where \(\mathcal {T}_i\) is the following \(i \times i\) Toeplitz–Hessenberg matrix:

$$\begin{aligned} \mathcal {T}_i:=\left( \begin{array}{ccccc} \quad \Pi _1 &{}\quad -\,1 &{}\quad 0 &{} \quad \ldots &{} \quad 0 \\ -\,\Pi _2\ \ &{} \quad \ddots &{} \quad \ddots &{} \quad &{} \quad \vdots \\ \vdots &{}\quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad 0 \\ \vdots &{}\quad &{} \quad \ddots &{} \quad \ddots &{}\quad -\,1\, \\ (-1)^{i+1}\Pi _{i} &{}\quad \ldots &{}\quad \ldots &{} \quad -\,\Pi _2&{}\qquad \Pi _1 \end{array} \right) . \end{aligned}$$

By the Trudi formula (see [43, Ch. VII]; see also [42, Theorem 1]), we deduce the following identity (see [42, Section 4]):

$$\begin{aligned} \det \mathcal {T}_i=H_i, \end{aligned}$$

where \(H_i:=H_i(X_1 {,\ldots ,}X_r)\) is the ith complete homogeneous symmetric function. Therefore, combining (3.4) and (3.6) we conclude that

$$\begin{aligned} \det M_{r-k,l} = \Delta _l\sum _{j=r-k}^r \det B_r^{r-k,j} \cdot \Pi _{r-j}^*&=\Delta _l \sum _{i=0}^k \det B_r^{r-k,\,i+r-k} \cdot \Pi _{k-i}^* \\&=\Delta _l \sum _{i=0}^k (-1)^{r-i-1} H_i \cdot \Pi ^*_{k-i}. \end{aligned}$$

We claim that

$$\begin{aligned} S(k):=\sum _{i=0}^k (-1)^{r-i-1} H_i \cdot \Pi ^*_{k-i}=(-1)^{r-k-1}X_l^k, \quad k=0 {,\ldots ,}r-1. \end{aligned}$$
(3.7)

We prove the claim arguing by induction on k. Since \(H_0= \Pi _0^*=1\), the case \(k = 0\) follows immediately. Assume now that (3.7) holds for \(k-1\) with \(k>0\), namely

$$\begin{aligned} (-1)^{r-1}\sum _{i=0}^{k-1}(-1)^i H_i\cdot \Pi _{k-1-i}^*=(-1)^{r-k}X_l^{k-1}. \end{aligned}$$
(3.8)

It is well known that (see, e.g., [12, 7.§1, Exercise 10])

$$\begin{aligned} \sum _{i=0}^k (-1)^i H_i\cdot \Pi _{k-i}=0. \end{aligned}$$

Since \(\Pi _{k-i}^*=\Pi _{k-i}(X_1 {,\ldots ,}X_{l-1},X_{l+1} {,\ldots ,}X_r)\), we deduce that \(\Pi _{k-i}= X_l \cdot \Pi _{k-i-1}^*+ \Pi _{k-i}^*\). As a consequence, it follows that

$$\begin{aligned} \sum _{i=0}^k(-1)^i H_i \cdot \Pi _{k-i}^*=X_l \sum _{i=0}^{k-1} (-1)^{i-1} H_i \cdot \Pi _{k-i-1}^*. \end{aligned}$$

Combining this identity and the inductive hypothesis (3.8), we conclude that

$$\begin{aligned} S(k)=-X_l \sum _{i=0}^{k-1} (-1)^{r-i-1} H_i \cdot \Pi _{k-i-1}^* =- X_l\, (-1)^{r-k} X_l^{k-1} =(-1)^{r-k-1}X_l^k. \end{aligned}$$

This concludes the proof of the proposition. \(\square \)

Denote by \((\partial \varvec{R}/\partial \varvec{X}):=(\partial R_i/\partial X_j)_{1\le i\le m,1\le j \le r}\) the Jacobian matrix of \(R_{1}{,\ldots ,}R_{m}\) with respect to \(X_1{,\ldots ,}X_r\).

Theorem 3.4

The set of \(\varvec{x}\in V\) for which \((\partial \varvec{R}/\partial \varvec{X})(\varvec{x})\) does not have full rank, has codimension at least 2. In particular, the singular locus \(\Sigma \) of V has codimension at least 2.

Proof

By the chain rule, we have the equality

$$\begin{aligned} \left( \frac{\partial \varvec{R}}{\partial \varvec{X}}\right) =\left( \frac{\partial \varvec{G}}{\partial \varvec{A}}\circ {\varvec{\Pi }}\right) \cdot \left( \frac{\partial {\varvec{\Pi }}}{\partial \varvec{X}}\right) , \end{aligned}$$

where \({\varvec{\Pi }}:=(-\Pi _1,\ldots , (-1)^{r-k-1}\Pi _{r-k-1},(-1)^{r-k+1}\Pi _{r-k+1},\ldots ,(-1)^{r}\Pi _{r})\). Fix a point \(\varvec{x}:=(x_1,\ldots ,x_r)\in V\) such that \((\partial \varvec{R}/\partial \varvec{X})(\varvec{x})\) does not have full rank, and let \(\varvec{v}\in \mathbb {A}^{m}\) be a nonzero element in the left kernel of \((\partial \varvec{R}/\partial \varvec{X})(\varvec{x})\). We have

$$\begin{aligned} \varvec{0}= \varvec{v}\cdot \left( \frac{\partial \varvec{R}}{\partial \varvec{X}}\right) (\varvec{x})=\varvec{v}\cdot \left( \frac{\partial \varvec{G}}{\partial \varvec{A}}\right) \big ({\varvec{\Pi }}(\varvec{x})\big )\cdot \left( \frac{\partial {\varvec{\Pi }}}{\partial \varvec{X}}\right) (\varvec{x}). \end{aligned}$$

Since by hypothesis (\( {{\mathsf {H}}}_2\)) the Jacobian matrix \((\partial \varvec{G}/\partial \varvec{A})\big ({\varvec{\Pi }}(\varvec{x})\big )\) has full rank, we see that \(\varvec{w}:=\varvec{v}\cdot \left( {\partial \varvec{G}}/{\partial \varvec{A}}\right) \big ({\varvec{\Pi }}(\varvec{x})\big )\in \mathbb {A}^{r-1}\) is nonzero. As \(\varvec{w}\cdot \left( {\partial {\varvec{\Pi }}}/{\partial \varvec{X}}\right) (\varvec{x})= \varvec{0}\), all the maximal minors of \(\left( {\partial {\varvec{\Pi }}}/{\partial \varvec{X}}\right) (\varvec{x})\) must be zero. These minors are the determinants \(\det M_{r-k,l}(\varvec{x})\), where \(M_{r-k,l}\) are the matrices of Proposition 3.3.

Since \(\det M_{r-k,l}({\varvec{x}})=0\) for \(1\le l\le r\), Proposition 3.3 implies

$$\begin{aligned} x_i^k\Delta _i({\varvec{x}})=x_j^{k}\Delta _{j}({\varvec{x}})=0\quad (1\le i <j\le r). \end{aligned}$$

It follows that \({\varvec{x}}\) cannot have its r coordinates pairwise distinct. As a consequence, either \({\varvec{x}}\) has \(r-1\) pairwise-distinct coordinates, one of them being equal to zero, or \({\varvec{x}}\) has at most \(r-2\) pairwise-distinct coordinates. Let

$$\begin{aligned} g:=(T-x_1) \ldots (T-x_r)=T^r -\Pi _1({\varvec{x}})T^{r-1}+ \cdots +(-1)^{r} \Pi _{r}({\varvec{x}}). \end{aligned}$$

Observe that \({\varvec{\Pi }}^r({\varvec{x}}) \in W\). If there is a coordinate \(x_i=0\), then the constant coefficient of g is zero. On the other hand, if \({\varvec{x}}\) has at most \(r-2\) pairwise-distinct coordinates, then there exist \(i,j,l,h \in \{1 {,\ldots ,}r\}\) with \(i<j, l<h\) and \(\{i,j\} \cap \{k,l\}= \emptyset \) such that \(x_i=x_j\) and \(x_h=x_l\). If \(x_i\not = x_h\), then g has two distinct multiple roots, while in the case \(x_i=x_h\), g has a root of multiplicity at least 4. In both cases, g and \(g'\) have a common factor of degree at least 2, which implies that

$$\begin{aligned} {\mathrm {Disc}}(g)=0, \, \, {\mathrm {Subdisc}}(g)=0, \end{aligned}$$

namely \(g\in \mathcal {S}_1(W)\). In either case, \({\varvec{\Pi }}^r({\varvec{x}}) \in (A_0\cdot \mathcal {S}_1)(W)\). According to (\({{\mathsf {H}}}_4\)) and (\({{\mathsf {H}}}_5\)), \((A_0\cdot \mathcal {S}_1)(W)\) has codimension at least 2 in W. Since \({\varvec{\Pi }}^r\) is a finite morphism, we have that \(({\varvec{\Pi }} ^r)^{-1}\big ((A_0\cdot \mathcal {S}_1)(W)\big )\) has codimension at least 2 in V. In particular, the set of points \(\varvec{x}\in V\) with \({\mathrm {rank}}(\partial \varvec{R}/\partial \varvec{X})(\varvec{x})< m\) is contained in a subvariety of codimension 2 of V.

Now let \({\varvec{x}}\) be an arbitrary point of \(\Sigma \). By Lemma 3.2, we have \(\dim T_{{\varvec{x}}}V >r-m\). It follows that \({\mathrm {rank}}(\partial \varvec{R}/\partial \varvec{X})(\varvec{x})<m\), for otherwise we would have \(\dim T_{{\varvec{x}}}V \le r-m\), contradicting the hypothesis that \({\varvec{x}}\) is a singular point of V. Therefore, from the first assertion the theorem follows. \(\square \)

From Lemma 3.2 and Theorem 3.4, we obtain further consequences concerning the polynomials \(R_i\) and the variety V. Theorem 3.4 shows in particular that the set of points \(\varvec{x}\in V\) for which \((\partial \varvec{R}/\partial \varvec{X})(\varvec{x})\) does not have full rank has codimension at least one in V. Since \(R_{1}{,\ldots ,}R_{m}\) form a regular sequence, by Theorem 2.1 we conclude that \(R_{1}{,\ldots ,}R_{m}\) define a radical ideal of \(\mathbb {F}_{q}[X_1{,\ldots ,}X_r]\), and thus V is a complete intersection. In other words, we have the following result.

Corollary 3.5

\(R_{1}{,\ldots ,}R_{m}\) define a radical ideal and V is a complete intersection.

3.1 The geometry of the projective closure

Consider the embedding of \(\mathbb {A}^r\) into the projective space \(\mathbb {P}^r\) defined by the mapping \((x_1,\ldots , x_r)\mapsto (1:x_1:\ldots :x_r)\). The closure \({\mathrm {pcl}}(V)\subset \mathbb {P}^r\) of the image of V under this embedding in the Zariski topology of \(\mathbb {P}^r\) is called the projective closure of V. The points of \({\mathrm {pcl}}(V)\) lying in the hyperplane \(\{X_0=0\}\) are called the points of \({\mathrm {pcl}}(V)\) at infinity.

Denote by \(F^h\in \mathbb {F}_{q}[X_0,\ldots ,X_r]\) the homogenization of each \(F\in \mathbb {F}_{q}[X_1,\ldots ,X_r]\), and let \((R_{1}{,\ldots ,}R_{m})^h\) be the ideal generated by all the polynomials \(F^h\) with \(F\in (R_{1}{,\ldots ,}R_{m})\). We have that \((R_{1}{,\ldots ,}R_{m})^h\) is radical because \((R_{1}{,\ldots ,}R_{m})\) is a radical ideal (see, e.g., [36, §I.5, Exercise 6]). It is well known that \({\mathrm {pcl}}(V)\) is the \(\mathbb {F}_{q}\)-variety of \(\mathbb {P}^r\) defined by \((R_{1}{,\ldots ,}R_{m})^h\) (see, e.g., [36, §I.5, Exercise 6]). Furthermore, \({\mathrm {pcl}} (V)\) has pure dimension \(r-m\) (see, e.g., [36, Propositions I.5.17 and II.4.1]) and degree equal to \(\deg V\) (see, e.g., [7, Proposition 1.11]).

Next we discuss the behavior of \({\mathrm {pcl}} (V)\) at infinity. Consider the decomposition of each \(R_i\) into its homogeneous components, namely

$$\begin{aligned} R_i=R_i^{d_i}+R_i^{d_i-1}+\cdots +R_i^{0}, \end{aligned}$$

where each \(R_i^j\in \mathbb {F}_{q}[X_1{,\ldots ,}X_r]\) is homogeneous of degree j or zero, \(R_i^{d_i}\) being nonzero for \(1\le i\le m\). The homogenization of each \(R_i\) is the polynomial

$$\begin{aligned} R_i^h=R_i^{d_i}+R_i^{d_i-1}X_0+\cdots +R_i^{0}X_0^{d_i}. \end{aligned}$$
(3.9)

It follows that \(R_i^h(0,X_1{,\ldots ,}X_r)=R_i^{d_i}\) for \(1\le i \le m\). To express each \(R_i^{d_i}\) in terms of the component \(G_i^{{{{\mathsf {wt}}}}}\) of highest weight of \(G_i\), let \(A_0^{i_0}\cdots A_{k-1}^{i_{k-1}}A_{k+1}^{i_{k+1}}\cdots A_{r-1}^{i_{r-1}}\) be a monomial arising with nonzero coefficient in the dense representation of \(G_i\). Then, its weight

$$\begin{aligned} {{{\mathsf {wt}}}}(A_0^{i_0}\cdots A_{k-1}^{i_{k-1}}A_{k+1}^{i_{k+1}}\cdots A_{r-1}^{i_{r-1}})=\mathop {\sum _{j=0}^{r-1}}_ {j\not =k} (r-j) i_j \end{aligned}$$

equals the degree of the corresponding monomial \(\Pi _r^{i_0}\cdots \Pi _{r-k+1}^{i_{k-1}}\Pi _{r-k-1}^{i_{k+1}}\cdots \Pi _{1}^{i_{r-1}}\) of \(R_i\). We deduce the following result.

Lemma 3.6

\(R_i^{d_i}=G_i^{{{{\mathsf {wt}}}}}(-\Pi _1 {,\ldots ,}(-1)^{r-k-1}\Pi _{r-k-1},(-1)^{r-k+1}\Pi _{r-k+1} {,\ldots ,} (-1)^{r}\Pi _{r})\) for \(1\le i \le m\). In particular, \(\deg R_i={{{\mathsf {wt}}}}(G_i)\) for \(1\le i \le m\).

Denote by \((\partial \varvec{R}^{\varvec{d}}/\partial \varvec{X}):=(\partial R_i^{d_i}/\partial X_j)_{1\le i\le m,1\le j \le r}\) the Jacobian matrix of \(R_{1}^{d_1}{,\ldots ,}R_{m}^{d_m}\) with respect to \(X_1{,\ldots ,}X_r\). Let \(\Sigma ^{\infty }\subset \mathbb {P}^r\) be the singular locus of \({\mathrm {pcl}}(V)\) at infinity, namely the set of singular points of \({\mathrm {pcl}}(V)\) lying in the hyperplane \(\{X_0=0\}\). We have the following result.

Lemma 3.7

The set of points \(\varvec{x} \in V(R_1^{d_1},\ldots ,R_m^{d_m})\subset \mathbb {P}^{r-1}\) for which \((\partial \varvec{R}^{\varvec{d}}/\partial \varvec{X})(\varvec{x})\) has not full rank, has codimension at least 1 in \(V(R_1^{d_1}{,\ldots ,}R_m^{d_m})\). In particular, the singular locus \(\Sigma ^{\infty }\subset \mathbb {P}^r\) at infinity has dimension at most \(r-m-2\).

Proof

Consider the affine variety \(V_{\mathrm {aff}}(R_1^{d_1}{,\ldots ,}R_m^{d_m})\subset \mathbb {A}^r\) defined by \(R_1^{d_1}{,\ldots ,}R_m^{d_m}\). Hypothesis (\({{\mathsf {H}}}_3\)) asserts that \(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}\) satisfy hypotheses (\({{\mathsf {H}}}_1\)) and (\({{\mathsf {H}}}_2\)). Therefore, Lemma 3.2 proves that \(V_{\mathrm {aff}}(R_1^{d_1}{,\ldots ,}R_m^{d_m})\) is a set-theoretic complete intersection of dimension \(r-m\). Denote by \(\Sigma _{\mathrm {aff}}^\infty \) the set of points \({\varvec{x}}\in V_{\mathrm {aff}}(R_1^{d_1}{,\ldots ,}R_m^{d_m})\) as in the statement of the lemma. Arguing as in the proof of Theorem 3.4 we conclude that any \({\varvec{x}}\in \Sigma _{\mathrm {aff}}^\infty \) cannot have its r coordinates pairwise distinct. This implies that \({\varvec{\Pi }}^r(\Sigma _{\mathrm {aff}}^\infty )\) is contained in the discriminant locus \(\mathcal {D}(V(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}))\). By hypothesis \(({{\mathsf {H}}}_6)\), we have that \(\mathcal {D}(V(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}}))\) has codimension at least 1 in \(V(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{{\mathsf {wt}}}}})={\varvec{\Pi }}^r(V_{\mathrm {aff}}(R_1^{d_1}{,\ldots ,}R_m^{d_m}))\). Since \({\varvec{\Pi }}^r\) is a finite morphism, we deduce that \(\Sigma _{\mathrm {aff}}^\infty \) has codimension at least 1 in \(V_{\mathrm {aff}}(R_1^{d_1}{,\ldots ,}R_m^{d_m})\). The first assertion of the lemma follows.

Now let \(\varvec{x}:=(0:x_1:\ldots : x_r)\) be an arbitrary point of \(\Sigma ^{\infty }\). Since each \(R_i^{h}\) vanishes identically in \({\mathrm {pcl}}(V)\), we have \(R_i^{h}(\varvec{x})=R_i^{d_i}(x_1{,\ldots ,}x_r)=0\) for \(1\le i\le m\). Further, \((\partial \varvec{R}^{\varvec{d}}/\partial \varvec{X})(\varvec{x})\) does not have full rank, since otherwise we would have \(\dim \mathcal {T}_{\varvec{x}}({\mathrm {pcl}}(V))\le r-m\), which would imply that \(\varvec{x}\) is a nonsingular point of \({\mathrm {pcl}}(V)\), contradicting thus the hypothesis on \(\varvec{x}\). It follows that \(\Sigma ^{\infty }\) has codimension at least 1 in \(V(R_1^{d_1},\ldots ,R_m^{d_m})\), and thus dimension at most \(r-m-2\). \(\square \)

Our next result concerns the projective variety \(V(R_1^{d_1},\ldots ,R_m^{d_m})\subset \mathbb {P}^{r-1}\).

Lemma 3.8

\(V(R_1^{d_1},\ldots ,R_m^{d_m})\subset \mathbb {P}^{r-1}\) is a complete intersection of dimension \(r-m-1\), degree \(\prod _{i=1}^{m}d_i\) and singular locus of dimension at most \(r-m-2\).

Proof

Since \(G_1^{{{{\mathsf {wt}}}}}{,\ldots ,}G_m^{{{\mathsf {wt}}}}\) satisfy hypothesis (\({{\mathsf {H}}}_1\)), Lemma 3.2 shows that \(V(R_1^{d_1},\ldots ,R_m^{d_m})\) is set-theoretic complete intersection of dimension \(r-m-1\). Furthermore, Lemma 3.7 shows that the set of \(\varvec{x} \in V(R_1^{d_1},\ldots ,R_m^{d_m})\) for which \((\partial \varvec{R}^{\varvec{d}}/\partial \varvec{X})(\varvec{x})\) has not full rank, has codimension at least 1 in \(V(R_1^{d_1}{,\ldots ,}R_m^{d_m})\). Then, Theorem 2.1 proves that \(R_1^{d_1},\ldots ,R_m^{d_m}\) define a radical ideal, and therefore \(V(R_1^{d_1}{,\ldots ,}R_m^{d_m})\) is a complete intersection.

In particular, the singular locus of \(V(R_1^{d_1} {,\ldots ,}R_m^{d_m})\) is the set of points \(\varvec{x} \in V(R_1^{d_1},\ldots ,R_m^{d_m})\) for which \((\partial \varvec{R}^{\varvec{d}}/\partial \varvec{X})(\varvec{x})\) has not full rank, and hence it has dimension at most \(r-m-2\). Finally, the Bézout theorem (2.4) proves the assertion on the degree. \(\square \)

Now we prove our main result concerning \({\mathrm {pcl}}(V)\).

Theorem 3.9

The identity \({\mathrm {pcl}}(V)=V(R_1^{h},\ldots ,R_m^{h})\) holds and \({\mathrm {pcl}}(V)\) is a normal complete intersection of dimension \(r-m\) and degree \(\prod _{i=1}^r d_i\).

Proof

Observe that the following inclusions hold:

$$\begin{aligned}&V(R_1^{h},\ldots ,R_m^{h})\cap \{X_0\not =0\} \subset V(R_1,\ldots ,R_m),\\&V(R_1^{h},\ldots ,R_m^{h})\cap \{X_0=0\}\subset V(R_{1}^{d_1},\ldots ,R_{m}^{d_m}). \end{aligned}$$

Lemma 3.8 proves that \(V(R_1^{d_1},\ldots ,R_m^{d_m})\subset \mathbb {P}^{r-1} \) is a complete intersection of dimension \(r-m-1\) and singular locus of codimension at least 1. On the other hand, Lemma 3.2 and Theorem 3.4 show that \(V(R_1,\ldots ,R_m)\subset \mathbb {A}^r\) is of pure dimension \(r-m\) and its singular locus has codimension at least 2. We conclude that the same holds with \(V(R_1^{h},\ldots ,R_m^{h})\subset \mathbb {P}^r\). Since it is defined by m polynomials, it is a set-theoretic complete intersection. Further, by Theorem 3.4 and Lemma 3.7 the set of points \({\varvec{x}}\in V(R_1^h,\ldots ,R_m^h)\) for which \((\partial \varvec{R}^{\varvec{h}}/\partial {\varvec{X}})({\varvec{x}})\) has not full rank, has codimension at least 2 in \(V(R_1^h,\ldots ,R_m^h)\). Then, Theorem 2.1 proves that \(R_1^{h},\ldots ,R_m^{h}\) define a radical ideal and therefore \(V(R_1^{h},\ldots ,R_m^{h})\) is a normal complete intersection. By Theorem 2.2, it follows that \(V(R_1^{h},\ldots ,R_m^{h})\) is absolutely irreducible.

It is clear that \({\mathrm {pcl}}(V)\subset V(R_1^{h},\ldots ,R_m^{h})\). Being both of pure dimension \(r-m\) and \(V(R_1^{h},\ldots ,R_m^{h})\) absolutely irreducible, the identity of the statement of the theorem follows. Finally, since \(R_1^h,\ldots ,R_m^h\) define a radical ideal, the Bézout theorem (2.4) proves the assertion on the degree. \(\square \)

We end the section with the following result, which allows us to control the number of \(\mathbb {F}_{q}\)-rational points of \({\mathrm {pcl}}(V)\) at infinity.

Remark 3.10

\(V_{ \infty }:={\mathrm {pcl}}(V)\cap \{X_0=0\}\subset \mathbb {P}^{r-1}\) has dimension \(r-m-1\). Indeed, recall that \({\mathrm {pcl}}(V)\) has pure dimension \(r-m\). Hence, each irreducible component of \({\mathrm {pcl}}(V)\cap \{X_0=0\}\) has dimension at least \(r-m-1\). From (3.9), we deduce that \({\mathrm {pcl}}(V)\cap \{X_0=0\}\subset V(R_1^{d_1}{,\ldots ,}R_m^{d_m})\). By Lemma 3.8, we have that \(V(R_1^{d_1},\ldots ,R_m^{d_m})\) has dimension \(r-m-1\). It follows that \({\mathrm {pcl}}(V)\cap \{X_0=0\}\) has also dimension \(r-m-1\).

3.2 Estimates on the number of \(\mathbb {F}_{q}\)-rational points of W

The results on V allow us to estimate the number of \(\mathbb {F}_{q}\)-rational points of W. We start with the following result.

Corollary 3.11

\(W\subset \mathbb {A}^r\) is absolutely irreducible.

Proof

By Theorems 3.9 and 2.2, we have that \({\mathrm {pcl}}(V)\) is absolutely irreducible. As a consequence, V is absolutely irreducible. Since \({\varvec{\Pi }}^r(V)=W\), the assertion follows. \(\square \)

As \(|\mathcal {A}|=|W(\mathbb {F}_{q})|\), we obtain estimates on the number of elements of \(\mathcal {A}\). Combining Corollary 3.11 with [3, Theorem 7.1], for \(q > \delta _{{\varvec{G}}}:=\deg (G_1)\ldots \deg (G_m)\) we have the following estimate:

$$\begin{aligned} \big ||\mathcal {A}|-q^{r-m}\big | \le (\delta _{{\varvec{G}}}-1)(\delta _{{\varvec{G}}}-2)q^{r-m-{1}/{2}}+5 \delta _{{\varvec{G}}}^{13/3}q^{r-m-1}. \end{aligned}$$

On the other hand, according to [3, Corollary 7.2], if \(q >15\delta _{{\varvec{G}}}^{13/3}\), then

$$\begin{aligned} \big ||\mathcal {A}|-q^{r-m}\big | \le (\delta _{{\varvec{G}}}-1)(\delta _{{\varvec{G}}}-2)q^{r-m-{1}/{2}}+7\delta _{{\varvec{G}}}^2q^{r-m-1}. \end{aligned}$$

We easily deduce the following result.

Theorem 3.12

For \(q >15\delta _{{\varvec{G}}}^{13/3}\), we have

$$\begin{aligned} |\mathcal {A}|\ge q^{r-m}\bigg (1-\frac{3\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg ){\text { and }}|\mathcal {A}|^{-1}\le q^{m-r}\bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg ). \end{aligned}$$

Further,

$$\begin{aligned} |\mathcal {A}|\ge \frac{1}{2}q^{r-m}. \end{aligned}$$

4 The distribution of factorization patterns in \(\mathcal {A}\)

Let \(\lambda _1,\ldots ,\lambda _r\) be nonnegative integers such that \(\lambda _1+2\lambda _2+\cdots +r\lambda _r=r\). Denote by \({\mathcal {P}}_{{\varvec{\lambda }}}\) the set of \(f\in \mathbb {F}_{q}[T]_r\) with factorization pattern \({\varvec{\lambda }}:=1^{\lambda _1}2^{\lambda _2}\ldots r^{\lambda _r}\), namely having exactly \(\lambda _i\) monic irreducible factors over \(\mathbb {F}_{q}\) of degree i (counted with multiplicity) for \(1\le i\le r\). Further, for \(\mathcal {S}\subset \mathbb {F}_{q}[T]_r\) we denote \(\mathcal {S}_{{\varvec{\lambda }}}:=\mathcal {S}\cap \mathcal {P}_{{\varvec{\lambda }}}\). In this section, we estimate the number \(|\mathcal {A}_{{\varvec{\lambda }}}|\) of elements of \(\mathcal {A}\) with factorization pattern \({\varvec{\lambda }}\), where \(\mathcal {A}\subset \mathbb {F}_{q}[T]_r\) is the family of (1.1).

4.1 Factorization patterns and roots

Following the approach of [8], we show that the set \(\mathcal {A}_{{\varvec{\lambda }}}\) can be expressed in terms of certain symmetric polynomials.

Let \(f\in \mathbb {F}_{q}[T]_r\) and \(m\in \mathbb {F}_{q}[T]\) a monic irreducible factor of f of degree i. Then, m is the minimal polynomial of a root \(\alpha \) of f with \(\mathbb {F}_{q}(\alpha )=\mathbb {F}_{q^i}\). Denote by \(\mathbb {G}_i\) the Galois group \(\text{ Gal }(\mathbb {F}_{q^i},\mathbb {F}_{q})\) of \(\mathbb {F}_{q^i}\) over \(\mathbb {F}_{q}\). We may express m in the following way:

$$\begin{aligned} m=\prod _{\sigma \in \mathbb {G}_i}(T-\sigma (\alpha )). \end{aligned}$$

Hence, each irreducible factor m of f is uniquely determined by a root \(\alpha \) of f (and its orbit under the action of the Galois group of \(\overline{\mathbb {F}}_{q}\) over \(\mathbb {F}_{q}\)), and this root belongs to a field extension of \(\mathbb {F}_{q}\) of degree \(\deg m\). Now, for \(f\in \mathcal {P}_{{\varvec{\lambda }}}\), there are \(\lambda _1\) roots of f in \(\mathbb {F}_{q}\), say \(\alpha _1,\ldots ,\alpha _{\lambda _1}\) (counted with multiplicity), which are associated with the irreducible factors of f in \(\mathbb {F}_{q}[T]\) of degree 1; we may choose \(\lambda _2\) roots of f in \(\mathbb {F}_{q^{2}}{\setminus }\mathbb {F}_{q}\) (counted with multiplicity), say \(\alpha _{\lambda _1+1},\ldots , \alpha _{\lambda _1+\lambda _2}\), which are associated with the \(\lambda _2\) irreducible factors of f of degree 2, and so on. From now on, we assume that a choice of \(\lambda _1{+\cdots +}\lambda _r\) roots \(\alpha _1{,\ldots ,}\alpha _{\lambda _1 {+\cdots +}\lambda _r}\) of f in \(\overline{\mathbb {F}}_{q}\) is made in such a way that each monic irreducible factor of f in \(\mathbb {F}_{q}[T]\) is associated with one and only one of these roots.

Our aim is to express the factorization of f into irreducible factors in \(\mathbb {F}_{q}[T]\) in terms of the coordinates of the chosen \(\lambda _1{+\cdots +}\lambda _r\) roots of f with respect to certain bases of the corresponding extensions \(\mathbb {F}_{q}\hookrightarrow \mathbb {F}_{q^i}\) as \(\mathbb {F}_{q}\)-vector spaces. To this end, we express the root associated with each irreducible factor of f of degree i in a normal basis \(\Theta _i\) of the field extension \(\mathbb {F}_{q}\hookrightarrow \mathbb {F}_{q^i}\).

Let \(\theta _i\in \mathbb {F}_{q^i}\) be a normal element and \(\Theta _i\) the normal basis of the extension \(\mathbb {F}_{q}\hookrightarrow \mathbb {F}_{q^i}\) generated by \(\theta _i\), i.e.,

$$\begin{aligned} \Theta _i=\left\{ \theta _i,\ldots , \theta _i^{q^{i-1}}\right\} . \end{aligned}$$

The Galois group \(\mathbb {G}_i\) is cyclic and the Frobenius map \(\sigma _i:\mathbb {F}_{q^i}\rightarrow \mathbb {F}_{q^i}\), \(\sigma _i(x):=x^q\) is a generator of \(\mathbb {G}_i\). Thus, the coordinates in the basis \(\Theta _i\) of all the elements in the orbit of a root \(\alpha _k\in \mathbb {F}_{q^i}\) of an irreducible factor of f of degree i are the cyclic permutations of the coordinates of \(\alpha _k\) in the basis \(\Theta _i\).

The vector that gathers the coordinates of all the roots \(\alpha _1{,\ldots ,}\alpha _{\lambda _1+\cdots +\lambda _r}\) we choose to represent the irreducible factors of f in the normal bases \(\Theta _1{,\ldots ,}\Theta _r\) is an element of \(\mathbb {F}_{q}^r\), which is denoted by \({{\varvec{x}}}:=(x_1,\ldots ,x_r)\). Set

$$\begin{aligned} \ell _{i,j}:=\sum _{k=1}^{i-1}k\lambda _k+(j-1)\,i \end{aligned}$$
(4.1)

for \(1\le j \le \lambda _i\) and \(1\le i \le r\). Observe that the vector of coordinates of a root \(\alpha _{\lambda _1{+\cdots +}\lambda _{i-1}+j}\in \mathbb {F}_{q^i}\) is the sub-array \((x_{\ell _{i,j}+1},\ldots ,x_{\ell _{i,j}+i})\) of \({\varvec{x}}\). With these notations, the \(\lambda _i\) irreducible factors of f of degree i are the polynomials

$$\begin{aligned} m_{i,j}=\prod _{\sigma \in \mathbb {G}_i} \Big (T-\big (x_{\ell _{i,j}+1}\sigma (\theta _i)+\cdots + x_{\ell _{i,j}+i}\sigma (\theta _i^{q^{i-1}})\big )\Big ) \end{aligned}$$
(4.2)

for \(1\le j \le \lambda _i\). In particular,

$$\begin{aligned} f=\prod _{i=1}^r\prod _{j=1}^{\lambda _i}m_{i,j}. \end{aligned}$$
(4.3)

Let \(X_1{,\ldots ,}X_r\) be indeterminates over \(\overline{\mathbb {F}}_{q}\), set \({\varvec{X}}:=(X_1,\ldots ,X_r)\) and consider the polynomial \(M\in \mathbb {F}_{q}[\varvec{X},T]\) defined as

$$\begin{aligned} M:=\prod _{i=1}^r\prod _{j=1}^{\lambda _i}M_{i,j},\quad M_{i,j}:=\prod _{\sigma \in \mathbb {G}_i} \Big (T-\big (X_{\ell _{i,j}+1}\sigma (\theta _i)+ \cdots +X_{\ell _{i,j}+i}\sigma (\theta _i^{q^{i-1}})\big )\Big ), \end{aligned}$$
(4.4)

where the \(\ell _{i,j}\) are defined as in (4.1). Our previous arguments show that \(f\in \mathbb {F}_{q}[T]_r\) has factorization pattern \({{\varvec{\lambda }}}\) if and only if there exists \({\varvec{x}}\in \mathbb {F}_{q}^r\) with \(f=M({{\varvec{x}}},T)\).

To discuss how many elements \({\varvec{x}}\in \mathbb {F}_{q}^r\) yield an arbitrary polynomial \(f=M({\varvec{x}},T)\in \mathcal {P}_{{\varvec{\lambda }}}\), we introduce the notion of an array of type \({\varvec{\lambda }}\). For \(\ell _{i,j}\)\((1\le i\le r,\ 1\le j\le \lambda _i)\) as in (4.1), we say that \({{\varvec{x}}}:=(x_1,\ldots , x_r)\in \mathbb {F}_{q}^r\) is of type\({\varvec{\lambda }}\) if and only if each sub-array \({\varvec{x}}_{i,j}:=(x_{\ell _{i,j}+1},\ldots ,x_{\ell _{i,j}+i})\) is a cycle of length i. The following result relates the set \(\mathcal {P}_{\varvec{\lambda }}\) with the set of elements of \(\mathbb {F}_{q}^r\) of type \({\varvec{\lambda }}\) (see [8, Lemma 2.2]).

Lemma 4.1

For any \({{\varvec{x}}}:=(x_1,\ldots , x_r)\in \mathbb {F}_{q}^r\), the polynomial \(f:=M({{\varvec{x}}},T)\) has factorization pattern \({\varvec{\lambda }}\) if and only if \({{\varvec{x}}}\) is of type \({\varvec{\lambda }}\). Furthermore, for each square-free polynomial \(f\in \mathcal {P}_{{\varvec{\lambda }}}\) there are \(w({{\varvec{\lambda }}}):=\prod _{i=1}^r i^{\lambda _i}\lambda _i!\) different \({{\varvec{x}}}\in \mathbb {F}_{q}^r \) with \(f=M({{\varvec{x}}},T)\).

Consider the polynomial M of (4.4) as an element of \(\mathbb {F}_{q}[{\varvec{X}}][T]\). We shall express the coefficients of M by means of the vector of linear forms \({\varvec{Y}}:=(Y_1{,\ldots ,}Y_r)\), with \(Y_i\in \overline{\mathbb {F}}_{q}[{\varvec{X}}]\) defined in the following way for \(1\le i\le r\):

$$\begin{aligned} (Y_{\ell _{i,j}+1},\ldots ,Y_{\ell _{i,j}+i})^{t}:=A_{i}\cdot (X_{\ell _{i,j}+1},\ldots , X_{\ell _{i,j}+i})^{t} \quad (1\le j\le \lambda _i,\ 1\le i\le r), \end{aligned}$$
(4.5)

where \(A_i\in \mathbb {F}_{q^i}^{i\times i}\) is the matrix

$$\begin{aligned} A_i:=\left( \sigma (\theta _i^{q^{h}})\right) _{\sigma \in {\mathbb {G}}_i,\, 0\le h\le i-1}. \end{aligned}$$

According to (4.4), we may express the polynomial M as

$$\begin{aligned} M=\prod _{i=1}^r\prod _{j=1}^{\lambda _i}\prod _{s=1}^i(T-Y_{\ell _{i,j}+s})= \prod _{i=1}^r(T-Y_i)=T^r+\sum _{i=1}^r(-1)^i\,(\Pi _i({\varvec{Y}}))\, T^{r-i}, \end{aligned}$$

where \(\Pi _1({\varvec{Y}}){,\ldots ,}\Pi _r({\varvec{Y}})\) are the elementary symmetric polynomials of \(\mathbb {F}_{q}[{\varvec{Y}}]\). By (4.4), we see that M belongs to \(\mathbb {F}_{q}[{{\varvec{X}}},T]\), which in particular implies that \(\Pi _i({\varvec{Y}})\) belongs to \(\mathbb {F}_{q}[{{\varvec{X}}}]\) for \(1\le i\le r\). Combining these arguments with Lemma 4.1 we obtain the following result.

Lemma 4.2

A polynomial \(f:=T^r+a_{r-1}T^{r-1}{+\cdots +}a_0\in \mathbb {F}_{q}[T]_r\) has factorization pattern \({\varvec{\lambda }}\) if and only if there exists \(\varvec{x}\in \mathbb {F}_{q}^r\) of type \({\varvec{\lambda }}\) such that

$$\begin{aligned} a_i= (-1)^{{r-i}}\,\Pi _{r-i}({\varvec{Y}}({\varvec{x}})) \quad (0\le i\le r-1). \end{aligned}$$
(4.6)

In particular, for f square-free, there are \(w({\varvec{\lambda }})\) elements \({\varvec{x}}\) for which (4.6) holds.

Recall that the family \(\mathcal {A}\) of (1.1) is defined by polynomials \(G_1 {,\ldots ,}G_m\) in \(\overline{\mathbb {F}}_{q}[\varvec{A}_k]\), for a fixed k with \(0 \le k \le r-1\). As a consequence, we may express the condition that an element of \(\mathcal {A}\) has factorization pattern \({\varvec{\lambda }}\) in terms of the elementary symmetric polynomials \(\Pi _1 {,\ldots ,}\Pi _{r-k-1},\Pi _{r-k+1} {,\ldots ,}\Pi _{r}\) of \(\mathbb {F}_{q}[{\varvec{Y}}]\).

Corollary 4.3

A polynomial \(f:=T^r+a_{r-1}T^{r-1}{+\cdots +}a_0\in \mathbb {F}_{q}[T]_r\) belongs to \(\mathcal {A}_{{\varvec{\lambda }}}\) if and only if there exists \(\varvec{x}\in \mathbb {F}_{q}^r\) of type \({\varvec{\lambda }}\) satisfying (4.6) such that

$$\begin{aligned}&G_j\big (-\Pi _{1}{,\ldots ,}(-1)^{r-k-1}\Pi _{r-k-1},(-1)^{r-k+1}\Pi _{r-k+1}{,\ldots ,}(-1)^r \Pi _{r} \big )({\varvec{Y}}({\varvec{x}}))=0\nonumber \\&\quad (1\le j\le m), \end{aligned}$$
(4.7)

where \(G_1{,\ldots ,}G_m\) are the polynomials defining the family \(\mathcal {A}\). In particular, if \(f:=M({\varvec{x}},T)\in \mathcal {A}_{{\varvec{\lambda }}}\) is square-free, then there are \(w({\varvec{\lambda }})\) elements \({\varvec{x}}\) for which (4.7) holds.

4.2 The number of polynomials in \(\mathcal {A}_{{\varvec{\lambda }}}\)

Given a factorization pattern \(\varvec{\lambda }\), in this section we estimate the number of elements of \(\mathcal {A}_{{\varvec{\lambda }}}\). For this purpose, in Corollary 4.3 we associate with \(\mathcal {A}_{{\varvec{\lambda }}}\) the polynomials \(R_1{,\ldots ,}R_m\in \mathbb {F}_{q}[{\varvec{X}}]\) defined as follows:

$$\begin{aligned} R_j:= G_j\big (-\Pi _{1}{,\ldots ,}(-1)^{r-k-1}\Pi _{r-k-1},(-1)^{r-k+1}\Pi _{r-k+1}{,\ldots ,}(-1)^r \Pi _{r} \big )({\varvec{Y}}({\varvec{x}})). \end{aligned}$$
(4.8)

Let \(V:=V(R_1 {,\ldots ,}R_m) \subset \mathbb {A}^r\) be the variety defined by \(R_1{,\ldots ,}R_m\). Since \(G_1 {,\ldots ,}G_m\) satisfy hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\), by Lemma 3.2, Corollary 3.5, Theorem 3.9 and Remark 3.10 we obtain the following result.

Theorem 4.4

Let mr be positive integers with \(m<r\).

  1. (1)

    \(V\subset \mathbb {A}^r\) is a complete intersection of dimension \(r-m\).

  2. (2)

    \({\mathrm {pcl}}(V)\subset \mathbb {P}^r\) is a normal complete intersection of dimension \(r-m\) and degree \(\prod _{i=1}^m d_i\), where \(d_i:=\deg (R_i)={{{\mathsf {wt}}}}(G_i)\) for \(1\le i\le m\).

  3. (3)

    \(V_{\infty } :={\mathrm {pcl}}(V)\cap \{Y_0=0\}\subset \mathbb {P}^{r-1}\) has dimension \(r-m-1\).

Now we estimate the number of \(\mathbb {F}_{q}\)-rational points of V. According to Theorem 4.4, \({\mathrm {pcl}}(V)\subset \mathbb {P}^r\) is a normal complete intersection defined over \(\mathbb {F}_{q}\), of dimension \(r-m\) and multidegree \(\varvec{d}:=(d_1{,\ldots ,}d_m)\). Therefore, [6, Corollary 8.4] implies the following estimate (see [4, 25, 26, 41] for further explicit estimates):

$$\begin{aligned} \big ||{\mathrm {pcl}}(V)(\mathbb {F}_{q})|-p_{r-m}\big |\le (\delta (D-2)+2)q^{r-m-\frac{1}{2}}+14 D^2 \delta ^2q^{r-m-1}, \end{aligned}$$

where \(p_{r-m}:=q^{r-m}+\cdots + q+1=|\mathbb {P}^{r-m}(\mathbb {F}_{q})|\), \(\delta :=d_1\cdots d_m\) and \(D:=\sum _{i=1}^m(d_i-1)\).

On the other hand, the Bézout inequality (2.1) implies \(\deg V_{\infty }\le \delta \). Then, by Theorem 4.4 and (2.3) we have

$$\begin{aligned} \big |V_{\infty }(\mathbb {F}_{q})\big |\le \delta p_{r-m-1}. \end{aligned}$$

It follows that

$$\begin{aligned} \big ||V(\mathbb {F}_{q})|-q^{r-m}\big |&= \big ||{\mathrm {pcl}}(V)(\mathbb {F}_{q})|-|V_{\infty } (\mathbb {F}_{q})|- p_{r-m}+p_{r-m-1}\big |\nonumber \\&\le \big ||{\mathrm {pcl}}(V)(\mathbb {F}_{q})|-p_{r-m}\big |+ \big |V_{\infty }(\mathbb {F}_{q})\big |+ 2{q^{r-m-1}} \nonumber \\&\le \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14D^2 \delta ^2+2 \delta +2\big )q^{r-m-1}. \end{aligned}$$
(4.9)

Let \(V^{=}\) be the subvariety of V defined as

$$\begin{aligned} V^{ =}:=\mathop {\bigcup _{1\le i\le r}}_{ 1\le j_1<j_2\le \lambda _i,\, \, 1\le k_1 <k_2 \le i} V\cap \{Y_{\ell _{i,j_1}+k_1}=Y_{\ell _{i,j_2}+k_2}\}, \end{aligned}$$

where \(Y_{\ell _{i,j}+k}\) are the linear forms of (4.5). Let \(V^{ \ne }(\mathbb {F}_{q}):=V(\mathbb {F}_{q})\backslash V^{ =}(\mathbb {F}_{q})\). We claim that \(V\cap \{Y_{\ell _{i,j_1}+k_1}=Y_{\ell _{i,j_2}+k_2}\}\) has dimension at most \(r-m-1\) for every \(1\le i\le r\), \(1\le j_1 <j_2\le \lambda _i\) and \(1\le k_1 <k_2 \le i\). Indeed, let \({\varvec{x}}\in V\cap \{Y_{\ell _{i,j_1}+k_1}=Y_{\ell _{i,j_2}+k_2}\}\) for \(i,j_1,j_2,k_1,k_2\) as above. By (4.4), we conclude that \(M({\varvec{x}},T)\) is not square-free, and therefore \(\Pi ^r({\varvec{Y}}({\varvec{x}}))\in \mathcal {D}(W)\). Since \(G_1{,\ldots ,}G_m\) satisfy \(({{\mathsf {H}}}_4)\), it follows that \(\dim \mathcal {D}(W)\le r-m-1\), and the fact that \(\Pi ^r\) is a finite morphism implies that \(\dim ((\Pi ^r)^{-1}(\mathcal {D}(W)))\le r-m-1\). This proves our claim.

The claim implies \(\dim V^{ =}\le r-m-1\). By the Bézout inequality (2.1), we have

$$\begin{aligned} \deg V^{ =}\le \deg V\sum _{i=1}^r\frac{i^2\lambda _i^2}{4}\le \frac{r^2}{4}\delta . \end{aligned}$$

As a consequence, by (2.2) we see that

$$\begin{aligned} |V^{ =}(\mathbb {F}_{q})|\le \deg V^{ =}\,q^{r-m-1}\le \frac{r^2\delta }{4}\, q^{r-m-1}. \end{aligned}$$
(4.10)

Finally, combining (4.9) and (4.10) we obtain the following result.

Theorem 4.5

For \(m<r\), we have

$$\begin{aligned} \big ||V^{ \ne }(\mathbb {F}_{q})|-q^{r-m}\big |&\le q^{r-m-1} \Big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+2\delta +2+r^2\delta /4\Big ), \end{aligned}$$

where \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\) and \(D:=\sum _{i=1}^m ({{{\mathsf {wt}}}}(G_i)-1).\)

Proof

By (4.10), \(|V^{ =}(\mathbb {F}_{q})|\le r^2\delta \, q^{r-m-1}/4\). Then, from (4.9) we deduce that

$$\begin{aligned} \big ||V^{ \ne }(\mathbb {F}_{q})|-q^{r-m}\big |&\le \big ||V(\mathbb {F}_{q})|-q^{r-m}\big |+\big |V^{ =}(\mathbb {F}_{q})\big |\\&\le \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14D^2 \delta ^2+2 \delta +2\big )q^{r-m-1}\\&\quad +\, \frac{r^2\delta }{4} q^{r-m-1}. \end{aligned}$$

This shows the statement of the theorem. \(\square \)

Next we use Corollary 4.3 to relate \(|V(\mathbb {F}_{q})|\) to the quantity \(|\mathcal {A}_{\varvec{\lambda }}|\). More precisely, let \({\varvec{x}}:=({\varvec{x}}_{i,j}:1\le i\le r,1\le j\le \lambda _i)\in \mathbb {F}_{q}^r\) be an \(\mathbb {F}_{q}\)-rational zero of \(R_1{,\ldots ,}R_m\) of type \({\varvec{\lambda }}\). Then, \({\varvec{x}}\) is associated with \(f\in \mathcal {A}_{{\varvec{\lambda }}}\) having \(Y_{\ell _{i,j}+k}({\varvec{x}}_{i,j})\) as an \(\mathbb {F}_{q^i}\)-root for \(1\le i\le r\), \(1\le j\le \lambda _i\) and \(1\le k\le i\), where \(Y_{\ell _{i,j}+k}\) is the linear form of (4.5).

Let \(\mathcal {A}_{{\varvec{\lambda }}}^{sq}:=\{f\in \mathcal {A}_{{\varvec{\lambda }}}: f \text{ is } \text{ square-free }\}\) and \(\mathcal {A}_{{\varvec{\lambda }}}^{nsq}:=\mathcal {A}_{{\varvec{\lambda }}}{{\setminus }} \mathcal {A}_{{\varvec{\lambda }}}^{sq}\). Corollary 4.3 shows that any element \(f\in \mathcal {A}_{{\varvec{\lambda }}}^{sq}\) is associated with \(w({\varvec{\lambda }}):=\prod _{i=1}^r i^{\lambda _i}\lambda _i!\) common \(\mathbb {F}_{q}\)-rational zeros of \(R_1{,\ldots ,}R_m\) of type \({\varvec{\lambda }}\). Observe that \({\varvec{x}}\in \mathbb {F}_{q}^r\) is of type \({\varvec{\lambda }}\) if and only if \(Y_{\ell _{i,j}+k_1}({\varvec{x}}) \ne Y_{\ell _{i,j}+k_2}({\varvec{x}})\) for \(1\le i\le r\), \(1\le j\le \lambda _i\) and \(1\le k_1 <k_2 \le i\). Furthermore, an \({\varvec{x}}\in \mathbb {F}_{q}^r\) of type \({\varvec{\lambda }}\) is associated with \(f\in \mathcal {A}_{{\varvec{\lambda }}}^{sq}\) if and only if \(Y_{\ell _{i,j_1}+k_1}({\varvec{x}}) \ne Y_{\ell _{i,j_2}+k_2}({\varvec{x}})\) for \(1\le i\le r\), \(1\le j_1<j_2\le \lambda _i\) and \(1\le k_1 <k_2 \le i\). It follows that \(|\mathcal {A}_{{\varvec{\lambda }}}^{sq}| =\mathcal {T}({\varvec{\lambda }}) \big |V^{\ne }(\mathbb {F}_{q})\big |\), where \(\mathcal {T}({\varvec{\lambda }}):=1/w({\varvec{\lambda }})\). This implies

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big | = \mathcal {T}({\varvec{\lambda }})\,\big ||V^{ \ne }(\mathbb {F}_{q})|-q^{r-m}\big |. \end{aligned}$$

From Theorem 4.5, we deduce that

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le \,\mathcal {T}({\varvec{\lambda }})q^{r-m-1}\big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2\\&\quad +\,2\delta +2+r^2\delta /4\big )\\&\le \,\mathcal {T}({\varvec{\lambda }})q^{r-m-1}\big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big ). \end{aligned}$$

Now we are able to estimate \(|\mathcal {A}_{{\varvec{\lambda }}}|\). We have

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&= \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}|+ |\mathcal {A}_{{\varvec{\lambda }}}^{nsq}|-\mathcal {T}({\varvec{\lambda }})q^{r-m}\big |\nonumber \\&\le \mathcal {T}({\varvec{\lambda }})q^{r-m-1}\big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )\nonumber \\&\quad +\, |\mathcal {A}_{{\varvec{\lambda }}}^{nsq}|. \end{aligned}$$
(4.11)

It remains to bound \(|\mathcal {A}_{{\varvec{\lambda }}}^{nsq}|\). To this end, we observe that \(f\in \mathcal {A}\) is not square-free if and only if its discriminant is equal to zero, namely it belongs to the discriminant locus \(\mathcal {D}(W)\). By hypothesis \(({{\mathsf {H}}}_4)\), the discriminant locus \(\mathcal {D}(W)\) has dimension at most \(r-m-1\). Further, by the Bézout inequality (2.1) we have

$$\begin{aligned} \deg \mathcal {D}(W)\le & {} \deg W\cdot \deg \{{\varvec{a}}_0 \in \mathbb {A}^r: {\mathrm {Disc}}(F({\varvec{A}}_0, T))|_{{\varvec{A}}_0={\varvec{a}}_0}= 0\}\\\le & {} \delta _{{\varvec{G}}}\, r(r-1)\le \delta \, r^2. \end{aligned}$$

Then, (2.2) implies

$$\begin{aligned} |\mathcal {A}_{{\varvec{\lambda }}}^{nsq}|\le |\mathcal {A}^{nsq}|\le \delta _{{\varvec{G}}}\, r(r-1) \,q^{r-m-1}\le \delta \, r^2q^{r-m-1}. \end{aligned}$$
(4.12)

Hence, combining (4.11) and (4.12) we conclude that

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le q^{r-m-1}\Big (\mathcal {T}({\varvec{\lambda }})\big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )+r^2\delta \Big ). \end{aligned}$$

In other words, we have the following result.

Theorem 4.6

For \(m<r\), we have that

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le \mathcal {T}({\varvec{\lambda }})q^{r-m-1} \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+ r^2\delta \big ),\\ \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le q^{r-m-1}\Big (\mathcal {T}({\varvec{\lambda }}) \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )+r^2\delta \Big ), \end{aligned}$$

where \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\) and \(D:=\sum _{i=1}^m({{{\mathsf {wt}}}}(G_i)-1)\).

As we show in Sect. 5.1, Theorem 4.6 extends [8, Theorem 4.2]. More precisely, Theorem 4.6 holds for families defined by linearly independent linear polynomials \(G_1,\ldots ,G_m\in \mathbb {F}_{q}[A_{r-1},\ldots , A_2]\) with \({\mathrm {char}}(\mathbb {F}_{q})\) not dividing \(r(r-1)\), and linearly independent linear polynomials \(G_1,\ldots ,G_m\in \mathbb {F}_{q}[A_{r-1},\ldots , A_3]\) with \({\mathrm {char}}(\mathbb {F}_{q})>2\). The latter is precisely [8, Theorem 4.2].

5 Examples of linear and nonlinear families

In this section, we exhibit examples of linear and nonlinear families of polynomials satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\). Therefore, the estimate of Theorem 4.6 is valid for these families.

5.1 The linear families of [8]

Suppose that \({\mathrm {char}}(\mathbb {F}_{q})>3\). Let rmn be positive integers with \(2\le n \le r-m\) and \(L_1,\ldots ,L_m\in \mathbb {F}_{q}[A_{r-1},\ldots ,A_n]\) linear forms which are linearly independent. In [8] the distribution of factorization patterns of the following linear family is considered:

$$\begin{aligned} \mathcal {A}:=\left\{ T^r+a_{r-1}T^{r-1}{+\cdots +}a_0\in \mathbb {F}_{q}[T]: L_j(a_{r-1}{,\ldots ,}a_n)=0\quad (1\le j\le m)\right\} . \end{aligned}$$
(5.1)

Assume without loss of generality that the Jacobian matrix \((\partial L_i/\partial A_j)_{1\le i\le m,\,n\le j\le r-1}\) is lower triangular in row echelon form and denote by \(1\le i_1<\cdots <i_m\le r-n\) the positions corresponding to the pivots. We have the following result.

Lemma 5.1

If either \(n=2\) and \({\mathrm {char}}(\mathbb {F}_{q})\) does not divide \(r(r-1)\) or \(n \ge 3\), then \(L_1 {,\ldots ,}L_m\) satisfy hypotheses \(({\mathsf {H}} _1)\)\(({\mathsf {H}} _6)\).

Proof

It is clear that hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_2)\) hold. Further, since the component of highest weight of \(L_k\) is of the form \(L_k^{{{{\mathsf {wt}}}}}=b_{k,r-i_k}A_{r-i_k}\) for \(1\le k\le m\), we conclude that \(({{\mathsf {H}}}_3)\) holds.

Now we analyze the validity of \(({{\mathsf {H}}}_4)\). Denote \(W:=V(L_1,\ldots ,L_m)\subset \mathbb {A}^r\). It is clear that

$$\begin{aligned} \overline{\mathbb {F}}_{q}[W]:=\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]/(L_1 {,\ldots ,}L_m) \simeq \overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}] \end{aligned}$$

is a domain, where \(\mathcal {J}:=\{{r-1} {,\ldots ,}0\}{{\setminus }} \{r-i_1 {,\ldots ,}r-i_m\}\). Therefore, it suffices to prove that the coordinate class \(\mathcal {R}\) defined by \({\mathrm {Disc}}(F(\varvec{A}_0,T))\) in \(\overline{\mathbb {F}}_{q}[W]\) is a nonzero polynomial in \(\overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]\), where \(F({\varvec{A}}_0,T):=T^r+A_{r-1}T^{r-1} +\cdots + A_0\) and \({\varvec{A}}_0:=(A_{r-1} {,\ldots ,}A_0)\). If \({\mathrm {char}}(\mathbb {F}_{q})\) does not divide \(r(r-1)\), then the nonzero monomial \(r^rA_0^{r-1}\) occurs in the dense representation of \(\mathcal {R}\). On the other hand, if \( {\mathrm {char}}(\mathbb {F}_{q})\) divides r, then the nonzero monomial \(A_1^r\) occurs in the dense representation of \(\mathcal {R}\). Finally, if \({\mathrm {char}}(\mathbb {F}_{q})\) divides \(r-1\), then we have the nonzero monomial \(A_0^{r-1}\) in the dense representation of \(\mathcal {R}\).

Next we show that \(({{\mathsf {H}}}_5)\) is fulfilled. For this purpose, we first prove that \(A_0, L_1 {,\ldots ,}L_m\), \({\mathrm {Disc}}(F(\varvec{A}_0,T))\) form a regular sequence of \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]\). We observe that

$$\begin{aligned} \overline{\mathbb {F}}_{q}[A_{r-1}{,\ldots ,}A_0]/(A_0, L_1 {,\ldots ,}L_m) \simeq \overline{\mathbb {F}}_{q}[A_k : k \in \mathcal {J}_1] \end{aligned}$$

is a domain, where \(\mathcal {J}_1:=\mathcal {J}{\setminus } \{0\}\). Hence, considering the class \(\mathcal {R}_1\) of \({\mathrm {Disc}}(F(\varvec{A}_0,T))\) as an element of \(\overline{\mathbb {F}}_{q}[A_k : k \in \mathcal {J}_1]\), it is enough to prove that it is nonzero. Indeed, if \({\mathrm {char}}(\mathbb {F}_{q})\) does not divide \(r(r-1)\), then the monomial \((-1)^{r-1}(r-1)^{r-1}A_1^r\) occurs in the dense representation \(\mathcal {R}_1\), while for \({\mathrm {char}}(\mathbb {F}_{q})\) dividing r, the monomial \(A_1^r\) appears in \(\mathcal {R}_1\). Finally, for \(n \ge 3\) and \({\mathrm {char}}(\mathbb {F}_{q})\) dividing \(r-1\), we have the nonzero monomial \((-1)^{r+1}A_1^2A_2^{r-1}\) in the dense representation of \(\mathcal {R}_1\).

Finally, we prove that \(L_1 {,\ldots ,}L_m,{\mathrm {Disc}}(F(\varvec{A}_0,T)),{\mathrm {Subdisc}}(F(\varvec{A}_0,T))\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]\). Recall that \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]/(L_1 {,\ldots ,}L_m) \simeq \overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]\) is a domain. Therefore, we may consider the classes \(\mathcal {R}\) and \(\mathcal {S}_1\) of \({\mathrm {Disc}}(F(\varvec{A}_0,T))\) and \({\mathrm {Subdisc}}(F(\varvec{A}_0,T))\) modulo \((L_1 {,\ldots ,}L_m)\) as elements of \(\overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]\). We have already shown that \(\mathcal {R}\) is nonzero. On the other hand, if \({\mathrm {char}}(\mathbb {F}_{q})\) does not divide \(r(r-1)\), then the nonzero monomial \(r(r-1)^{r-2}A_1^{r-2}\) occurs in the dense representation of \(\mathcal {S}_1\), while for \({\mathrm {char}}(\mathbb {F}_{q})\) dividing \(r(r-1)\), we have the nonzero monomial \(2(-1)^r(r-2)^{r-2}A_2^{r-1}\) in the dense representation of \(\mathcal {S}_1\). We conclude that \(\mathcal {S}_1\) is nonzero.

Further, [40, Theorem A.3] or [45, Theorem 3.1.7] shows that \(\mathcal {R}\) is an irreducible element of \(\overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]\) and hence \( \mathbb {B}:=\overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]/(\mathcal {R})\) is a domain. Thus, it suffices to see that the class of \(\mathcal {S}_1\) in \(\mathbb {B}\) is nonzero. If not, then \(\mathcal {S}_1\) would be a nonzero multiple of \(\mathcal {R}\) in \(\overline{\mathbb {F}}_{q}[A_k: k \in \mathcal {J}]\), which is not possible because \({\text {max}}\{\deg _{A_1}\mathcal {R},\deg _{A_2}\mathcal {R}\}=r\) and \(\text {max}\{\deg _{A_1}\mathcal {S}_1,\deg _{A_2}\mathcal {S}_1\}= r-1\).

Finally, we prove that \(({{\mathsf {H}}}_6)\) holds. The components of highest weight of \(L_1{,\ldots ,}L_m\) being of the form \(L_k^{{{{\mathsf {wt}}}}}=b_{k,r-i_k}A_{r-i_k}\) for \(k=1 {,\ldots ,}m\), arguing as before we readily conclude that \(({{\mathsf {H}}}_6)\) holds. \(\square \)

From Lemma 5.1, it follows that the family \(\mathcal {A}\) of (5.1) satisfies the hypotheses of Theorem 4.6. Therefore, applying Theorem 4.6 we obtain the following result.

Theorem 5.2

Suppose that \({\mathrm {char}}(\mathbb {F}_{q})>3\). Let \(\mathcal {A}\) be the family of (5.1) and \(\varvec{\lambda }\) a factorization pattern. If either \({\mathrm {char}}(\mathbb {F}_{q})\) does not divide \(r(r-1)\) and \(L_k\in \mathbb {F}_{q}[A_{r-1},\ldots ,A_2]\) for \(1\le k\le m\), or \(L_k\in \mathbb {F}_{q}[A_{r-1},\ldots ,A_n]\) for \(1\le k\le m\) and \( 3 \le n \le r-m\), then

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le \mathcal {T}({\varvec{\lambda }})q^{r-m-1} \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+ r^2\delta \big ),\\ \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^{r-m}\big |&\le q^{r-m-1}\Big (\mathcal {T}({\varvec{\lambda }}) \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )+r^2\delta \Big ), \end{aligned}$$

where \(\delta :=\prod _{j=1}^m i_j\) and \(D:=\sum _{j=1}^m(i_j-1)\).

5.2 A linear family from [23]

In [23], there are experimental results on the number of irreducible polynomials on certain families over \(\mathbb {F}_{q}\). Further, the distribution of factorization patterns on general families of polynomials of \(\mathbb {F}_{q}[T]\) of a given degree is stated as an open problem. In particular, the family of polynomials we now discuss is considered.

Suppose that \({\mathrm {char}}(\mathbb {F}_{q})>3\). For positive integers s and r with \(3\le s \le r-2\), let

$$\begin{aligned} \mathcal {A}:=\{T^r+g(T)T+1: \,\, g\in \mathbb {F}_{q}[T] \,\, \text {and}\, \deg g\le s-1\}. \end{aligned}$$
(5.2)

Observe that \(\mathcal {A}\) is isomorphic to the set of \(\mathbb {F}_{q}\)-rational points of the affine \(\mathbb {F}_{q}\)-subvariety of \(\mathbb {A}^r\) defined by the polynomials

$$\begin{aligned} G_1:=A_{0}-1,\ G_2:=A_{s+1},\ldots , G_{r-s}:=A_{r-1}. \end{aligned}$$

We show that hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\) are fulfilled. It is easy to see that \(({{\mathsf {H}}}_1)\) and \(({{\mathsf {H}}}_2)\) hold, since \(G_1{,\ldots ,}G_{r-s}\) are linearly independent polynomials of degree 1. Furthermore, taking into account that

$$\begin{aligned} G_1^{{{{\mathsf {wt}}}}}=A_{0},\ G_2^{{{{\mathsf {wt}}}}}=A_{s+1},\ldots , G_{r-s}^{{{{\mathsf {wt}}}}}=A_{r-1}, \end{aligned}$$

we immediately conclude that hypothesis \(({{\mathsf {H}}}_3)\) holds.

Now, we analyze the validity of hypotheses \(({{\mathsf {H}}}_4)\) and \(({{\mathsf {H}}}_5)\). Let \(W\subset \mathbb {A}^r\) be the \(\mathbb {F}_{q}\)-variety defined by the polynomials \(G_1,\ldots ,G_{r-s}\), and denote by \(\mathcal {D}(W)\subset \mathbb {A}^r\) and \(\mathcal {S}_1(W)\subset \mathbb {A}^r\) the discriminant locus and the first subdiscriminant locus of W, respectively.

We first prove that \(\mathcal {D}(W)\) has codimension one in W. It is clear that \(G_1,\ldots ,G_{r-s}\) form a regular sequence of \(\mathbb {F}_{q}[A_{r-1},\ldots ,A_0]\). Observe that

$$\begin{aligned} \overline{\mathbb {F}}_{q}[W]=\overline{\mathbb {F}}_{q}[A_{r-1},\ldots ,A_0]/(G_1,\ldots ,G_{r-s})\simeq \overline{\mathbb {F}}_{q}[A_s,\ldots ,A_1] \end{aligned}$$

is a domain. As a consequence, we may consider the coordinate function \(\mathcal {R}\) defined by \({\mathrm {Disc}}(F(\varvec{A}_0,T))\) as an element of \(\overline{\mathbb {F}}_{q}[A_s {,\ldots ,}A_1]\), where \(\varvec{A}_0:=(A_{r-1},\ldots ,A_0)\) and \(F(\varvec{A}_0,T):=T^r+A_{r-1}T^{r-1}+\cdots + A_0\). We observe that \(\mathcal {R}\not =0\) in \(\overline{\mathbb {F}}_{q}[A_s {,\ldots ,}A_1]\), because \(F({\varvec{A}}_0,T)\) is not a separable polynomial, and therefore it is not a zero divisor of \(\overline{\mathbb {F}}_{q}[W]\). It follows that \(\mathcal {D}(W)\) has codimension one in W, namely hypothesis \(({{\mathsf {H}}}_4)\) holds.

Next we show that \((A_0\cdot \mathcal {S}_1)(W)\) has codimension at least one in \(\mathcal {D}(W)\). Since \(G_1:=A_0-1\) vanishes on W, the coordinate function of \(\overline{\mathbb {F}}_{q}[W]\) defined by \(A_0\) is a unit, which proves that \((A_0\cdot \mathcal {S}_1)(W)=\mathcal {S}_1(W)\).

In what follows, we shall use the following elementary property.

Lemma 5.3

Let \(F_1 {,\ldots ,}F_m \in \overline{\mathbb {F}}_{q}[A_{0} {,\ldots ,}A_{r-1}]\). If \(F_1 {,\ldots ,}F_m\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_0 {,\ldots ,}A_i)[A_{i+1} {,\ldots ,}A_{r-1}]\), then \(F_1 {,\ldots ,}F_m\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_0 {,\ldots ,}A_{r-1}]\).

We shall also use the following property of regular sequences.

Lemma 5.4

Let \(F_1 {,\ldots ,}F_m \in \overline{\mathbb {F}}_{q}[A_0 {,\ldots ,}A_{r-1}]\). For an assignment of positive integer weights \({{{\mathsf {wt}}}}\) to the variables \(A_0 {,\ldots ,}A_{r-1}\), denote by \(F_1^{{{{\mathsf {wt}}}}} {,\ldots ,}F_m^{{{{\mathsf {wt}}}}}\) the components of highest weight of \(F_1 {,\ldots ,}F_m\). If \(F_1^{{{{\mathsf {wt}}}}} {,\ldots ,}F_m^{{{{\mathsf {wt}}}}}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_0 {,\ldots ,}A_{r-1}]\), then \(F_1 {,\ldots ,}F_m\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_0 {,\ldots ,}A_{r-1}]\).

Proof

Let \(V_j:=V(F_1{,\ldots ,}F_j)\subset \mathbb {A}^r\) for \(1\le j\le m\). It is enough to see that \(V_j\) has codimension j for \(1 \le j \le m\). By hypothesis, \(V_j^{{{{\mathsf {wt}}}}}:=V(F_1^{{{{\mathsf {wt}}}}} {,\ldots ,}F_j^{{{{\mathsf {wt}}}}})\subset \mathbb {A}^r\) has pure dimension \(r-j\). Therefore, there exist \(1\le {k_1}<\cdots < k_{r-j}\le m\) such that the variety \(V:=V(F_1^{{{{\mathsf {wt}}}}} {,\ldots ,}F_j^{{{{\mathsf {wt}}}}},A_{k_1} {,\ldots ,}A_{k_{r-j}} )\subset \mathbb {A}^r\) has dimension zero. Consider the following morphism of affine \(\mathbb {F}_{q}\)-varieties:

$$\begin{aligned} {\varvec{\phi }}: \mathbb {A}^r&\rightarrow \mathbb {A}^{r}\\ (a_0 {,\ldots ,}a_{r-1})&\mapsto (a_0^{{{{\mathsf {wt}}}}(0)}, a_1^{{{{\mathsf {wt}}}}(1)},\ldots ,a_{r-1}^{{{{\mathsf {wt}}}}(r-1)}), \end{aligned}$$

where \({{{\mathsf {wt}}}}(0){,\ldots ,}{{{\mathsf {wt}}}}(r-1)\) are the weights assigned to \(A_0{,\ldots ,}A_{r-1}\), respectively. It is clear that \(\varvec{\phi }\) is a finite, dominant morphism. Observe that if \(F\in \overline{\mathbb {F}}_{q}[A_0,\ldots ,A_{r-1}]\) is weighted homogeneous, then \(\varvec{\phi }(F)\) is homogeneous.

We have that \(\varvec{\phi }(V) \subset \mathbb {A}^r\) is a zero-dimensional affine cone. Since \(\varvec{\phi }(V)\) is defined by the homogeneous polynomials \(F_i^{{{{\mathsf {wt}}}}}(A_0^{{{{\mathsf {wt}}}}(0)},\ldots ,A_{r-1}^{{{{\mathsf {wt}}}}(r-1)})\), \(1\le i \le j\), and \(A_{k_i} ^{{{{\mathsf {wt}}}}(k_{i})}\), \(1\le i \le r-j\), it must be \(\varvec{\phi }(V)=\{0\}\). Therefore, by, e.g., [44, Proposition 18], the affine variety defined by the polynomials

$$\begin{aligned} F_1(A_0^{{{{\mathsf {wt}}}}(0)},\ldots ,A_{r-1}^{{{{\mathsf {wt}}}}(r-1)}),\ldots , F_j(A_0^{{{{\mathsf {wt}}}}(0)},\ldots ,A_{r-1}^{{{{\mathsf {wt}}}}(r-1)}),A_{k_1}^{{{{\mathsf {wt}}}}(k_1)} {,\ldots ,}A_{k_{r-j}}^{{{{\mathsf {wt}}}}(k_{r-j})} \end{aligned}$$

has dimension zero. Taking into account that \(\varvec{\phi }\) is a finite morphism, we conclude that the variety \({\hat{V}}_j\subset \mathbb {A}^r\) defined by \(F_1,\ldots ,F_j, A_{k_1},\ldots , A_{k_{r-j}}\) has also dimension zero.

Finally, observe that the dimension of \(V_j\) is at least \(r-j\). On the other hand, \(0=\dim {\hat{V}}_j\ge \dim V_j-(r-j)\). This finishes the proof of the lemma. \(\square \)

We have that \(G_2 {,\ldots ,}G_{r-s}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]\). Observe that \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]/(G_2 {,\ldots ,}G_{r-s}) \simeq \overline{\mathbb {F}}_{q}[A_{s}{,\ldots ,}A_0]\). Therefore, to conclude that \(({{\mathsf {H}}}_5)\) holds it suffices to prove that \(\mathcal {G}_1\), \(\mathcal {S}_1\) and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_{s} {,\ldots ,}A_0]\), where \(\mathcal {G}_1\), \(\mathcal {R}\) and \(\mathcal {S}_1\) are the coordinate functions of \(\overline{\mathbb {F}}_{q}[A_{r-1} {,\ldots ,}A_0]/(G_2 {,\ldots ,}G_{r-s})\) defined by \(G_1\), \({\mathrm {Disc}}(F({\varvec{A}}_0,T))\) and \({\mathrm {Subdisc}}(F({\varvec{A}}_0,T))\), respectively.

Lemma 5.5

\(\mathcal {G}_1\), \(\mathcal {S}_1\) and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_{s} {,\ldots ,}A_0]\).

Proof

We consider \(\mathcal {R}, \mathcal {S}_1, \mathcal {G}_1\) as elements of \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_{i+1})[A_i,\ldots ,A_0]\) for an appropriate \(i\in \{2,3\}\) and define a weight \({{{\mathsf {wt}}}}_i\) by setting

$$\begin{aligned} {{{\mathsf {wt}}}}_i(A_0):=r,\ {{{\mathsf {wt}}}}_i(A_1):=r-1,\ldots ,{{{\mathsf {wt}}}}_i(A_i):=r-i. \end{aligned}$$

Denote by \(\mathcal {G}_1^{{{{\mathsf {wt}}}}_i}\), \(\mathcal {R}^{{{{\mathsf {wt}}}}_i}\) and \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_i}\) the components of highest weight of \(\mathcal {G}_1\), \(\mathcal {R}\) and \(\mathcal {S}_1\), respectively. We have the following claim.

Claim

\(\mathcal {G}_1^{{{{\mathsf {wt}}}}_i}\), \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_i}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_i}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_{i+1})[A_i,\ldots ,A_0]\).

Proof of Claim

Observe that

$$\begin{aligned} \overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_{i+1})[A_i,\ldots ,A_0]/(\mathcal {G}_1^{{{{\mathsf {wt}}}}_i}) \simeq \overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_{i+1})[A_i,\ldots ,A_1] \end{aligned}$$

is a domain. As a consequence, it suffices to prove that the coordinate functions defined by \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_i}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_i}\) in this quotient ring form a regular sequence. With a slight abuse of notation, we shall also denote them by \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_i}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_i}\).

The proof will be split into four parts, according to whether \({\mathrm {char}}(\mathbb {F}_{q})\) divides r, \(r-1\), \(r-2\) or does not divide \(r(r-1)(r-2)\).

First case\({\mathrm {char}}(\mathbb {F}_{q})\)dividesr. For \(i:=2\) we have that, in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]\),

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_2}=A_1^r+(-1)^{r+1}2^{r-2}A_2^{r-1}A_1 ^2&\quad \text {and} \quad \mathcal {S}_1^{{{{\mathsf {wt}}}}_2}=(2A_{2})^{r-1}. \end{aligned}$$
(5.3)

Observe that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) is a nonzero polynomial of \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]\), and

$$\begin{aligned} \overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]/(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}) \simeq \overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_1]. \end{aligned}$$

It follows that \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) is not a zero divisor in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]/(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2})\), which completes the proof of the claim in this case.

Second case\({\mathrm {char}}(\mathbb {F}_{q})\)divides\(r-1\). For \(i:=3\), we prove that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3, A_2,A_1]\). Let \(F:=T^r+A_3T^3+A_2T^2+A_1T\). It is easy to see that \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}={\mathrm {Disc}}(F)\) and \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}={\mathrm {Subdisc}}(F)\). Observe that \(F'=T^{r-1}+3A_3T^3+2A_2T^2+A_1\). By [24, Lemma 7.1], we deduce that

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_3}=(-1)^{{r}({r-1})} {\mathrm {Res}}(F',G)\ {\text { and }} \ \mathcal {S}_1^{{{{\mathsf {wt}}}}_3}=(-1)^{(r-1)(r-2)}{\mathrm {Subdisc}}(F',G), \end{aligned}$$

where \(G:=-2A_3T^3-A_2T^2\) is the remainder of the division of F by \(F'\). Therefore, by the Poisson formula it follows that

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_3}=(-1)^{r+1}A_1^2A_2^{r-1}+ 2^{r-1}A_1^2A_2^2A_3^{r-2}-2^{r-3}A_1^3A_3^{r-1}. \end{aligned}$$

On the other hand, by, e.g., [13, Theorem 2.5], we conclude that

$$\begin{aligned} \mathcal {S}_1^{{{{\mathsf {wt}}}}_3}&=2 A_2^{r-1}+(-1)^r2^{r-2}A_2^2A_3^{r-2}+2A_1A_2^{r-3}A_3+3 (-1)^{r+1}2^{r-2}A_1A_3^{r-1}\\&=2\big ( A_2^{r-1}+A_1A_2^{r-3}A_3\big )+(-2)^{r-2}\big (A_2^2A_3^{r-2}-3 A_1A_3^{r-1}\big ). \end{aligned}$$

In the second line, we express \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) as the sum of two homogeneous polynomials of degrees \(r-1\) and r without common factors. Then, [27, Lemma 3.15] proves that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) is an irreducible polynomial in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3,A_2,A_1]\). Next suppose that \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}\) is a zero divisor in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3,A_2,A_1]/(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3})\). Since \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) is irreducible, we have that \(\mathcal {R}^{{{{\mathsf {wt}}}}_3} \in (\mathcal {S}_1^{{{{\mathsf {wt}}}}_3})\), which is easily shown to be not possible by a direct calculation.

Third case\({\mathrm {char}}(\mathbb {F}_{q})\)divides\(r-2\). For \(i:=3\), we show that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3, A_2,A_1]\). As in the previous case, if \(F:=T^r+A_3T^3+A_2T^2+A_1T\), then \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}={\mathrm {Disc}}(F)\) and \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}:={\mathrm {Subdisc}}(F)\). Since \(F'=2T^{r-1}+3A_3T^3+2A_2T^2+A_1\), from [24, Lemma 7.1] it follows that

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_3}=(-1)^{{r}({r-1})} 2^{r-3}{\mathrm {Res}}(F',G)\ {\text { and }} \ \mathcal {S}_1^{{{{\mathsf {wt}}}}_3}= (-1)^{(r-1)(r-2)}2^{r-3}{\mathrm {Subdisc}}(F',G), \end{aligned}$$

where \(G:=-\frac{1}{2}A_3T^3+\frac{1}{2}A_1T\) is the remainder the division of F by \(F'\). By the Poisson formula, we obtain

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_3}=\left\{ \begin{array}{ll} 4 A_1^3A_3^{r-1}-A_1^r -2A_2A_1{}^{\frac{r+2}{2}}A_3{}^{\frac{r-2}{2}}-A_1^2A_2^2A_3^{r-2}&{}{\text { for }}r{\text { even}},\\ 4A_1^3A_3^{r-1}+A_1^r+4A_1^{\frac{r+3}{2}}A_3{}^{\frac{r-1}{2}}-A_1^2A_2^2A_3^{r-2}&{}{\text { for }}r{\text { odd}}. \end{array} \right. \end{aligned}$$

In the same vein, by, e.g., [13, Theorem 2.5], we have that

$$\begin{aligned} \mathcal {S}_1^{{{{\mathsf {wt}}}}_3}=\left\{ \begin{array}{ll} 4A_2(A_1A_3)^{\frac{r-2}{2}} +2A_2^2A_3^{r-2}+2A_1^{r-2}-6A_1A_3^{r-2}&{}{\text { for }}r{\text { even}},\\ 7(A_1A_3)^{\frac{r-1}{2}} -2A_2^2A_3^{r-2}+2A_1^{r-2}+6A_1A_3^{r-1}&{}{\text { for }}r{\text { odd}}. \end{array} \right. \end{aligned}$$

We observe that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) is an irreducible polynomial in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3,A_2,A_1]\). To prove this, it suffices to apply the Eisenstein criterion, considering \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) as an element of the polynomial ring \(\overline{\mathbb {F}}_{q}((A_s {,\ldots ,}A_4)[A_3,A_1])[A_2]\) and the prime \((A_1)\). Next, suppose that \(\mathcal {R}^{{{{\mathsf {wt}}}}_3}\) is a zero divisor in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_4)[A_3,A_2,A_1]/(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3})\). Since \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_3}\) is irreducible, we have that \(\mathcal {R}^{{{{\mathsf {wt}}}}_3} \in (\mathcal {S}_1^{{{{\mathsf {wt}}}}_3})\), which can be shown to be not possible by a direct calculation.

Fourth case\({\mathrm {char}}(\mathbb {F}_{q})\)does not divide\(r(r-1)(r-2)\). For \(i:=2\), we prove that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]\). Arguing as before, we obtain

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_2}&=(1-r)^{r-1}A_1^r-(r-2)r^{-1}A_1^2A_2^{r-1},\\ \mathcal {S}_1^{{{{\mathsf {wt}}}}_2}&=r(r-1)^{r-2}A_1^{r-2}+2(2-r)^{r-2}A_2^{r-1}. \end{aligned}$$

By the Stepanov criterion (see, e.g., [39, Lemma 6.54]), we deduce that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) is an irreducible polynomial in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]\). Suppose that \(\mathcal {R}^{{{{\mathsf {wt}}}}}\) is a zero divisor in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2,A_1]/(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2})\). Since \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) is irreducible, we have that \(\mathcal {R}^{{{{\mathsf {wt}}}}_2} \in (\mathcal {S}_1^{{{{\mathsf {wt}}}}_2})\), which can be seen not to be the case by a direct calculation. Therefore, we deduce that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_3)[A_2, A_1]\). \(\square \)

By the claim and Lemma 5.4, it follows that \(\mathcal {G}_1\), \(\mathcal {S}_1\) and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_s {,\ldots ,}A_{i+1})[A_i,\ldots ,A_0]\), and Lemma 5.3 implies that \(\mathcal {G}_1\), \(\mathcal {S}_1\) and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_s {,\ldots ,}A_0]\). \(\square \)

By Lemma 5.5, we conclude that hypothesis \(({{\mathsf {H}}}_5)\) holds. Finally, we prove that hypothesis \(({{\mathsf {H}}}_6)\) holds. The components of higher weight of the polynomials \(G_1,\ldots ,G_{r-s}\) are \(G_i^{{{{\mathsf {wt}}}}}=A_{s+i-1}\) for \(2\le i \le r-s\) and \(G_1^{{{{\mathsf {wt}}}}}= A_0\). With the same arguments as above, we see that \(\mathcal {D}(W^{{{{\mathsf {wt}}}}})\) has codimension at least one in \(W^{{{{\mathsf {wt}}}}}\), where \(W^{{{{\mathsf {wt}}}}}:=V(G_1^{{{{\mathsf {wt}}}}},\ldots ,G_{r-s}^{{{{\mathsf {wt}}}}})\).

Since the family (5.2) satisfies hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\), from Theorem 4.6 we deduce the following result.

Theorem 5.6

Let \(\mathcal {A}\) be the family (5.2) and \({\varvec{\lambda }}\) a factorization pattern. We have

$$\begin{aligned} \big ||\mathcal {A}_{{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^s\big |&\le \mathcal {T}({\varvec{\lambda }})q^{s-1} \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+ r^2\delta \big ),\\ \big ||\mathcal {A}_{{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^s\big |&\le q^{s-1}\Big (\mathcal {T}({\varvec{\lambda }}) \big ((\delta (D-2)+2)q^{\frac{1}{2}}+14 D^2 \delta ^2+r^2\delta \big )+r^2\delta \Big ), \end{aligned}$$

where \(\mathcal {A}_{{\varvec{\lambda }}}\) is the set of elements of \(\mathcal {A}\) with factorization pattern \({\varvec{\lambda }}\), \(\mathcal {A}_{{\varvec{\lambda }}}^{sq}\) is the set of square-free elements of \(\mathcal {A}_{{\varvec{\lambda }}}\), \(\delta :=r\cdot (r-s-1)!\) and \(D:=r-1+{(r-s-2)(r-s-1)}/{2}\).

Proof

We apply Theorem 4.6 with \(m:=r-s\) to the polynomials

$$\begin{aligned} R_1:=(-1)^r\Pi _r-1,\ R_2:=(-1)^{r-s-1}\Pi _{r-s-1},\ldots , R_{r-s}:=-\Pi _1. \end{aligned}$$

Therefore, we have

$$\begin{aligned}&\delta :=\prod _{i=1}^{r-s} \deg R_i=r\cdot (r-s-1)!{\text { and }}\\&\quad D:=\sum _{i=1}^{r-s}(\deg R_i-1)=r-1+\frac{(r-s-2)(r-s-1)}{2}. \end{aligned}$$

This finishes the proof. \(\square \)

5.3 A nonlinear family

Let \(r, t_1 {,\ldots ,}t_r\) be positive integers with r even. Suppose that \({\mathrm {char}}(\mathbb {F}_{q})>3\) does not divide \((r-1)(r+1)\big ((r-1)^{r-1}+r^r\big )\). Consider the polynomial \(G\in \mathbb {F}_{q}[A_1,\ldots ,A_r]\) defined in the following way:

$$\begin{aligned} G:=\sum _{t_1+2t_2 +\cdots +r t_r=r} (-1)^{\Delta (t_1,\ldots ,t_r)}\frac{(t_1 +\cdots + t_r)!}{t_1! \ldots t_r!} A_r^{t_1} \ldots A_1^{t_r}, \end{aligned}$$

where \(\Delta (t_1,t_2,\ldots ,t_r):= r-\sum _{i=1}^r t_i\). The polynomial G arises as the determinant of the \(n \times n\) generic Toeplitz–Hessenberg matrix, namely

$$\begin{aligned} G=\det \left( \begin{array}{lllll} A_r &{}\quad 1 &{}\quad 0 &{}\quad \ldots &{}\quad 0\;\; \\ \vdots &{}\quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} \quad &{} \quad \ddots &{} \quad \ddots &{} \quad 0\;\; \\ A_1 &{} \quad \ldots &{} \quad \ldots &{} \quad A_r&{}\quad 1\;\; \end{array} \right) . \end{aligned}$$

This is the well-known Trudi formula (see [43, Ch. VII]; see also [42, Theorem 1]). We also remark that the polynomial \(H_r:=G(\Pi _r,\ldots ,\Pi _1)\) is critical in the study of deep holes of the standard Reed–Solomon codes (see [5, Proposition 2.2]).

We consider the following family of polynomials:

$$\begin{aligned} \mathcal {A}_{\mathcal {N}}:=\{T^{r+1} +a_rT^r+ \cdots +a_0: G(a_r,\ldots , a_1)=0\}. \end{aligned}$$
(5.4)

Observe that \(\mathcal {A}_{\mathcal {N}}\) may be seen as the set of \(\mathbb {F}_{q}\)-rational points of the \(\mathbb {F}_{q}\)-variety \(W:=V(G) \subset \mathbb {A}^{r+1}\). Let \({{{\mathsf {wt}}}}\) be the weight defined by \({{{\mathsf {wt}}}}(A_i):=r+1-i\) for \(i=0 {,\ldots ,}r\). We shall prove that this family of polynomials satisfies hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\).

It is clear that \(({{\mathsf {H}}}_1)\) holds, because G is nonzero. Further, since G is a monic element of \(\mathbb {F}_{q}[A_r,\ldots ,A_2][A_1]\) of degree 1 in \(A_1\), we have that

$$\begin{aligned} \nabla G({\varvec{a}}_0)= \bigg (\frac{\partial {G}}{\partial {A_r}}({\varvec{a}}_0){,\ldots ,}\frac{\partial {G}}{\partial A_2}({\varvec{a}}_0),1\bigg ) \ne 0 \end{aligned}$$

for any \({\varvec{a}}_0 \in W\). We deduce that hypothesis \(({{\mathsf {H}}}_2)\) holds.

Next we consider hypothesis \(({{\mathsf {H}}}_3)\). Given an arbitrary nonzero monomial

$$\begin{aligned} m_g:=(-1)^{\Delta (t_1,\ldots ,t_r)}(\frac{(t_1 +\cdots + t_r)!}{t_1! \ldots t_r!} A_r^{t_1} \ldots A_1^{t_r} \end{aligned}$$

arising in the dense representation of G, it is easy to see that \({{{\mathsf {wt}}}}(m_G)=r\). It follows that G is weighted homogeneous of weighted degree r. Then, \(G^{{{{\mathsf {wt}}}}}=G\), which readily implies that hypothesis \(({{\mathsf {H}}}_3)\) holds.

Now we analyze the validity of hypothesis \(({{\mathsf {H}}}_4)\), namely that the discriminant locus \(\mathcal {D}(W) \subset \mathbb {A}^{r+1}\) of W has codimension at least 1 in W. For this purpose, it suffices to show that \(\{G, \mathcal {R}\}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_r {,\ldots ,}A_0]\), where \(\mathcal {R}:={\mathrm {Disc}}(F({\varvec{A}}_0, T))\), \(F({\varvec{A}}_0,T):=T^{r+1}+A_rT^r +\cdots + A_0\) and \({\varvec{A}}_0:=(A_r {,\ldots ,}A_0)\).

We consider G and \(\mathcal {R}\) as elements of the polynomial ring \(\overline{\mathbb {F}}_{q}(A_r,\ldots , A_2)[A_1, A_0]\) and define a weight \({{{\mathsf {wt}}}}_1\) on \(\overline{\mathbb {F}}_{q}(A_r,\ldots , A_2)[A_1, A_0]\) by setting

$$\begin{aligned} {{{\mathsf {wt}}}}_1(A_1):=r,\quad {{{\mathsf {wt}}}}_1(A_0):=r+1. \end{aligned}$$

We claim that \(G^{{{{\mathsf {wt}}}}_1}, \mathcal {R}^{{{{\mathsf {wt}}}}_1}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_r,{,\ldots ,}A_2)[A_1,A_0]\). Observe that \(G^{{{{\mathsf {wt}}}}_1}=A_1\). Further, since \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_2)[A_1, A_0]/(G^{{{{\mathsf {wt}}}}_1}) \simeq \overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_2)[A_0]\) is a domain, to prove the claim it suffices to show that \({\mathcal {R}^{{{{\mathsf {wt}}}}_1}}\) is nonzero modulo \((A_1)\). A direct calculation shows that \({\mathcal {R}^{\mathsf {wt}_1}}=(r+1)^{r+1}A_0^{r+1}\) modulo \((A_1)\), which proves the claim. As a consequence of the claim and Lemma 5.4, we deduce that G and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_2)[A_1,A_0]\), and Lemma 5.3 implies that G and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_r {,\ldots ,}A_0]\). In other words, hypothesis \(({{\mathsf {H}}}_4)\) is satisfied.

Next we show that hypothesis \(({{\mathsf {H}}}_5)\) holds. To this end, we make the following claim.

Claim

\(A_0\), \(\mathcal {R}\) and G form a regular sequence of \(\overline{\mathbb {F}}_{q}[A_r,\ldots ,A_0]\).

Proof

Since \(\overline{\mathbb {F}}_{q}[A_r,\ldots ,A_0]/(A_0)\simeq \overline{\mathbb {F}}_{q}[A_r,\ldots ,A_1]\) and \(G\in \overline{\mathbb {F}}_{q}[A_r,\ldots ,A_1]\), we show that \(\mathcal {R}\) modulo \((A_0)\), and G, form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_r,\ldots ,A_1]\). We consider G and \(\mathcal {R}\) modulo \((A_0)\) as elements of \(\overline{\mathbb {F}}_{q}(A_{r-1},\ldots ,A_2)[A_r,A_1]\), with the weight \({{{\mathsf {wt}}}}_r\) defined by \({{{\mathsf {wt}}}}_r(A_r):=1\) and \({{{\mathsf {wt}}}}_r(A_1):=r\). We claim that \(G^{{{{\mathsf {wt}}}}_r}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_r}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_{r-1},\ldots ,A_2)[A_r,A_1]\). First, we observe that

$$\begin{aligned} G^{{{{\mathsf {wt}}}}_r}=A_1+A_r^r, \end{aligned}$$

and the Stepanov criterion (see, e.g., [39, Lemma 6.54]) proves that \(G^{{{{\mathsf {wt}}}}_r}\) is an irreducible polynomial of \(\overline{\mathbb {F}}_{q}(A_{r-1},\ldots ,A_2)[A_r,A_1]\). Thus, it is enough to prove that \(\mathcal {R}^{{{{\mathsf {wt}}}}_r}\) is a nonzero polynomial of \(\overline{\mathbb {F}}_{q}(A_{r-1},\ldots ,A_2)[A_r,A_1]/(G^{{{{\mathsf {wt}}}}_r})\). We have

$$\begin{aligned} \mathcal {R}^{{{{\mathsf {wt}}}}_r}&= - (r-1)^{r-1} A_r^rA_1^r+ r^rA_1^{r+1}\\&\equiv -\big ((r-1)^{r-1}+r^r\big )A_r^{r+r^2} {\text { modulo }}G^{{{{\mathsf {wt}}}}_r}. \end{aligned}$$

We conclude that \(G^{{{{\mathsf {wt}}}}_r}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_r}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_{r-1},\ldots ,A_2)[A_r,A_1]\). Combining Lemmas 5.4 and 5.3 as before, we deduce that \(\mathcal {R}\) modulo \((A_0)\) and G form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_r {,\ldots ,}A_1]\), which implies that \(A_0\), \(\mathcal {R}\) and G form a regular sequence of \(\overline{\mathbb {F}}_{q}[A_r,\ldots ,A_0]\). \(\square \)

We also need the following claim.

Claim

G, \(\mathcal {R}\) and \(\mathcal {S}_1\) form a regular sequence of \(\overline{\mathbb {F}}_{q}[A_r {,\ldots ,}A_0]\).

Proof

Consider G, \(\mathcal {R}\) and \(\mathcal {S}_1\) as elements of \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_3)[A_2,A_1,A_0]\), and the weight \({{{\mathsf {wt}}}}_2\) defined by \({{{\mathsf {wt}}}}_2(A_2):=r-1\), \({{{\mathsf {wt}}}}_2(A_1):=r\), \({{{\mathsf {wt}}}}_2(A_0):=r+1\). We claim that \(G^{{{{\mathsf {wt}}}}_2}\), \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_3)[A_2,A_1,A_0]\). Since \(G^{{{{\mathsf {wt}}}}_2}=A_1\), we have that \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_3)[A_2,A_1,A_0]/(G^{{{{\mathsf {wt}}}}_2}) \simeq \overline{\mathbb {F}}_{q}(A_r {,\ldots ,} A_3)[A_2,A_0]\) is a domain. Therefore, it suffices to prove that \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) modulo \((A_1)\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) modulo \((A_1)\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_3)[A_2,A_0]\). Observe that

$$\begin{aligned} \mathcal {S}_1^{{{{\mathsf {wt}}}}_2}{\text { modulo }}(A_1)=-2(r-1)^{r-1}A_2^r. \end{aligned}$$

Further, we have \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}{\text { modulo }}(A_1,A_2)=(r+1)^{r+1}A_{0}^r\). As a consequence, \(G^{{{{\mathsf {wt}}}}_2}\), \(\mathcal {S}_1^{{{{\mathsf {wt}}}}_2}\) and \(\mathcal {R}^{{{{\mathsf {wt}}}}_2}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}(A_r {,\ldots ,}A_3)[A_2, A_1, A_0]\). From Lemmas 5.4 and 5.3, it follows that G, \(\mathcal {S}_1\) and \(\mathcal {R}\) form a regular sequence in \(\overline{\mathbb {F}}_{q}[A_r {,\ldots ,}A_0]\). \(\square \)

From the first claim, we conclude that \(\mathcal {D}(W)\cap \{A_0=0\}\) has codimension two in W, while the second claim shows that \(\mathcal {S}_1(W)\) has codimension two in W. As a consequence, \(\mathcal {D}(W)\cap (A_0\cdot \mathcal {S}_1)(W)\) has codimension two in W, that is, hypothesis \(({{\mathsf {H}}}_5)\) is satisfied.

Finally, since \(G^{{{{\mathsf {wt}}}}}=G\), we readily deduce that hypothesis \(({{\mathsf {H}}}_6)\) holds.

As a consequence of the fact that the family (5.4) satisfies hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\), we obtain the following result.

Theorem 5.7

Let \(\mathcal {A}_\mathcal {N}\) be the family (5.4) and \({\varvec{\lambda }}\) a factorization pattern. We have

$$\begin{aligned} \big ||\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^r\big |&\le \mathcal {T}({\varvec{\lambda }})q^{r-1} \big (r^2q^{\frac{1}{2}}+14r^4\big ),\\ \big ||\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^r\big |&\le q^{r-1}\big (\mathcal {T}({\varvec{\lambda }})(r^2q^{\frac{1}{2}}+14 r^4)+r^3\big ), \end{aligned}$$

where \(\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}\) is the set of elements of \(\mathcal {A}_\mathcal {N}\) with factorization pattern \({\varvec{\lambda }}\) and \(\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}^{sq}\) is the set of square-free elements of \(\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}\).

Proof

This is a simple consequence of Theorem 4.6 with \(m:=1\) and the polynomial

$$\begin{aligned} R_1:=G(-\Pi _1,\Pi _2,\ldots ,(-1)^r\Pi _r). \end{aligned}$$

As previously remarked, the weighted degree of G is r, which implies that \(\deg R_1=r\). Therefore, we have

$$\begin{aligned} \delta :=\deg R_1=r{\text { and }} D:=\deg R_1-1=r-1. \end{aligned}$$

As a consequence, Theorem 4.6 implies

$$\begin{aligned} \big ||\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}^{sq}| -\mathcal {T}({\varvec{\lambda }})\,q^r\big |&\le \mathcal {T}({\varvec{\lambda }})q^{r-1} \big ((r(r-3)+2)q^{\frac{1}{2}}+14 (r-1)^2 r^2+ r^3\big ),\\ \big ||\mathcal {A}_{\mathcal {N},{\varvec{\lambda }}}| -\mathcal {T}({\varvec{\lambda }})\,q^r\big |&\le q^{r-1}\Big (\mathcal {T}({\varvec{\lambda }}) \big ((r(r-3)+2)q^{\frac{1}{2}}+14 (r-1)^2 r^2 +r^3\big )+r^3\Big ). \end{aligned}$$

This immediately implies the statement of the theorem. \(\square \)

6 Average-case analysis of polynomial factorization over \(\mathcal {A}\)

In this section, we analyze the average-case complexity of the classical factorization algorithm applied to any family \(\mathcal {A}\) as in (1.1) satisfying hypotheses \(({{\mathsf {H}}}_1)\)\(({{\mathsf {H}}}_6)\).

Given \(f \in \mathbb {F}_{q}[T]\), the classical factorization algorithm finds the complete factorization \(f=f_1^{e_1} \ldots f_n^{e_n}\), where \(f_1, \ldots , f_n\) are pairwise-distinct monic irreducible polynomials in \(\mathbb {F}_{q}[T]\) and \(e_1, \ldots ,e_n\) are strictly positive integers. The algorithm contains three main routines:

  • elimination of repeated factors (ERF) replaces a polynomial by a square-free one that contains all the irreducible factors of the original one with exponent 1;

  • distinct-degree factorization (DDF) splits a square-free polynomial into a product of polynomials whose irreducible factors have all the same degree;

  • equal-degree factorization (EDF) splits completely a polynomial whose irreducible factors have all the same degree.

More precisely, the algorithm works as follows:

Classical factorization algorithm

figure a

In [18], the authors analyze the average-case complexity of the classical factorization algorithm applied to all the monic polynomials of degree r of \( \mathbb {F}_{q}[T] \). Unfortunately, the results of this analysis cannot be directly applied to the family \(\mathcal {A}\), because there is a small probability that a random monic polynomial of degree r of \( \mathbb {F}_{q}[T] \) belongs to \(\mathcal {A}\). For this reason, we shall perform an analysis of the behavior of this algorithm applied to elements of \(\mathcal {A}\), using the results on the distribution of factorization patterns of Sect. 4.

Considering the uniform probability on \(\mathcal {A}\), let \(\mathcal {X}: \mathcal {A} \rightarrow \mathbb {N}\) be the random variable that counts the number \( \mathcal {X} (f) \) of arithmetic operations in \( \mathbb {F}_{q}\) performed by the classical factorization algorithm to obtain the complete factorization in \(\mathbb {F}_{q}[T]\) of any \(f\in \mathcal {A}\). We may describe this algorithm as consisting of four stages, and thus the random variable \(\mathcal {X}\) may be decomposed as the sum of the random variables that count the cost of each step of the algorithm. More precisely, we consider the random variable \(\mathcal {X}_1: \mathcal {A} \rightarrow \mathbb {N}\) that counts the number of arithmetic operations in \(\mathbb {F}_{q}\) performed in the ERF step, namely

$$\begin{aligned} \mathcal {X}_1(f):={\mathrm {Cost}}({\mathrm {ERF}}(f)). \end{aligned}$$
(6.1)

Further, we introduce a random variable \(\mathcal {X}_2: \mathcal {A} \rightarrow \mathbb {N}\) that counts the number of arithmetic operations in \(\mathbb {F}_{q}\) performed during the DDF step, namely

$$\begin{aligned} \mathcal {X}_2(f):={\mathrm {Cost}}({\mathrm {DDF}}(a_f)), \end{aligned}$$
(6.2)

where \(a_f:={\mathrm {ERF}}(f)\) denotes the square-free polynomial obtained after performing the ERF step on input f. Denote by

$$\begin{aligned} \varvec{b}_f:={\mathrm {DDF}}(a_f)=(b_f(1),\ldots , b_f(s)) \end{aligned}$$

the vector of polynomials obtained by applying the DDF step to the monic square-free polynomial \(a_f:={\mathrm {ERF}}(f)\), where s is the degree of the largest irreducible factor of \(a_f\). Each \(b_f(k)\) consists of the product of all the monic irreducible polynomials in \( \mathbb {F}_{q}[T] \) of degree k that divide f. With this notation, let \(\mathcal {X}_3: \mathcal {A} \rightarrow \mathbb {N}\) be the random variable that counts the number of arithmetic operations in \(\mathbb {F}_{q}\) of the EDF step, namely

$$\begin{aligned} \mathcal {X}_3(f):=\sum _{k=1}^s \mathcal {X}_{3,k}(f),\quad \mathcal {X}_{3,k}(f):={\mathrm {Cost}}({\mathrm {EDF}}(b_f(k)))\quad (1\le k\le s). \end{aligned}$$
(6.3)

Finally, we introduce a random variable \(\mathcal {X}_4:\mathcal {A} \rightarrow \mathbb {N}\) that counts the number of operations in \(\mathbb {F}_{q}\) performed by the classical factorization algorithm applied to \({f}/{\mathrm {ERF}}(f)\). Our aim is to study the expected value of the random variable \(\mathcal {X}\), namely

$$\begin{aligned} E[\mathcal {X}]:=\frac{1}{|\mathcal {A}|}\sum _{f \in \mathcal {A}}\mathcal {X}(f)= \frac{1}{|\mathcal {A}|} \sum _{k=1}^4 \sum _{f \in \mathcal {A}}\mathcal {X}_k(f). \end{aligned}$$
(6.4)

We denote by M(r) a multiplication time, so that the product of two polynomials of degree at most r of \(\mathbb {F}_{q}[T]\) can be computed with at most \(\tau _1 M(r)\) arithmetic operations in \(\mathbb {F}_{q}\). Using fast arithmetic, we can take \(M(r):= r\log r\log \log r\) (see, e.g., [50]). For \(\tau _1\) suitably chosen, a division with remainder of two polynomials of degree at most r can also be computed with at most \(\tau _1 M(r)\) arithmetic operations in \(\mathbb {F}_{q}\). Further, the cost of computing the greatest common divisor of two polynomials in \(\mathbb {F}_{q}[T]\) of degree at most r is at most \(\tau _2\, \mathcal {U}(r)\) arithmetic operations in \(\mathbb {F}_{q}\), where \(\mathcal {U}(r):= M(r) \log r\) (see, e.g., [50]). Here, \(\tau _1\) and \(\tau _2\) are system- and implementation-dependent constants.

6.1 Elimination of repeated factors

We consider in detail the step of elimination of repeated factors (ERF). Let

$$\begin{aligned} f=f_1^{e_1}\cdots f_n^{e_n}=\prod _{p\mid e_i}f_i^{e_i}\prod _{p \not \mid e_i}f_i^{e_i} \end{aligned}$$

be the factorization of \(f \in \mathcal {A}\) into monic irreducible polynomials in \(\mathbb {F}_{q}[T]\), where \(f_1, \ldots ,f_n\) are pairwise distinct, \(e_1, \ldots ,e_n \in \mathbb {N}\) and \(p:={\mathrm {char}}(\mathbb {F}_{q})\). It is clear that f is square-free if and only if \(\gcd (f,f')=1\) (see, e.g., [50, Corollary 14.25]). Assume that f is not square-free. Hence, \(u:=\gcd (f,f') \ne 1\). It follows that \(v:=f/u=\prod _{p \not \mid e_i} f_i\) is the square-free part of the product \(\prod _{p \not \mid e_i}f_i^{e_i}\) (see, e.g., [49, Theorem 20.4]). Since each \(e_i \le r:=\deg f\), we deduce that \(\gcd (u,v^r)=\prod _{ p \not \mid e_i} f_i^{e_i-1}\). Therefore,

$$\begin{aligned} w:=\frac{u}{\gcd (u,v^r)}=\prod _{p \mid e_i} f_i^{e_i} \end{aligned}$$

is the part of f which is a power of p. These are the foundations of the following procedure.

ERF algorithm

figure b

According to [50, Exercise 14.27], for \(f\in \mathbb {F}_{q}[T]\) of degree at most r, the number of arithmetic operations in \(\mathbb {F}_{q}\) performed by the ERF algorithm to obtain the square-free part of f is \(\mathcal {O}(M(r)\log r+ r \log (q/p))\). In this section, we analyze the average-case complexity of the ERF algorithm restricted to elements of the family \(\mathcal {A}\). More precisely, we analyze the expected value \(E[\mathcal {X}_1]\) of the random variable \(\mathcal {X}_1\) defined in (6.1), namely

$$\begin{aligned} E[\mathcal {X}_1]:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}} \mathcal {X}_1(f). \end{aligned}$$
(6.5)

Let \(\mathcal {A}^{sq}\) be the set of \(f \in \mathcal {A} \) that are square-free and \(\mathcal {A}^{nsq}:=\mathcal {A}{\setminus } \mathcal {A}^{sq}\). The probability that a random polynomial of \(\mathcal {A}\) is square-free is

$$\begin{aligned} P[\mathcal {A}^{sq}]=\frac{|\mathcal {A}^{sq}|}{|\mathcal {A}|}=1-\frac{|\mathcal {A}^{nsq}|}{|\mathcal {A}|}. \end{aligned}$$

According to (4.12), we have \(|\mathcal {A}^{nsq}|\le r(r-1)\delta _{{\varvec{G}}} q^{r-m-1}\). On the other hand, from Theorem 3.12 it follows that if \(q > 15\delta _{{\varvec{G}}}^{13/3}\), then \(|\mathcal {A}|\ge \frac{1}{2} q^{r-m}\), where \( {\varvec{G}}:= (G_1, \ldots , G_m) \) are the polynomials defining the family \( \mathcal {A} \) and \(\delta _{{\varvec{G}}}:=\deg G_1 \cdots \deg G_m\). As a consequence,

$$\begin{aligned} P[\mathcal {A}^{sq}] \ge 1-\frac{2\,r^2\delta _{{\varvec{G}}} \,q^{r-m-1}}{q^{r-m}} =1-\frac{2\,r^2\delta _{{\varvec{G}}}}{q}. \end{aligned}$$

In other words, we have the following result.

Lemma 6.1

For \(q > 15\delta _{{\varvec{G}}}^{13/3}\), the probability that a random polynomial of \(\mathcal {A}\) is square-free is \(P[\mathcal {A}^{sq}]\ge 1 - 2\,r^2\delta _{{\varvec{G}}}/q\). In particular, if \(q >\max \{ 15\delta _{{\varvec{G}}}^{13/3},4\,r^2\delta _{{\varvec{G}}}\}\), then \(P[\mathcal {A}^{sq}]>1/2\).

To estimate \(E[\mathcal {X}_1]\), we decompose the family \(\mathcal {A}\) into the sets \(\mathcal {A}^{sq}\) and \(\mathcal {A}^{nsq}\). We have

$$\begin{aligned} E[\mathcal {X}_1]=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_1(f) +\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_1(f)=:S_1^{sq}+S_1^{nsq}. \end{aligned}$$

First, we obtain an upper bound for \(S_1^{sq}\). On input \(f\in \mathcal {A}^{sq}\), the ERF algorithm performs the first three steps. Since \(u:=\gcd (f,f')=1\) and \(\gcd (u,v^r)=1\), its cost is dominated by the cost of calculating u, which is at most \(\tau _2\,\mathcal {U}(r)\) arithmetic operations in \(\mathbb {F}_{q}\), and the cost of calculating \(v^r\), which is at most \( \tau _1 \, \mathcal {U}(r)\) arithmetic operations in \(\mathbb {F}_{q}\). We conclude that if \(f \in \mathcal {A}^{sq}\), then \(\mathcal {X}_1(f) \le (\tau _1 +\tau _2)\, \mathcal {U}(r)\). Therefore,

$$\begin{aligned} S_1^{sq}:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_1(f)\le (\tau _1 +\tau _2)\, \mathcal {U}(r) \frac{|\mathcal {A}^{sq}|}{|\mathcal {A}|}. \end{aligned}$$
(6.6)

On the other hand, if \(f \in \mathcal {A}^{nsq}\), then [50, Exercise 14.27] shows that the number of arithmetic operations in \( \mathbb {F}_{q}\) which performs the ERF algorithm on input f is bounded by \( \mathcal {X}_1(f) \le c_1 \big ( \mathcal {U}(r)+ r \log \big (\frac{q}{p}\big )\big )\), where \(c_1\) is a constant independent of q and \(p:={\mathrm {char}}(\mathbb {F}_{q})\). Hence, we have

$$\begin{aligned} S_1^{nsq}:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_1(f)\le c_1 \bigg (\mathcal {U}(r)+ r \log \Big (\frac{q}{p}\Big )\bigg ) \frac{|\mathcal {A}^{nsq}|}{|\mathcal {A}|}. \end{aligned}$$
(6.7)

Combining (6.6) and (6.7) we conclude that

$$\begin{aligned} E[\mathcal {X}_1]&\le (\tau _1 +\tau _2) \, \mathcal {U}(r) \frac{|\mathcal {A}^{sq}|}{|\mathcal {A}|} + c_1 \, \mathcal {U}(r) \frac{|\mathcal {A}^{nsq}|}{|\mathcal {A}|}+ c_1 \,r \log \Big (\frac{q}{p}\Big )\frac{|\mathcal {A}^{nsq}|}{|\mathcal {A}|}\\&\le c_2 \, \mathcal {U}(r) + c_1\, r \log \Big (\frac{q}{p}\Big ) \frac{|\mathcal {A}^{nsq}|}{|\mathcal {A}|}, \end{aligned}$$

where \(c_2:=\max \{ \tau _1 +\tau _2,c_1\}\). Hence, if \(q>15\delta _{{\varvec{G}}}^{13/3}\), then Lemma 6.1 implies

$$\begin{aligned} E[\mathcal {X}_1] \le c_2\, \mathcal {U}(r) + 2\,c_1\, r^3 \delta _{{\varvec{G}}}\log \Big (\frac{q}{p}\Big )\frac{1}{q}. \end{aligned}$$

We obtain the following result.

Theorem 6.2

Let \(q > 15\delta _{{\varvec{G}}}^{13/3}\). The average cost \( E [\mathcal {X} _1] \) of the \({\mathrm {ERF}}\) algorithm applied to elements of \( \mathcal {A} \) is bounded as \( E [\mathcal {X} _1] \le c_2 \, \mathcal {U} (r) +c_3 \log \big (\frac{q}{p} \big ) \delta _{{\varvec{G}}}\frac{r ^ 3}{q} \), where \(c_2\) and \( c_3\) are constants independent of r and q.

We may paraphrase this result as saying that the average cost of the ERF algorithm applied to elements of \(\mathcal {A}\) is asymptotically of order \(\mathcal {U}(r)\), which corresponds to the cost of calculating the greatest common divisor \(u:=\gcd (f,f')\). This generalizes the results of [18, Section 2].

6.2 Distinct-degree factorization

Now we analyze the distinct-degree factorization (DDF) step. Recall that, given a square-free polynomial \(a_f:={\mathrm {ERF}}(f)\), the DDF routine outputs a list \((b(1),\ldots ,b(s))\), where b(k) is the product of all the irreducible factors of degree k of the complete factorization of \(a_f\) over \(\mathbb {F}_{q}\). The output \((b(1),\ldots ,b(s))\) is called the distinct-degree factorization of \(a_f\).

The DDF procedure is based on the following property (see, e.g., [39, Theorem 3.20]): for \(k \ge 1\), the polynomial \(T^{q^k}-T \in \mathbb {F}_{q}[T]\) is the product of all monic irreducible polynomials in \(\mathbb {F}_{q}[T]\) whose degree divides k. It follows that \(g_1:=\gcd (T^q-T,f)\) is the product of all the irreducible factors of f of degree 1. Then, for \( 1 \le k \le r \), the polynomial \(g_k:=\gcd (T^{q^k }-T, f/g_{k-1}) \) is the product of all the irreducible factors of f of degree k. This proves the correctness of the following procedure.

DDF Algorithm

figure c

In [50, Theorem 14.4], it is shown that this algorithm performs \(\mathcal {O}(s M(r) \log (rq))\) arithmetic operations in \(\mathbb {F}_{q}\), where s is the maximum degree of the irreducible factors of the input polynomial a. In this section, we analyze the average-case complexity of the DDF routine restricted to polynomials of the family \(\mathcal {A}\). More precisely, we consider the expected value \(E[\mathcal {X}_2]\) of the random variable \(\mathcal {X}_2\) of (6.2), namely

$$\begin{aligned} E[\mathcal {X}_2]:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}} \mathcal {X}_2(f). \end{aligned}$$

We decompose as before the set of inputs \(\mathcal {A}\) into the disjoint subsets \(\mathcal {A}^{sq}\) (elements of \(\mathcal {A}\) which are square-free) and \(\mathcal {A} ^{nsq}:=\mathcal {A} {\setminus } \mathcal {A}^{sq}\). Hence, we have

$$\begin{aligned} E[\mathcal {X}_2]=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_2(f)+\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_2(f). \end{aligned}$$
(6.8)

First, we obtain an upper bound for the first sum \(S_2^{sq}\) in the right-hand side of (6.8). We express \(\mathcal {A}^{sq}\) as a disjoint union as follows:

$$\begin{aligned} \mathcal {A}^{sq}=\bigcup _{i=1}^r \mathcal {A}_{i}^{sq}, \end{aligned}$$

where \(\mathcal {A}_{i}^{sq}\) is the set of elements of \(\mathcal {A}^{sq}\) for which the maximum degree of the irreducible factors is i. Moreover, for \(1 \le i \le r\), we can express each \(\mathcal {A}_{i}^{sq}\) as the disjoint union

$$\begin{aligned} \mathcal {A}_{i}^{sq}=\bigcup _{{\varvec{\lambda }} \in \mathcal {P}_i} \mathcal {A}_{{\varvec{\lambda }}}^{sq}, \end{aligned}$$

where \(\mathcal {P}_i\) is the set of \({\varvec{\lambda }}:=(\lambda _1, \ldots , \lambda _i, 0,\ldots ,0)\in \mathbb {Z}_{\ge 0}^r\) such that \(\lambda _1+\cdots + i \, \lambda _i=r\) and \( \lambda _i>0\), and \(\mathcal {A}_{ {\varvec{\lambda }}}^{sq}\) is the set of elements of \(\mathcal {A}_{i}^{sq}\) with factorization pattern \({\varvec{\lambda }}\). Therefore,

$$\begin{aligned} S_2^{sq}=\frac{1}{|\mathcal {A}|} \sum _{i=1}^r \sum _{{\varvec{\lambda }}\in \mathcal {P}_i}\sum _{f \in \mathcal {A}_{{\varvec{\lambda }} }^{sq}} \mathcal {X}_2(f). \end{aligned}$$
(6.9)

Fix i with \( 1 \le i \le r\), let \({\varvec{\lambda }}\in \mathcal {P}_i\) and \(f \in \mathcal {A}_{{\varvec{\lambda }}}^{sq}\). To determine the cost \(\mathcal {X}_2(f)\), we observe that the procedure performs i iterations of the main loop. Fix l with \( 1\le l\le i\) and consider the lth iteration of the DDF algorithm. The number of products modulo g needed to compute \(h^q \mod g\) is denoted by \(\lambda (q)\). Using repeated squaring, and denoting by \(\nu (q)\) the number of ones in the binary representation of q, the number of products required to compute \(h^q \mod g\) is

$$\begin{aligned} \lambda (q):= \lfloor \log q \rfloor + \nu (q) -1. \end{aligned}$$

Thus, the first step in the lth iteration of the DDF algorithm requires at most \(2\,\tau _1\, \lambda (q) M(r_l)\) arithmetic operations in \(\mathbb {F}_{q}\), where \(r_l:=\deg g\) (note that \(r_1=r\) and \(r_l \le r\) for any l). Then, the computation \(b(k):=\gcd (h-T,g)\) requires at most \(\tau _2 M(r_l) \log r_l\) arithmetic operations in \(\mathbb {F}_{q}\). Finally, the division g / b(k) requires at most \(\tau _1 M(r_l)\) arithmetic operations in \(\mathbb {F}_{q}\). As a consequence, we see that

$$\begin{aligned} \mathcal {X}_2(f) \le \sum _{l=1}^i (2\,\tau _1 \lambda (q) + \tau _2 \log r_l + \tau _1) \, M(r_l). \end{aligned}$$

Observe that if \(a \le b\), then \(M(a) \le M(b)\) (see, e.g., [50, §14.8])). It follows that

$$\begin{aligned} \mathcal {X}_2(f) \le i\,c_{r,q},\quad c_{r,q}:= M(r)\,\big (2\,\tau _1\lambda (q)+ \tau _1+\tau _2 \log r\big ). \end{aligned}$$
(6.10)

Thus, we obtain

$$\begin{aligned} S_2^{sq} \le \frac{c_{r,q}}{|\mathcal {A}|} \sum _{i=1}^r \sum _{{\varvec{\lambda }} \in \mathcal {P}_i}\sum _{f \in \mathcal {A}_{{\varvec{\lambda }} }^{sq}} i= \frac{c_{r,q}}{|\mathcal {A}|}\sum _{i=1}^r i \sum _{{\varvec{\lambda }} \in \mathcal {P}_i} |\mathcal {A}_{{\varvec{\lambda }}}^{sq}|. \end{aligned}$$

We have the following result.

Lemma 6.3

For \(q > 15\delta _{{\varvec{G}}}^{13/3}\), the sum \(S_2^{sq}\) is bounded in the following way:

$$\begin{aligned} S_2^{sq} \le c_{r,q}\bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\bigg (1+\frac{M_{r}}{q} \bigg )\xi (r+1)= c_{r,q}\,\xi (r+1)\big (1+o(1)\big ), \end{aligned}$$
(6.11)

where \(M_{r}:=D\delta q^{\frac{1}{2}} +14\,D^2 \delta ^2+ r^2\delta \), \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\), \(D:=\sum _{i=1}^m ({{{\mathsf {wt}}}}(G_i)-1)\) and \(\xi \sim 0.62432945\ldots \) is the Golomb–Dickman constant.

Proof

According to Theorem 4.6, we have

$$\begin{aligned} |\mathcal {A}_{{\varvec{\lambda }} }^{sq}| \le q^{r-m}\,\mathcal {T}({\varvec{\lambda }})\bigg (1 +\frac{M_{ r}}{q}\bigg ), \end{aligned}$$

where \(\mathcal {T}({\varvec{\lambda }})\) is the probability of the set of permutations with cycle pattern \({\varvec{\lambda }} \) in the symmetric group \(\mathbb {S}_r\) of r elements. Hence,

$$\begin{aligned} S_2^{sq}&\le \frac{c_{r,q}}{|\mathcal {A}|}\,q^{r-m}\bigg (1+\frac{M_{r}}{q} \bigg )\sum _{i=1}^r i \sum _{{\varvec{\lambda }} \in \mathcal {P}_i} \mathcal {T}({\varvec{\lambda }} ). \end{aligned}$$
(6.12)

Now we analyze the sum \(E_r:=\sum _{i=1}^r i\sum _{{\varvec{\lambda }} \in \mathcal {P}_i} \mathcal {T}({\varvec{\lambda }} )\). Observe that the sum \( \sum _ {{\varvec{\lambda }} \in \mathcal {P}_i} \mathcal {T} ({\varvec{\lambda }}) \) expresses the probability of the set of permutations whose longest cycle has length i. It follows that \( E_r \) is the largest expected length between cycles of a random permutation in \( \mathbb {S} _r \). In [28], it is shown that

$$\begin{aligned} \frac{E_r}{r+1}\le \xi , \end{aligned}$$

where \(\xi \) is the Golomb–Dickman constant (see, e.g., [35]). Combining this upper bound, Theorem 3.12 and (6.12), we readily deduce the statement of the lemma. \(\square \)

Next we obtain an upper bound for the second sum \(S_2^{nsq}\) of (6.8), namely

$$\begin{aligned} S_2^{nsq}:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_2(f). \end{aligned}$$

Given \(f \in \mathcal {A}^{nsq}\), we bound \(\mathcal {X}_2(f):={\mathrm {Cost}}({\mathrm {DDF}}(a_f))\), where \(a_f:={\mathrm {ERF}}(f)\) is the output square-free polynomial of the ERF procedure applied to f. By (6.10), we have

$$\begin{aligned} \mathcal {X}_2(f) \le c_{N,q}\cdot s_a, \end{aligned}$$

where \(c_{N,q}:=M(N)\,\big (2\,\tau _1\lambda (q)+ \tau _1+\tau _2 \log N\big )\), \(N:=\deg (a_f)\) and \(s_a\) is the highest degree of the irreducible factors of \( a_f \). Since \(f \in \mathcal {A}^{nsq}\), we have \(N\le r-1\) and \(s_a\le r -2\). Moreover, it is easy to see that these bounds are optimal. Therefore we obtain

$$\begin{aligned} \mathcal {X}_2(f) \le c_{r-1,q}\, (r-2). \end{aligned}$$

Combining this bound, Theorem 3.12 and (4.12), we deduce that if \(q> 15\delta _{{\varvec{G}}}^{13/3}\), then

$$\begin{aligned} S_{2}^{nsq} \le c_{r-1,q}\,(r-2) \frac{|A^{nsq}|}{|\mathcal {A}|}&\le c_{r-1,q}\,(r-2)\left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \frac{r^2\delta _{{\varvec{G}}}\, q^{r-m-1}}{q^{r-m}}\nonumber \\&\le \,c_{r-1,q}\left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \,\frac{r^3\delta _{{\varvec{G}}}}{q}. \end{aligned}$$
(6.13)

From the upper bounds of Lemma 6.3 and (6.13), we conclude that

$$\begin{aligned} E[\mathcal {X}_2]&=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_2(f)+\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_2(f)\\&\le c_{r,q}\left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \left( 1+\frac{M_{r}}{q}\right) \xi (r+1)+\,c_{r-1,q}\left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \,\frac{r^3\delta _{{\varvec{G}}}}{q}. \end{aligned}$$

Since \(c_{j,q}:=M(j)\,\big (2\, \tau _1 \lambda (q)+\tau _1+\tau _2\log j\big )\), we have \(c_{r-1,q} \le c_{r,q}\). As a consequence, we obtain the following result.

Theorem 6.4

For \(q> 15\delta _{{\varvec{G}}}^{13/3}\), the average cost \(E[\mathcal {X}_2]\) of the \({\mathrm {DDF}}\) algorithm restricted to \(\mathcal {A} \) is bounded by

$$\begin{aligned} E[\mathcal {X}_2]&\le \xi \,(2\, \tau _1 \lambda (q)+\tau _1+\tau _2\log r) M(r)\,(r+1) \left( 1 +\frac{M_r+r^2\delta _{{\varvec{G}}}}{q}\right) \\&\quad \times \left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \\&=\xi \,(2\, \tau _1 \lambda (q)+\tau _1+\tau _2\log r) M(r)\,(r+1)\big (1+o(1)\big ), \end{aligned}$$

where \(M_{r}:=D\delta q^{\frac{1}{2}} +14\,D^2 \delta ^2+ r^2\delta \), \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\), \(D:=\sum _{i=1}^m ({{{\mathsf {wt}}}}(G_i)-1)\) and \(\xi \sim 0.62432945\ldots \) is the Golomb–Dickman constant.

In [18, Theorem 5], the authors prove that the average cost of the DDF algorithm applied to a random polynomial \( f \in \mathbb {F}_{q}[T] \) of degree at most r is of order \(0.26689\, (2\,\tau _1\,\lambda (q) +\tau _2)\,r^3\). We prove that, assuming that fast arithmetic is used, the average cost of this algorithm restricted to \(\mathcal {A}\) is of order \(\xi (2\,\tau _1\,\lambda (q)+\tau _1 +\tau _2\log r)\,(r+1)\,M(r)\) arithmetic operations in \(\mathbb {F}_{q}\).

The DDF algorithm does not completely factor any polynomial \(f \in \mathcal {A}\) having distinct irreducible factors of the same degree. More precisely, the classical factorization algorithm ends in this step if the input polynomial f has a factorization pattern \({\varvec{\lambda }}\in \{0,1\}^r\). We conclude this section with a result on the probability that the DDF algorithm outputs the complete factorization of the input polynomial of \(\mathcal {A}\).

In [19], it is shown that most factorizations are completed after the application of the DDF procedure. More precisely, it is proved that, when r is fixed and q tends to infinity, the probability that the DDF algorithm produces a complete factorization of a random polynomial of degree at most r in \( \mathbb {F}_{q}[T] \) is of order of \(e^{-\gamma } \sim 0.5614\ldots \), where \(\gamma \sim 0.57721\ldots \) is the Euler constant (see [18, Theorem 6]). We generalize this result to the family \(\mathcal {A}\).

Theorem 6.5

The probability that the \({\mathrm {DDF}}\) algorithm completes the factorization of a random polynomial of \(\mathcal {A}\) is bounded from above by \(\big (e^{-\gamma }+ e^{-{\gamma }}/{r}+O({\log r}/{r^2})\big )\big (1+o(1)\big )\), where \( \gamma \) is Euler’s constant.

Proof

Let \(\mathcal {A}_1\) be set of elements of \( \mathcal {A} \) whose irreducible factors have all distinct degrees. The probability that the DDF algorithm outputs the complete factorization of a random polynomial \(f\in \mathcal {A} \) coincides with the probability that a random \(f\in \mathcal {A}\) belongs to \(\mathcal {A}_1\). We may express \( \mathcal {A}_1 \) as the following disjoint union:

$$\begin{aligned} \mathcal {A}_1=\bigcup _{{\varvec{\lambda }} \in \mathcal {P}_{r}} \mathcal {A}_{1,{\varvec{\lambda }}}, \end{aligned}$$

where \( \mathcal {P}_r \) is the set of all vectors \( {\varvec{\lambda }}: = (\lambda _1, \ldots , \lambda _r) \in \{0,1 \}^r \) such that \( \lambda _1 + \cdots + r \, \lambda _r = r \) and \( \mathcal {A}_{1, {\varvec{\lambda }}} \) is the set of elements of \( \mathcal {A}_1 \) having factorization pattern \( {\varvec{\lambda }} \). Hence,

$$\begin{aligned} P[\mathcal {A}_1]=\sum _{{\varvec{\lambda }} \in \mathcal {P}_r} P [\mathcal {A}_{1,{\varvec{\lambda }}}]=\frac{1}{|\mathcal {A}|} \sum _{{\varvec{\lambda }} \in \mathcal {P}_r} |\mathcal {A}_{1,{\varvec{\lambda }}}|. \end{aligned}$$
(6.14)

Observe that if \(f \in \mathcal {A}_1\), then f is square-free. By Theorem 4.6, for \(m<r\) we have

$$\begin{aligned} |\mathcal {A}_{1,{\varvec{\lambda }}}| \le q^{r-m}\,\mathcal {T}({\varvec{\lambda }})\,\bigg (1+\frac{M_{r}}{q}\bigg ), \end{aligned}$$

where \(M_{r}:=D\delta q^{\frac{1}{2}} +14\,D^2 \delta ^2+ r^2\delta \), \(\delta :=\prod _{i=1}^m {{{\mathsf {wt}}}}(G_i)\) and \(D:=\sum _{i=1}^m ({{{\mathsf {wt}}}}(G_i)-1)\). Theorem 3.12 shows that if \(q > 15\delta _{{\varvec{G}}}^{13/3}\), then

$$\begin{aligned} P[ \mathcal {A}_1] \le \left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \bigg (1+\frac{M_{r}}{q}\bigg )\sum _{{\varvec{\lambda }} \in \mathcal {P}_r}\mathcal {T}({\varvec{\lambda }}). \end{aligned}$$

We observe that \(\sum _{{\varvec{\lambda }} \in \mathcal {P}_r} \mathcal {T}({\varvec{\lambda }})\) expresses the probability that a random permutation of \(\mathbb {S}_r\) has a decomposition into cycles of pairwise different lengths. By [30, (4.57)] (see also [17, Proposition 1]), it follows that

$$\begin{aligned} \sum _{{\varvec{\lambda }} \in \mathcal {P}_r}\mathcal {T}( {\varvec{\lambda }})= e^{-\gamma }+\frac{e^{-\gamma }}{r}+O\bigg (\frac{\log r}{r^2}\bigg ). \end{aligned}$$

We deduce that

$$\begin{aligned} P[\mathcal {A}_1]\le \left( 1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\right) \left( 1+\frac{M_{r}}{q}\right) \left( e^{-\gamma }+ \frac{e^{-\gamma }}{r}+O\left( \frac{\log r}{r^2}\right) \right) . \end{aligned}$$

This finishes the proof of the theorem. \(\square \)

6.3 Equal-degree factorization

After the first two steps of the classical factorization algorithm, the general problem of factorization is reduced to factorizing a collection of square-free polynomials b(k) , whose irreducible factors have all the same degree k. The procedure for equal-degree factorization (EDF) receives as input a vector \( \varvec{b}_f: = {\mathrm {DDF}} (a_f) = (b_f (1), \ldots , b_f (s)) \), where each \( b_f (k) \) is the product of the irreducible factors of degree k of the square-free part \( a_f: = {\mathrm {ERF}} (f) \) of f. Its output is the irreducible factorization \(b_f (k)=b_f (k, 1) \ldots b_f ( k, l)\) of each \(b_f (k)\) in \(\mathbb {F}_{q}[T]\). The probabilistic algorithm presented here is based on the Cantor–Zassenhaus algorithm [53], and works for \({\mathrm {char}}(\mathbb {F}_{q})\) odd.

EDF algorithm

figure d

The EDF algorithm is based on the principle we now briefly explain. Assume that the irreducible factorization of the input polynomial c is \(c= f_1 \ldots f_j \), with each \(f_i\) of degree k. The Chinese remainder Theorem implies that

$$\begin{aligned} \mathbb {F}_{q}[T] /(c) \cong \mathbb {F}_{q}[T] / (f_1) \times \cdots \times \mathbb {F}_{q}[T] / (f_j). \end{aligned}$$

Under this isomorphism, a random \(h\in \mathbb {F}_{q}[T]/(c)\) is associated with a j-tuple \((h_1,\ldots ,h_j)\), where each \(h_i\) is a random element of \(\mathbb {F}_{q}[T]/(f_i)\). Since each \(f_i\) is irreducible, the quotient ring \(\mathbb {F}_{q}[T]/(f_i)\) is a finite field, isomorphic to \(\mathbb {F}_{q^k}\). The multiplicative group \(\mathbb {F}_{q^k}^*\) being cyclic, there are the same number \((q^k-1)/2\) of squares and non-squares (see, e.g., [50, Lemma 14.7]). Recall that \(m \in \mathbb {F}_{q^k}^*\) is a square if only if \(m^{(q^k-1)/2}=1\). Therefore, testing whether \(h_i^{(q^k-1)/2}=1\) discriminates the squares in \(\mathbb {F}_{q^k}^*\). Thus, if \(g:=h^{(q^k-1)/2}-1 \mod c\), then \(\gcd (g,c)\) is the product of all the \(f_i\) with h a square in \(\mathbb {F}_{q}[T]/(f_i)\). From the probabilistic standpoint, a random element \(h_i\) of \(\mathbb {F}_{q}[T]/(f_i)\) has probability \(\alpha := 1/2-1/(2q^k)\) of being a square and the dual probability \(\beta :=1/2+1/(2q^k)\) of being a non-square.

Then, the EDF algorithm is applied recursively to the polynomials \( d = \gcd (g, c) \) and c / d. In this way, all the irreducible factors of \(c:=b(k)\) are extracted successively.

Following [18, Section 5], in this section we analyze the average-case complexity of the EDF algorithm applied to the family \( \mathcal {A} \), namely we consider the expected value \(E[\mathcal {X}_3]\) of the random variable \(\mathcal {X}_3\) of (6.3):

$$\begin{aligned} E[\mathcal {X}_3]:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}} \mathcal {X}_3(f). \end{aligned}$$

We decompose \(\mathcal {X} _3 \) as in (6.3) in the form

$$\begin{aligned} \mathcal {X}_3(f):=\sum _{k=1}^{\lceil r/2 \rceil } \mathcal {X}_{3,k}(f),\quad \mathcal {X}_{3,k}(f)\,:=\,{\mathrm {Cost}}({\mathrm {EDF}}(b_f(k)))\quad (1\le k\le {\lceil r/2 \rceil }), \end{aligned}$$

where \( b_f (k) \) is the kth coordinate of \(\varvec{b}_f: = {\mathrm {DDF}} (a_f) = (b_f (1), \ldots , b_f (s)) \). Hence, we have

$$\begin{aligned} E[\mathcal {X}_3]=\frac{1}{|\mathcal {A}|}\sum _{k=1}^{\lceil r/2 \rceil }\sum _{f\in \mathcal {A}}\mathcal {X}_{3,k}(f)=\sum _{k=1}^{\lceil r/2 \rceil } E[\mathcal {X}_{3,k}]. \end{aligned}$$

Fix k with \( 1\le k \le \lceil r/2 \rceil \) and write \(E[\mathcal {X}_{3,k}]\) as follows:

$$\begin{aligned} E[\mathcal {X}_{3,k}]=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_{3,k}(f)+\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_{3,k}(f)=:S_{3,k}^{sq}+S_{3,k}^{nsq}. \end{aligned}$$

We first bound \(S_{3,k}^{sq}\). For this purpose, we express \(\mathcal {A}^{sq}\) as the disjoint union

$$\begin{aligned} \mathcal {A}^{sq}=\bigcup _{j=0}^{\lfloor r/k \rfloor } \mathcal {A}_{j,k}^{sq}, \end{aligned}$$

where \(\mathcal {A}_{j,k}^{sq}\) is the set of all the elements \(f\in \mathcal {A}^{sq}\) having j irreducible factors of degree k. Hence,

$$\begin{aligned} S_{3,k}^{sq}= \frac{1}{|\mathcal {A}|}\sum _{j=0}^{\lfloor r/k \rfloor } \sum _{f \in \mathcal {A}_{j,k}^{sq}} \mathcal {X}_{3,k}(f). \end{aligned}$$
(6.15)

We first bound the cost \(\mathcal {X}_{3,k}(f)\) of the \({\mathrm {EDF}}\) algorithm applied to any \(f \in \mathcal {A}_{j,k}^{sq}\).

Lemma 6.6

For any \(f\in \mathcal {A}_{j,k}^{sq}\), we have

$$\begin{aligned} \mathcal {X}_{3,k}(f) \le \frac{j(j-1)}{\alpha \beta }\big (\tau _1\,\mu _k {M(r)}+ \tau _3\,\mathcal {U}(r)\big )\, \frac{k}{r}, \end{aligned}$$

where \(\mu _k:=\lambda \big (\frac{q^{k} -1}{2} \big ): =\lfloor \log (\frac{q^k-1}{2})\rfloor +\nu (\frac{q^k-1}{2})-1\) and \(\tau _3:=\max \{\tau _1,\tau _2\}\).

Proof

If \(j=0\) or \(j=1\), then the EDF procedure does not perform any computation, and the result trivially follows. Therefore, we may assume that \(j\ge 2\).

The cost of a recursive call to the EDF procedure for \( f \in \mathcal {A}_{j, k} ^{sq}\) is determined by the cost of computing \(h^ {(q ^k-1) / 2} \mod f \), where h is a random element of \( \mathbb {F}_{q}[T] / (f) \), a greatest common divisor of f with a polynomial of degree at most jk and a division of two polynomials of degree at most jk. Observe that \( \mu _k\) multiplications modulo f are required to compute \( h^{ (q ^ k-1) / 2} \mod f \) using binary exponentiation. We conclude that \( h ^ {(q ^ k-1) / 2} \mod f \) can be computed with at most \(2\, \tau _1\, \mu _k M (jk) \) arithmetic operations in \( \mathbb {F}_{q}\), while the remaining greatest common divisor and division are computed with at most \( \tau _2\, \mathcal {U} (jk) \) and \( \tau _1\, M(jk) \) arithmetic operations in \( \mathbb {F}_{q}\). In other words, we have

$$\begin{aligned} 2\, \tau _1\, \mu _k M (jk)+\tau _2\, \mathcal {U} (jk)+\tau _1\, M(jk)\le \Big (\tau _1\, \mu _k \frac{M (r)}{k\,r}+\tau _2\, \frac{\mathcal {U} (r)}{2\,k\,r}+\tau _1\, \frac{M(r)}{2\,k\,r}\Big )(jk)^2 \end{aligned}$$

arithmetic operations in \(\mathbb {F}_{q}\). Applying [18, Lemma 4] with \({\widetilde{\tau }}_1:= \frac{\tau _1 M(r)}{k\,r}\) and \({\widetilde{\tau }}_2:=\frac{\tau _3\, \mathcal {U}(r)}{k\,r}\), we see that

$$\begin{aligned} \mathcal {X}_{3,k}(f)\le & {} \bigg (\frac{j(j-1)}{2 \alpha \beta } + j\sum _{m=0}^{\infty }\sum _{l=0}^m \left( {\begin{array}{c}m\\ l\end{array}}\right) \alpha ^{m-l} \beta ^{l}\big (1-(1-\alpha ^{m-l} \beta ^l)^{j-1}\big )\bigg )\, \\&\times (\mu _k {\widetilde{\tau }}_1 + {\widetilde{\tau }}_2)\, k^2. \end{aligned}$$

Using the inequality \( 1- (1-u)^{j-1} \le (j-1) u \) for \( j \ge 2 \) and \( 0 \le u \le 1 \), we obtain

$$\begin{aligned}&\sum _{m=0}^{\infty }\sum _{l=0}^m \left( {\begin{array}{c}m\\ l\end{array}}\right) \alpha ^{m-l} \beta ^{l}\big (1-(1-\alpha ^{m-l} \beta ^l)^{j-1}\big )\le (j-1)\sum _{m=0}^{\infty }\sum _{l=0}^m \left( {\begin{array}{c}m\\ l\end{array}}\right) \alpha ^{2(m-l)} \beta ^{2l}\\&\quad \le (j-1) \sum _{m=0}^{\infty }(\alpha ^2+ \beta ^2)^m=\frac{j-1}{2\alpha \beta }. \end{aligned}$$

This easily implies the lemma. \(\square \)

As a consequence of Lemma 6.6, we have

$$\begin{aligned} S_{3,k}^{sq}:= \frac{1}{\vert \mathcal {A}\vert }\sum _{j=2}^{\lfloor r/k \rfloor } \sum _{f \in \mathcal {A}_{j,k}^{sq}} \mathcal {X}_{3,k}(f) \le \sum _{j=2}^{\lfloor r/k \rfloor } \frac{j(j-1)}{\alpha \beta }\, \big (\tau _1\,\mu _k {M(r)}+ \tau _3\,\mathcal {U}(r)\big )\, \frac{k}{r} \, \frac{|\mathcal {A}_{j,k}^{sq}|}{|\mathcal {A}|}. \end{aligned}$$
(6.16)

In the next result, we obtain an explicit upper bound for \(S_{3,k}^{sq}\).

Lemma 6.7

For \(q>15\delta _{{\varvec{G}}}^{13/3}\), we have

$$\begin{aligned} S_{3,k}^{sq}\le \frac{1}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ) \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg ) \bigg (1+\frac{M_{r}}{q}\bigg ), \end{aligned}$$

where \(\mu _k\) and \(\tau _3\) are as in Lemma 6.6 and \(M_{r}\) is defined as in Theorem 6.4.

Proof

According to (6.16), we estimate the probability \(P[\mathcal {A}_{j, k}^ {sq} ]\) that a random \( f \in \mathcal { A}\) is square-free and has j irreducible factors of degree k. In [34], it is shown that if q is sufficiently large, then the probability that a random \( f \in \mathbb {F}_{q}[T] \) of degree at most r has j distinct irreducible factors of degree k tends to \(e^{-1/k}\frac{k^{-j}}{j!}\).

We decompose the set \(\mathcal {A}_{j,k}^{sq}\) into the disjoint union

$$\begin{aligned} \mathcal {A}_{j,k}^{sq}=\bigcup _{{\varvec{\lambda }} \in \mathcal {P}_r^{j,k}} \mathcal {A}_{j,{\varvec{\lambda }}}^{sq}, \end{aligned}$$

where \(\mathcal {P}_{r}^{j,k}\) is the set of all r-tuples \( {\varvec{\lambda }}: = (\lambda _1, \ldots , \lambda _r) \in \mathbb {Z}_{\ge 0}^r \) with \(\lambda _1 + \cdots + r \, \lambda _r = r \) and \( \lambda _k = j \). Hence, we have

$$\begin{aligned} P[\mathcal {A}_{j,k}^{sq}]=\frac{1}{|\mathcal {A}|} \sum _{{\varvec{\lambda }} \in \mathcal {P}_{r}^{j,k}} |\mathcal {A}_{j,{\varvec{\lambda }}}^{sq}|. \end{aligned}$$

From Theorem 4.6, we deduce that

$$\begin{aligned} |\mathcal {A}_{j,{\varvec{\lambda }}}^{sq}| \le q^{r-m}\,\mathcal {T}({\varvec{\lambda }})\bigg (1+\frac{M_{r}}{q}\bigg ). \end{aligned}$$

From Theorem 3.12, it follows that, for \(q> 15\delta _{{\varvec{G}}}^{13/3}\),

$$\begin{aligned} P[\mathcal {A}_{j,k}^{sq}]=\frac{1}{|\mathcal {A}|} \sum _{{\varvec{\lambda }} \in \mathcal {P}_{r}^{j,k}} |\mathcal {A}_{j,{\varvec{\lambda }}}^{sq}| \le \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\bigg (1+\frac{M_{r}}{q}\bigg ) \sum _{{\varvec{\lambda }} \in \mathcal {P}_{r}^{j,k}} \mathcal {T}({\varvec{\lambda }}). \end{aligned}$$

The sum of the right-hand side expresses the probability that a random permutation in \( \mathbb {S}_r \) has exactly j cycles of length k. In [48], it is shown that

$$\begin{aligned} \sum _{{\varvec{\lambda }} \in \mathcal {P}_{r}^{j,k}} \mathcal {T}({\varvec{\lambda }})=\frac{1}{j!k^j}\sum _{i=0}^{\lfloor r/k-j \rfloor }(-1)^i \frac{1}{i! k^i}. \end{aligned}$$

We observe that the sum of all probabilities is 1, that is,

$$\begin{aligned} \sum _{j=0}^{\lfloor r/k \rfloor } \frac{1}{j!k^j}\sum _{i=0}^{\lfloor r/k-j \rfloor }(-1)^i \frac{1}{i! k^i}=1. \end{aligned}$$

As a consequence, by (6.16) we deduce that

$$\begin{aligned} S_{3,k}^{sq}&\le \sum _{j=2}^{\lfloor r/k \rfloor } \frac{j(j-1)}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ) k^2 \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\\&\quad \times \bigg (1+\frac{M_{r}}{q}\bigg ) \frac{1}{j!k^j}\sum _{i=0}^{\lfloor r/k-j \rfloor } \frac{(-1)^i}{i! k^i}\\&\le \frac{1}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ) \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\\&\quad \times \bigg (1+\frac{M_{r}}{q}\bigg )\sum _{j=2}^{\lfloor r/k \rfloor } \frac{1}{(j-2)!k^{j-2}}\sum _{i=0}^{\lfloor r/k-j \rfloor } \frac{(-1)^i}{i! k^i}\\&\le \frac{1}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ) \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg ) \bigg (1+\frac{M_{r}}{q}\bigg ). \end{aligned}$$

This shows the lemma. \(\square \)

Next we obtain an upper bound for

$$\begin{aligned} S_{3,k}^{nsq}:=\frac{1}{|\mathcal {A}|} \sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_{3,k}(f). \end{aligned}$$
(6.17)

Let \(f \in \mathcal {A}^{nsq}\) and \(\varvec{b}_f:={\mathrm {DDF}}(a_f)=(b_f(1),\ldots , b_f(s))\). Assume that \(\deg (b_f(k))=m_k\). We have the following bound (see, e.g., [50, Theorem 14.11]):

$$\begin{aligned} \mathcal {X}_{3,k}(f) \le c\, (k \log q+ \log m_k)M(m_k) \log \Big (\frac{m_k}{k}\Big ), \end{aligned}$$

where c is a constant independent of k and q. Taking into account the estimate of \(|\mathcal {A}^{nsq}|\) of (4.12) and Theorem 3.12, we conclude that if \(q > 15\delta _{{\varvec{G}}}^{13/3}\), then

$$\begin{aligned} S_{3,k}^{nsq}&\le c\, (k \log q+ \log m_k)M(m_k) \log \bigg (\frac{m_k}{k}\bigg ) \frac{2\,r^2\delta _{{\varvec{G}}}}{q}. \end{aligned}$$
(6.18)

Now we are able to bound the cost of the EDF procedure.

Theorem 6.8

For \(q > 15\delta _{{\varvec{G}}}^{13/3}\), the average cost \(E[\mathcal {X}_3]\) of the \({\mathrm {EDF}}\) algorithm restricted to \(\mathcal {A}\) is bounded as

$$\begin{aligned} E[\mathcal {X}_3]\le & {} \tau \, M(r) \log q\bigg (\bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\bigg (1+\frac{M_{r}}{q}\bigg )+\frac{r^2\delta _{{\varvec{G}}}}{q}\bigg )\\= & {} \tau \, \mathcal {U}(r) \log q\, (1+o(1)), \end{aligned}$$

where \(\tau \) is a constant independent of q and r, and \(M_{r}\) is defined as in Theorem 6.4.

Proof

Recall that \(E[\mathcal {X}_3]=S_{3,k}^{sq}+S_{3,k}^{nsq}\). From Lemma 6.7 and (6.18), we have

$$\begin{aligned} S_{3,k}^{sq}&\le \bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\bigg (1+\frac{M_{r}}{q}\bigg ) \sum _{k=1}^{\lceil r/2 \rceil } \frac{1}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ),\\ S_{3,k}^{nsq}&\le \frac{2\,c\,r^2\delta _{{\varvec{G}}}}{q} \sum _{k=1}^{\lceil r/2 \rceil }(k \log q+ \log m_k)M(m_k) \log \Big (\frac{m_k}{k}\Big ). \end{aligned}$$

We first estimate the sum

$$\begin{aligned} S_1:=\sum _{k=1}^{\lceil r/2 \rceil } \frac{1}{\alpha \beta }\bigg (\tau _1\mu _k \frac{M(r)}{k\,r}+ \tau _3\frac{\mathcal {U}(r)}{k\,r}\bigg ). \end{aligned}$$

Recall that \(\mu _k:=\lfloor \log (\frac{q^k-1}{2})\rfloor +\nu (\frac{q^k-1}{2})-1\), \(\alpha :={1}/{2}-{1}/(2 q^k)\) and \(\beta :={1}/{2}+{1}/(2 q^k)\). It is easy to see that

$$\begin{aligned} \frac{1}{\alpha \beta }\le \frac{4q^2}{q^2-1}\le \frac{16}{3},\quad \mu _k \le 2\, k \log q. \end{aligned}$$

As a consequence,

$$\begin{aligned} S_1&\le \frac{64 \tau _1}{3}\,\frac{M(r) {\lceil r/2 \rceil } \log q}{r} + \frac{32 \tau _3}{3} \frac{\mathcal {U}(r)}{r}\sum _{k=1}^{\lceil r/2 \rceil } \frac{1}{k}\\&\le M(r) \log q\bigg (\frac{64 \tau _1}{3}+\frac{32 \tau _3}{3} \frac{H(\lceil r/2 \rceil )\log r}{r} \bigg ), \end{aligned}$$

where \(H(\lceil r/2 \rceil )\) is the \(\lceil r/2 \rceil \)th harmonic number. Since \(H(N)\le 1+\ln N\) (see, e.g., [29, §6.3]), we deduce that if \(r \ge 2\), then \({H(\lceil r/2 \rceil )\log r}/{r} \le 1\). We conclude that

$$\begin{aligned} S_1\le M(r) \log q\bigg (\frac{64 \tau _1}{3}+\frac{32 \tau _3}{3}\bigg ). \end{aligned}$$
(6.19)

We now estimate the sum

$$\begin{aligned} S_2:=\sum _{k=1}^{\lceil r/2 \rceil }(k \log q+ \log m_k)M(m_k) \log \Big (\frac{m_k}{k}\Big ). \end{aligned}$$

We have the following inequalities:

$$\begin{aligned}&\sum _{k=1}^{\lceil r/2 \rceil }k\, M(m_k)\log \Big (\frac{m_k}{k}\Big )\le M(r)\sum _{k=1}^{\lceil r/2 \rceil }m_k \frac{\log \big (\frac{m_k}{k}\big )}{\frac{m_k}{k}}\le M(r)\sum _{k=1}^{\lceil r/2 \rceil } m_k \le r\, M(r),\\&\sum _{k=1}^{\lceil r/2 \rceil } M(m_k)\log (m_k) \log \Big (\frac{m_k}{k}\Big ) \le M(r)\sum _{k=1}^{\lceil r/2 \rceil } \log ^2(m_k) \le M(r)\sum _{k=1}^{\lceil r/2 \rceil } m_k \le r M(r). \end{aligned}$$

Hence, we deduce that

$$\begin{aligned} S_2&\le 2\, r M(r) \log q. \end{aligned}$$
(6.20)

From (6.19) and (6.20), we obtain the following upper bound for \(E[\mathcal {X}_3]\):

$$\begin{aligned} E[\mathcal {X}_3] \le M(r)\log q \bigg (\bigg (1+\frac{15\delta _{{\varvec{G}}}^{{13}/{6}}}{q^{{1}/{2}}}\bigg )\bigg (1+\frac{M_{r}}{q}\bigg )\bigg (\frac{64 \tau _1}{3}+\frac{32 \tau _3}{3}\bigg ) +\frac{4\,c\,r^3\delta _{{\varvec{G}}}}{q}\bigg ). \end{aligned}$$

Defining \( \tau : =\max \{\frac{64 \tau _1}{3}+\frac{32 \tau _3}{3},4\,c\}\), the statement of the theorem follows. \(\square \)

We remark that, for fields of even characteristic, a similar analysis can be carried out, yielding a bound for \(E[\mathcal {X}_3]\) as in Theorem 6.8 (compare with [18, Section 5.4]).

In [18, Theorem 9], using the classical multiplication of polynomials, it is shown that the EDF algorithm requires on average \( \mathcal {O} (r ^ 2 \log q)\) arithmetic operations in \(\mathbb {F}_{q}\) on the set of elements of \(\mathbb {F}_{q}[T]\) of degree at most r. Theorem 6.8 proves that, using fast multiplication, the EDF algorithm performs on average \(r\, \log q \) arithmetic operations in \( \mathbb {F}_{q}\) on \(\mathcal {A}\), up to logarithmic terms and terms which tend to zero as q tends to infinity (for fixed \(\delta _{{\varvec{G}}}\) and r).

Our analysis improves the worst-case analysis of [50, Theorem 14.11], where it is proved that the EDF algorithm applied to a polynomial of degree at most r having j irreducible factors of degree k requires \( \mathcal {O} ((k \log q+\log r) M(r) \log j)\) arithmetic operations in \( \mathbb {F}_{q}\), that is, \(\mathcal {O}^\sim (k\,r \log q )\) arithmetic operations in \(\mathbb {F}_{q}\).

6.4 Average-case analysis of the classical algorithm

Now we are able to conclude the analysis of the average cost of the factorization algorithm applied to elements of \( \mathcal {A} \). For this purpose, it remains to analyze the behavior of the classical factorization algorithm when the first three steps fail to find the complete factorization of the input polynomial, namely the expected value \(E[\mathcal {X}_4]\) of the random variable \(\mathcal {X}_4\) which counts the number of arithmetic operations in \( \mathbb {F}_{q}\) that the algorithm performs to factorize \(f/{\mathrm {ERF}}(f) \), when f runs over all elements of \( \mathcal {A} \). We can rewrite \( E [\mathcal {X} _4] \) as follows:

$$\begin{aligned} E[\mathcal {X}_4]=\frac{1}{|\mathcal {A}|}\sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_4(f) + \frac{1}{|\mathcal {A}|}\sum _{f \in \mathcal {A}^{nsq}} \mathcal {X}_4(f)=:S_4^{sq}+S_4^{nsq}. \end{aligned}$$

We estimate the first sum \(S_4^{sq}\). If \(f \in \mathcal {A}^{sq}\), then \(f/{\mathrm {ERF}}(f)=1\) and the algorithm does not perform any further operation. Hence, the cost of this step is that of dividing two polynomials of degree at most r, namely \(\tau _1 M(r)\) arithmetic operations in \(\mathbb {F}_{q}\). Thus,

$$\begin{aligned} S_4^{sq} :=\frac{1}{|\mathcal {A}|}\sum _{f \in \mathcal {A}^{sq}} \mathcal {X}_4(f)\le \tau _1 M(r). \end{aligned}$$
(6.21)

Now we estimate the second sum \(S_4^{nsq}\). For this purpose, we decompose the set \(\mathcal {A}^{nsq}\) into the disjoint union of the set \(\mathcal {A}_{=2}^{nsq}\) of elements having all the irreducible factors of multiplicity at most 2, and \(\mathcal {A}_{\ge 2}^{nsq}:=\mathcal {A}^{nsq}{\setminus } \mathcal {A}_{=2}^{nsq}\). If \(f \in \mathcal {A}_{=2}^{nsq}\), then f is of the form \(f=\prod _i f_i \prod _j f_j^2\), and we have \(f/{\mathrm {ERF}}(f)=\prod _j f_j\). Consequently, in this case only the first three steps of the algorithm are executed, and the worst-case analysis of the classical algorithm of [50, Theorem 14.14] shows that \(\mathcal {X}_4(f) \le c_3\, r\, M(r)\log (rq)\), where \(c_3\) is a constant independent of q and r. On the other hand, if \(f \in \mathcal {A}_{\ge 2}^{nsq}\), then the four steps of the algorithm are executed. Observe that the last step is executed as many times as the highest multiplicity arising in the irreducible factors of \(f/{\mathrm {ERF}}(f)\). Thus, the worst-case analysis of [50, Theorem 14.14] implies that \(\mathcal {X}_4(f) \le c_4\, r^2 M(r) \log (rq)\), where \(c_4\) is a constant independent of q and r. It follows that

$$\begin{aligned} S_4^{nsq}&\le c_3\, r\, M(r) \log (rq) \frac{|\mathcal {A}_{=2}^{nsq}|}{|\mathcal {A}|}+ c_4\, r^2 M(r) \log (rq)\frac{|\mathcal {A}_{\ge 2}^{nsq}|}{|\mathcal {A}|} \end{aligned}$$
(6.22)

Since \(\mathcal {A}_{=2}^{nsq}\) is a subset of \(\mathcal {A}^{nsq}\), from (4.12) we have that

$$\begin{aligned} |\mathcal {A}_{=2}^{nsq}|\le r(r-1)\delta _{{\varvec{G}}}q^{r-m-1} \le r^2\delta _{{\varvec{G}}} q^{r-m-1}. \end{aligned}$$
(6.23)

On the other hand, if \(f \in \mathcal {A}_{\ge 2}^{nsq}\), then \(\deg (\gcd (f,f')) \ge 2\). We deduce that \({\mathrm {Res}}(f,f')={\mathrm {Subres}}(f,f')=0\). Hence, \(\mathcal {A}_{\ge 2}^{nsq}\) is a subset of \(\mathcal {S}_1(W)\), where \(W\subset \mathbb {A}^r\) is the affine variety defined by \(G_1, \ldots , G_m\) and \(\mathcal {S}_1(W)\) is the first subdiscriminant locus of W. We deduce that

$$\begin{aligned} |\mathcal {A}_{\ge 2}^{nsq}| \le r(r-1)^2(r-2)\delta _{{\varvec{G}}} q^{r-m-2}\le r^4\delta _{{\varvec{G}}} q^{r-m-2}. \end{aligned}$$
(6.24)

Further, if \(q >15\delta _{{\varvec{G}}}^{13/3}\), then Theorem 3.12 implies \(|\mathcal {A}|\ge \frac{1}{2} q^{r-m}.\) Replacing (6.23), (6.24) in (6.22), we obtain

$$\begin{aligned} \mathcal {S}_4^{nsq}&\le 2\,c_3 M(r) \log (rq) \frac{r^3\delta _{{\varvec{G}}}}{q}+ 2\,c_4 M(r) \log (rq) \frac{r^6\delta _{{\varvec{G}}}}{q^2}. \end{aligned}$$
(6.25)

Combining (6.21) and (6.25), we obtain the following result.

Theorem 6.9

Let \(q > 15\delta _{{\varvec{G}}}^{13/3}\). The average cost \(E[\mathcal {X}_4]\) of the fourth step of the classical factorization algorithm on \(\mathcal {A}\) is bounded in the following way:

$$\begin{aligned} E[\mathcal {X}_4] \le \tau _1 M(r)+ \frac{c\, r^6\delta _{{\varvec{G}}} M(r) \log (rq)}{q}=\tau _1 M(r)(1+o(1)), \end{aligned}$$

where c is a constant independent of q and r.

Theorem 6.9 shows that the average cost of the last step of the classical factorization algorithm applied to elements of \(\mathcal {A} \) is \(\tau _1\, M(r)(1+o(1))\) arithmetic operations in \( \mathbb {F}_{q}\), which asymptotically coincides with the cost of dividing two polynomials of degree at most r.