In this section we formalize one of the important applications of the LLL algorithm: polynomial-time algorithms for polynomial factorization. In Sect. 9.1 we first describe the key idea on how the LLL algorithm helps to factor integer polynomials, following the textbook [29, Chapters 16.4–16.5]. Section 9.2 presents the formalization of some necessary results. In combination with our previous work [7], this is sufficient for obtaining a polynomial-time algorithm to factor arbitrary integer polynomials, whose formalization is presented in Section 9.3. When attempting to directly verify the factorization algorithm in the above-mentioned textbook (Algorithm 16.22 in [29]), it turned out that the original algorithm has a flaw that made the algorithm return incorrect results on certain inputs. The details and a corrected version are provided in Sect. 9.4.
Short Vectors for Polynomial Factorization
The common structure of a modern factorization algorithm for square-free primitive polynomials in \(\mathbb {Z}[x]\) is as follows:
-
1.
A prime p and exponent l are chosen depending on the input polynomial f.
-
2.
A factorization of f over \(\mathbb {Z}_p[x]\) is computed.
-
3.
Hensel lifting is performed to lift the factorization to \(\mathbb {Z}_{p^l}[x]\).
-
4.
The factorization \(f=\prod _i f_i \in \mathbb {Z}[x]\) is reconstructed where each \(f_i\) corresponds to the product of one or more factors of f in \(\mathbb {Z}_{p^l}[x]\).
In a previous work [7], we formalized the Berlekamp–Zassenhaus algorithm, which follows the structure presented above, where step 4 runs in exponential time. The use of the LLL algorithm allows us to derive a polynomial-time algorithm for the reconstruction phase.Footnote 7 In order to reconstruct the factors in \(\mathbb {Z}[x]\) of a polynomial f, by steps 1–3 we compute a modular factorization of f into several monic factors \(u_i\):
modulo m where \(m = p^l\) is some prime power given in step 1.
The intuitive idea underlying why lattices and short vectors can be used to factor polynomials follows. We want to determine a non-trivial factor h of f which shares a common modular factor u, i.e., both h and f are divided by u modulo \(p^l\). This means that h belongs to a certain lattice. The condition that h is a factor of f means that the coefficients of h are relatively small. So, we must look for a small element (a short vector) in that lattice, which can be done by means of the LLL algorithm. This allows us to determine h.
More concretely, the key is the following lemma.
Lemma 1
([29, Lemma 16.20]) Let f, g, u be non-constant integer polynomials. Let u be monic. If u divides f modulo m, u divides g modulo m, and
, then
is non-constant.
Let f be a polynomial of degree n. Let u be any degree-d factor of f modulo m. Now assume that f is reducible, so that \(f = f_1 \cdot f_2\), where w.l.o.g. we may assume that u divides \(f_1\) modulo m and that
. Let \(L_{u,k}\) be the lattice of all polynomials of degree below \(d+k\) which are divisible by u modulo m. As
, clearly \(f_1 \in L_{u, n - d}\).
In order to instantiate Lemma 1, it now suffices to take g as the polynomial corresponding to any short vector in \(L_{u,n-d}\): u divides g modulo m by definition of \(L_{u,n-d}\) and moreover
. The short vector requirement provides an upper bound to satisfy the assumption
.
$$\begin{aligned}&|\!|g|\!| \le 2^{(n - 1)/2} \cdot |\!|f_1|\!| \le 2^{(n - 1)/2} \cdot 2^{n-1} |\!|f|\!| = 2^{3(n-1)/2} |\!|f|\!| \end{aligned}$$
(5)
The first inequality in (5) is the short vector approximation (\(f_1 \in L_{u,n-d}\)). The second inequality in (5) is Mignotte’s factor bound (\(f_1\) is a factor of f). Mignotte’s factor bound and (5) are used in (6) as approximations of \(|\!|f_1|\!|\) and \(|\!|g|\!|\), respectively. Hence, if l is chosen such that \(m = p^l > |\!|f|\!|^{2(n-1)} \cdot 2^{5(n - 1)^2/2}\), then all preconditions of Lemma 1 are satisfied, and
is a non-constant factor of f. Since \(f_1\) divides f, also
is a non-constant factor of f. Moreover, the degree of h is strictly less than n, and so h is a proper factor of f.
Formalization of the Key Results
Here we present the formalization of two items that are essential for relating lattices and factors of polynomials: Lemma 1 and the lattice \(L_{u,k}\).
To prove Lemma 1, we partially follow the textbook, although we do the final reasoning by means of some properties of resultants which were already proved in the previous development of algebraic numbers [16]. We also formalize Hadamard’s inequality, which states that for any square matrix A having rows \(v_i\) we have
. Essentially, the proof of Lemma 1 consists of showing that the resultant of f and g is 0, and then deduce
. We omit the detailed proof; a formalized version can be found in the sources.
To define the lattice \(L_{u,k}\) for a degree-d polynomial u and integer k, we give a basis \(v_0,\dots ,v_{k+d-1}\) of the lattice \(L_{u,k}\) such that each \(v_i\) is the \((k+d)\)-dimensional vector corresponding to polynomial \(u(x) \cdot x^i\) if \(i<k\), and to the monomial \(m\cdot x^{k+d-i}\) if \(k \le i < k+d\).
We define the basis in Isabelle/HOL as
as follows:
Here,
denotes the list of natural numbers descending from \(a-1\) to b (with \(a>b\)),
denotes the monomial \(ax^b\), and
is a function that transforms a polynomial p into a vector of dimension n with coefficients in the reverse order and completing with zeroes if necessary. We use it to identify an integer polynomial f of degree \(< n\) with its coefficient vector in \({\mathbb {Z}}^{n}\). We also define its inverse operation, which transforms a vector into a polynomial, as
.
To visualize the definition, for \(u(x) = \sum _{i=0}^d u_i x^i\) we have
and
is precisely the basis \((f_0,f_1,f_2)\) of Example 1.
There are some important facts that we must prove about
.
-
ukm is a list of linearly independent vectors, as required for applying the LLL algorithm in order to find a short vector in \(L_{u,k}\).
-
\(L_{u,k}\) characterizes the polynomials which have u as a factor modulo m:
That is, any polynomial that satisfies the right-hand side can be transformed into a vector that can be expressed as an integer linear combination of the vectors of
. Similarly, any vector in the lattice \(L_{u,k}\) can be expressed as an integer linear combination of
and corresponds to a polynomial of degree less than \(k+d\) that is divisible by u modulo m.
The first property is a consequence of the obvious fact that the matrix S in (7) is upper triangular, and that its diagonal entries are non-zero if both u and m are non-zero. Thus, the vectors in
ukm are linearly independent.
Next, we look at the second property. For one direction, we see the matrix S as (a generalization of) the Sylvester matrix of the polynomial u and constant polynomial m. Then we generalize an existing formalization about Sylvester matrices as follows:
We instantiate
by the constant polynomial m. So for every \(c \in \mathbb {Z}^{k+d}\) we get
for some polynomials r and s. As every \(g \in L_{u,k}\) is represented as \(S^{{\mathsf {T}}} c\) for some integer coefficient vector \(c \in \mathbb {Z}^{k+d}\), we conclude that every \(g \in L_{u,k}\) is divisible by u modulo m. The other direction requires the use of division with remainder by the monic polynomial u. Although we closely follow the textbook, the actual formalization of these reasonings requires some more tedious work, namely the connection between the matrix-times-vector multiplication of Matrix.thy (denoted by
in the formalization) and linear combinations (
) of HOL-Algebra.
A Verified Factorization Algorithm
Once the key results, namely Lemma 1 and properties about the lattice \(L_{u,k}\), are proved, we implement an algorithm for the reconstruction of factors within a context that fixes
and
. The simplified definition looks as follows.
is a recursive function which receives two parameters: the polynomial
that has to be factored, and the list
of modular factors of the polynomial
.
computes a short vector (and transforms it into a polynomial) in the lattice generated by a basis for \(L_{u,k}\) and suitable k, that is,
. We collect the elements of
that divide
modulo
into the list
, and the rest into
.
returns the list of irreducible factors of
. Termination follows from the fact that the degree decreases, that is, in each step the degree of both
and
is strictly less than the degree of
.
In order to formally verify the correctness of the reconstruction algorithm for a polynomial
we use the following invariants for each invocation of
, where
is an intermediate non-constant factor of
. Here some properties are formulated solely via
, so they are trivially invariant, and then corresponding properties are derived locally for
by using that
is a factor of
.
-
1.
divides 
-
2.
is the unique modular factorization of
modulo 
-
3.
and
are coprime, and
is square-free in 
-
4.
is sufficiently large:
where 
Concerning complexity, it is easy to see that if a polynomial splits into i factors, then
invokes the short vector computation \(i + (i-1)\) times: \(i-1\) invocations are used to split the polynomial into the i irreducible factors, and for each of these factors one invocation is required to finally detect irreducibility.
Finally, we combine the new reconstruction algorithm with existing results presented in the Berlekamp–Zassenhaus development to get a polynomial-time factorization algorithm for square-free and primitive polynomials.
We further combine this algorithm with a pre-processing algorithm also from our earlier work [7]. This pre-processing splits a polynomial f into \(c \cdot f_1^1 \cdot \ldots \cdot f_k^k\) where c is the content of f which is not further factored (see Sect. 2). Each \(f_i\) is primitive and square-free, and will then be passed to
. The combined algorithm factors arbitrary univariate integer polynomials into its content and a list of irreducible polynomials.
The Berlekamp–Zassenhaus algorithm has worst-case exponential complexity, e.g., exhibited on Swinnerton–Dyer polynomials. Still it is a practical algorithm, since it has polynomial average complexity [5], and this average complexity is smaller than the complexity of the LLL-based algorithm, cf. [29, Ch. 15 and 16]. Therefore, it is no surprise that our verified Berlekamp–Zassenhaus algorithm [7] significantly outperforms the verified LLL-based factorization algorithm on random polynomials, as it factors, within one minute, polynomials that the LLL-based algorithm fails to factor within any reasonable amount of time.
The Factorization Algorithm in the Textbook Modern Computer Algebra
In the previous section we have chosen the lattice \(L_{u,k}\) for \(k = n-d\), in order to find a polynomial h that is a proper factor of f. This has the disadvantage that h is not necessarily irreducible. By contrast, Algorithm 16.22 from the textbook tries to directly find irreducible factors by iteratively searching for factors w.r.t. the lattices \(L_{u,k}\) for increasing k from 1 up to \(n - d\).
The max-norm of a polynomial \(f(x) = \sum _{i=0}^n c_i x^i\) is defined to be \(|\!|f|\!|_\infty = \max \{|c_0|,\dots ,|c_n|\}\), the 1-norm is \(|\!|f|\!|_1=\sum _{i=0}^{n} |c_i|\) and
is the primitive part of f, i.e., the quotient of the polynomial f by its content.
Let us note that Algorithm 16.22 also follows the common structure of a modern factorization algorithm; indeed, the reconstruction phase corresponds to steps 5-13. Once again, the idea behind this reconstruction phase is to find irreducible factors via Lemma 1 and short vectors in the lattice \(L_{u,k}\). However, this part of the algorithm (concretely, the inner loop presented at step 8) can return erroneous calculations, and some modifications are required to make it sound.
The textbook proposes the following invariants to the reconstruction phase:
-
\(f^{{\text {*}}} \equiv b \prod _{i\in T} u_i \pmod {p^l}\),
-
,
-
\(f=\pm f^{{\text {*}}} \prod _{g \in G} g\), and
-
each polynomial in G is irreducible.
While the arguments given in the textbook and the provided invariants all look reasonable, the attempt to formalize them in Isabelle runs into obstacles when one tries to prove that the content of the polynomial g in step 9 is not divisible by the chosen prime p. In fact, this is not necessarily true.
The first problem occurs if the content of g is divisible by p. Consider \(f_1 = x^{12}+x^{10}+x^8+x^5+x^4+1\) and \(f_2 = x\). When trying to factor \(f = f_1 \cdot f_2\), then \(p = 2\) is chosen, and in step 9 the short vector computation is invoked for a modular factor u of degree 9 where \(L_{u,4}\) contains \(f_1\). Since \(f_1\) itself is a shortest vector, \(g = p \cdot f_1\) is a short vector: the approximation quality permits any vector of \(L_{u,4}\) of norm at most
. For this valid choice of g, the result of Algorithm 16.22 will be the non-factorization \(f = f_1 \cdot 1\).
The authors of the textbook agreed that this problem can occur. The flaw itself is easily fixed by modifying step 10 to
A potential second problem revealed by our formalization work, is that if g is divisible not only by p but also by \(p^l\), Algorithm 16.22 will still return a wrong result (even with step 10 modified). Therefore, we modify the condition in step 12 of the factorization algorithm and additionally demand
, and then prove that the resulting algorithm is sound. Unlike the first problem, we did not establish whether or not this second problem can actually occur.
Regarding to the implementation, apart from the required modifications to make Algorithm 16.22 sound, we also integrate some changes and optimizations:
-
We improve the bound B at step 1 with respect to the one used in the textbook.
-
We test a necessary criterion whether a factor of degree \(d+k\) is possible, before performing any short vector computations in step 9. This is done by computing all possible degrees of products of the modular factors \(\prod _{i \in I}u_i\).
-
We dynamically adjust the modulus to compute short vectors in smaller lattices: Directly before step 9 we compute a new bound \(B'\) and a new exponent \(l'\) depending on the current polynomial \(f^{\text {*}}\) and the degree \(d+k\), instead of using the ones computed in steps 1-2, which depend on the input polynomial f and its degree n. This means that the new exponent \(l'\) can be smaller than l (otherwise, we follow the computations with l), and the short vector computation of step 9 will perform operations in a lattice with smaller values.
-
We check divisibility instead of norm-inequality in step 12. To be more precise, we test
instead of the condition in step 12. If this new condition holds, then \(h^{\text {*}}\) is not computed as in step 11, but directly as the result of dividing f by
.
The interested reader can explore the implementation and the soundness proof of the modified algorithm in the file Factorization_Algorithm_16_22.thy of our AFP entry [9]. The file Modern_Computer_Algebra_Problem.thy in the same entry shows some examples of erroneous outputs of the textbook algorithm. A pseudo-code version of the fixed algorithm is detailed in the appendix as Algorithm 4.