A deterministic algorithm for finding $r$-power divisors

Building on work of Boneh, Durfee and Howgrave-Graham, we present a deterministic algorithm that provably finds all integers $p$ such that $p^r \mathrel| N$ in time $O(N^{1/4r+\epsilon})$ for any $\epsilon>0$. For example, the algorithm can be used to test squarefreeness of $N$ in time $O(N^{1/8+\epsilon})$; previously, the best rigorous bound for this problem was $O(N^{1/6+\epsilon})$, achieved via the Pollard--Strassen method.

1. Introduction 1.1. Statement of main result. Let r be a positive integer. In this paper we study the problem of finding all r-power divisors of a given positive integer N , i.e., all positive integers p such that p r | N . Throughout the paper we write lg x := log 2 x, and unless otherwise specified, the "running time" of an algorithm refers to the number of bit operations it performs, or more formally, the number of steps executed by a deterministic multitape Turing machine [Pap94]. We always assume the use of fast (quasilinear time) algorithms for basic integer arithmetic, i.e., for multiplication, division and GCD (see for example [vzGG13] or [BZ11]).
Our main result is the following theorem.
Theorem 1.1. There is an explicit deterministic algorithm with the following properties. It takes as input an integer N 2 and a positive integer r lg N . Its output is a list of all positive integers p such that p r | N . Its running time is Note that whenever we write ǫ in a complexity bound, we mean that the bound holds for all ǫ > 0, where the implied big-O constant may depend on ǫ.
The integers p referred to in Theorem 1.1 need not be prime. Of course, if p is a composite integer found by the algorithm, then the algorithm will incidentally determine the complete factorisation of p, as the prime divisors ℓ of p must also satisfy ℓ r | N .
The hypothesis r lg N does not really limit the applicability of the theorem: if r > lg N then the problem is trivial, as the only possible r-power divisor is 1.
Theorem 1.1 is intended primarily as a theoretical result. For fixed r the complexity is O(N 1/4r+ǫ ), which is fully exponential in lg N , so the algorithm cannot The first author was supported by the Australian Research Council (grant FT160100219). SBA Research (SBA-K1) is a COMET Centre within the framework of COMET -Competence Centers for Excellent Technologies Programme and funded by BMK, BMDW, and the federal state of Vienna. The COMET Programme is managed by FFG.
compete asymptotically with subexponential factoring algorithms such as the elliptic curve method (ECM) or the number field sieve (NFS). Furthermore, experiments confirm that for small r the algorithm is grossly impractical compared to generalpurpose factoring routines implemented in modern computer algebra systems.
1.2. Previous work. At the core of our algorithm is a generalisation of Coppersmith's method [Cop97] introduced by Boneh, Durfee and Howgrave-Graham [BDHG99]. We refer to the latter as the BDHG algorithm. Coppersmith's seminal work showed how to use lattice methods to quickly find all divisors of N in certain surprisingly large intervals. To completely factor N , one simply applies the method to a sequence of intervals that covers all possible divisors up to N 1/2 . Each interval is searched in polynomial time, so the overall complexity is governed by the number of such intervals, which turns out to be O(N 1/4+ǫ ). The BDHG algorithm adapts Coppersmith's method to the case of r-power divisors. The relationship between our algorithm and the BDHG algorithm is discussed in Section 1.3 below.
We emphasise that, unlike factoring algorithms such as ECM or NFS, whose favourable running time analyses depend on heuristic assumptions, the complexity bound in Theorem 1.1 is rigorously analysed and fully deterministic. Under these restrictions, for r 2 it is asymptotically superior to all previously known complexity bounds for the problem of finding r-power divisors.
Its closest competitors are the algorithms of Strassen [Str77] and Pollard [Pol74]. These algorithms can be used to find all divisors of N less than a given bound B in time O(B 1/2+ǫ ). If p r | N , say N = p r q, then either p N 1/(r+1) or q N 1/(r+1) , so the Pollard-Strassen method can be used to find p or q, and hence both, in time O(N 1/2(r+1)+ǫ ). For example, taking r = 2, these algorithms can find all square divisors of N in time O(N 1/6+ǫ ), whereas our algorithm finds all square divisors in time O(N 1/8+ǫ ).
There is one special case in which the Pollard-Strassen approach still wins. If one knows in advance that p is relatively small, say p < N c for some c ∈ (0, 1/2r), then the Pollard-Strassen method has complexity O(N c/2+ǫ ), which is better than the bound in Theorem 1.1. Our algorithm can also take advantage of the information that p < N c , but unfortunately this yields only a constant-factor speedup.
Another point of difference is the space complexity. The space required by the algorithm in Theorem 1.1 is only polynomial in lg N (we will not give the details of this analysis), whereas for the Pollard-Strassen method the space complexity is the same as the time complexity, up to logarithmic factors.
In connection with the case r = 2, two other works are worth mentioning. Booker, Hiary and Keating [BHK15] describe a subexponential time algorithm that can sometimes prove that a given integer N is squarefree, with little or no knowledge of its factorisation. This algorithm is not fully rigorous, as its analysis depends on (among other things) the Generalised Riemann Hypothesis. Peralta and Okamoto [PO96] present a speedup of the ECM method for integers of the form N = p 2 q. Again this result is not fully rigorous, because it depends on standard conjectures concerning the distribution of smooth numbers in short intervals, just as in Lenstra's original ECM algorithm.
The case r = 1 corresponds to the ordinary factoring problem, and in this case our algorithm is essentially equivalent to Coppersmith's method. As mentioned above, the complexity is O(N 1/4+ǫ ), which does not improve on known results; currently, the fastest known deterministic factoring method has complexity O(N 1/5+ǫ ) [HH21]. (It is interesting to ask whether the ideas behind [HH21] can be used to improve Theorem 1.1 when r 2. Our inquiries in this direction have been so far unsuccessful.) In fact, when r = 1, Theorem 1.1 gives the more precise complexity bound O(N 1/4 (lg N ) 10+ǫ ). It is apparently well known that Coppersmith's method has complexity O(N 1/4 (lg N ) C ) for some constant C > 0, but to the best of our knowledge, this is the first time in the literature that a particular value of C has been specified. On the other hand, we have not tried particularly hard to optimise the value of C, and it is likely that it can be improved. (One possible improvement is outlined in Remark 3.6.) 1.3. Relationship to the BDHG algorithm. The authors of [BDHG99] were mainly interested in cryptographic applications, and this led them to focus on the case that N = p r q where p and q are roughly the same size. In this setting, they show that their algorithm is faster than ECM when r ≈ (log p) 1/2 , and that it even runs in polynomial time when r is as large as log p.
In this paper we take a different point of view: our goal is to determine the worst-case complexity, without any assumptions on the size of p, q or r.
To illustrate what difference this makes, consider again the case r = 2. This case is mentioned briefly in Section 6 of [BDHG99]. The authors point out that if N = p 2 q, where p and q are known to be about the same size, i.e., both p and q are within a constant factor of N 1/3 , then the running time of their method is O(N 1/9+ǫ ), i.e., the number of search intervals is O(N 1/9+ǫ ). However, in our more general setup, this is not the worst case. Rather, the worst case running time is O(N 1/8+ǫ ), which occurs when searching for p ∼ N 1/4 and q ∼ N 1/2 . More generally, for r 1 the worst case running time of O(N 1/4r+ǫ ) stated in Theorem 1.1 occurs when p ∼ N 1/2r and q ∼ N 1/2 . By contrast, in the "balanced" situation considered in [BDHG99], where p, q ∼ N 1/(r+1) , one can show that the running time is only O(N 1/(r+1) 2 +ǫ ) (see Remark 3.5, and take θ = r/(r + 1)).
Although the core of our algorithm is essentially the same as the BDHG algorithm, our more general perspective requires us to make a few changes to their presentation. For instance, we cannot take the lattice dimension to be d ≈ r 2 (as is done in the main theorem of [BDHG99]), because this choice is suboptimal when r is small and fixed. Additional analysis is required to deal with potentially small values of p and q, and in general we must take more care than [BDHG99] in estimating certain quantities throughout the argument. For these reasons, we decided to give a self-contained presentation, not relying on the results in [BDHG99].
1.4. Root-finding. An important component of our algorithm, and of all algorithms pursuing Coppersmith's strategy, is a subroutine for finding all integer roots of a polynomial with integer coefficients. This problem has received extensive attention in the literature, but we were unable to locate a clear statement of a deterministic complexity bound suitable for our purposes. For completeness, in Appendix A we give a detailed proof of the following result. Note that this complexity bound is much stronger than what is needed for the application in this paper. However, it is still not quasilinear in the size of the input, which is O(nb). For further discussion, see Remarks A.8 and A.10.

Searching one interval
In this section we recall the strategy of [BDHG99] for finding all integers p in a prescribed interval P − H p P + H such that p r | N , provided that H is not too large. We will prove the following theorem.
Theorem 2.1. There is an explicit deterministic algorithm with the following properties. It takes as input positive integers N , r, m, d, P and H such that A key tool needed in the proof of Theorem 2.1 is the LLL algorithm: we may find a nonzero vector w in the lattice L : (Here · denotes the standard Euclidean norm on R d .) Proof. We take w to be the first vector in a reduced basis for L computed by the LLL algorithm [LLL82, Prop.
Then in time , where we have used the hypotheses (2.3) and (2.2). For the case rm i < d, a similar argument shows that every coefficient off i (y) = (P + Hy) i is bounded above by 2 d N d/r . Therefore every v i has coordinates bounded by 2 d N d/r+1 , and we conclude that v i B for all i. Next we calculate the determinant of the lattice L := span Z (v 0 , . . . , v d−1 ), or equivalently, the determinant of the d × d integer matrix whose rows are given by the v i . Since degf i (y) = i, this is a lower triangular matrix whose diagonal entries are given by the leading coefficients of thef i (y), namely The determinant is the product of these leading coefficients, i.e., Invoking Lemma 2.2, we may compute a nonzero vector w ∈ L such that ). Note that this time bound certainly dominates the cost of computing the vectors v i themselves, as thef i (y) may be computed by starting withf 0 (y) = N m , and then successively multiplying by P + Hy and occasionally dividing by N . The hypotheses (2.1) and (2.2) imply that The vector w corresponds to a nonzero polynomialh(y) =h 0 + · · · +h d−1 y d−1 in the Z-span of thef i (y). Applying the Cauchy-Schwartz inequality to the vectors w = (h 0 , . . . ,h d−1 ) and (1, . . . , 1) yields Next we show that any r-power divisor that is sufficiently close to P corresponds to a root of h(x).
On the other hand, the assumption −H x 0 H together with (2.6) and (2.4) implies that Since h(x 0 ) is divisible by p rm , this forces h(x 0 ) = 0.
We may now complete the proof of the main theorem of this section.
Proof of Theorem 2.1. We first invoke Proposition 2.3, with inputs N , r, m, d, P and H, to find a polynomial h(x) satisfying (2.6). According to Proposition 2.4, we may then construct a list of candidates for p by finding all integer roots of h(x), which we do via Theorem 1.2.
To estimate the complexity of the root-finding step, recall from the proof of Proposition 2.4 that |h 0 | + · · · + |h d−1 |H d−1 < (P − H) rm , so certainly |h j | < (P − H) rm for all j, and we obtain Therefore in Theorem 1.2 we may take n := d and b := ⌈lg(N d/r )⌉ = ⌈ d r lg N ⌉. Note that the hypothesis b n is satisfied due to (2.1). The root-finding complexity is thus , which is negligble compared to the main bound (2.5). Finally, we must check each candidate for p to ensure that p r | N , which again requires negligble time.

Proof of the main theorem
We now consider the problem of searching for all integers p such that p r | N in an interval, say T p T ′ , that is too large to be handled by a single application of Theorem 2.1. Given N , r, T and T ′ , our strategy will be to choose parameters d, m and H, and then apply Theorem 2.1 to a sequence of subintervals of the form P − H p P + H that cover the target interval T p T ′ . The overall running time will depend mainly on the number of subintervals, so our goal is to make H as large as possible. On the other hand, to ensure that the hypothesis (2.4) of Theorem 2.1 is satisfied, we also require that H <H where The key issue is therefore to choose d and m to maximiseH. For large d and m, the magnitude ofH depends more or less on the ratio m/d; in fact, one finds thatH is maximised when m/d ≈ lg T / lg N . The following result gives a simple formula for m (as a function of d) that is close to the optimal choice, and a corresponding explicit lower bound forH.
Lemma 3.1. Let N , r, d and T be positive integers such that d 2 and T N 1/r . Let and letH be defined as in (3.1). Theñ . It is easy to check that d 1/(d−1) 2 1/2 < 3 for all d 2, so we find that . Continuing to estimate the exponent in this inequality, we obtain where the last line follows from the inequalities 0 δ < 1 d−1 and 0 θ 1. We may now estimate the time required to search a given interval T p T ′ .
Proposition 3.2. There is an explicit deterministic algorithm with the following properties. It takes as input positive integers N , r, T and T ′ such that (2.1) holds (i.e., r lg N ) and such that Its output is a list of all integers p in the interval T p T ′ such that p r | N . Its running time is where θ is defined as in (3.3).
Proof. Set d := ⌈lg N ⌉ + 1 and define m as in (3.2). Equivalently, m is the largest integer such that N m T d−1 . Note that d 2 (since N 2 r 2) and m ⌊lg T ⌋ 2 (since T 4 √ 1 = 4). Since lg(T d−1 ) d lg T ≪ (lg N ) 2 , we may clearly compute d and m in time O((lg N ) 2+ǫ ). Also, the assumption T N 1/r implies that m (d − 1)/r d/r, so (2.2) holds.
Let H be the largest integer less thanH, i.e., H := H − 1. Then 2 H <H, and moreover, sinceH > 2, we also have H H /2 > 1 12 N θ 2 /r . We may compute H by first approximating the d(d − 1)-th root of the rational numberH Remark 3.3. A slightly better choice for d is to take d ≈ θ lg N , but this complicates the analysis and only improves the main result by a constant factor.
Finally we may prove the main theorem. Recall that we are given as input positive integers N 2 and r lg N , and we wish to find all positive integers p such that p r | N . Such divisors p must clearly lie in [1, N 1/r ].
We first check all p = 2, 3, . . . , 2 k by brute force, i.e., testing directly whether p r | N . Note that k may certainly be computed in time O((lg N ) 1+ǫ ). To estimate the cost of checking up to 2 k , observe that k 2 (lg N )/r + 1 + 1.
Let C > 0 be an absolute constant such that 2 √ x + 1 + 1 x/4 + C for all x 1; it follows that k (lg N )/4r + C, and hence that 2 k ≪ N 1/4r . The cost of checking up to 2 k is therefore O(N 1/4r (lg N ) 1+ǫ ), which is negligible compared to (1.1).
We now apply Proposition 3.2 to the intervals [2 k , 2 k+1 ], [2 k+1 , 2 k+2 ], and so on until we reach N 1/r , taking the last interval to be [2 j , ⌊N 1/r ⌋] for suitable j. Therefore the cost of searching each interval is Finally, the number of intervals is at most ⌈lg(N 1/r )⌉ = O( 1 r lg N ). Remark 3.4. The use of dyadic intervals in the above proof was only for convenience; the same argument would work with intervals [B j , B j+1 ] for any fixed B > 1.
Remark 3.5. The expression N θ(1−θ)/r achieves its maximum value N 1/4r at the point θ = 1/2. This justifies the claim made in the introduction that the factors p r that are "hardest" to find are those for which p ∼ N 1/2r . Remark 3.6. A more careful analysis, taking into account the fact that N θ(1−θ)/r is much smaller than N 1/4r for most values of θ ∈ [0, 1], shows that the bound (1.1) can be improved by a factor of O(( 1 r lg N ) 1/2 ). Let us briefly explain this calculation. The main contribution to the cost estimate in the above proof is the number of subintervals, i.e., the sum of the values of N θ(1−θ)/r over the various dyadic intervals. It can be shown that this sum is essentially a Riemann sum approximating the integral log N r The argument in the proof of Theorem 1.1 amounted to estimating this integral via the trivial bound 1 0 N θ(1−θ)/r dθ 1 0 N 1/4r dθ = N 1/4r . A better estimate is obtained by recognising the integrand as a truncated Gaussian function, i.e., In this section we prove Theorem 1.2. Our root-finding procedure consists of two parts. In the first part, we discuss how to deterministically find all integer roots of a squarefree polynomial f ∈ Z[x]. We mainly follow the approach of Loos [Loo83], but we obtain better complexity bounds by employing faster algorithms for the underlying arithmetic. In the second part, we explain how to reduce the general case to the squarefree case. The reduction depends on computing GCDs in Z[x]; for this purpose we present a rigorous, deterministic variant of the "heuristic GCD" algorithm of Char, Geddes and Gonnet [CGG84].
A.1. Some preliminary estimates. For f, g ∈ Z[x], let res(f, g) ∈ Z denote the resultant of f and g. Proof. Let ϑ(X) := p X log p denote the usual Chebyshev weighted prime counting function. The claim is that ϑ(X)/X > 1 3 log 2 (≈ 0.231) for all X 2. For X 101 this follows from [RS62, Thm. 10], which states that ϑ(X)/X > 0.84 for X 101. For 2 X < 101 the claim may be checked directly, for example by inspecting the graph of ϑ(X)/X in the reader's favourite computer algebra system.
A.2. The squarefree case. The core idea of Loos' algorithm is the following wellknown p-adic Hensel lifting strategy.
Then there exists a unique v ∈ {0, . . . , Given f and u as input, we may compute v in time Proof. We argue by induction on k. If k = 1, we simply take v = u. Now assume that k 2 and set ℓ := ⌈k/2⌉ < k. By induction there exists a unique w ∈ {0, . . . , p ℓ − 1} such that w ≡ u (mod p) and f (w) ≡ 0 (mod p ℓ ).
We first establish uniqueness of v. Suppose that v has the desired properties, i.e., v ≡ u (mod p) and f (v) ≡ 0 (mod p k ). By the uniqueness of w, we must have v ≡ w (mod p ℓ ), say v = w + p ℓ t for some t ∈ {0, . . . , p k−ℓ − 1}. Expanding f around w, we find that for some g ∈ (Z/p k Z) [x]. Substituting x = p ℓ t, and using the fact that p 2ℓ ≡ 0 (mod p k ), we deduce that 0 ≡ f (w) + p ℓ tf ′ (w) (mod p k ). Since f ′ (w) ≡ f ′ (u) ≡ 0 (mod p), we may solve for t to obtain This establishes uniqueness of t (mod p k−ℓ ), and hence of v (mod p k ). Moreover, the same calculation gives an explicit formula for v, proving existence. To prove the complexity bound, suppose that we have already computed w and that we wish to lift to v. We first apply Horner's rule to compute f (w) and f ′ (w) using O(n) arithmetic operations in Z/p k Z. Each such operation requires time O(lg(p k ) 1+ǫ ). Similarly, we may invert f ′ (w), and hence compute t and v, in time O(lg(p k ) 1+ǫ ). Therefore, the time required to deduce v from w is O(n lg(p k ) 1+ǫ ). The contributions from subsequent recursion levels form a geometric series, so the total cost of computing v from u is also O(n lg(p k ) 1+ǫ ).
The next result shows how to find a reasonably small prime p for which the p-adic lifting strategy is guaranteed to succeed. Proof. Since f is squarefree, the resultant D := res(f, f ′ ) is nonzero. Our goal is to find a prime p such that p ∤ D.
First, by Lemma A.1 we have |D| (n + 1) (n−1)/2 n n/2 f n−1 ∞ f ′ n ∞ . One easily checks that (n + 1) n−1 n n for all n 1. Since f ′ ∞ n f ∞ , we obtain On the other hand, let X := 6bn + 6n lg n. If D is divisible by all primes p X, then D is divisible by their product, so lg |D| p X lg p > X/3 = 2nb + 2n lg n by Lemma A.3. This contradicts (A.1), so we conclude that there must exist a prime p X such that p ∤ D. To actually find such a prime, we run the following algorithm.
Step 2 (reduce f modulo primes). Let f p ∈ (Z/pZ)[x] denote the reduction of f modulo p. We compute f p for all p Y by applying a fast simultaneous modular reduction algorithm [vzGG13,Thm. 10.24] (i.e., using a remainder tree) to each coefficient of f . The bit size of the product of the primes is O(ϑ(Y )) = O(Y ) = O(nb), and the number of primes is certainly O(nb), so the cost per coefficient is O (n 1+ǫ b 1+ǫ ). The total cost over all coefficients is therefore O(n 2+ǫ b 1+ǫ ).
Step 3 (compute GCDs). Finally, we return the least prime p for which f p = 0 and gcd(f p , f ′ p ) = 1. As shown above, such a prime exists and satisfies p X.
We now give a deterministic root-finding algorithm for the squarefree case.
Proposition A.6. Let b n 1 be integers, and let f ∈ Z[x] be a squarefree polynomial of degree n such that f ∞ 2 b . Then we may find all integer roots of f in time O(n 2+ǫ b 1+ǫ ).
Proof. As above, let f p ∈ (Z/pZ)[x] denote the reduction of f modulo p. We first invoke Proposition A.5 to find a prime p = O(nb) such that f p is nonzero and squarefree. Then we perform the following steps.
Step 1 (find roots mod p). Compute the roots of f p in Z/pZ by brute force, i.e., by evaluating f p (i) for i = 0, . . . , p − 1. Note that the integer roots of f correspond to distinct roots of f p , thanks to the squarefreeness of f p . Each f p (i) may be evaluated in time O(n 1+ǫ b ǫ ), so the cost of this step is O(pn 1+ǫ b ǫ ) = O(n 2+ǫ b 1+ǫ ).
Step 2 (find roots mod p k ). Letf ∈ (Z/p k Z)[x] be the reduction of f modulo p k , where k is chosen to be the smallest integer such that (A.2) p k > (n + 1) 1/2 2 n+b+1 .
Applying Proposition A.4 tof , we lift each of the roots of f p found in Step 1 to a root off . The uniqueness claim in Proposition A.4 implies that the resulting set of lifted roots in Z/p k Z includes the reductions modulo p k of all of the actual integer roots of f . To estimate the complexity, observe that p k p(n + 1) 1/2 2 n+b+1 , so lg(p k ) = O(lg p + lg(n + 1) The cost of lifting each root is therefore O(n lg(p k ) 1+ǫ ) = O(nb 1+ǫ ), and the total cost of this step is O(n 2 b 1+ǫ ).
Step 3 (check roots in Z). For each rootr ∈ Z/p k Z off found in Step 2, we determine whether it arises from a genuine integer root of f as follows. We first liftr to a candidate root r * ∈ Z satisfying r * ≡r (mod p k ) and r * ∈ [− 1 2 p k , 1 2 p k ). We next dividef by x −r to obtain a polynomialḡ ∈ (Z/p k Z)[x] such thatf (x) = (x −r)ḡ(x), and we liftḡ to a polynomial g * ∈ Z[x] satisfying g * ≡ḡ (mod p k ) and whose coefficients also all lie in [− 1 2 p k , 1 2 p k ). We then multiply x − r * by g * (x) (in Z[x]) and check whether we obtain f . If so, then f (r * ) = 0, so r * must be the integer root corresponding tor. Otherwise, as we will see in the next paragraph, thisr does not correspond to any integer root and we may ignore it. This procedure requires O(n) operations on integers of O(b) bits, i.e., O(nb 1+ǫ ) bit operations, so the total cost over all roots is O (n 2 b 1+ǫ ).
We now prove that the procedure described above does in fact find all integer roots. (The following argument is adapted from [vzGG13, §15.6].) Let r ∈ Z be a root of f . Then |r| f ∞ 2 b (as r divides the constant term of f ), and f factors as f (x) = (x − r)g(x) for some g ∈ Z[x] satisfying g ∞ (n + 1) 1/2 2 n+b (by Lemma A.2). In particular, (A.2) ensures that |r| < p k /2 and g ∞ < p k /2. Letr ∈ Z/p k Z be the root off corresponding to r, and let r * ∈ Z and g * ∈ Z[x] be the quantities computed in Step 3 for thisr. Then r * ≡r ≡ r (mod p k ), so we must have r * = r, since both sides lie in [− 1 2 p k , 1 2 p k ). Similarly, we have g * (x) ≡ḡ(x) =f (x)/(x −r) ≡ f (x)/(x − r) = g(x) (mod p k ), so again we must have g * = g as the coefficients on both sides lie in [− 1 2 p k , 1 2 p k ). Therefore (x − r * )g * (x) = f (x), and the procedure does indeed recover r.
Remark A.7. Loos [Loo83] imposes the additional requirement that p should not divide the leading coefficient of f , to ensure that deg f p = deg f . This is because he is searching for rational roots, not just integral roots. Our algorithm may also be easily adapted to this case.
Remark A.8. An interesting question is whether the complexity bound in Proposition A.6 can be improved to quasilinear, i.e., to O(n 1+ǫ b 1+ǫ ) bit operations. There are two main obstructions to this.
First, although f p ∈ (Z/pZ)[x] is squarefree for almost all primes p, it is difficult to predict in advance for which p this will occur. Consequently, in the proof of Proposition A.5 we were forced to test every prime up to O(nb). If we allow probabilistic algorithms, then we can find a suitable prime with high probability by randomly selecting p in the range 2 p X ′ for some X ′ = O(nb). This allows us to find a suitable p in expected quasilinear time. The complexity of Steps 1 and 2 in Proposition A.6, i.e., finding the roots modulo p and lifting them to Z/p k Z, can also be improved to (deterministic) quasilinear time by means of fast multipoint evaluation techniques. The resulting algorithm is quite similar to the root finding algorithm presented in [vzGG13,Thm. 15.21].
The second obstruction concerns Step 3 of Proposition A.6, namely, checking which of the candidate integer roots are in fact roots of f . We do not know how to carry out this step rigorously in quasilinear time, even allowing randomised algorithms. A similar issue occurs in [vzGG13,Thm. 15.21], where the last term of the given complexity bound corresponds in our notation to O(n 2+ǫ b 1+ǫ ). In the discussion following that theorem, von zur Gathen and Gerhard suggest testing the candidate roots modulo a small prime (different to p) as a way to quickly rule out incorrect candidates. This idea can be turned into a "Monte Carlo" algorithm: one would randomly choose a small prime q, compute f (r * ) (mod q) for all candidate roots r * , and declare the ones for which f (r * ) ≡ 0 (mod q) to be the true roots. We suspect that in this way one can obtain a quasilinear expected running time with an exponentially small probability of failure, but we have not checked the details.
A.3. The general case. In order to prove Theorem 1.2, we must first discuss the computation of GCDs in Z[x].
Let f, g ∈ Z[x] and let h := gcd(f, g). The idea of the "heuristic GCD" algorithm [CGG84] is to use an integer GCD algorithm to compute gcd(f (N ), g(N )) for some choice of evaluation point N ∈ Z. If we are lucky, then gcd(f (N ), g(N )) will actually be equal to h(N ), and we may simply read off the coefficients of h(x) from h(N ), provided that N is not too small. However, it is possible for gcd(f (N ), g(N )) to contain extraneous factors unrelated to h(x). Usually these extraneous factors are small but in rare circumstances they can be very large. The algorithm can be made to tolerate extraneous factors up to a given size by taking larger values of N , at the expense of running more slowly. In practice, one usually takes a fairly small value of N , accepting a small chance of failure in order to get a fast algorithm. In the next result we work at the other extreme, taking N so large that the algorithm is guaranteed to work in all cases. We thereby obtain a GCD algorithm that is deterministic and completely rigorous (although unfortunately quite slow in practice).
Proposition A.9. Let b n 1 be integers. Let f, g ∈ Z[x] be nonzero polynomials such that deg f, deg g n and f ∞ , g ∞ 2 b . Assume that at least one of f and g is primitive. Define Writing h(x) = h 0 + h 1 x + · · · + h n x n , this becomes (A.4) gcd(f (N ), g(N )) = δ(N )h 0 + δ(N )h 1 N + · · · + δ(N )h n N n .