1 Introduction

1.1 Statement of main result

Let r be a positive integer. In this paper we study the problem of finding all r-power divisors of a given positive integer N, i.e., all positive integers p such that \({p^r \mathrel {|}N}\). Throughout the paper we write \(\lg x :=\log _2 x\), and unless otherwise specified, the “running time” of an algorithm refers to the number of bit operations it performs, or more formally, the number of steps executed by a deterministic multitape Turing machine [12]. We always assume the use of fast (quasilinear time) algorithms for basic integer arithmetic, i.e., for multiplication, division and GCD (see for example [19] or [4]).

Our main result is the following theorem.

Theorem 1.1

There is an explicit deterministic algorithm with the following properties. It takes as input an integer \(N \geqslant 2\) and a positive integer \(r \leqslant \lg N\). Its output is a list of all positive integers p such that \(p^r \mathrel {|}N\). Its running time is

$$\begin{aligned} O\left( N^{1/4r} \cdot \frac{(\lg N)^{10+\epsilon }}{r^3} \right) . \end{aligned}$$
(1.1)

Note that whenever we write \(\epsilon \) in a complexity bound, we mean that the bound holds for all \(\epsilon > 0\), where the implied big-O constant may depend on \(\epsilon \).

The integers p referred to in Theorem 1.1 need not be prime. Of course, if p is a composite integer found by the algorithm, then the algorithm will incidentally determine the complete factorisation of p, as the prime divisors \(\ell \) of p must also satisfy \(\ell ^r \mathrel {|}N\).

The hypothesis \(r \leqslant \lg N\) does not really limit the applicability of the theorem: if \(r > \lg N\) then the problem is trivial, as the only possible r-power divisor is 1.

Theorem 1.1 is intended primarily as a theoretical result. For fixed r the complexity is \(O(N^{1/4r+\epsilon })\), which is fully exponential in \(\lg N\), so the algorithm cannot compete asymptotically with subexponential factoring algorithms such as the elliptic curve method (ECM) or the number field sieve (NFS). Furthermore, experiments confirm that for small r the algorithm is grossly impractical compared to general-purpose factoring routines implemented in modern computer algebra systems.

1.2 Previous work

At the core of our algorithm is a generalisation of Coppersmith’s method [6] introduced by Boneh, Durfee and Howgrave-Graham [1]. We refer to the latter as the BDHG algorithm. Coppersmith’s seminal work showed how to use lattice methods to quickly find all divisors of N in certain surprisingly large intervals. To completely factor N, one simply applies the method to a sequence of intervals that covers all possible divisors up to \(N^{1/2}\). Each interval is searched in polynomial time, so the overall complexity is governed by the number of such intervals, which turns out to be \(O(N^{1/4+\epsilon })\). The BDHG algorithm adapts Coppersmith’s method to the case of r-power divisors. The relationship between our algorithm and the BDHG algorithm is discussed in Sect. 1.3 below.

We emphasise that, unlike factoring algorithms such as ECM or NFS, whose favourable running time analyses depend on heuristic assumptions, the complexity bound in Theorem 1.1 is rigorously analysed and fully deterministic. Under these restrictions, for \(r \geqslant 2\) it is asymptotically superior to all previously known complexity bounds for the problem of finding r-power divisors.

Its closest competitors are the algorithms of Strassen [17] and Pollard [14]. These algorithms can be used to find all divisors of N less than a given bound B in time \(O(B^{1/2+\epsilon })\). If \(p^r \mathrel {|}N\), say \(N = p^r q\), then either \(p \leqslant N^{1/(r+1)}\) or \(q \leqslant N^{1/(r+1)}\), so the Pollard–Strassen method can be used to find p or q, and hence both, in time \(O(N^{1/2(r+1)+\epsilon })\). For example, taking \(r = 2\), these algorithms can find all square divisors of N in time \(O(N^{1/6+\epsilon })\), whereas our algorithm finds all square divisors in time \(O(N^{1/8+\epsilon })\).

There is one special case in which the Pollard–Strassen approach still wins. If one knows in advance that p is relatively small, say \(p < N^c\) for some \(c \in (0, 1/2r)\), then the Pollard–Strassen method has complexity \(O(N^{c/2+\epsilon })\), which is better than the bound in Theorem 1.1. Our algorithm can also take advantage of the information that \(p < N^c\), but unfortunately this yields only a constant-factor speedup.

Another point of difference is the space complexity. The space required by the algorithm in Theorem 1.1 is only polynomial in \(\lg N\) (we will not give the details of this analysis), whereas for the Pollard–Strassen method the space complexity is the same as the time complexity, up to logarithmic factors.

In connection with the case \(r = 2\), two other works are worth mentioning. Booker, Hiary and Keating [3] describe a subexponential time algorithm that can sometimes prove that a given integer N is squarefree, with little or no knowledge of its factorisation. This algorithm is not fully rigorous, as its analysis depends on (among other things) the Generalised Riemann Hypothesis. Peralta and Okamoto [13] present a speedup of the ECM method for integers of the form \(N = p^2 q\). Again this result is not fully rigorous, because it depends on standard conjectures concerning the distribution of smooth numbers in short intervals, just as in Lenstra’s original ECM algorithm.

The case \(r = 1\) corresponds to the ordinary factoring problem, and in this case our algorithm is essentially equivalent to Coppersmith’s method. As mentioned above, the complexity is \(O(N^{1/4+\epsilon })\), which does not improve on known results; currently, the fastest known deterministic factoring method has complexity \(O(N^{1/5+\epsilon })\) [9]. (It is interesting to ask whether the ideas behind [9] can be used to improve Theorem 1.1 when \(r \geqslant 2\). Our inquiries in this direction have been so far unsuccessful.)

In fact, when \(r = 1\), Theorem 1.1 gives the more precise complexity bound \(O(N^{1/4} (\lg N)^{10+\epsilon })\). It is apparently well known that Coppersmith’s method has complexity \(O(N^{1/4} (\lg N)^C)\) for some constant \(C > 0\), but to the best of our knowledge, this is the first time in the literature that a particular value of C has been specified. On the other hand, we have not tried particularly hard to optimise the value of C, and it is likely that it can be improved. (One possible improvement is outlined in Remark 3.6.)

1.3 Relationship to the BDHG algorithm

The authors of [1] were mainly interested in cryptographic applications, and this led them to focus on the case that \(N = p^r q\) where p and q are roughly the same size. In this setting, they show that their algorithm is faster than ECM when \(r \approx (\log p)^{1/2}\), and that it even runs in polynomial time when r is as large as \(\log p\).

In this paper we take a different point of view: our goal is to determine the worst-case complexity, without any assumptions on the size of p, q or r.

To illustrate what difference this makes, consider again the case \(r = 2\). This case is mentioned briefly in Section 6 of [1]. The authors point out that if \(N = p^2 q\), where p and q are known to be about the same size, i.e., both p and q are within a constant factor of \(N^{1/3}\), then the running time of their method is \(O(N^{1/9+\epsilon })\), i.e., the number of search intervals is \(O(N^{1/9+\epsilon })\). However, in our more general setup, this is not the worst case. Rather, the worst case running time is \(O(N^{1/8+\epsilon })\), which occurs when searching for \(p \sim N^{1/4}\) and \(q \sim N^{1/2}\).

More generally, for \(r \geqslant 1\) the worst case running time of \(O(N^{1/4r+\epsilon })\) stated in Theorem 1.1 occurs when \(p \sim N^{1/2r}\) and \(q \sim N^{1/2}\). By contrast, in the “balanced” situation considered in [1], where \(p, q \sim N^{1/(r+1)}\), one can show that the running time is only \(O(N^{1/(r+1)^2+\epsilon })\) (see Remark 3.5, and take \(\theta = r/(r+1)\)).

Although the core of our algorithm is essentially the same as the BDHG algorithm, our more general perspective requires us to make a few changes to their presentation. For instance, we cannot take the lattice dimension to be \(d \approx r^2\) (as is done in the main theorem of [1]), because this choice is suboptimal when r is small and fixed. Additional analysis is required to deal with potentially small values of p and q, and in general we must take more care than [1] in estimating certain quantities throughout the argument. For these reasons, we decided to give a self-contained presentation, not relying on the results in [1].

Remark 1.2

After this paper was accepted for publication, Dan Bernstein mentioned to us (personal communication) that the proof of Theorem 5.2 of [2] likely includes much of the argument needed to obtain our Theorem 1.1.

1.4 Root-finding

An important component of our algorithm, and of all algorithms pursuing Coppersmith’s strategy, is a subroutine for finding all integer roots of a polynomial with integer coefficients. This problem has received extensive attention in the literature, but we were unable to locate a clear statement of a deterministic complexity bound suitable for our purposes. For completeness, in Appendix A we give a detailed proof of the following result. For a polynomial \(f \in \mathbb {Z}[x]\), we write \(\left\Vert f \right\Vert _{\infty }\) for the maximum of the absolute values of the coefficients of f.

Theorem 1.3

Let \(b \geqslant n \geqslant 1\) be integers. Given as input a polynomial \(f \in \mathbb {Z}[x]\) of degree n such that \(\left\Vert f \right\Vert _{\infty } \leqslant 2^b\), we may find all of the integer roots of f in time

$$\begin{aligned} O(n^{2+\epsilon } b^{1+\epsilon }). \end{aligned}$$

Note that this complexity bound is much stronger than what is needed for the application in this paper. However, it is still not quasilinear in the size of the input, which is O(nb). For further discussion, see Remarks A.8 and A.10.

2 Searching one interval

In this section we recall the strategy of [1] for finding all integers p in a prescribed interval \(P - H \leqslant p \leqslant P + H\) such that \(p^r \mathrel {|}N\), provided that H is not too large. We will prove the following theorem.

Theorem 2.1

There is an explicit deterministic algorithm with the following properties. It takes as input positive integers N, r, m, d, P and H such that

$$\begin{aligned}{} & {} r \leqslant \lg N, \end{aligned}$$
(2.1)
$$\begin{aligned}{} & {} m \leqslant d/r, \end{aligned}$$
(2.2)
$$\begin{aligned}{} & {} H < P \leqslant N^{1/r}, \end{aligned}$$
(2.3)

and

$$\begin{aligned}{} & {} H^{(d-1)/2} < \frac{1}{d^{1/2} \, 2^{(d-1)/4}} \cdot \frac{(P-H)^{rm}}{N^{rm(m+1)/2d}}. \end{aligned}$$
(2.4)

Its output is a list of all integers p in the interval \(P - H \leqslant p \leqslant P + H\) such that \(p^r \mathrel {|}N\). Its running time is

$$\begin{aligned} O\big ( d^{7+\epsilon } (\tfrac{1}{r} \lg N)^{2+\epsilon } \big ). \end{aligned}$$

A key tool needed in the proof of Theorem 2.1 is the LLL algorithm:

Lemma 2.2

Let \(d \geqslant 1\) and \(B \geqslant 2\). Given as input linearly independent vectors \(v_0, \ldots , v_{d-1} \in \mathbb {Z}^d\) such that \(\left\Vert v_i \right\Vert \leqslant B\), in time

$$\begin{aligned} O\big ( d^{5+\epsilon } (\lg B)^{2+\epsilon } \big ) \end{aligned}$$

we may find a nonzero vector w in the lattice \(L :={{\,\textrm{span}\,}}_\mathbb {Z}(v_0, \ldots , v_{d-1})\) such that

$$\begin{aligned} \left\Vert w \right\Vert \leqslant 2^{(d-1)/4} (\det L)^{1/d}. \end{aligned}$$

(Here \(\left\Vert \,\cdot \, \right\Vert \) denotes the standard Euclidean norm on \(\mathbb {R}^d\).)

Proof

We take w to be the first vector in a reduced basis for L computed by the LLL algorithm [10, Prop. 1.26]. For the bound on \(\left\Vert w \right\Vert \), see [10, Prop. 1.6]. (For more recent developments on lattice reduction, see for example [8, Ch. 17].) \(\square \)

Let \(\mathbb {Z}[x]_d\) denote the space of polynomials in \(\mathbb {Z}[x]\) of degree less than d. The first step in the proof of Theorem 2.1 is the following proposition, which uses the LLL algorithm to construct a nonzero polynomial \(h \in \mathbb {Z}[x]_d\) with relatively small coefficients in a carefully chosen lattice.

Proposition 2.3

Let N, r, m, d, P and H be positive integers satisfying (2.1), (2.2) and (2.3). Define polynomials \(f_0, \ldots , f_{d-1} \in \mathbb {Z}[x]_d\) by

$$\begin{aligned} f_i(x) :={\left\{ \begin{array}{ll} N^{m - \lfloor i/r \rfloor } (P + x)^i, &{} 0 \leqslant i< rm, \\ (P + x)^i, &{} rm \leqslant i < d. \end{array}\right. } \end{aligned}$$

Then in time

$$\begin{aligned} O\big ( d^{7+\epsilon } (\tfrac{1}{r} \lg N)^{2+\epsilon } \big ) \end{aligned}$$
(2.5)

we may find a nonzero polynomial

$$\begin{aligned} h(x) = h_0 + \cdots + h_{d-1} x^{d-1} \in \mathbb {Z}[x]_d \end{aligned}$$

in the \(\mathbb {Z}\)-span of \(f_0, \ldots , f_{d-1}\) such that

$$\begin{aligned} |h_0| + |h_1| H + \cdots + |h_{d-1}| H^{d-1} < d^{1/2} \, 2^{(d-1)/4} H^{(d-1)/2} N^{rm(m+1)/2d}. \end{aligned}$$
(2.6)

Proof

Set \({\tilde{f}}_i(y) :=f_i(Hy) \in \mathbb {Z}[y]_d\), and let \(v_i \in \mathbb {Z}^d\) be the vector whose j-th entry (for \(j = 0, \ldots , d-1\)) is the coefficient of \(y^j\) in \({\tilde{f}}_i(y)\). We will apply Lemma 2.2 to the vectors \(v_0, \ldots , v_{d-1}\).

Let

$$\begin{aligned} B :=d^{1/2} \, 2^d N^{d/r+1}. \end{aligned}$$

We claim that \(\left\Vert v_i \right\Vert \leqslant B\) for all i. First consider the case \(0 \leqslant i < rm\). For any \(j = 0, \ldots , i\), the coefficient of \(y^j\) in \({\tilde{f}}_i(y) = N^{m-\lfloor i/r \rfloor } (P + Hy)^i\) is equal to

$$\begin{aligned} N^{m-\lfloor i/r \rfloor } \genfrac(){0.0pt}1{i}{j} P^{i-j} H^j \leqslant N^{m - i/r + 1} 2^i P^i \leqslant 2^i N^{m+1} \leqslant 2^d N^{d/r+1}, \end{aligned}$$

where we have used the hypotheses (2.3) and (2.2). For the case \(rm \leqslant i < d\), a similar argument shows that every coefficient of \({\tilde{f}}_i(y) = (P + Hy)^i\) is bounded above by \(2^d N^{d/r}\). Therefore every \(v_i\) has coordinates bounded by \(2^d N^{d/r+1}\), and we conclude that \(\left\Vert v_i \right\Vert \leqslant B\) for all i.

Next we calculate the determinant of the lattice \(L :={{\,\textrm{span}\,}}_\mathbb {Z}(v_0, \ldots , v_{d-1})\), or equivalently, the determinant of the \(d \times d\) integer matrix whose rows are given by the \(v_i\). Since \(\deg {\tilde{f}}_i(y) = i\), this is a lower triangular matrix whose diagonal entries are given by the leading coefficients of the \({\tilde{f}}_i(y)\), namely

$$\begin{aligned} {\left\{ \begin{array}{ll} N^{m - \lfloor i/r \rfloor } H^i, &{} 0 \leqslant i< rm, \\ H^i, &{} rm \leqslant i < d. \end{array}\right. } \end{aligned}$$

The determinant is the product of these leading coefficients, i.e.,

$$\begin{aligned} \det L&= H^{1 + 2 + \cdots + (d-1)} \underbrace{(N^m \cdots N^m)}_{r \text { terms }} \underbrace{(N^{m-1} \cdots N^{m-1})}_{r \text { terms }} \cdots \underbrace{(N \cdots N)}_{r \text { terms }} \\&= H^{1 + 2 + \cdots + (d-1)} (N^{1 + 2 + \cdots + m})^r \\&= H^{d(d-1)/2} N^{rm(m+1)/2}. \end{aligned}$$

Invoking Lemma 2.2, we may compute a nonzero vector \(w \in L\) such that

$$\begin{aligned} \left\Vert w \right\Vert \leqslant 2^{(d-1)/4} H^{(d-1)/2} N^{rm(m+1)/2d} \end{aligned}$$

in time \(O(d^{5+\epsilon } (\lg B)^{2+\epsilon })\). Note that this time bound certainly dominates the cost of computing the vectors \(v_i\) themselves, as the \({\tilde{f}}_i(y)\) may be computed by starting with \({\tilde{f}}_0(y) = N^m\), and then successively multiplying by \(P + Hy\) and occasionally dividing by N. The hypotheses (2.1) and (2.2) imply that

$$\begin{aligned} \lg B \ll d + (\tfrac{d}{r} + 1) \lg N = \big (\tfrac{r}{\lg N} + 1 + \tfrac{r}{d} \big ) \tfrac{d}{r} \lg N \leqslant \big (2 + \tfrac{1}{m} \big ) \tfrac{d}{r} \lg N \ll \tfrac{d}{r} \lg N, \end{aligned}$$

so the cost estimate \(O(d^{5+\epsilon } (\lg B)^{2+\epsilon })\) simplifies to (2.5).

The vector w corresponds to a nonzero polynomial \({\tilde{h}}(y) = {\tilde{h}}_0 + \cdots + {\tilde{h}}_{d-1} y^{d-1}\) in the \(\mathbb {Z}\)-span of the \({\tilde{f}}_i(y)\). Applying the Cauchy–Schwartz inequality to the vectors \(w = ({\tilde{h}}_0, \ldots , {\tilde{h}}_{d-1})\) and \((1, \ldots , 1)\) yields

$$\begin{aligned} |{\tilde{h}}_0| + \cdots + |{\tilde{h}}_{d-1}| \leqslant d^{1/2} \left\Vert w \right\Vert < d^{1/2} \, 2^{(d-1)/4} H^{(d-1)/2} N^{rm(m+1)/2d}. \end{aligned}$$

Moreover, each \({\tilde{h}}_j\) is divisible by \(H^j\), so we obtain in turn a polynomial \(h(x) :={\tilde{h}}(x/H) \in \mathbb {Z}[x]_d\) in the \(\mathbb {Z}\)-span of the \(f_i(x)\). Since \(h(x) = h_0 + \cdots + h_{d-1} x^{d-1}\) with \(h_j = {\tilde{h}}_j / H^j\) for each j, the estimate (2.6) follows immediately. \(\square \)

Next we show that any r-power divisor that is sufficiently close to P corresponds to a root of h(x).

Proposition 2.4

Let N, r, m, d, P and H be positive integers satisfying (2.1), (2.2) and (2.3), and let \(h(x) \in \mathbb {Z}[x]_d\) be as in Proposition 2.3. Suppose additionally that (2.4) holds, and that p is an integer in the interval \(P - H \leqslant p \leqslant P + H\) such that \(p^r \mathrel {|}N\). Then \(x_0 :=p - P\) is a root of h(x).

Proof

We claim that \(h(x_0)\) is divisible by \(p^{rm}\). Since h(x) is a \(\mathbb {Z}\)-linear combination of the \(f_i(x)\) (where \(f_i(x)\) is defined as in Proposition 2.3), it is enough to prove that \(p^{rm} \mathrel {|}f_i(x_0)\) for all i. For the case \(0 \leqslant i < rm\), we have \(f_i(x_0) = N^{m - \lfloor i/r \rfloor } p^i\). Since \(p^r \mathrel {|}N\), we have \(p^{r (m - \lfloor i/r \rfloor )} p^i \mathrel {|}f_i(x_0)\), and this implies that \(p^{rm} \mathrel {|}f_i(x_0)\) because \(r \lfloor i/r \rfloor \leqslant i\). For the case \(i \geqslant rm\) we have simply \(f_i(x_0) = p^i\), which is certainly divisible by \(p^{rm}\).

On the other hand, the assumption \(-H \leqslant x_0 \leqslant H\) together with (2.6) and (2.4) implies that

$$\begin{aligned} |h(x_0)| \leqslant |h_0| + \cdots + |h_{d-1}| H^{d-1} < (P-H)^{rm} \leqslant p^{rm}. \end{aligned}$$

Since \(h(x_0)\) is divisible by \(p^{rm}\), this forces \(h(x_0) = 0\). \(\square \)

We may now complete the proof of the main theorem of this section.

Proof of Theorem 2.1

We first invoke Proposition 2.3, with inputs N, r, m, d, P and H, to find a polynomial h(x) satisfying (2.6). According to Proposition 2.4, we may then construct a list of candidates for p by finding all integer roots of h(x), which we do via Theorem 1.3.

To estimate the complexity of the root-finding step, recall from the proof of Proposition 2.4 that \(|h_0| + \cdots + |h_{d-1}| H^{d-1} < (P-H)^{rm}\), so certainly \(|h_j| < (P-H)^{rm}\) for all j, and we obtain

$$\begin{aligned} \left\Vert h \right\Vert _{\infty }< (P-H)^{rm} < P^{rm} \leqslant N^m \leqslant N^{d/r}. \end{aligned}$$

Therefore in Theorem 1.3 we may take \(n :=d\) and \(b :=\lceil \lg (N^{d/r}) \rceil = \lceil \tfrac{d}{r} \lg N \rceil \). Note that the hypothesis \(b \geqslant n\) is satisfied due to (2.1). The root-finding complexity is thus

$$\begin{aligned} O(d^{2+\epsilon } (\tfrac{d}{r} \lg N)^{1+\epsilon }) = O(d^{3+\epsilon } (\tfrac{1}{r} \lg N)^{1+\epsilon }), \end{aligned}$$

which is negligible compared to the main bound (2.5). Finally, we must check each candidate for p to ensure that \(p^r \mathrel {|}N\), which again requires negligible time. \(\square \)

3 Proof of the main theorem

We now consider the problem of searching for all integers p such that \(p^r \mathrel {|}N\) in an interval, say \(T \leqslant p \leqslant T'\), that is too large to be handled by a single application of Theorem 2.1. Given N, r, T and \(T'\), our strategy will be to choose parameters d, m and H, and then apply Theorem 2.1 to a sequence of subintervals of the form \(P - H \leqslant p \leqslant P + H\) that cover the target interval \(T \leqslant p \leqslant T'\). The overall running time will depend mainly on the number of subintervals, so our goal is to make H as large as possible. On the other hand, to ensure that the hypothesis (2.4) of Theorem 2.1 is satisfied, we also require that \(H < {\tilde{H}}\) where

$$\begin{aligned} {\tilde{H}} :=\frac{1}{d^{1/(d-1)} 2^{1/2}} \cdot \frac{T^{2rm/(d-1)}}{N^{rm(m+1)/d(d-1)}} > 0. \end{aligned}$$
(3.1)

The key issue is therefore to choose d and m to maximise \({\tilde{H}}\). For large d and m, the magnitude of \({\tilde{H}}\) depends more or less on the ratio m/d; in fact, one finds that \({\tilde{H}}\) is maximised when \(m/d \approx \lg T / \lg N\). The following result gives a simple formula for m (as a function of d) that is close to the optimal choice, and a corresponding explicit lower bound for \({\tilde{H}}\).

Lemma 3.1

Let N, r, d and T be positive integers such that \(d \geqslant 2\) and \(T \leqslant N^{1/r}\). Let

$$\begin{aligned} m :=\left\lfloor \frac{(d-1) \lg T}{\lg N} \right\rfloor , \end{aligned}$$
(3.2)

and let \({\tilde{H}}\) be defined as in (3.1). Then

$$\begin{aligned} {\tilde{H}} > \tfrac{1}{3} N^{\theta ^2/r \, - \, 1/(d-1)}, \end{aligned}$$

where

$$\begin{aligned} \theta :=\frac{r \lg T}{\lg N} \in [0,1] \qquad (\text { so that }T = N^{\theta /r}). \end{aligned}$$
(3.3)

Proof

The definition of m implies that \((d-1)\tfrac{\theta }{r} - 1 < m \leqslant (d-1) \tfrac{\theta }{r}\), so we may write

$$\begin{aligned} \frac{m}{d-1} = \frac{\theta }{r} - \delta \qquad \text {for some } \delta \in [0, \tfrac{1}{d-1}). \end{aligned}$$

It is easy to check that \(d^{1/(d-1)} 2^{1/2} < 3\) for all \(d \geqslant 2\), so we find that

$$\begin{aligned} {\tilde{H}} > \tfrac{1}{3} N^{2\theta m/(d-1) - rm(m+1)/d(d-1)}. \end{aligned}$$

Continuing to estimate the exponent in this inequality, we obtain

$$\begin{aligned} \frac{2\theta m}{d-1} - \frac{rm(m+1)}{d(d-1)}&> \frac{2\theta m}{d-1} - \frac{rm(m+1)}{(d-1)^2} \\&= \frac{2\theta m}{d-1} - \frac{rm^2}{(d-1)^2} - \frac{rm}{(d-1)^2} \\&= 2\theta (\tfrac{\theta }{r} - \delta ) - r(\tfrac{\theta }{r} - \delta )^2 - \frac{r(\tfrac{\theta }{r} - \delta )}{d-1} \\&= \frac{\theta ^2}{r} + r\delta \left( \frac{1}{d-1} - \delta \right) - \frac{\theta }{d-1} \\&\geqslant \frac{\theta ^2}{r} - \frac{1}{d - 1}, \end{aligned}$$

where the last line follows from the inequalities \(0 \leqslant \delta < \tfrac{1}{d-1}\) and \(0 \leqslant \theta \leqslant 1\). \(\square \)

We may now estimate the time required to search a given interval \(T \leqslant p \leqslant T'\).

Proposition 3.2

There is an explicit deterministic algorithm with the following properties. It takes as input positive integers N, r, T and \(T'\) such that (2.1) holds (i.e., \(r \leqslant \lg N\)) and such that

$$\begin{aligned} 4^{\sqrt{(\lg N)/r}} \leqslant T < T' \leqslant N^{1/r}. \end{aligned}$$
(3.4)

Its output is a list of all integers p in the interval \(T \leqslant p \leqslant T'\) such that \(p^r \mathrel {|}N\). Its running time is

$$\begin{aligned} O\left( \left( \frac{T'-T}{T} \cdot N^{\theta (1-\theta )/r} + 1\right) \frac{(\lg N)^{9+\epsilon }}{r^2} \right) , \end{aligned}$$

where \(\theta \) is defined as in (3.3).

Proof

Set

$$\begin{aligned} d :=\lceil \lg N \rceil + 1 \end{aligned}$$

and define m as in (3.2). Equivalently, m is the largest integer such that \(N^m \leqslant T^{d-1}\). Note that \(d \geqslant 2\) (since \(N \geqslant 2^r \geqslant 2\)) and \(m \geqslant \lfloor \lg T \rfloor \geqslant 2\) (since \(T \geqslant 4^{\sqrt{1}} = 4\)). Since \(\lg (T^{d-1}) \leqslant d \lg T \ll (\lg N)^2\), we may clearly compute d and m in time \(O((\lg N)^{2+\epsilon })\). Also, the assumption \(T \leqslant N^{1/r}\) implies that \(m \leqslant (d-1)/r \leqslant d/r\), so (2.2) holds.

Let \({\tilde{H}}\) be defined as in (3.1). Since \(d \geqslant \lg N + 1\), Lemma 3.1 implies that

$$\begin{aligned} {\tilde{H}} > \tfrac{1}{3} N^{\theta ^2/r} N^{-1/\lg N} = \tfrac{1}{6} N^{\theta ^2/r}. \end{aligned}$$

Moreover, (3.4) implies that \(\theta \geqslant 2 \sqrt{r/\lg N}\), so we have \(N^{\theta ^2/r} \geqslant N^{4/\lg N} = 16\) and hence \({\tilde{H}}> 16/6 > 2\).

Let H be the largest integer less than \({\tilde{H}}\), i.e., \(H :=\big \lceil {\tilde{H}} \big \rceil - 1\). Then \(2 \leqslant H < {\tilde{H}}\), and moreover, since \({\tilde{H}} > 2\), we also have

$$\begin{aligned} H \geqslant {\tilde{H}} / 2 > \tfrac{1}{12} N^{\theta ^2/r}. \end{aligned}$$

We may compute H by first approximating the \(d(d-1)\)-th root of the rational number

$$\begin{aligned} {\tilde{H}}^{d(d-1)} = \frac{T^{2drm}}{d^d 2^{d(d-1)/2} N^{rm(m+1)}}, \end{aligned}$$

and then taking \(d(d-1)\)-th powers of nearby integers to find the correct value. The numerator has bit size at most \(O(drm \lg T) = O(d^2 \lg N) = O(\lg ^3 N)\), and the denominator also has bit size at most

$$\begin{aligned} O(d \lg d + d^2 + rm^2 \lg N) = O(d^2 + d m \lg N) = O(d^2 \lg N) = O(\lg ^3 N), \end{aligned}$$

so this can all be done in time \(O((\lg N)^{3+\epsilon })\).

We now apply Theorem 2.1 with the parameters N, r, d, m, H, and with successively \(P = T + H\), \(P = T + 3H\), and so on, stopping when the interval \([T,T']\) has been exhausted by the subintervals \([P-H, P+H]\). The hypotheses (2.1), (2.2) and (2.3) have already been checked above, and (2.4) follows from our choice of \(H < {\tilde{H}}\) because \(P-H \geqslant T\). The number of subintervals is at most

$$\begin{aligned} \left\lceil \frac{T' - T}{2H} \right\rceil \leqslant \frac{T' - T}{\frac{1}{6} N^{\theta ^2/r}} + 1 = \frac{6(T' - T)}{T} \cdot N^{\theta (1-\theta )/r} + 1. \end{aligned}$$

Finally, since \(d \ll \lg N\), the cost of each invocation of Theorem 2.1 is

$$\begin{aligned} O\big ( d^{7+\epsilon } (\tfrac{1}{r} \lg N)^{2+\epsilon } \big ) = O\left( \frac{(\lg N)^{9+\epsilon }}{r^2} \right) . \end{aligned}$$

\(\square \)

Remark 3.3

A slightly better choice for d is to take \(d \approx \theta \lg N\), but this complicates the analysis and only improves the main result by a constant factor.

Finally we may prove the main theorem. Recall that we are given as input positive integers \(N \geqslant 2\) and \(r \leqslant \lg N\), and we wish to find all positive integers p such that \(p^r \mathrel {|}N\). Such divisors p must clearly lie in \([1,N^{1/r}]\).

Proof of Theorem 1.1

Let

$$\begin{aligned} k :=\left\lceil 2 \sqrt{\lceil \lg N \rceil /r} \right\rceil . \end{aligned}$$

We first check all \(p = 2, 3, \ldots , 2^k\) by brute force, i.e., testing directly whether \(p^r \mathrel {|}N\). Note that k may certainly be computed in time \(O((\lg N)^{1+\epsilon })\). To estimate the cost of checking up to \(2^k\), observe that

$$\begin{aligned} k \leqslant 2 \sqrt{(\lg N)/r + 1} + 1. \end{aligned}$$

Let \(C > 0\) be an absolute constant such that \(2\sqrt{x+1} + 1 \leqslant x/4 + C\) for all \(x \geqslant 1\); it follows that \(k \leqslant (\lg N)/4r + C\), and hence that \(2^k \ll N^{1/4r}\). The cost of checking up to \(2^k\) is therefore \(O(N^{1/4r} (\lg N)^{1+\epsilon })\), which is negligible compared to (1.1).

We now apply Proposition 3.2 to the intervals \([2^k, 2^{k+1}]\), \([2^{k+1}, 2^{k+2}]\), and so on until we reach \(N^{1/r}\), taking the last interval to be \([2^j, \lfloor N^{1/r} \rfloor ]\) for suitable j. Since \(k \geqslant 2\sqrt{(\lg N)/r}\), the precondition (3.4) is satisfied. For each interval we have \((T'-T)/T = O(1)\), and since \(\theta \in [0,1]\) we have

$$\begin{aligned} \theta (1-\theta ) \leqslant \frac{1}{4}. \end{aligned}$$

Therefore the cost of searching each interval is

$$\begin{aligned} O\left( (N^{1/4r} + 1) \cdot \frac{(\lg N)^{9+\epsilon }}{r^2} \right) = O\left( N^{1/4r} \cdot \frac{(\lg N)^{9+\epsilon }}{r^2} \right) . \end{aligned}$$

Finally, the number of intervals is at most \(\lceil \lg (N^{1/r}) \rceil = O(\tfrac{1}{r} \lg N)\). \(\square \)

Remark 3.4

The use of dyadic intervals in the above proof was only for convenience; the same argument would work with intervals \([B^j, B^{j+1}]\) for any fixed \(B > 1\).

Remark 3.5

The expression \(N^{\theta (1-\theta )/r}\) achieves its maximum value \(N^{1/4r}\) at the point \(\theta = 1/2\). This justifies the claim made in the introduction that the factors \(p^r\) that are “hardest” to find are those for which \(p \sim N^{1/2r}\).

Remark 3.6

A more careful analysis, taking into account the fact that \(N^{\theta (1-\theta )/r}\) is much smaller than \(N^{1/4r}\) for most values of \(\theta \in [0,1]\), shows that the bound (1.1) can be improved by a factor of \(O((\tfrac{1}{r} \lg N)^{1/2})\). Let us briefly explain this calculation. The main contribution to the cost estimate in the above proof is the number of subintervals, i.e., the sum of the values of \(N^{\theta (1-\theta )/r}\) over the various dyadic intervals. It can be shown that this sum is essentially a Riemann sum approximating the integral

$$\begin{aligned} \frac{\log N}{r} \int _0^1 N^{\theta (1-\theta )/r} d\theta . \end{aligned}$$

The argument in the proof of Theorem 1.1 amounted to estimating this integral via the trivial bound \(\int _0^1 N^{\theta (1-\theta )/r} d\theta \leqslant \int _0^1 N^{1/4r} d\theta = N^{1/4r}\). A better estimate is obtained by recognising the integrand as a truncated Gaussian function, i.e.,

$$\begin{aligned} \int _0^1 N^{\theta (1-\theta )/r} d\theta&= \int _{-1/2}^{1/2} N^{(1/4-\alpha ^2)/r} d\alpha \\&\leqslant N^{1/4r} \int _{-\infty }^\infty N^{-\alpha ^2/r} d\alpha = \left( \frac{\pi r}{\log N}\right) ^{1/2} N^{1/4r}. \end{aligned}$$

An interesting question is whether this \((\lg N)^{1/2}\) factor is somehow equivalent to the \((\lg N)^{1/2}\) factor appearing in [2, §6]; we have not checked this in detail.