1 Introduction

The Hilbert class polynomial \(H_D[j]\) of the imaginary quadratic order \({\mathcal {O}}\) of discriminant D is the minimal polynomial of the j-invariant of an elliptic curve with endomorphism ring \({\mathcal {O}}\). It is a defining polynomial of the ring class field of \({\mathcal {O}}\) and can be used for constructing elliptic curves over a finite field with a given number of points. Its coefficients are however rather large, which limits its practical usefulness. Already in 1908, Weber [37] therefore introduced alternative class invariants to be used instead of j, which resulted in class polynomials with coefficients that have roughly 1/72 of the digits of the coefficients of the Hilbert class polynomial for certain discriminants.

There has been continued interest in alternative class invariants ever since (e.g. [2, 4, 8,9,10,11,12, 14, 17, 18, 29, 30]). None however matched, let alone surpassed, the factor 72 of Weber’s functions. Moreover, Bröker and Stevenhagen [4] showed that no class invariant will ever do better than a factor 100.83. Under Selberg’s eigenvalue conjecture [31, Conjecture 1], this bound reduces to 96.

We introduce generalized (multivariate) class polynomials, define an appropriate notion of their reduction factor, and show that this notion indeed gives a measure of their “size” compared to the Hilbert class polynomial (Sect. 3). Contrary to classical class polynomials, the reduction factors of generalized class polynomials are not limited by the Bröker–Stevenhagen bound.

We give a family of generalized class polynomials for which we prove that the reduction factor matches Weber’s 72 for a large range of values of D, including infinitely many values of D where no reduction of 36 or better was previously known (Sect. 4). We also give an example that possibly achieves the factor 120 (Remark 7.6).

Though the focus of this paper is on introducing the generalized class invariants and studying their height, we also give a preliminary analysis indicating that the height reduction leads to a speed-up in their computation (Sect. 6), and we show how to use them for constructing elliptic curves over finite fields (Sect. 5).

2 Generalized class polynomials

Definition 2.1

By a modular curve over \(\textbf{Q}\) we mean a smooth, projective, geometrically irreducible curve \(C\) over \(\textbf{Q}\) together with a map \(\psi : \textbf{H}\rightarrow C(\textbf{C})\) from the upper half space \(\textbf{H}\subset \textbf{C}\) with the following property. There exists a positive integer N such that for every function \(f\in \textbf{Q}(C)\), the function \(f\circ \psi \) is a modular function for \(\Gamma (N)\) with all q-expansion coefficients in \(\textbf{Q}^{\textrm{ab}}\).

We identify f with \(f\circ \psi \) and we identify \(\psi \) with the induced morphism of curves \(X(N)\rightarrow C\).

For an order \(\mathcal {O}\) in an imaginary quadratic number field K, we denote by \(K_{\mathcal {O}}\) the associated ring class field. Let f be a modular function and \(\tau \in \textbf{H}\) imaginary quadratic, say a root of \(aX^2+bX+c\) for coprime integers abc. The pair \((f,\tau )\) is called a class invariant for the imaginary quadratic order \(\mathcal {O}=\textbf{Z}[a\tau ]\) if \(f(\tau )\) lies in the ring class field \(K_{\mathcal {O}}\). The discriminant D of the class invariant is the discriminant of \(\mathcal {O}\). The Galois group G of \(K(f(\tau ))/K\) is isomorphic via the Artin map to a quotient of the Picard group \(\textrm{Cl}(\mathcal {O})\). Associated to a class invariant is its minimal polynomial over K, also known as the class polynomial,

$$\begin{aligned} H_\tau [f]:= \prod _{\sigma \in G}\big (X-\sigma (f(\tau ))\big )\quad \in K[X]. \end{aligned}$$

Under additional restrictions, class polynomials can sometimes be shown to have coefficients in \(\textbf{Q}\) (cf. [9, Thm. 4.4], [13, Thm. 5.4]); in that case we call the class polynomials real. Oftentimes, a modular function admits class invariants for an infinite family of discriminants, determined by a certain congruence condition ( [30, 9, Thm. 4.3]). Sometimes the discriminant uniquely determines the class polynomial for a given modular function.

Example 2.2

The modular j-function admits a unique class polynomial for any discriminant \(D<0\), called the Hilbert class polynomial \(H_D[j]:=H_\tau [j]\). It can be seen as a function on \(\textbf{P}^1\) whose zeros are the j-invariants of elliptic curves with CM by the imaginary quadratic order of discriminant D and whose poles are restricted to the point at infinity.

We propose a generalization of class polynomials, seen as functions on modular curves of higher genus, for which the classical class polynomials can be viewed as the genus zero case. We will mostly restrict ourselves to the case of genus one, as this will make notation considerably less complicated. We discuss the arbitrary genus case in Sect. 7. Let C be a modular curve over \(\textbf{Q}\) with a smooth Weierstrass model \(y^2+a_1xy+a_3y=x^3+a_2x^2+a_4x+a_6\), and suppose that \((x,\tau ),(y,\tau )\) are class invariants for some imaginary quadratic \(\tau \in \textbf{H}\). Consider \(G = \textrm{Gal}(K(x(\tau ),y(\tau ))/K)\) and \(m=\#G\). If we denote by \(\mathcal {D}\) the divisor of the unique point at infinity of C, then \(\mathcal {L}(\infty \mathcal {D})\) has a basis \(b_0=1,b_1=x,b_2=y,b_3=x^2,b_4=xy,b_5=x^3,b_6=x^2y,\ldots \) (ordered by ascending degree). There exist \(a_i\in K\), not all zero, such that

$$\begin{aligned} \sum _{i=0}^m a_i b_i(\tau )=0. \end{aligned}$$
(2.3)

In fact, up to scaling by an element of \(K^{\times }\), there exists a unique function \(F_{\tau }[C]=\sum _{i=0}^m a_ib_i\in K(C)\) such that

$$\begin{aligned} \textrm{div}F_{\tau }[C] = \left[ \sum _{\sigma \in G} \left( \sigma (\psi (\tau ))\right) \right] +\left( -\sum _{\sigma \in G}\sigma (\psi (\tau ))\right) -(m+1)\mathcal {D}. \end{aligned}$$
(2.4)

Definition 2.5

We call \(F_{\tau }[C]\) as in (2.4) a generalized class function for \(\tau \). The associated generalized class polynomial is the unique \(H_{\tau }[C]\in K[X,Y]\) of degree \(\le 1\) in Y such that \(H_{\tau }[C](x,y) = F_{\tau }[C]\).

We note that the polynomial \(H_{\tau }[C]\) depends on the choice of x and y, but we leave this out of the notation. In Sect. 7 (and in particular Definition 7.3) we will allow more general divisors \(\mathcal {D}\) and bases \(\mathcal {B}\), leading to more general functions \(F_{\tau }[C,\mathcal {B}]\) and polynomials \(H_{\tau }[C,\mathcal {B}]\).

Definition 2.6

We call the point \(P = \sum _{\sigma \in G}\sigma (\psi (\tau ))\in C(K)\) the Heegner point of the class function F.

If the Heegner point P is the point at infinity, then \(a_m=0\). Otherwise, the point \(-P\) is a zero of F. In particular, if\(P=-(0,0)\), then \(a_0=0\).

For \(N\in \textbf{Z}_{>0}\), we denote by \(X^0(N)\) the smooth, projective, geometrically irreducible curve over \(\textbf{Q}\) with function field consisting of the modular functions for the modular group \(\Gamma ^0(N)=\{\begin{matrix} a &{} b\\ c &{} d \end{matrix} \in {{\,\textrm{SL}\,}}_2(\textbf{Z})\mid b\equiv 0 \pmod {N}\}\) that have rational q-expansion. We denote by \(X^0_+(N)\) the quotient of \(X^0(N)\) by the Fricke-Atkin-Lehner involution \(z\mapsto -N/z\), and write \(\eta (z)\) for the Dedekind \(\eta \)-function

$$\begin{aligned} \eta (z) = q^{1/24}\prod _{n=1}^{\infty }(1-q^n), \quad \text{ where }\quad q = \exp (2\pi i z). \end{aligned}$$

Example 2.7

Consider the genus one modular curve \(C:=X^0_+(119)\). Its conductor as an elliptic curve is 17 (Cremona label 17a4)Footnote 1. A Weierstrass model for E is given byFootnote 2

$$\begin{aligned} y^2+3xy-y=x^3-3x^2+x, \end{aligned}$$
(2.8)

where \(x,y\in \textbf{Q}(C)\) have respective q-expansions

$$\begin{aligned} x= & {} q^{-2} + q^{-1} + 1 + q + 2q^2 + 2q^3 + 3q^4 + 3q^5 + 4q^6 + 5q^7 + \ldots ,\\ y= & {} q^{-3} + 1 + 2q + 2q^2 + 4q^3 + 4q^4 + 7q^5 + 9q^6 + 12q^7 +\ldots ,\\{} & {} \text{ where } \text{ this } \text{ time } q = \exp (2\pi i z/119)\text{. } \end{aligned}$$

The “double eta quotient” \({\mathfrak {w}}_{7,17}\) given by

$$\begin{aligned} {\mathfrak {w}}_{7,17}(z) =\frac{\eta (z/7)\eta (z/17)}{\eta (z)\eta (z/119)} \end{aligned}$$
(2.9)

is invariant under the action of \(\Gamma ^0(N)\) [26, Thm. 1] and the Fricke-Atkin-Lehner involution [11, Thm. 2], hence also forms an element of the (rational) function field of C. It is related to x and y by

$$\begin{aligned} {\mathfrak {w}}_{7,17}=-y+x^2-x. \end{aligned}$$
(2.10)

The curve \(X^0_+(119)\) has two cusps, and they are both rational. In the given Weierstrass model, these correspond to the point (0, 0) and the point at infinity. Numerical examples of generalized class polynomials specifically for \(X^0_+(119)\) are given in Sect. 4.5. We will treat this curve as our main test case in the rest of the paper.

3 Estimates and reduction factors

3.1 Reduction factors

We define the reduction factor of a modular curve \(C\) to be

$$\begin{aligned} r(C) = \frac{\deg (j : X(N)\rightarrow \textbf{P}^1)}{\deg (\psi : X(N) \rightarrow C)}. \end{aligned}$$
(3.1)

In the case \(C=\textbf{P}^1\), we denote this number also by \(r(\psi )\) and our notation and terminology coincide with that of [4]. The number \(r(\psi )^{-1}\) is denoted by \({\widehat{c}}(\psi )\) in [8] and by \(c(\psi )\) in [9]. Bröker and Stevenhagen [4, Theorem 4.1]Footnote 3 show \(r(\psi )\le 32768/325 \le 100.83\). Under Selberg’s eigenvalue conjecture, one can even prove \(r(\psi )\le 96\). The best known \(\psi \) achieves \(r(\psi ) = 72\). This result does not however apply directly to \(r(C)\). For example, we have

$$\begin{aligned} r(X^0(N)) = N\prod _{p\mid N} (1+\frac{1}{p}) \quad \text{ and }\quad r(X^0_+(N)) = \frac{1}{2} r(X^0(N)) \quad \text{ if }\quad N>1. \end{aligned}$$
(3.2)

Our main example \(C= X^0_+(119)\) therefore achieves \(r(C) = \frac{1}{2}(7+1)(17+1)=72\). For (hyper)elliptic modular curves \(C\) we get \(r(C) \le 201.65\) (or \(r(C) \le 192\) under Selberg’s eigenvalue conjecture), by applying the bounds to the x-function. Surprisingly, all elliptic curve quotients of \(X^0(N)\) we found so far have \(r\le 72\) (Sect. 4.7). In Sect. 7 we will discuss higher-genus curves, which allow for unbounded \(r(C)\).

Remark 3.3

In the applications we have in mind, the reduction factor is the main source of improvement in computational efficiency. It is important to note, however, that this number \(r(C)\) does not tell the complete story, even in the “classical” setting (\(C\cong \textbf{P}^1\)), for example for the following reasons.

  1. (1)

    There are many challenges when computing class polynomials, and even more with generalized class polynomials. See Sect. 6.

  2. (2)

    In the CM method (Sect. 5), we will want to find a j-invariant in \(\textbf{F}_p\) from a point in \(C(\textbf{F}_p)\). This is done using the minimal polynomial of the j-function over \(\textbf{Q}(C)\), known as the modular polynomial (Lemma 5.1). This works best if the degree of j over \(\textbf{Q}(C)\) is small. For example, this degree is 1 for \(C = X^0(N)\), is 2 for \(C = X^0_+(N)\), and ranges from 1 to 20 in [9, Table 7.1], making \(X^0_+(119)\) a good choice in this respect.

  3. (3)

    If the (generalized) class polynomial is not real, then its coefficients lie in an imaginary quadratic extension of \(\textbf{Q}\); roughly doubling its bit size. This issue can be avoided by imposing additional restrictions on \(C\) or \(\tau \), see Sects. 4.2 and 4.3.

On the other hand, there are two important tricks that may be used in complementary directions, providing computational improvements beyond the reduction factor \(r(C)\):

  1. (1)

    Under some constraints, typically when all primes dividing the level of the modular curve ramify in the CM field, both the degree and height of the class polynomial are cut in half. This happens for example in the record-computation of [14] for the Atkin invariant \(A_{71}\) when 71 divides the discriminant, leading to class polynomials that are \(2^2\cdot 36 = 144\) times smaller than the Hilbert class polynomial (note that the reduction factor \(r(A_{71})\) is 36 in this case). The same trick also applies to generalized class polynomials, see Sect. 4.4, which in the case of \(X^0_+(119)\) leads to a factor \(2^2\cdot 72 = 288\) in size reduction.

  2. (2)

    When the class number is composite, one can decompose the ring class field into a tower of fields whose defining polynomials have smaller degrees, also leading to a significant speed-up in the CM method [34].

These last two tricks only work when the class number is composite. We expect both of them to work well for generalized class polynomials, so will mainly restrict to the case of prime class number in our examples, as this more clearly illustrates the role of the parameter \(r(C)\).

The goal of the rest of this section is to show under some hypotheses that the reduction factor \(r(C)\) is indeed an asymptotic reduction factor of the size of the polynomials involved. For that, we will first introduce the appropriate notions of “size”.

3.2 Measures of polynomials and heights of their roots

For a polynomial \(A\in \textbf{C}[X]\), let \(|A|_1\) (resp. \(|A|_\infty \)) be the sum (resp. maximum) of the absolute values of the coefficients of A. The Mahler measure of a polynomial \(A = a\prod _{i=1}^{n} (X-\alpha _i)\in \textbf{C}[X]\) is

$$\begin{aligned} {\mathcal {M}}(A) = |a| \prod _{i} \max \{1,|\alpha _i|\}. \end{aligned}$$

Lemma 3.4

We have

$$\begin{aligned} |A|_\infty \quad&\le \quad |A|_1 \quad \le \quad (n+1)|A|_\infty ,\\ {\mathcal {M}}(A)\quad&\le \quad |A|_1 \quad \le \quad 2^n {\mathcal {M}}(A),\\ \left| \log |A|_1 - \log |A|_\infty \right|&\quad \le \quad \log (n+1),\\ \left| \log |A|_\infty - \log ({\mathcal {M}}(A)) \right|&\quad \le \quad n\log (2). \end{aligned}$$

Proof

The first two inequalities are by definition and the third is Equation (6) of [23]. For its converse, observe that we have \(|AB|_1 \le |A|_1|B|_1\), and hence also \( |A|_1 \le |a| \prod _i \max \{2, 2|\alpha _i|\} \le 2^n{\mathcal {M}}(A).\) Then take logarithms. \(\square \)

For an element \(\alpha \) in a number field L of degree n, we define its (absolute logarithmic) height to be

$$\begin{aligned} h(\alpha ) = \frac{1}{n}\sum _{v} \max \{0,\log |\alpha |_v\}, \end{aligned}$$

where the sum ranges over the Archimedean and non-Archimedean absolute values, suitably normalized (that is, those denoted \(||\cdot ||_v\) in [19, §B.1]). If \(\alpha \) is a root of an irreducible \(A\in \textbf{Z}[X]\) of degree n, then we have

$$\begin{aligned} \log ({\mathcal {M}}(A)) = n h(\alpha ). \end{aligned}$$
(3.5)

Remark 3.6

Another measure for the complicatedness of A would be its total bit size, or the sum s of the logarithms of the absolute values of the nonzero coefficients. We will instead focus on \(|A|_\infty \) for the following reasons.

First of all, for computational purposes, it is more useful to look at \(p = \deg (A)\cdot \log |A|_\infty \), as the required precision (or number of primes with the CRT approach) is proportional to \(\log |A|_\infty \) and the number of computations to do with that precision is proportional to \(\deg (A)\).

Secondly, we get the impression from numerical computations that s is close to p. For example, the value of s/p is spread out over the interval (0.75, 0.9) for the larger discriminants in both Sect. 4.5 and Example 7.4.

Finally, it is hard to prove lower bounds on s other than \(s\ge \log |A|_\infty \), as it seems to already be hard to show that a sufficient proportion of coefficients is nonzero.

3.3 Proof of the height reduction

Theorem 3.7

Let C be a modular curve over \(\textbf{Q}\) and suppose that C is an elliptic curve of rank 0 with Weierstrass coordinates x and y. Suppose that \(\tau \in \textbf{H}\) ranges over a sequence of imaginary quadratic points for which C yields real generalized class polynomials \(H_\tau [C]\), and with

$$\begin{aligned} \frac{h(j(\tau ))}{\log (\log (\#\textrm{Cl}({\mathcal {O}})))}\rightarrow \infty . \end{aligned}$$
(3.8)

Scale each \(H_\tau [C]\) such that it has coprime coefficients in \(\textbf{Z}\). Then

$$\begin{aligned} d\cdot \frac{\log |H_\tau [C]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{1}{r(C)}, \end{aligned}$$

where d is the degree of \(K_{\mathcal {O}}\) over \(K(\psi (\tau ))\).

Remark 3.9

We argue that the hypothesis (3.8) is very reasonable. Under GRH, we have

$$\begin{aligned} \#\textrm{Cl}(\mathcal {O}) = O(\sqrt{|D|}\log (\log |D|)), \end{aligned}$$
(3.10)

where D is the discriminant of \(\mathcal {O}\) (see [22, 9.Theorem 1 and 11. on page 371], suitably extended to arbitrary D.)

Moreover, [8, §6.2] gives the approximation \(\log |H_\tau [j]|_\infty \approx \pi \sqrt{|D|} S(D)\), with \(S(D) = \sum _Q a^{-1}\), where the sum ranges over reduced primitive quadratic forms \(Q = ax^2+bxz+cz^2\) of discriminant D. We now give a heuristic lower bound of this sum on average over all \(|D|\le X\). We have \(\sum _D S(D) \approx \sum _{Q} a^{-1}\), where this time the sum is taken over all reduced quadratic forms of negative discriminant \(>-X\) (using the heuristic that imprimitive forms have a negligible contribution). As we are only computing a lower bound, we may restrict to \(a \le \sqrt{X/8}\). Then b ranges from \(-a\) to a, and c ranges from a or \(a+1\) to \(\lfloor (X+b^2)/(4a)\rfloor \); a range that contains at least \(\lfloor X/(8a)\rfloor \) integers. This yields at least roughly X/4 values of b and c for each a, hence \(\sum _D S(D)\) is roughly at least \((X/4)\sum _{a^2\le X/8} a^{-1} \ge \frac{1}{8} X\log (X)\).

It follows that the average S(D) is at least proportional to \(\log |D|\). Thus, for “average” S(D), we have that \(\log |H_\tau [j]|_\infty \) is at least proportional to \(\sqrt{|D|}\log |D|\). Combined with (3.10), (3.5), and Lemma 3.4, we find for such D that \(h(j(\tau ))/\log (\log (\#\textrm{Cl}(\mathcal {O})))\) is at least proportional to \(\log |D| / (\log (\log |D|))^2\). We thus see that (3.8) indeed holds for “average” S(D).

Theorem 3.7 is the analogue of the following result.

Theorem 3.11

(cf. Enge-Morain [8]) Let f be a modular function and suppose that \(\tau \in \textbf{H}\) ranges over a sequence of imaginary quadratic points for which \((f,\tau )\) is a class invariant with \(h(j(\tau ))\rightarrow \infty \). Then \(d\cdot \frac{\log |H_\tau [f]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{1}{r(f)}\), where d is the degree of \(K_{\mathcal {O}}\) over \(K(f(\tau ))\).

The goal of the remainder of Sect. 3 is to prove Theorem 3.7. We start with a proof of Theorem 3.11.

Proof

Let m be the degree of \(K(f(\tau ))\) over K and let \(n = dm\) be the degree of \(K_{\mathcal {O}}\) over K. By Lemma 3.4 and (3.5), we get \(|\frac{1}{n}\log |H_\tau [j]|_\infty - h(j(\tau ))|\le \log (2)\) and \(|\frac{d}{n}\log |H_\tau [f]|_\infty - h(f(\tau ))| \le \log (2)\).

As \(h(j(\tau ))\rightarrow \infty \), we also get

$$\begin{aligned} \frac{h(f(\tau ))}{h(j(\tau ))} \rightarrow \frac{1}{r(f)} \end{aligned}$$
(3.12)

by [19, Proposition B.3.5(b)]. Altogether, this gives the result. \(\square \)

Proposition 3.13

Let C be a modular curve over \(\textbf{Q}\) and suppose that C is an elliptic curve of rank 0 with Weierstrass coordinates x and y. For every imaginary quadratic \(\tau \in \textbf{H}\) for which C yields a real generalized class polynomial \(H_\tau [C]\), let m be the degree of \(K(\psi (\tau ))\) over K and let \({d'}\in \{1,2\}\) be the degree of \(K(\psi (\tau ))/K(x(\tau ))\). Scale each \(H_\tau [C]\) such that it has coprime coefficients in \(\textbf{Z}\). Then we have

$$\begin{aligned} \left| \log |H_\tau [C]|_\infty - \frac{{d'}}{2}\log |H_\tau [x]|_\infty \right| < B\max \{1,m\log (\log (m))\}, \end{aligned}$$

for some constant B that only depends on C and the choice of Weierstrass model.

Proof

We first put the equation for C in a nice form. We have \(C : y^2 + g(x)y = f(x)\). Without loss of generality we have \(g=0\) and \(f\in \textbf{Z}[X]\) monic of odd degree such that \(f(z)\le -1\) for all real \(z\le 0\). Indeed, we obtain \(g=0\) by the substitution \(y' = y+\frac{1}{2}g(x)\), then do scalings \(x' = vx\) and \(y'=wy\) to make f integral and (thanks to its odd degree) monic, and then do a substitution \(x' = x+c\) to make \(f(z)\le -1\) for all \(z\le 0\). This affects \(H_\tau [C]=A+BY\) and \(H_\tau [x]\) as follows. The first substitution changes A into \(A+\frac{1}{2} g(X)B\), the second changes A into A(vX) and B into wB(vX), and the third changes A into \(A(X+c)\). Each of these substitutions change \(\log (\max \{|A|_1,|B|_1\})\) at most by O(m), as does clearing the denominators afterwards.

Next, we relate a norm of \(H_\tau [C]\) to \(H_\tau [x]\). The extra elliptic curve point \((a/b^2,c/b^3) := \sum _{\sigma \in G} \sigma (\psi (\tau ))\in C(\textbf{Q})\) from (2.4) (which is minus the Heegner point) is torsion by our assumption that \(C\) has rank 0. There are finitely many torsion points in \(C(\textbf{Q})\), hence finitely many possibilities for the polynomial \(T = b^2X-a\). Writing \(H_\tau [C]= A(X) + B(X)Y\), we get that \(N(H_\tau [C]) = A(X)^2 + (-f(X))B(X)^2\) has the same divisor as the primitive polynomial \(H_\tau [x]^{d'}\cdot T\), hence there is a constant \(s\in \textbf{Z}\setminus \{0\}\) with \(N(H_\tau [C]) = s H_\tau [x]^{d'}\cdot T\).

We claim that \(s = \pm 1\). If not, take a prime \(p\mid s\) and consider the highest-weight term of \((H_\tau [C]\bmod p)\), where X has weight 2 and Y has weight \(\deg (f)\). This gives rise to the highest-degree term of \((N(H_\tau [C])\bmod p)\), which is therefore nonzero, a contradiction.

Now we use interpolation to bound \(H_\tau [C]\) in terms of \(H_\tau [x]\). We will choose interpolation points \(z= g(i) \le 0\). Note that for \(z\le 0\) we have

$$\begin{aligned} A(z)^2, B(z)^2 \le A(z)^2 + (- f(z))B(z)^2 = N(H_\tau [C]) \le \max \{1,|z|\}^m |H_\tau [x]|^e_1 |T|_1, \end{aligned}$$

and since there are finitely many polynomials T, we get

$$\begin{aligned} \log |A(z)|, \log |B(z)| \le \frac{m}{2} \max \{0,\log |z|\} + \frac{{d'}}{2} \log |H_\tau [x]|_1 + O(1). \end{aligned}$$

Interpolation then gives, for \(P \in \{A,B\}\):

$$\begin{aligned} P(X) = \sum _{i=1}^{k} P(g(i)) \prod _{j\not =i} \frac{X-g(j)}{g(i)-g(j)}, \end{aligned}$$
(3.14)

where \(k = \deg (P)+1 = O(m)\).

Taking \(g(u) = -\log (eu)^2\), we find \(|g(i)-g(j)| \ge |i-j|\min _{z\in [1,k]} |g'(u)| = |i-j|\min _{u\in [1,k]} 2\frac{\log (eu)}{u} = 2|i-j|\frac{\log (ek)}{k}\). So for each i there are at most \(k/\log (k)\) values of \(j\not =i\) with \(|g(i)-g(j)| < 1\) and each of them has \(|g(i)-g(j)| \ge 1/k\). We get

$$\begin{aligned} \log \prod _{j\not =i} \frac{1}{|g(i)-g(j)|} \le (k/\log (k)) \log (k) = k = O(m). \end{aligned}$$

For the other factors in (3.14), we have \(\log |X-g(j)|_1 \le \log (1 + \log (em)^2) = O(\log (\log (m)))\), so \(\log \prod _{j} |X-g(j)|_1 = O(m\log (\log (m)))\), as well as \(\log |P(g(i))| \le \frac{{d'}}{2}\log |H_\tau [x]|_1 + O(m\log (\log (m)))\). Taking the sum in (3.14) gives another \(+\log (k)\), so that the end result is \(\log |P(X)|_1 \le \frac{{d'}}{2}\log |H_\tau [x]|_1 + O(m\log (\log ( m)))\). By Lemma 3.4, this also holds with \(|\cdot |_\infty \), which proves the upper bound on \(\log |H_\tau [C]|_\infty \).

For the lower bound, note that \(H_\tau [x]^{d'}\) is a factor of \(Q =A^2-f(X)\cdot B^2\), and we have \(|Q|_1 \le |A|_1^2 + |f|_1 |B|_1^2 \le |f|_1(m+1)^2|H_\tau [C]|_\infty ^2\). Using the fact that \({\mathcal {M}}\) is multiplicative by definition and is related to \(|\cdot |_1\) and \(|\cdot |_\infty \) by Lemma 3.4, we get exactly what we need: \({d'}\log |H_\tau [x]|_\infty \le {d'}\log {\mathcal {M}}(H_\tau [x]) + O(m) \le \log {\mathcal {M}}(Q) + O(m) \le \log |Q|_1 + O(m) \le 2\log (|H_\tau [C]|_\infty ) + O(m)\). \(\square \)

Proof of Theorem 3.7

Denote again by \(n=\#\textrm{Cl}(\mathcal {O})\) the degree of \(K_{\mathcal {O}}\) over K. First we apply Theorem 3.11 to x and get \(d{d'}\frac{\log |H_\tau [x]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{2}{r(C)}\). Proposition 3.13, together with the hypothesis \(h(j(\tau ))/(n\log (\log (n)))\rightarrow \infty \), gives \(\frac{1}{{d'}}\frac{\log |H_\tau [C]|_\infty }{\log |H_\tau [x]|}\rightarrow \frac{1}{2}\) (as in the proof of Theorem 3.11). The product of these two limits gives the result. \(\square \)

Remark 3.15

Theorem 3.7 states that asymptotically the effect of the choice of a model of the curve \(C\) is negligible, as is the effect of replacing f by 2f or \(f+1\) or any other element of \(\textbf{Q}(f)\) in Theorem 3.11.

However, in practice the error terms can be quite large and depend on these choices. For example, if f is integral over \(\textbf{Z}[j]\) then \(H_\tau [f]\) is monic, and if \(f^{-1}\) is integral over \(\textbf{Z}[j]\), then f has zero constant coefficient. This can make a difference in practical examples as it forces the coefficients at the beginning and end to be small, though this improvement is negligible asymptotically by the theorems. See also Remark 3.6.

4 Class invariants for \(X^0(N)\) and \(X^0_+(N)\)

In this section we assume that C is a quotient over \(\textbf{Q}\) of \(X^0(N)\); in other words, C is a smooth, projective, geometrically irreducible curve over \(\textbf{Q}\) with function field consisting only of modular functions for \(\Gamma ^0(N)\) that have rational q-expansion. We will show how to obtain generalized class functions for every discriminant \(D<0\) that is square modulo 4N (Sect. 4.1).

In some cases we get further reductions from class invariants generating subfields of \(K_{\mathcal {O}}\) or from real class polynomials (Sects. 4.24.4).

In Sects. 4.54.6 we study what this means for \(X^0_+(119)\) and in Sect. 4.7 we look for more examples of elliptic curve quotients of \(X^0(N)\).

4.1 Class invariants for \(X^0(N)\)

The following result does not require C to be an elliptic curve, except that (unless C is an elliptic curve) one needs to read the definitions in Sect. 7 for the parts about generalized class polynomials.

Proposition 4.1

(based on Schertz [30]) Let \(C = (C, \psi )\) be a quotient over \(\textbf{Q}\) of \(X^0(N)\) and let \(D<0\) be a square modulo 4N.

There exist \(a,b,c\in \textbf{Z}\) with \(a,c>0\), \(b^2-4ac = D\), \(N\mid c\), and \(\gcd (a,N) = \gcd (a,b,c) = 1\). Choose such abc, let \(\tau \in \textbf{H}\) be a root of \(aX^2 + bX + c\), with order \(\mathcal {O} = \textbf{Z}[a\tau ]\), which has discriminant D. Then we have

$$\begin{aligned} \psi (\tau )\in C(K_{{\mathcal {O}}}), \end{aligned}$$

thus giving rise to a generalized class polynomial \(H_{\tau }[C]\).

The Galois orbit of \(\psi (\tau )\) can be computed as follows. There exists an N-system, that is, there exist \(\tau _1,\ldots , \tau _n\in \textbf{H}\) such that \((\tau _i\textbf{Z}+\textbf{Z})_i\) is a system of representatives of \(\textrm{Cl}(\mathcal {O})\) and such that \(\tau _i\) is a root of \(a_iX^2 + b_iX + c_i\) with \(\gcd (a_i,N) = \gcd (a_i,b_i,c_i)=1\) and \(b_i\equiv b\ \textrm{mod}\ 2N\). Moreover, for any such choice, we have

$$\begin{aligned} \textrm{Gal}(K_{\mathcal {O}}/K)\cdot \psi (\tau ) = \{\psi (\tau _i) : i=1,\ldots ,n\}. \end{aligned}$$

Proof

For the existence of abc, take an arbitrary square root b of D modulo 4N, let \(a=1\), and \(c= (b^2-D)/4\). Then the existence of an N-system is [30, Proposition 3].

For any \(f\in \textbf{Q}(C)\), Theorem 4 of Schertz [30] states \(f(\tau ) \in K_{\mathcal {O}}\cup \{\infty \}\) and gives the \(\textrm{Gal}(K_{\mathcal {O}}/K)\)-orbit as \(\{g(N\tau _i) : i\}\), under an additional condition on the function f(1/z). However, the condition on f(1/z) is not needed, as stated in Theorems 3.9 and 4.4 of [13]. This proves the result. \(\square \)

4.2 Real class polynomials from ramification

There are some situations in which we can actually get real class polynomials, cutting the total required bit size in half. The first such situation is when all primes dividing N ramify.

Proposition 4.2

(based on Enge-Morain [9]) Let \(C = (C, \psi )\) be a quotient over \(\textbf{Q}\) of \(X^0(N)\) and let \(D<0\) be a discriminant divisible by N if N is odd and by 4N if N is even.

There exist \(a,b,c\in \textbf{Z}\) with \(a,c>0\), \(N\mid b,c\), \(\gcd (a,N)=1\), and \(b^2-4ac = D\). Choose such abc, let \(\tau \in \textbf{H}\) be a root of \(aX^2 + bX + c\), with order \(\mathcal {O} = \textbf{Z}[a\tau ]\), which has discriminant D.

Then the \(\textrm{Gal}(K_{\mathcal {O}}/K)\)-orbit of \(\psi (\tau )\) is stable under complex conjugation, and hence we may take \(H_{\tau }[C]\in \textbf{Q}[X,Y]\).

Proof

If D is odd, take \(b=N\), and if D is even, take \(b=0\). If N is even, then we find \(4N\mid b^2-D\). If N is odd, then we find both \(4\mid b^2-D\) and \(N\mid b^2-D\), hence also \(4N\mid b^2-D\). Let \(a = 1\) and \(c = (b^2-D)/4\).

The complex conjugate of \(\psi (\tau )\) is \(\psi (-{\overline{\tau }})\) by the fact that the q-expansion coefficients are real. Here \(-{\overline{\tau }}\) is a root of \(aX^2 - bX + c\), and as \(N\mid b\), we can choose the N-system in Proposition 4.1 in such a way that \(-{\overline{\tau }} =\tau _i\) for some i. This proves the result. \(\square \)

4.3 Real class polynomials from \(X^0_+(N)\)

The second situation in which we get real class polynomials is when working with quotients of \(X^0_+(N)\).

Proposition 4.3

(based on Theorem 3.4 of Enge-Schertz [10]) In the situation of Proposition 4.1, suppose furthermore that C is a quotient of \(X^0_+(N)\), and that \(\gcd (c/N,N) = 1\).

Then the \(\textrm{Gal}(K_{\mathcal {O}}/K)\)-orbit of \(\psi (\tau )\) is stable under complex conjugation, and hence we may take \(H_{\tau }[C]\in \textbf{Q}[X,Y]\).

Proof

The complex conjugate of \(\psi (\tau )\) is \(\psi (-{\overline{\tau }})\) by the fact that the q-expansion coefficients are real. As \(\psi \) is invariant under the Fricke-Atkin-Lehner involution, this in turn is \(\psi (\tau ')\) with \(\tau ' = N/{\overline{\tau }}\), a root of \((c/N)X^2 + bX + Na\). As c/N is coprime to N, we can choose the N-system in Proposition 4.1 in such a way that \(\tau ' =\tau _i\) for some i. This proves the result. \(\square \)

To use this result, we will need \(\gcd (c/N,N)=1\), which can be achieved most of the time, as follows.

Lemma 4.4

If D is a square modulo 4N and \(D = F^2D_0\) for a negative fundamental discriminant \(D_0\) and a positive integer F coprime to N, then there exist abc as in Proposition 4.1 with \(\gcd (c/N,N) = 1\).

More generally, let \(D<0\) be a square modulo 4N. Then there exist abc as in Proposition 4.1 with \(\gcd (c/N,N) = 1\) if and only if all of the following do not hold.

  1. (1)

    there exists a prime \(p\mid N\) with \({{\,\textrm{ord}\,}}_p(N)\) odd and \({{\,\textrm{ord}\,}}_p(D) > {{\,\textrm{ord}\,}}_p(4N)\),

  2. (2)

    \(m:={{\,\textrm{ord}\,}}_2(N) > 0\) and D is of the form \(2^{m+1}d\) with \(d\equiv 1\ (\textrm{mod}\ 4)\),

  3. (3)

    \(m:={{\,\textrm{ord}\,}}_2(N) > 0\) and D is of the form \(2^{m}d\) with \(d\equiv 1\ (\textrm{mod}\ 8)\).

Proof

The triple (abc) exists if and only if there exists \(b\in \textbf{Z}\) such that for all \(p\mid N\): \({{\,\textrm{ord}\,}}_p(b^2-D) = {{\,\textrm{ord}\,}}_p(4N)\).

Suppose that we are not in case (1), (2), or (3). By the Chinese remainder theorem, it suffices to find one \(b\in \textbf{Z}\) for each \(p\mid N\). So let \(p\mid N\) be prime and let \(k = {{\,\textrm{ord}\,}}_p(4N)\) and \(l = {{\,\textrm{ord}\,}}_p(D)\). If \(k < l\), then as we are not in case (2), we find that k is even, and we can take \(b = p^{(k/2)}\). If \(k=l\), then we can take \(b = p^e\) with \(e>k/2\). Now the case \(k>l\) remains. As D is a square modulo 4N, there exists \(b_0\in \textbf{Z}\) be such that \(D\equiv b_0^2\ (\textrm{mod}\ 4N)\). If \({{\,\textrm{ord}\,}}_p(b_0^2-D) = {{\,\textrm{ord}\,}}_p(4N)\), then we are done, so suppose \({{\,\textrm{ord}\,}}_p(b_0^2-D)>k\).

Note that \(2{{\,\textrm{ord}\,}}_p(b_0) = l\), hence l is even. Let \(b = b_0 + p^{e}\) with e to be determined later. We get \(b^2-D = (b_0^2-D) + 2p^{e}b_0 + p^{2e}\), and the terms have valuation \(>k\), \(e+(l/2)+{{\,\textrm{ord}\,}}_p(2)\), 2e respectively.

If \(p\not =2\), then we choose \(e=k-(l/2)\), so \(2e = k+(k-l) > k\), hence \({{\,\textrm{ord}\,}}_p(b^2-D) = k\). If \(p=2\) and \(k > l+2\), then we choose \(e=k-(l/2)-1\), so \(2e = k+(k-l-2)>k\), hence \({{\,\textrm{ord}\,}}_p(b^2-D) = k\).

Now only the case \(p=2\) with \(k-l\in \{1,2\}\) remains. Write \(d = 2^{-l}D\) and \(b_1 = 2^{-(l/2)} b_0\), so \(b_1\) is odd and \(b_1^2-d\) is divisible by \(2^{k-l}\).

In the case \(k-l = 1\), we get \(b_1^2 - d \equiv 0\pmod {2}\), and we claim that this is nonzero modulo 4. Indeed, \(b_1^2\) is 1 modulo 4 and d is not (as we are not in case (2)). Therefore \({{\,\textrm{ord}\,}}_2(b_1^2-d) = 1\) and \({{\,\textrm{ord}\,}}_2(b_0^2-D) = 1+l=k\), so we take \(b = b_0\).

In the case \(k-l = 2\), we get \(b_1^2 - d\equiv 0\pmod {4}\), and we claim that this is nonzero modulo 8. Indeed, \(b_1^2\) is 1 modulo 8, and d is not (as we are not in case (3)). Therefore \({{\,\textrm{ord}\,}}_2(b_1^2-d) = 2\) and \({{\,\textrm{ord}\,}}_2(b_0^2-D) = 2 + l = k\), so we take \(b = b_0\).

Conversely, suppose that b exists.

In case (1), we have \({{\,\textrm{ord}\,}}_p(D)> {{\,\textrm{ord}\,}}_p(4N)\), hence \(2{{\,\textrm{ord}\,}}_p(b) ={{\,\textrm{ord}\,}}_p(4N)\) is odd, contradiction.

In case (2), we have \({{\,\textrm{ord}\,}}_2(b^2-2^{m+1}d) = m+2\), hence \(m+1 = 2{{\,\textrm{ord}\,}}_2(b)=: 2e\). Write \(b = 2^{e}b_1\) and note \({{\,\textrm{ord}\,}}_2(b_1^2-d) = 1\), but \(b_1^2-d\) is 0 modulo 4.

In case (3), we have \({{\,\textrm{ord}\,}}_2(b^2-2^{m}d) = m+2\), hence \(m = 2{{\,\textrm{ord}\,}}_2(b)=:2e\). Write \(b = 2^eb_1\) and note \({{\,\textrm{ord}\,}}_2(b_1^2-d) = 2\), but \(b_1^2-d\) is 0 modulo 8.

It remains only to prove the first statement, for which it suffices to show that the exceptions (1), (2), and (3) all imply \(\gcd (N,F)>1\). In case (1), we see that \(p^2\mid D\) and if \(p=2\), then \(p^4\mid D\), hence \(p\mid F\). In cases (2) and (3), write \(D = 2^v d\) with \(v\in \{m,m+1\}\). As D is a square modulo \(2^{m+2}\), we find that v is even, and hence \(D = (2^{v/2})^2 d\) for a discriminant d, so \(2\mid F\). \(\square \)

Lemma 4.5

Let N be the product of distinct odd primes \(p_1,\ldots , p_k\). The negative discriminants that are a square modulo 4N and not in one of the exceptions of Lemma 4.4 have density

$$\begin{aligned} \prod _{i=1}^k \frac{p_i^2 + p_i-2}{2p_i^2} \end{aligned}$$

in the set of all negative discriminants.

The negative fundamental discriminants that are a square modulo 4N (which are not in one of the exceptions of Lemma 4.4) have density

$$\begin{aligned} \prod _{i=1}^k \frac{p_i^2 + p_i-2}{2(p_i^2-1)} \end{aligned}$$

in the set of all fundamental negative discriminants.

Proof

Being a discriminant is the condition of being 0 or 1 modulo 4. It is equivalent to being a square modulo 4. This is independent of being a square modulo \(p_i\) that does not suffer from (1), which is happens for the \((p_i-1)/2\) residue classes modulo \(p_i\) that are nonzero squares modulo \(p_i\), and the \(p_i-1\) nonzero residue classes modulo \(p_i^2\) that are zero modulo \(p_i\). As \(p_i(p_i-1)/2 + p_i-1 = (p_i^2+p_i-2)/2\), we get the first statement.

Being a fundamental discriminant means being nonzero modulo the squares of all odd primes and being 1, 5, 8, 9, 12, 13 modulo 16. This happens for \(\zeta (2)^{-1} (1-1/4)^{-1} \frac{6}{16}\) of all negative integers. In order to restrict this to products that satisfy the conditions of Lemma 4.4, we have to adjust the Euler product exactly by the given factor. \(\square \)

For example, if \(N = 119 = 7\cdot 17\), then the numbers in Lemma 4.5 are \(> 0.2898\) and \(19/64 > 0.2968\).

4.4 Lower-degree class polynomials from ramification

In the case where all primes dividing N ramify, we get an even greater size reduction. The point \(\psi (\tau )\) will then be defined over a subfield, cutting the degree of its minimal polynomial in half. This in turn also cuts the height of the coefficients of this polynomial in half, as we get \(d\ge 2\) in Theorem 3.7. The amount of work required for computing the class polynomial, as well as the bit size of the polynomial (Remark 3.6), is related to the degree times the logarithm of the largest coefficient, and this product is reduced by a factor \(\ge 2\times 2 \times r(C)= 4r(C)\).

Proposition 4.6

(based on Enge-Schertz [12]) Let \(C = (C, \psi )\) be a quotient over \(\textbf{Q}\) of \(X^0(N)\) and let \(D=F^2D_0<0\) be such that \(N\mid D\), \(\gcd (F,N)=1\), and \(D\not \in \{N,4N\}\).

There exist \(a,b,c\in \textbf{Z}\) with \(a>0\), \(N\mid b\), \(c=N\), \(b^2-4ac = D\), and \(\gcd (a,b,c) = 1\). Choose such abc, let \(\tau \in \textbf{H}\) be a root of \(aX^2 + bX + c\), with order \(\mathcal {O} = \textbf{Z}[a\tau ]\), which has discriminant D.

Let \({\mathfrak {n}} = ((-b+\sqrt{D})/2, a)\), and let \(K_{\mathcal {O}}^{[{\mathfrak {n}}]}\) be the subfield of \(K_{\mathcal {O}}\) fixed by the image of \({\mathfrak {n}}\) under the Artin map. Then \([{\mathfrak {n}}]\) has order 2 in \(\textrm{Cl}(\mathcal {O})\) and \(\psi (\tau )\in C(K_{\mathcal {O}}^{[{\mathfrak {n}}]})\), where \(K_{\mathcal {O}}\) has degree 2 over \(K_{\mathcal {O}}^{[{\mathfrak {n}}]}\).

We get \(m \le \#\textrm{Cl}(\mathcal {O})/2\) in the definition of \(H_\tau [C]\), we get \(H_\tau [C]\in \textbf{Q}[X,Y]\), and we get and \(d\ge 2\) in Theorem 3.7.

If \({\mathfrak {a}}_i\) are the ideals \(\tau _i\textbf{Z}+\textbf{Z}\) of an N-system, then \(\mathfrak {a}_i\) and \(\mathfrak {a}_i{\mathfrak {n}}\) yield the same point \(\psi (\tau _i)\), while \(\mathfrak {a}_i^{-1}\) and \(\mathfrak {a}_i^{-1}{\mathfrak {n}}\) yield \(\overline{\psi (\tau _i)}\).

Proof

This is exactly what we get when applying [12, Theorem 9] to the coordinate functions f of C. \(\square \)

4.5 Numerical results for \(X^0_+(119)\)

For the rest of this section we will return to our main Example 2.7, so set \(N=119=7\cdot 17\). For any \(\tau \) as in Proposition 4.3, we have \(H_{\tau }[C]\in \textbf{Q}[X,Y]\). By scaling, we may assume that the coefficients of \(H_{\tau }[C]\) are integral and coprime, and that the leading coefficient (i.e. the coefficient of the monomial of highest degree as a function on C) is positive, and this uniquely determines \(H_{\tau }[C]\in \textbf{Z}[X,Y]\).

For any discriminant \(D<0\) coprime to N such that D is a square modulo N, there are two generalized class polynomials (depending on the choice of \(\tau \)). We experimentally computed both of these for all fundamental discriminants of prime class number \(<100\). The main reason for restricting to prime class number is to exclude the two tricks of Remark 3.3; for these discriminants, the reduction factor thus provides a fair comparison with the Hilbert class polynomial. The method we employ numerically evaluates class invariants by their q-expansions, and finds a minimal polynomial relation (2.3) using lattice basis reduction (LLL). We leave faster methods for future research, but see Sect. 6 for the first ideas. Since the q-expansions can only be evaluated up to finite precision, this does not result in provably correct polynomials, although – based on heuristic estimates – they are highly unlikely to be incorrect.

A few examples of computed polynomials are listed in Table 1. Here, for the given discriminant D, we consistently chose \(\tau \) such that its primitive equation is \(X^2+bX+(b^2-D)/4\) with \(b\in \textbf{Z}_{>0}\) minimal satisfying \(b^2\equiv D\pmod {4N}\) and \(\gcd ((b^2-D)/(4N) ,N) = 1\).

Table 1 Some conjecturally correct generalized class functions for \(C=X^0_+(119)\)

Still assuming that \(H_{\tau }[C]\) is scaled such that it has coprime coefficients in \(\textbf{Z}\), we denote by

$$\begin{aligned} r_A(\tau ):=\frac{\log |H_{\tau }[j]|_{\infty }}{\log |H_{\tau }[C]|_{\infty }} \end{aligned}$$

the practical reduction factor of \(\tau \). Under the assumption \(h(j(\tau ))/\log (\log (n))\rightarrow \infty \) for \(n=\#\textrm{Cl}(\mathcal {O})\) (cf. Theorem 3.7) we have \(d^{-1}r_A(\tau )\rightarrow r(C)\). Experimentally obtained practical reduction factors, plotted against both the class number n and \(\log (|H_{\tau }[j]|_{\infty })/\log (\log (n))\), can be seen in Fig. 1. To visualize the role of the class number and the hypothesis \(h(j(\tau ))/\log (\log (n))\rightarrow \infty \), the points of higher class number are given a darker color in the second figure.

Fig. 1
figure 1

Practical reduction factors for \(H_{\tau }[X^0_+(119)]\) for fundamental discriminants D with \(\gcd (D,N)=1\) and prime class number \(n<100\)

The values of the practical reduction factor \(r_A(\tau )\) seem to be around their expected asymptotic value \(r(C)=72\) (represented by the horizontal grey line), though the convergence is not apparent; especially compared to, e.g. some classical class polynomials [8, Fig. 1]. However, in practical applications (see Sect. 5), the class numbers employed are typically several orders of magnitude higher (cf. e.g. [34]), so here we expect the speed of convergence not to cause major deviations in expected running times (cf. Sect. 6). For small class numbers, one can in practice even take advantage of this phenomenon by constructing generalized class polynomial with surprisingly good practical reduction factors by selecting a basis of \(\mathcal {L}(\infty \mathcal {D})\) different from \(1,x,y,x^2,xy,\ldots \) (see Example 7.4).

4.6 Comparison with existing class invariants

Real class invariants typically arise subject to congruence conditions on the discriminant. For example, Weber’s functions with reduction factor 72 are not known to give class invariants for discriminants \(\equiv 5\pmod {8}\). The reduction factors obtained by class invariants coming from the family of (double) eta quotients \({\mathfrak {w}}_n\) and \({\mathfrak {w}}_{p,q}\) (such as the Weber function \({\mathfrak {w}}_2\), as well as the function \({\mathfrak {w}}_{7,17}\) of Example 2.7) have been extensively studied; cf. most notably [9]. These modular functions are not known to yield class invariants if D is not a square modulo 4n or 4pq. Hence, to the best of our knowledge, they also are not applicable to discriminants \(\equiv 5\pmod {8}\) as soon as n, p or q is even. Excluding these cases, the (double) eta quotient with highest known reduction factor is \({\mathfrak {w}}_9\), with a reduction factor of 36 [9, Table 7.1].

A less-studied generalization are multiple eta quotients [12], which are quotients of products of \(2^k\) eta functions. As far as we know these do not yield reduction factors better than 36 for \(k>1\).

The only other known family of “good” class invariants (in the sense that they have large reduction factors) are the Atkin functions \(A_p\) for prime numbers p, defined to be the smallest-degree functions in \({\mathcal {L}}(\infty \mathcal {D})\), where \(\mathcal {D}\) is the unique cusp of \(X^0_+(p)\). The “best” known one here is \(A_{71}\), again with a reduction factor of 36, owing to the fact that \(X^0_+(71)\) has genus zero [14, §3]).

The curve \(C=X^0_+(119)\) has a reduction factor \(r(C) = 72\) and yields real class invariants whenever D is a square modulo \(4\cdot 7\cdot 17\) and not divisible by \(7^2\) or \(17^2\). The set of such D has density \(> 28.98\%\) among the set of all negative discriminants (by Lemma 4.5). Out of these discriminants, one-fourth are \(\equiv 5\pmod {8}\). Hence, for at least \(28.98\%\cdot \frac{1}{4}> 7.24\%\) of imaginary quadratic discriminants, the reduction factor exceeds the previously best known reduction factors by a factor of at least two.

Remark 4.7

One should note that the above comparison does not take into account the discussion of Remark 3.3. Most importantly, the reduction factor is not synonymous with the true size reduction of the class polynomials. Indeed, as noted in that remark, the record-breaking CM construction [34] uses the Atkin invariant \(A_{71}\) of reduction factor 36, because the effective size reduction of class polynomials is by a factor of roughly \(2^2\cdot 36 = 144\) for certain discriminants. However, by Sect. 4.4, the same trick applies to generalized class polynomials, leading for \(X^0_+(119)\) to a size reduction of \(2^2\cdot 72 = 288\), again for a positive density subset of discriminants. In Fig. 2 we plot the practical reductions in bit size we found compared to the Hilbert class polynomial using this trick.

Fig. 2
figure 2

Bit-length reduction for \(H_{\tau }[X^0_+(119)]\) for discriminants \(D\equiv 0\pmod {119}\) of class number \(n<100\).

Remark 4.8

Note that the “classical” class polynomial \(H_\tau [x]\), arising from the function x on \(X^0_+(119)\) by itself attains a reduction factor of 36 for the same \(28.98\%\) of discriminants. This beats all previously-known class invariants for a smaller subset (\(\approx 1.2\%\)) of discriminants: those that additionally are non-square modulo both 3 and 71. This x can be viewed as a generalisation of the Atkin functions to non-prime levels: it is the function of minimal degree in \({\mathcal {L}}(\infty \mathcal {D})\) for one of the cusps \(\mathcal {D}\) of \(X^0_+(119)\).

Similarly, the degree-two map of the hyperelliptic curve \(X^0_+(191)\) (not to be confused with 119) has reduction factor 48, as observed by David Kohel in the AGC\({}^2\)T 2021 Zulip group chat. This beats the reduction factor 32 of the Atkin function \(A_{191}\) of degree 3 on the same curve (see Example 7.2).

This shows that the search for generalized class invariants can even uncover new “classical” class invariants.

4.7 More modular curves of genus one

We searched for more elliptic curves that could be used, and the results are in Tables 2, 3, and 4. In our search, we used the fact that \(X_0(N)\) is well-studied and that there is an isomorphism \(X_0(N)\rightarrow X^0(N): z\mapsto Nz\). Surpisingly, we found lots of elliptic curves with reduction factor 72 and no elliptic curves with a greater reduction factor.

In Sect. 7, we will allow curves of higher genus, which do achieve arbitrarily high values of r(C). Moreover, our search is by no means exhaustive, as Tables 2 and 3 restrict to maps \(\phi : X\rightarrow C\) of degree \(\le 2\) and Table 4 only looks at one curve \(X = X^0(N)\) per isomorphism class of curves C. For example, the curve \(C = X^0_+(119)\) has \(r(C) = 72\). However, in the Cremona database, it is listed as 17a4, and comes with a modular parametrization \(\phi _{17} : X_0(17)\rightarrow C\) of degree 1, which has \(r(\phi _{17}) = 18\). This is why C does not appear in Table 4.

Finally, the tables are restricted to quotients of \(X^0(N)\). Letting go of \(X^0(N)\), we find that the genus-one modular curves \(7C^1\), \(8K^1\), \(9H^1\), \(12V^1\), \(15I^1 = X_1(15)\), \(16M^1\), \(24J^1\), \(27C^1\), \(32E^1\) in the Pauli-Cummins database [6] all achieve \(r(C) \in \{84, 96, 108\}\). We have not pursued these curves yet, as Proposition 4.1 does not apply to them.

Table 2 The curves \(X = X^0_+(N)\) for which there exists a map \(\phi : X \rightarrow C\) of degree \(\le 2\) with \(g(C) \le 1\) and \(r(C) \ge 48\)
Table 3 The curves \(X = X^0(N)\) for which there exists a map \(\phi : X \rightarrow C\) of degree \(\le 2\) with \(g(C) \le 1\) and \(r(C) \ge 48\) and N is not already in Table 2
Table 4 The elliptic curves \(E/\textbf{Q}\) of conductor \(< 500.000\) such that the modular parametrization \(\phi : X \rightarrow E\) according to the LMFDB [5, 35, 36] gives \(r(C) \ge 66\) or gives \(r(C) \ge 48\) and odd N

5 Application: the CM method

Class polynomials are used in the CM method for constructing elliptic curves over finite fields with a specified characteristic polynomial of Frobenius.

The input to the CM method is a monic quadratic polynomial \(P = x^2 - tx + q\in \textbf{Z}[x]\), where q is a prime power coprime to t, and the discriminant \(d=t^2-4q\) is negative. The output is an elliptic curve \(E/\textbf{F}_q\) with \(q+1-t\) rational points, which has P as its characteristic polynomial of Frobenius.

The algorithm of the classical CM method (without using class invariants for now) is as follows. Let \(K = \textbf{Q}(\sqrt{d})\).

  1. (1)

    Compute the Hilbert class polynomial \(H_K\) of \({\mathcal {O}}_K\).

  2. (2)

    Find a root \(j_0\in \textbf{F}_q\) of \(H_K\) (which is known to split into linear factors in \(\textbf{F}_q\)).

  3. (3)

    Construct an elliptic curve \(E/\textbf{F}_q\) with \(j(E)=j_0\). Compute all twists of E and return the one with \(q+1-t\) rational points.

In practice, one can discard the curves for which \((q+1-t)Q\not =O\) for some random point Q, although there are also more straightforward methods to select the correct twist [28].

As the degree and height of the Hilbert class polynomial grow quickly with the absolute value of the discriminant \(\Delta _K\) of K, the CM method is only feasible for small values of \(|\Delta _K|\). The record computation of [34] uses class invariants, specifically arising from the Atkin function \(A_{71}\). Combined with the tricks listed in Remark 3.3 this allows to handle a case where \(|\Delta _K| > 10^{16}\).

We will now describe how to apply the CM method using generalized class polynomials. Hence let C be an elliptic modular curve. Since we are working with alternative class invariants instead of the usual j-invariant, we will relate the two using modular polynomials as follows.

Lemma 5.1

Let \(d_j:=[\textbf{Q}(C,j):\textbf{Q}(C)]\). Then there exists a polynomial \(\Psi _C=\sum _{i=0}^{d_j} f_iZ^i\in \textbf{Z}[X,Y][Z]\) of degree \(d_j\) in Z such that

  1. (i)

    \(\Psi _C(j)=0\);

  2. (ii)

    \(\deg _{Y}(f_i) \le 1\) for each i;

  3. (iii)

    the coefficients (in \(\textbf{Z}\)) of \(\Psi _C\) viewed as an element of \(\textbf{Z}[X,Y,Z]\) are coprime;

  4. (iv)

    viewed as elements of \(\textbf{Q}(C)\), the \(f_i\) have at most one common zero in \(C({\overline{\textbf{Q}}})\).

Furthermore, \(\Psi _C\) is unique up to sign.

Proof

Consider the minimal polynomial \(\Psi _C^0 = \sum _{i=0}^{d_j} g_iZ^i\in \textbf{Q}(C)[Z]\) of j over \(\textbf{Q}(C)\). Let

$$\begin{aligned} \mathcal {E}:=\sum _{P\in C\setminus \{O\}} \min _i({{\,\textrm{ord}\,}}_P(g_i))(P). \end{aligned}$$

Then \(\mathcal {E} - \left( \sum _{P\in C}{{\,\textrm{ord}\,}}_P(\mathcal {E})P\right) - (\deg (\mathcal {E})-1)(O)\) is a \(\textbf{Q}\)-rational principal divisor. There is a unique function g up to \(\textbf{Q}^\times \)-scaling such that \(\textrm{div}(g)=\mathcal {E}\). Dividing each \(g_i\) by g gives \(g_i\in {{\mathcal {L}}}(\infty (O)) = \textbf{Q}[x,y]\) satisfying (iv) and unique up to \(\textbf{Q}^\times \). Now take representatives \(f_i\) satisfying (ii) and scale to get (iii), which makes \(\Psi _C\) unique up to sign. \(\square \)

For each curve C with which we would like to apply the generalized CM method, the polynomial \(\Psi _C\in \textbf{Z}[X,Y,Z]\) can be precomputed and stored. Next we need a criterion for which discriminants D yields class invariants. For example, if \(C=X^0_+(N)\) then this is given by Proposition 4.1. Now, given a desired characteristic polynomial of Frobenius \(x^2-tx+q\) such that \(D=t^2-4q\) satisfies this criterion, we have the following algorithm for sufficiently large |D|.

  1. (1)

    Compute a generalized class function F of discriminant D as well as its Heegner point Q.

  2. (2a)

    Find a zero \(P=(x,y)\in C(\textbf{F}_q)\) of F that is neither \(-Q\) nor a common root of the polynomials \(f_1,\ldots , f_{d_j}\) of Lemma 5.1.

  3. (2b)

    Find all roots \(j_0\in \textbf{F}_q\) of the polynomial \(\Psi _C(x,y,Z)\in \textbf{F}_q[Z]\).

  4. (3)

    For each \(j_0\), construct an elliptic curve \(E/\textbf{F}_q\) with \(j(E)=j_0\) and all of its twists up to isomorphism over \(\textbf{F}_q\). Return one with \(q+1-t\) rational points.

The main advantage compared to the classical CM method, both in terms of memory and speed, is expected to be in the (dominant) first step (1) (see Sect. 6). Out of the computationally non-dominant steps, only (2a) is less straightforward. One way to proceed would be as follows.

  1. (i)

    Compute \(F_x:=N_{\textbf{F}_q(C)/\textbf{F}_q(x)}(F)\).

  2. (ii)

    Find a root \(x\in \textbf{F}_q\) of \(F_x\).

  3. (iii)

    Solve for the corresponding value of y using the linear polynomial \(H_\tau [C](x,Y)\), or continue with both solutions y coming from the Weierstrass equation.

Remark 5.2

The polynomial \(F_x\) is very close to the classical class polynomial \(H_{\tau }[x]\); indeed, it has the same roots, together with one additional root at the x-coordinate of the Heegner point of F. The norm computation in step () is however computationally asymptotically dominated by the computation of F.

6 The computational benefits of our invariants

6.1 Space complexity of the functions

The advantage of using generalized class functions lies in their size. This already gives a serious advantage when storing one or more class polynomials for later use, e.g. for various values of q in the CM method. Additionally, one would expect the smaller size to make the generalized class polynomials less expensive to compute. Again for C a modular elliptic curve with a given Weierstrass model, we present a preliminary analysis of the cost of computing a generalized class polynomial \(H_{\tau }[C]\) when compared to the “classical” class polynomial \(H_{\tau }[x]\) (though recall that the latter already dominates all previously-known class invariants along a positive density subset of discriminants for \(C=X^0_+(119)\), cf. Sect. 4.6).

6.2 Speed of complex analytic computation

We now explain how to adapt the complex analytic approximation algorithm to generalized class polynomials.

To compute the classical class polynomial \(H_\tau [x]\) one first evaluates \(x(\tau )\) and all its conjugates, which are of the form \(x_i(\tau _i)\), where \(x_i\) and \(\tau _i\) can be obtained using Shimura’s reciprocity law [18] or N-systems [30]. Then one multiplies the linear polynomials \(X-x_i(\tau _i)\) together in a binary tree using fast multiplication algorithms.

As \(H_\tau [C]\) has roughly half the height, we only need half the precision at each step. This gives a great speed-up when evaluating \(x_i(\tau _i)\), but then we also need to compute \(y_i(\tau _i)\). Fortunately that should only take a fraction of the time required for computing \(x_i(\tau _i)\), as we can first compute it to low precision and then obtain as many digits as desired quickly using

$$\begin{aligned} y = \frac{-g(x) + \sqrt{g(x)^2+4f(x)}}{2} \end{aligned}$$

for \(C : y^2 + g(x) y = f(x)\).

The binary tree step is harder to analyze. Instead of having polynomials \(A(X) = \prod _{i\in S} (X-x_i(\tau ))\) to multiply for various subsets \(S\subset \{1,2,\ldots , n\}\), we will have pairs (FQ) with \(F = A(X)+B(X)Y\) and

$$\begin{aligned} \textrm{div}(F) = \sum _{i\in S} (P_i) + (Q) - (\#S+1)\mathcal {D}. \end{aligned}$$

Instead of a single multiplication \(A_1A_2\) to go from \(S_1\) and \(S_2\) to \(S_3 = S_1\sqcup S_2\), we now need to compute the point \(Q_3 = Q_1+Q_2\) (with the elliptic curve group law) and a function \(F_3\) with

$$\begin{aligned} \textrm{div}(F_3) = \sum _{i\in S} (P_i) + (Q_3) - (\#S_3+1)\mathcal {D}= \textrm{div}(F_1)+\textrm{div}(F_2) + (Q_3) + (O) - (Q_1) - (Q_2). \end{aligned}$$

The following formula can be used:

$$\begin{aligned} F_3&= \frac{F_1\ F_2\ R \quad \textrm{mod}\quad (Y^2-f(X))}{(X-x(Q_1))(X-x(Q_2))},\quad \text{ where } \end{aligned}$$
(6.1)
$$\begin{aligned} R&= (x(Q_1)-x(Q_2))\ Y\ +\ (y(-Q_2)-y(-Q_1))\ X \end{aligned}$$
(6.2)
$$\begin{aligned}&\ \ + x(Q_2)y(-Q_1) - x(Q_1)y(-Q_2), \end{aligned}$$
(6.3)

and where the reduction modulo \(Y^2-f(X)\) keeps the outcome of degree \(\le 1\) in Y.

We can multiply \(F_1\) with \(F_2\) using three multiplications of half the degree, by the same trick that is used in Karatsuba multiplication. Indeed, let

$$\begin{aligned} C = A_1A_2,\quad D = B_1B_2,\quad \text{ and }\quad E=(A_1+B_2)(A_2+B_1) \end{aligned}$$

to get \(F_1F_2 = (C+Df) + (E-C-D)Y\). So computing \(F_3\) involves three polynomial multiplications of half the degree of \(F_1\) and \(F_2\), as well as various multiplications and long divisions by fixed-degree polynomials and various additions and subtractions. The most serious computations in the binary tree are now done with half the degree and half the number of digits, but three times as often, which takes 3/16th of the time with naive multiplication and still less than 3/4 of the time with quasi-linear-time multiplication. The impact of the extra additions and subtractions, as well as the extra multiplications by a linear polynomial in X and Y and long division by the denominator of (6.1) requires further analysis, but we expect this to be minor. Regardless, for large discriminants, the main bottleneck is in memory complexity (as noted in [7, Sect. 7]), and here we obtain an improvement of a factor of 1/2 when passing from \(H_\tau [x]\) to \(H_\tau [C]\).

6.3 Adapting the CRT method

6.3.1 Overview of CRT class polynomial computation

We now heuristically estimate the expected speed-up when computing \(H_\tau [C]\) instead of \(H_\tau [x]\) using the (currently state-of-the-art) CRT method for class polynomial computation [14, 33, 34]. We restrict to the case of C such that all q-expansion coefficients of x and y are rational, and will analyse some steps only in the main case where C is a quotient of \(X^0_+(N)\). To keep our exposition simple, we will not treat the main improvement of [34], even though we do expect it to combine well with our generalized class invariants. We plan to give a more detailed account and an implementation in future work.

For the CM method, it is more efficient to directly compute class polynomials modulo q using the online CRT as in [33, Sect. 2]. In other words, we never write down \(H_\tau [C]\in \textbf{Z}[X,Y]\), but instead compute \((H_\tau [C]\ \textrm{mod}\ q)\in \textbf{F}_q[X,Y]\) directly from \((H_\tau [C]\ \textrm{mod}\ p)\) for p in a set S of small primes. The space complexity of the CM method is then \(n\log (q)\), which is independent of our choice of class function. The set S must be chosen in such a way that \(\prod _{p\in S} p\) is larger than 4 times the largest coefficient.

By cutting the number of digits in half when switching from x to C, we essentially cut \(\# S\) in half. If the amount of work that we do for each prime p does not grow too much, then our class function \(H_\tau [C]\) yields a speed-up over the classical class polynomial \(H_\tau [x]\).

What needs to be done for each p is the following.

  1. (1)

    Enumerate all \(E''\) with endomorphism ring \({\mathcal {O}}\) and compute the appropriate points in \(C(\textbf{F}_p)\).

  2. (2)

    Compute \((F\ \textrm{mod}\ p)\) by putting together the information from Step 1.

In practice, for “typical” discriminants D with 9 to 14 digits, Sutherland [33, Sects. 8.3 and 8.4] finds that performing Steps 1 and 2 together \(\#S\) times is the dominant part of the CRT method.

We will now argue why we expect each of these steps to take (much) less than twice as long with the generalized class polynomial for suitable C. Together with the fact that our set S is only half the original size due to the reduction factor, this means that computing \(H_\tau [C]\) takes less time than computing \(H_\tau [x]\).

6.3.2 Enumerating via the Fricke involution

Step 1 is already very subtle in the case of a single class invariant f. Indeed, there could be multiple Galois orbits of values \(f(\tau )\) for the same order \(\mathcal {O}\), and hence multiple irreducible class polynomials \(H_{\tau _i}[f]\in K[X]\). In the CRT method, one has to make sure to compute the polynomials \((H_{\tau _i}[f]\ \textrm{mod}\ p)_p\) for the same value of i, and only for \(\tau _i\) for which this is a class invariant. This issue is addressed in detail in [14, Sect. 4].

We will first explain how to adapt one solution to our main case of quotients C of \(X^0_+(N)\) where N is coprime to the conductor of \(\mathcal {O}\) and \(D=\textrm{disc}(\mathcal {O})\) is a square modulo 4N.

We adapt the method of Sect. 4.3 of [14] as follows. We have \(\textbf{Q}(X^0(N)) = \textbf{Q}(j,j_N)\), where \(j_N(z) = j(z/N) = j(W_N z)\) for the Fricke-Atkin-Lehner involution \(W_N : z\mapsto -N/z\) (this follows for example from [32, Proposition 6.9]). In particular, every function \(f\in \textbf{Q}(C)\) for a quotient C of \(X^0(N)\) can be expressed as a rational function in j and \(j_N\). In practice, these expressions can be quite large, but (analogously to [14, Lemma 2]) we can also obtain the value f(z) as a root of \(\gcd (\Psi _{f}(X,j(z)),\Psi _{f\circ W_N}(X,j_N(z)))\) instead.

In the particular case where C is a quotient of \(X^0_+(N)\), we even have \(\textbf{Q}(C)\subset \textbf{Q}(X^0_+(N)) = \textbf{Q}(j+ j_N,j\cdot j_N)\), and we can use \(\Psi _{f}\) instead of \(\Psi _{f\circ W_N}\).

So instead of enumerating just the j-values, we wish to link them with the corresponding \(j_N\)-values, and we do that as follows.

Suppose that N is coprime to the conductor of \(\mathcal {O}\) and that D is a square modulo 4N. Then by Lemma 4.4 we get \(a,b,c\in \textbf{Z}\) with \(a,c>0\), \(b^2-4ac = D\), \(N\mid c\), and \(\gcd (ac/N,N)=\gcd (a,b,c)=1\). In line with Lemma 2 of [14] we could even take \(c=N\) by replacing a by ac/N. We take \(z = \frac{-b+\sqrt{D}}{2a}\), \({\mathfrak {n}} = a{\overline{z}}\textbf{Z}+N\textbf{Z}\), and \(\mathfrak {a}= z\textbf{Z}+\textbf{Z}\). Then we have \(\mathcal {O} = az\textbf{Z}+\textbf{Z}\), and we find that \({\mathfrak {n}}\) is an invertible \(\mathcal {O}\)-ideal with \(\mathcal {O}/{\mathfrak {n}}\cong \textbf{Z}/N\textbf{Z}\). In fact, we find \(\overline{{\mathfrak {n}}}\mathfrak {a}= z\textbf{Z}+N\textbf{Z}\) and hence

$$\begin{aligned} \sigma _{[{\mathfrak {n}}]}j(z) = j({\mathfrak {n}}^{-1}\mathfrak {a}) = j(\overline{{\mathfrak {n}}}\mathfrak {a}) = j_N(z). \end{aligned}$$

Exactly as in Sect. 4.3 of [14], we list the j-values of elliptic curves over \(\textbf{F}_p\) with endomorphism ring \(\mathcal {O}\), and arrange them into unoriented \([{\mathfrak {n}}]\)-isogeny cycles. If C is a quotient of \(X^0_+(N)\) over \(\textbf{Q}\), then for each edge of this graph, we find the f-value from the two j-values of the end points. (In the case where the \([{\mathfrak {n}}]\)-isogeny cycles are 2-cycles, we only get one f-value per 2-cycle and we get a lower-degree class polynomial \(H_\tau [f]\).)

In practice, we could do this for \(f=x\) exactly as in [14], and then solve for y using \(\Psi _{C}(x,y,j)=0\), which is linear in y. The only additional work compared to what is done in [14] is computing and solving the linear equation to get y, which is much faster than all the other steps.

In particular, Step 1 takes much less than twice as long with C than with x, while we need to do it only half as often, which leads to a speed-up. Further research into these modular polynomials is needed in order to determine the exact gain.

To also make this work for quotients of \(X^0(N)\) that are not quotients of \(X^0_+(N)\), one would need to compute oriented \([{\mathfrak {n}}]\)-isogeny cycles.

6.3.3 Other tricks for enumerating

The methods from [14, Sects. 4.1 and 4.2] also seem amenable.

The main computational tool at the beginning of Sect. 4.1 is the modular polynomial \(\Phi _{\ell ,f}\), which we generalize from f to C as follows.

Let \(\Phi _{\ell ,C}\) be a Gröbner basis of the ideal in \(\textbf{Q}[X_1,Y_1,X_2,Y_2]\) of polynomials that vanish on \(\{(\psi (z),\psi (\ell z)) : z\in \textbf{H}\}\), with respect to the lexicographic ordering with \(X_1>Y_1>X_2>Y_2\). To get from \(\psi (z)\) to all possible values of \(\psi (\ell z)\), one substitutes \(\psi (z)\) for \((X_1,Y_1)\), and then solves first for \(X_2\) and then for \(Y_2\). For each C and \(\ell \) this works for all but a finite set of primes p. Such multivariate modular polynomials would need to be precomputed. One possible starting point for computing these would be [24, 25], which compute multivariate (Hilbert) modular polynomials, each with a different method. For yet another approach to computing modular polynomials, see [3].

We expect the reduction factor to also give a reduction of the size of these multivariate modular polynomials, but on the other hand, we need two of them: one to solve for x of an isogenous curve, and one to evaluate in x and get y. As evaluating is faster than solving, we expect the use of these modular polynomials to take much less than twice as long (and we need to do it only half as often, because we have half as many primes).

The ‘Trace Trick’ of [14, Sect. 4.2] enables the use of the Weber function \({\mathfrak {f}}\) in the CRT method. In case we would also need this trick, for some more exotic curves C, we could consider applying it with arbitrary functions \(f\in \textbf{Q}(C)\) such as \(f = ax+by\) for small integers a and b. In loc. cit. the relevant trace is computed with much fewer primes, so it is ok to apply this with the lower reduction factor of f.

We did not yet consider the general algorithm of [14, Sect. 4.4]. It is the method that works for all class invariants, but is only practical under additional restrictions. We do not have examples of generalized class invariants where this trick is needed. The challenging step to generalize is factoring a large-degree function in \(\textbf{Q}(C)\) in order to obtain the small class functions.

6.3.4 Constructing a function from its roots

In the CRT setting the multiplications and long-divisions by small-degree polynomials of Sect. 6.2 only take time \(O(nM(\log (p)))\) per level, which is asymptotically dominated by the \(O(M(n\log (p))\) time of the multiplications of large-degree polynomials. Therefore, Step 2 seems to take about 1.5 times as long per prime p for \(H_\tau [C]\) when compared to \(H_\tau [x]\).

6.3.5 The total running time

Concluding this preliminary analysis, we estimate the cost of computing \(H_\tau [C]\) to be significantly lower compared to \(H_\tau [x]\), though further research, in particular into (the implementation of) modular polynomials for C is required to determine the exact gain. This is beyond the scope of the current paper, which focuses on introducing the generalized class functions and their height reduction. We plan to give a more detailed account and an implementation in future work.

7 General curves and bases

Now suppose that our modular curve \(C\) is not necessarily an elliptic curve. Let \(\mathcal {D}\) be an effective divisor over \(\textbf{Q}\) on \(C\) and let \(\mathcal {B}=\{b_0,b_1,\ldots \}\) be a \(\textbf{Q}\)-basis of \(\mathcal {L}(\infty \mathcal {D})\) ordered by ascending degree.

The classical case is the case where we have one modular function f and we take \(C= \textbf{P}^1\), \(\psi = f = (f:1)\), \(\mathcal {D}= ((1:0))=(\infty )\), and \(\mathcal {B}=\{1,f,f^2,\ldots \}\). The case of all previous sections of this paper is the case where \(C\) is an elliptic curve given by a Weierstrass equation, \(\mathcal {D}= ((0:1:0))\), and \(\mathcal {B}=\{1,x,y,x^2,xy,x^3,x^2y,\ldots \}\).

Example 7.1

One systematic way to choose a \(\textbf{Q}\)-basis of \({{\mathcal {L}}}(\infty \mathcal {D})\) is as follows. First choose \(x\in {{\mathcal {L}}}(\infty \mathcal {D})\setminus \textbf{Q}\) of some degree d. (For example, one can take \(x=f\) with \(d=1\) in the classical case, and \(x=x\) with \(d=2\) in the elliptic case.) Now, let \(y_0 = 1\) and choose \(y_j\) for \(j = 1,2,\ldots , d-1\) in such a way that

$$\begin{aligned} y_j \in {{\mathcal {L}}}(m_j \mathcal {D})\setminus \langle y_k x^e : k < j, e\in \textbf{Z}\rangle , \end{aligned}$$

where \(m_j\) is minimal such that this set is non-empty. This way we obtain a vector \(\textbf{y} = (y_0,\ldots , y_{d-1})\) of d functions. (For example, in the classical case we have \(\textbf{y}=1\), and in the elliptic case we chose \(\textbf{y} =(1,y)\).) Then \(\mathcal {B}=\{x^e y_j : e\in \textbf{Z}_{\ge 0}, j\in \{0,1,2,\ldots ,d-1\}\}\) is a basis of \({\mathcal {L}}(\infty \mathcal {D})\). We order this basis by ascending degree \(de+m_j\), and if two elements have the same degree, then we put the one with lowest j first.

Example 7.2

Consider the modular curve \(X^0_+(191)\) (not to be confused with 119), which is hyperelliptic with model \(t^2 = s^6+2s^4+2s^3+5s^2-6s+1\) [16, Table 3], and the unique cusp is at \(\mathcal {D}=((1:1:0))\). One of the possible bases of \(\mathcal {L}(\infty \mathcal {D})\) obtained by the recipe above is \(\mathcal {B}=\{1,x,y_1,y_2,x^2,x^2y_1,x^2y_2,\ldots \}\), where \(x=(t+s^3+s+1)/2\), \(y_1=sx\), and \(y_2=s(y_1+1)\). The degrees of these functions are respectively 3, 5, and 7.

The function x is, up to multiplicative and additive constants, equal to the Atkin function \(A_{191}\). The reduction factors are \(r(C) = 96\), \(r(s) = 48\), and \(r(A_{191}) = 32\).

As in Sect. 2, let \(\tau \in \textbf{H}\) imaginary quadratic and assume that \((b_i,\tau )\) is a class invariant for every \(b_i\in \mathcal {B}\). Then, again unique up to scaling, we obtain a non-zero function \(F_{\tau }[C,\mathcal {B}]=\sum _{i=0}^k a_ib_i\in K(C)\) (\(a_i\in K\)) with k minimal such that \(\sum _{i=0}^k a_ib_i(\tau )=0\).

Definition 7.3

We call this \(F_{\tau }[C,\mathcal {B}]\) the generalized class function for the triple \(C,\mathcal {B},\tau \).

If \(\mathcal {B}\) is as in Example 7.1 then we again refer to the associated polynomial \(H_{\tau }[C,\mathcal {B}]\in K[X,Y_1,\ldots ,Y_d]\) (of total degree \(\le 1\) in \(Y_1,\ldots , Y_d\) and such that \(H_{\tau }[C,\mathcal {B}](x,y_1,\ldots , y_d) = F_{\tau }[C,\mathcal {B}]\)) as the generalized class polynomial.

Example 7.4

It turns out that, already for the case of elliptic curves, allowing the freedom of the choice of basis of may in reality lead to potentially better practical reduction factors. Revisiting our main example \(C:=X^0_+(119)\), denote by \(w:={\mathfrak {w}}_{7,17}\) the function (2.9) and by \(z:=x+y\) the sum of the Weierstrass coordinates for the model (2.8). Now consider the basis \(\mathcal {B}:=\{1,x,z,w,xz,wx,wz,w^2,wxz,w^2x,\ldots \}\) of \(\mathcal {L}(\infty \mathcal {D})\). The resulting generalized class polynomials corresponding to the discriminants of Table 1 are listed in Table 5. We get practical reduction factors in Fig. 3 that are better than those in Fig. 1.

A likely explanation for this improvement is that now not only the poles, but also the zeroes are as much restricted to the cusps of \(X^0_+(119)\) as possible. Indeed, the points \(O=(0:1:0)\) and \(P=(0,0)\) are the cusps, while 2P and 3P are rational CM points. Now \(\textrm{div}(w) = 4(P)-4(O)\), \(\textrm{div}(x) = (P) + (3P) - 2(O)\), and \(\textrm{div}(y) = 2(P) + (2P) -3(O)\). In particular, the function w is a modular unit. As explained in Remark 3.15, modular units in the classical setting give better practical reduction factors than non-units, even though the reduction factors are asymptotically the same.

Table 5 Some conjecturally correct generalized class functions for the curve \(C=X^0_+(119)\) using the \(\mathcal {L}(\infty \mathcal {D})\)-basis \(\mathcal {B}:=\{1,x,z,w,xz,wx,wz,w^2,wxz,w^2x,\ldots \}\)
Fig. 3
figure 3

Practical reduction factors for \(H_{\tau }[X^0_+(119),\mathcal {B}]\) for fundamental discriminants D with \(\gcd (D,N)=1\) and prime class number \(n<100\)

Theorem 7.5

Let \(C: y^2 + g(x) y = f(x)\) with \(f,g\in \textbf{Q}[X]\) be a hyperelliptic curve such that \(4f(x)+g(x)^2\) has odd degree and \(\textrm{Jac}(C)(\textbf{Q})\) is finite. Set \(\mathcal {D}\) to be the unique point at infinity and choose the basis \(\mathcal {B}=\{1,x,x^2,y,xy,x^2y,\ldots \}\) of \(\mathcal {L}(\infty \mathcal {D})\). Then Theorem 3.7 and Proposition 3.13 also hold for C and \(H_{\tau }[C,\mathcal {B}]\).

Proof

The original proof now goes through with only the following change. There are finitely many possibilities for the class c of the divisor \( - \sum _{\sigma } ((\sigma (\psi (\tau ))) - \mathcal {D})\) by our assumption that \(\textrm{Jac}(C)(\textbf{Q})\) is finite. For every c, choose a representative \(\sum _{i=1}^{m} ((P_i)-\mathcal {D})\) with m minimal and consider a primitive polynomial \(T\in \textbf{Z}[X]\) with roots \(x(P_i)\) for \(i=1,\ldots , m\). \(\square \)

Remark 7.6

Our proofs of Theorems 3.7 and 7.5 heavily rely on the fact that Heegner points are torsion. To completely remove the assumption on ranks, one would therefore need to bound the Heegner points, even in the rank-one case. Moreover, the proofs rely on the hyperelliptic equation where we use that \(|a| \le |a+bi|\) for real numbers a and b. Though we expect an analogue of these results to hold for general modular curves, this would require additional ideas. Do note that such an analogue would yield arbitrarily high reduction factors for generalized class polynomials by (3.2). For example, for \(C=X^0_+(239)\) of genus 3 we already obtain \(r(C)=120\), exceeding the Bröker–Stevenhagen bound.