Generalized class polynomials

Houben, Marc; Streng, Marco

doi:10.1007/s40993-022-00400-2

Generalized class polynomials

Research
Open access
Published: 16 November 2022

Volume 8, article number 103, (2022)
Cite this article

Download PDF

You have full access to this open access article

Research in Number Theory Aims and scope Submit manuscript

Generalized class polynomials

Download PDF

1606 Accesses
Explore all metrics

Abstract

The Hilbert class polynomial has as roots the j-invariants of elliptic curves whose endomorphism ring is a given imaginary quadratic order. It can be used to compute elliptic curves over finite fields with a prescribed number of points. Since its coefficients are typically rather large, there has been continued interest in finding alternative modular functions whose corresponding class polynomials are smaller. Best known are Weber’s functions, which reduce the size by a factor of 72 for a positive density subset of imaginary quadratic discriminants. On the other hand, Bröker and Stevenhagen showed that no modular function will ever do better than a factor of 100.83. We introduce a generalization of class polynomials, with reduction factors that are not limited by the Bröker–Stevenhagen bound. We provide examples matching Weber’s reduction factor. For an infinite family of discriminants, their reduction factors surpass those of all previously known modular functions by a factor at least 2.

Class invariants from a new kind of Weber-like modular equation

Article 20 August 2015

On Chudnovsky–Ramanujan type formulae

Article 31 October 2017

The General Coefficient Theorem of Jenkins and the Method of Modules of Curve Families

Article 19 May 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Hilbert class polynomial $H_D[j]$ of the imaginary quadratic order ${\mathcal {O}}$ of discriminant D is the minimal polynomial of the j-invariant of an elliptic curve with endomorphism ring ${\mathcal {O}}$. It is a defining polynomial of the ring class field of ${\mathcal {O}}$ and can be used for constructing elliptic curves over a finite field with a given number of points. Its coefficients are however rather large, which limits its practical usefulness. Already in 1908, Weber [37] therefore introduced alternative class invariants to be used instead of j, which resulted in class polynomials with coefficients that have roughly 1/72 of the digits of the coefficients of the Hilbert class polynomial for certain discriminants.

There has been continued interest in alternative class invariants ever since (e.g. [2, 4, 8,9,10,11,12, 14, 17, 18, 29, 30]). None however matched, let alone surpassed, the factor 72 of Weber’s functions. Moreover, Bröker and Stevenhagen [4] showed that no class invariant will ever do better than a factor 100.83. Under Selberg’s eigenvalue conjecture [31, Conjecture 1], this bound reduces to 96.

We introduce generalized (multivariate) class polynomials, define an appropriate notion of their reduction factor, and show that this notion indeed gives a measure of their “size” compared to the Hilbert class polynomial (Sect. 3). Contrary to classical class polynomials, the reduction factors of generalized class polynomials are not limited by the Bröker–Stevenhagen bound.

We give a family of generalized class polynomials for which we prove that the reduction factor matches Weber’s 72 for a large range of values of D, including infinitely many values of D where no reduction of 36 or better was previously known (Sect. 4). We also give an example that possibly achieves the factor 120 (Remark 7.6).

Though the focus of this paper is on introducing the generalized class invariants and studying their height, we also give a preliminary analysis indicating that the height reduction leads to a speed-up in their computation (Sect. 6), and we show how to use them for constructing elliptic curves over finite fields (Sect. 5).

2 Generalized class polynomials

Definition 2.1

By a modular curve over $\textbf{Q}$ we mean a smooth, projective, geometrically irreducible curve $C$ over $\textbf{Q}$ together with a map $\psi : \textbf{H}\rightarrow C(\textbf{C})$ from the upper half space $\textbf{H}\subset \textbf{C}$ with the following property. There exists a positive integer N such that for every function $f\in \textbf{Q}(C)$, the function $f\circ \psi $ is a modular function for $\Gamma (N)$ with all q-expansion coefficients in $\textbf{Q}^{\textrm{ab}}$.

We identify f with $f\circ \psi $ and we identify $\psi $ with the induced morphism of curves $X(N)\rightarrow C$.

For an order $\mathcal {O}$ in an imaginary quadratic number field K, we denote by $K_{\mathcal {O}}$ the associated ring class field. Let f be a modular function and $\tau \in \textbf{H}$ imaginary quadratic, say a root of $aX^2+bX+c$ for coprime integers a, b, c. The pair $(f,\tau )$ is called a class invariant for the imaginary quadratic order $\mathcal {O}=\textbf{Z}[a\tau ]$ if $f(\tau )$ lies in the ring class field $K_{\mathcal {O}}$. The discriminant D of the class invariant is the discriminant of $\mathcal {O}$. The Galois group G of $K(f(\tau ))/K$ is isomorphic via the Artin map to a quotient of the Picard group $\textrm{Cl}(\mathcal {O})$. Associated to a class invariant is its minimal polynomial over K, also known as the class polynomial,

$$\begin{aligned} H_\tau [f]:= \prod _{\sigma \in G}\big (X-\sigma (f(\tau ))\big )\quad \in K[X]. \end{aligned}$$

Under additional restrictions, class polynomials can sometimes be shown to have coefficients in $\textbf{Q}$ (cf. [9, Thm. 4.4], [13, Thm. 5.4]); in that case we call the class polynomials real. Oftentimes, a modular function admits class invariants for an infinite family of discriminants, determined by a certain congruence condition ( [30, 9, Thm. 4.3]). Sometimes the discriminant uniquely determines the class polynomial for a given modular function.

Example 2.2

The modular j-function admits a unique class polynomial for any discriminant $D<0$, called the Hilbert class polynomial $H_D[j]:=H_\tau [j]$. It can be seen as a function on $\textbf{P}^1$ whose zeros are the j-invariants of elliptic curves with CM by the imaginary quadratic order of discriminant D and whose poles are restricted to the point at infinity.

We propose a generalization of class polynomials, seen as functions on modular curves of higher genus, for which the classical class polynomials can be viewed as the genus zero case. We will mostly restrict ourselves to the case of genus one, as this will make notation considerably less complicated. We discuss the arbitrary genus case in Sect. 7. Let C be a modular curve over $\textbf{Q}$ with a smooth Weierstrass model $y^2+a_1xy+a_3y=x^3+a_2x^2+a_4x+a_6$, and suppose that $(x,\tau ),(y,\tau )$ are class invariants for some imaginary quadratic $\tau \in \textbf{H}$. Consider $G = \textrm{Gal}(K(x(\tau ),y(\tau ))/K)$ and $m=\#G$. If we denote by $\mathcal {D}$ the divisor of the unique point at infinity of C, then $\mathcal {L}(\infty \mathcal {D})$ has a basis $b_0=1,b_1=x,b_2=y,b_3=x^2,b_4=xy,b_5=x^3,b_6=x^2y,\ldots $ (ordered by ascending degree). There exist $a_i\in K$, not all zero, such that

$$\begin{aligned} \sum _{i=0}^m a_i b_i(\tau )=0. \end{aligned}$$

(2.3)

In fact, up to scaling by an element of $K^{\times }$, there exists a unique function $F_{\tau }[C]=\sum _{i=0}^m a_ib_i\in K(C)$ such that

$$\begin{aligned} \textrm{div}F_{\tau }[C] = \left[ \sum _{\sigma \in G} \left( \sigma (\psi (\tau ))\right) \right] +\left( -\sum _{\sigma \in G}\sigma (\psi (\tau ))\right) -(m+1)\mathcal {D}. \end{aligned}$$

(2.4)

Definition 2.5

We call $F_{\tau }[C]$ as in (2.4) a generalized class function for $\tau $. The associated generalized class polynomial is the unique $H_{\tau }[C]\in K[X,Y]$ of degree $\le 1$ in Y such that $H_{\tau }[C](x,y) = F_{\tau }[C]$.

We note that the polynomial $H_{\tau }[C]$ depends on the choice of x and y, but we leave this out of the notation. In Sect. 7 (and in particular Definition 7.3) we will allow more general divisors $\mathcal {D}$ and bases $\mathcal {B}$, leading to more general functions $F_{\tau }[C,\mathcal {B}]$ and polynomials $H_{\tau }[C,\mathcal {B}]$.

Definition 2.6

We call the point $P = \sum _{\sigma \in G}\sigma (\psi (\tau ))\in C(K)$ the Heegner point of the class function F.

If the Heegner point P is the point at infinity, then $a_m=0$. Otherwise, the point $-P$ is a zero of F. In particular, if$P=-(0,0)$, then $a_0=0$.

For $N\in \textbf{Z}_{>0}$, we denote by $X^0(N)$ the smooth, projective, geometrically irreducible curve over $\textbf{Q}$ with function field consisting of the modular functions for the modular group $\Gamma ^0(N)=\{\begin{matrix} a &{} b\\ c &{} d \end{matrix} \in {{\,\textrm{SL}\,}}_2(\textbf{Z})\mid b\equiv 0 \pmod {N}\}$ that have rational q-expansion. We denote by $X^0_+(N)$ the quotient of $X^0(N)$ by the Fricke-Atkin-Lehner involution $z\mapsto -N/z$, and write $\eta (z)$ for the Dedekind $\eta $-function

$$\begin{aligned} \eta (z) = q^{1/24}\prod _{n=1}^{\infty }(1-q^n), \quad \text{ where }\quad q = \exp (2\pi i z). \end{aligned}$$

Example 2.7

Consider the genus one modular curve $C:=X^0_+(119)$. Its conductor as an elliptic curve is 17 (Cremona label 17a4)^{Footnote 1}. A Weierstrass model for E is given by^{Footnote 2}

$$\begin{aligned} y^2+3xy-y=x^3-3x^2+x, \end{aligned}$$

(2.8)

where $x,y\in \textbf{Q}(C)$ have respective q-expansions

$$\begin{aligned} x= & {} q^{-2} + q^{-1} + 1 + q + 2q^2 + 2q^3 + 3q^4 + 3q^5 + 4q^6 + 5q^7 + \ldots ,\\ y= & {} q^{-3} + 1 + 2q + 2q^2 + 4q^3 + 4q^4 + 7q^5 + 9q^6 + 12q^7 +\ldots ,\\{} & {} \text{ where } \text{ this } \text{ time } q = \exp (2\pi i z/119)\text{. } \end{aligned}$$

The “double eta quotient” ${\mathfrak {w}}_{7,17}$ given by

$$\begin{aligned} {\mathfrak {w}}_{7,17}(z) =\frac{\eta (z/7)\eta (z/17)}{\eta (z)\eta (z/119)} \end{aligned}$$

(2.9)

is invariant under the action of $\Gamma ^0(N)$ [26, Thm. 1] and the Fricke-Atkin-Lehner involution [11, Thm. 2], hence also forms an element of the (rational) function field of C. It is related to x and y by

$$\begin{aligned} {\mathfrak {w}}_{7,17}=-y+x^2-x. \end{aligned}$$

(2.10)

The curve $X^0_+(119)$ has two cusps, and they are both rational. In the given Weierstrass model, these correspond to the point (0, 0) and the point at infinity. Numerical examples of generalized class polynomials specifically for $X^0_+(119)$ are given in Sect. 4.5. We will treat this curve as our main test case in the rest of the paper.

3 Estimates and reduction factors

3.1 Reduction factors

We define the reduction factor of a modular curve $C$ to be

$$\begin{aligned} r(C) = \frac{\deg (j : X(N)\rightarrow \textbf{P}^1)}{\deg (\psi : X(N) \rightarrow C)}. \end{aligned}$$

(3.1)

In the case $C=\textbf{P}^1$, we denote this number also by $r(\psi )$ and our notation and terminology coincide with that of [4]. The number $r(\psi )^{-1}$ is denoted by ${\widehat{c}}(\psi )$ in [8] and by $c(\psi )$ in [9]. Bröker and Stevenhagen [4, Theorem 4.1]^{Footnote 3} show $r(\psi )\le 32768/325 \le 100.83$. Under Selberg’s eigenvalue conjecture, one can even prove $r(\psi )\le 96$. The best known $\psi $ achieves $r(\psi ) = 72$. This result does not however apply directly to $r(C)$. For example, we have

$$\begin{aligned} r(X^0(N)) = N\prod _{p\mid N} (1+\frac{1}{p}) \quad \text{ and }\quad r(X^0_+(N)) = \frac{1}{2} r(X^0(N)) \quad \text{ if }\quad N>1. \end{aligned}$$

(3.2)

Our main example $C= X^0_+(119)$ therefore achieves $r(C) = \frac{1}{2}(7+1)(17+1)=72$. For (hyper)elliptic modular curves $C$ we get $r(C) \le 201.65$ (or $r(C) \le 192$ under Selberg’s eigenvalue conjecture), by applying the bounds to the x-function. Surprisingly, all elliptic curve quotients of $X^0(N)$ we found so far have $r\le 72$ (Sect. 4.7). In Sect. 7 we will discuss higher-genus curves, which allow for unbounded $r(C)$.

Remark 3.3

In the applications we have in mind, the reduction factor is the main source of improvement in computational efficiency. It is important to note, however, that this number $r(C)$ does not tell the complete story, even in the “classical” setting ($C\cong \textbf{P}^1$), for example for the following reasons.

(1)
There are many challenges when computing class polynomials, and even more with generalized class polynomials. See Sect. 6.
(2)
In the CM method (Sect. 5), we will want to find a j-invariant in $\textbf{F}_p$ from a point in $C(\textbf{F}_p)$. This is done using the minimal polynomial of the j-function over $\textbf{Q}(C)$, known as the modular polynomial (Lemma 5.1). This works best if the degree of j over $\textbf{Q}(C)$ is small. For example, this degree is 1 for $C = X^0(N)$, is 2 for $C = X^0_+(N)$, and ranges from 1 to 20 in [9, Table 7.1], making $X^0_+(119)$ a good choice in this respect.
(3)
If the (generalized) class polynomial is not real, then its coefficients lie in an imaginary quadratic extension of $\textbf{Q}$; roughly doubling its bit size. This issue can be avoided by imposing additional restrictions on $C$ or $\tau $, see Sects. 4.2 and 4.3.

On the other hand, there are two important tricks that may be used in complementary directions, providing computational improvements beyond the reduction factor $r(C)$:

(1)
Under some constraints, typically when all primes dividing the level of the modular curve ramify in the CM field, both the degree and height of the class polynomial are cut in half. This happens for example in the record-computation of [14] for the Atkin invariant $A_{71}$ when 71 divides the discriminant, leading to class polynomials that are $2^2\cdot 36 = 144$ times smaller than the Hilbert class polynomial (note that the reduction factor $r(A_{71})$ is 36 in this case). The same trick also applies to generalized class polynomials, see Sect. 4.4, which in the case of $X^0_+(119)$ leads to a factor $2^2\cdot 72 = 288$ in size reduction.
(2)
When the class number is composite, one can decompose the ring class field into a tower of fields whose defining polynomials have smaller degrees, also leading to a significant speed-up in the CM method [34].

These last two tricks only work when the class number is composite. We expect both of them to work well for generalized class polynomials, so will mainly restrict to the case of prime class number in our examples, as this more clearly illustrates the role of the parameter $r(C)$.

The goal of the rest of this section is to show under some hypotheses that the reduction factor $r(C)$ is indeed an asymptotic reduction factor of the size of the polynomials involved. For that, we will first introduce the appropriate notions of “size”.

3.2 Measures of polynomials and heights of their roots

For a polynomial $A\in \textbf{C}[X]$, let $|A|_1$ (resp. $|A|_\infty $) be the sum (resp. maximum) of the absolute values of the coefficients of A. The Mahler measure of a polynomial $A = a\prod _{i=1}^{n} (X-\alpha _i)\in \textbf{C}[X]$ is

$$\begin{aligned} {\mathcal {M}}(A) = |a| \prod _{i} \max \{1,|\alpha _i|\}. \end{aligned}$$

Lemma 3.4

We have

$$\begin{aligned} |A|_\infty \quad&\le \quad |A|_1 \quad \le \quad (n+1)|A|_\infty ,\\ {\mathcal {M}}(A)\quad&\le \quad |A|_1 \quad \le \quad 2^n {\mathcal {M}}(A),\\ \left| \log |A|_1 - \log |A|_\infty \right|&\quad \le \quad \log (n+1),\\ \left| \log |A|_\infty - \log ({\mathcal {M}}(A)) \right|&\quad \le \quad n\log (2). \end{aligned}$$

Proof

The first two inequalities are by definition and the third is Equation (6) of [23]. For its converse, observe that we have $|AB|_1 \le |A|_1|B|_1$, and hence also $ |A|_1 \le |a| \prod _i \max \{2, 2|\alpha _i|\} \le 2^n{\mathcal {M}}(A).$ Then take logarithms. $\square $

For an element $\alpha $ in a number field L of degree n, we define its (absolute logarithmic) height to be

$$\begin{aligned} h(\alpha ) = \frac{1}{n}\sum _{v} \max \{0,\log |\alpha |_v\}, \end{aligned}$$

where the sum ranges over the Archimedean and non-Archimedean absolute values, suitably normalized (that is, those denoted $||\cdot ||_v$ in [19, §B.1]). If $\alpha $ is a root of an irreducible $A\in \textbf{Z}[X]$ of degree n, then we have

$$\begin{aligned} \log ({\mathcal {M}}(A)) = n h(\alpha ). \end{aligned}$$

(3.5)

Remark 3.6

Another measure for the complicatedness of A would be its total bit size, or the sum s of the logarithms of the absolute values of the nonzero coefficients. We will instead focus on $|A|_\infty $ for the following reasons.

First of all, for computational purposes, it is more useful to look at $p = \deg (A)\cdot \log |A|_\infty $, as the required precision (or number of primes with the CRT approach) is proportional to $\log |A|_\infty $ and the number of computations to do with that precision is proportional to $\deg (A)$.

Secondly, we get the impression from numerical computations that s is close to p. For example, the value of s/p is spread out over the interval (0.75, 0.9) for the larger discriminants in both Sect. 4.5 and Example 7.4.

Finally, it is hard to prove lower bounds on s other than $s\ge \log |A|_\infty $, as it seems to already be hard to show that a sufficient proportion of coefficients is nonzero.

3.3 Proof of the height reduction

Theorem 3.7

Let C be a modular curve over $\textbf{Q}$ and suppose that C is an elliptic curve of rank 0 with Weierstrass coordinates x and y. Suppose that $\tau \in \textbf{H}$ ranges over a sequence of imaginary quadratic points for which C yields real generalized class polynomials $H_\tau [C]$, and with

$$\begin{aligned} \frac{h(j(\tau ))}{\log (\log (\#\textrm{Cl}({\mathcal {O}})))}\rightarrow \infty . \end{aligned}$$

(3.8)

Scale each $H_\tau [C]$ such that it has coprime coefficients in $\textbf{Z}$. Then

$$\begin{aligned} d\cdot \frac{\log |H_\tau [C]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{1}{r(C)}, \end{aligned}$$

where d is the degree of $K_{\mathcal {O}}$ over $K(\psi (\tau ))$.

Remark 3.9

We argue that the hypothesis (3.8) is very reasonable. Under GRH, we have

$$\begin{aligned} \#\textrm{Cl}(\mathcal {O}) = O(\sqrt{|D|}\log (\log |D|)), \end{aligned}$$

(3.10)

where D is the discriminant of $\mathcal {O}$ (see [22, 9.Theorem 1 and 11. on page 371], suitably extended to arbitrary D.)

Moreover, [8, §6.2] gives the approximation $\log |H_\tau [j]|_\infty \approx \pi \sqrt{|D|} S(D)$, with $S(D) = \sum _Q a^{-1}$, where the sum ranges over reduced primitive quadratic forms $Q = ax^2+bxz+cz^2$ of discriminant D. We now give a heuristic lower bound of this sum on average over all $|D|\le X$. We have $\sum _D S(D) \approx \sum _{Q} a^{-1}$, where this time the sum is taken over all reduced quadratic forms of negative discriminant $>-X$ (using the heuristic that imprimitive forms have a negligible contribution). As we are only computing a lower bound, we may restrict to $a \le \sqrt{X/8}$. Then b ranges from $-a$ to a, and c ranges from a or $a+1$ to $\lfloor (X+b^2)/(4a)\rfloor $; a range that contains at least $\lfloor X/(8a)\rfloor $ integers. This yields at least roughly X/4 values of b and c for each a, hence $\sum _D S(D)$ is roughly at least $(X/4)\sum _{a^2\le X/8} a^{-1} \ge \frac{1}{8} X\log (X)$.

It follows that the average S(D) is at least proportional to $\log |D|$. Thus, for “average” S(D), we have that $\log |H_\tau [j]|_\infty $ is at least proportional to $\sqrt{|D|}\log |D|$. Combined with (3.10), (3.5), and Lemma 3.4, we find for such D that $h(j(\tau ))/\log (\log (\#\textrm{Cl}(\mathcal {O})))$ is at least proportional to $\log |D| / (\log (\log |D|))^2$. We thus see that (3.8) indeed holds for “average” S(D).

Theorem 3.7 is the analogue of the following result.

Theorem 3.11

(cf. Enge-Morain [8]) Let f be a modular function and suppose that $\tau \in \textbf{H}$ ranges over a sequence of imaginary quadratic points for which $(f,\tau )$ is a class invariant with $h(j(\tau ))\rightarrow \infty $. Then $d\cdot \frac{\log |H_\tau [f]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{1}{r(f)}$, where d is the degree of $K_{\mathcal {O}}$ over $K(f(\tau ))$.

The goal of the remainder of Sect. 3 is to prove Theorem 3.7. We start with a proof of Theorem 3.11.

Proof

Let m be the degree of $K(f(\tau ))$ over K and let $n = dm$ be the degree of $K_{\mathcal {O}}$ over K. By Lemma 3.4 and (3.5), we get $|\frac{1}{n}\log |H_\tau [j]|_\infty - h(j(\tau ))|\le \log (2)$ and $|\frac{d}{n}\log |H_\tau [f]|_\infty - h(f(\tau ))| \le \log (2)$.

As $h(j(\tau ))\rightarrow \infty $, we also get

$$\begin{aligned} \frac{h(f(\tau ))}{h(j(\tau ))} \rightarrow \frac{1}{r(f)} \end{aligned}$$

(3.12)

by [19, Proposition B.3.5(b)]. Altogether, this gives the result. $\square $

Proposition 3.13

Let C be a modular curve over $\textbf{Q}$ and suppose that C is an elliptic curve of rank 0 with Weierstrass coordinates x and y. For every imaginary quadratic $\tau \in \textbf{H}$ for which C yields a real generalized class polynomial $H_\tau [C]$, let m be the degree of $K(\psi (\tau ))$ over K and let ${d'}\in \{1,2\}$ be the degree of $K(\psi (\tau ))/K(x(\tau ))$. Scale each $H_\tau [C]$ such that it has coprime coefficients in $\textbf{Z}$. Then we have

$$\begin{aligned} \left| \log |H_\tau [C]|_\infty - \frac{{d'}}{2}\log |H_\tau [x]|_\infty \right| < B\max \{1,m\log (\log (m))\}, \end{aligned}$$

for some constant B that only depends on C and the choice of Weierstrass model.

Proof

We first put the equation for C in a nice form. We have $C : y^2 + g(x)y = f(x)$. Without loss of generality we have $g=0$ and $f\in \textbf{Z}[X]$ monic of odd degree such that $f(z)\le -1$ for all real $z\le 0$. Indeed, we obtain $g=0$ by the substitution $y' = y+\frac{1}{2}g(x)$, then do scalings $x' = vx$ and $y'=wy$ to make f integral and (thanks to its odd degree) monic, and then do a substitution $x' = x+c$ to make $f(z)\le -1$ for all $z\le 0$. This affects $H_\tau [C]=A+BY$ and $H_\tau [x]$ as follows. The first substitution changes A into $A+\frac{1}{2} g(X)B$, the second changes A into A(vX) and B into wB(vX), and the third changes A into $A(X+c)$. Each of these substitutions change $\log (\max \{|A|_1,|B|_1\})$ at most by O(m), as does clearing the denominators afterwards.

Next, we relate a norm of $H_\tau [C]$ to $H_\tau [x]$. The extra elliptic curve point $(a/b^2,c/b^3) := \sum _{\sigma \in G} \sigma (\psi (\tau ))\in C(\textbf{Q})$ from (2.4) (which is minus the Heegner point) is torsion by our assumption that $C$ has rank 0. There are finitely many torsion points in $C(\textbf{Q})$, hence finitely many possibilities for the polynomial $T = b^2X-a$. Writing $H_\tau [C]= A(X) + B(X)Y$, we get that $N(H_\tau [C]) = A(X)^2 + (-f(X))B(X)^2$ has the same divisor as the primitive polynomial $H_\tau [x]^{d'}\cdot T$, hence there is a constant $s\in \textbf{Z}\setminus \{0\}$ with $N(H_\tau [C]) = s H_\tau [x]^{d'}\cdot T$.

We claim that $s = \pm 1$. If not, take a prime $p\mid s$ and consider the highest-weight term of $(H_\tau [C]\bmod p)$, where X has weight 2 and Y has weight $\deg (f)$. This gives rise to the highest-degree term of $(N(H_\tau [C])\bmod p)$, which is therefore nonzero, a contradiction.

Now we use interpolation to bound $H_\tau [C]$ in terms of $H_\tau [x]$. We will choose interpolation points $z= g(i) \le 0$. Note that for $z\le 0$ we have

$$\begin{aligned} A(z)^2, B(z)^2 \le A(z)^2 + (- f(z))B(z)^2 = N(H_\tau [C]) \le \max \{1,|z|\}^m |H_\tau [x]|^e_1 |T|_1, \end{aligned}$$

and since there are finitely many polynomials T, we get

$$\begin{aligned} \log |A(z)|, \log |B(z)| \le \frac{m}{2} \max \{0,\log |z|\} + \frac{{d'}}{2} \log |H_\tau [x]|_1 + O(1). \end{aligned}$$

Interpolation then gives, for $P \in \{A,B\}$:

$$\begin{aligned} P(X) = \sum _{i=1}^{k} P(g(i)) \prod _{j\not =i} \frac{X-g(j)}{g(i)-g(j)}, \end{aligned}$$

(3.14)

where $k = \deg (P)+1 = O(m)$.

Taking $g(u) = -\log (eu)^2$, we find $|g(i)-g(j)| \ge |i-j|\min _{z\in [1,k]} |g'(u)| = |i-j|\min _{u\in [1,k]} 2\frac{\log (eu)}{u} = 2|i-j|\frac{\log (ek)}{k}$. So for each i there are at most $k/\log (k)$ values of $j\not =i$ with $|g(i)-g(j)| < 1$ and each of them has $|g(i)-g(j)| \ge 1/k$. We get

$$\begin{aligned} \log \prod _{j\not =i} \frac{1}{|g(i)-g(j)|} \le (k/\log (k)) \log (k) = k = O(m). \end{aligned}$$

For the other factors in (3.14), we have $\log |X-g(j)|_1 \le \log (1 + \log (em)^2) = O(\log (\log (m)))$, so $\log \prod _{j} |X-g(j)|_1 = O(m\log (\log (m)))$, as well as $\log |P(g(i))| \le \frac{{d'}}{2}\log |H_\tau [x]|_1 + O(m\log (\log (m)))$. Taking the sum in (3.14) gives another $+\log (k)$, so that the end result is $\log |P(X)|_1 \le \frac{{d'}}{2}\log |H_\tau [x]|_1 + O(m\log (\log ( m)))$. By Lemma 3.4, this also holds with $|\cdot |_\infty $, which proves the upper bound on $\log |H_\tau [C]|_\infty $.

For the lower bound, note that $H_\tau [x]^{d'}$ is a factor of $Q =A^2-f(X)\cdot B^2$, and we have $|Q|_1 \le |A|_1^2 + |f|_1 |B|_1^2 \le |f|_1(m+1)^2|H_\tau [C]|_\infty ^2$. Using the fact that ${\mathcal {M}}$ is multiplicative by definition and is related to $|\cdot |_1$ and $|\cdot |_\infty $ by Lemma 3.4, we get exactly what we need: ${d'}\log |H_\tau [x]|_\infty \le {d'}\log {\mathcal {M}}(H_\tau [x]) + O(m) \le \log {\mathcal {M}}(Q) + O(m) \le \log |Q|_1 + O(m) \le 2\log (|H_\tau [C]|_\infty ) + O(m)$. $\square $

Proof of Theorem 3.7

Denote again by $n=\#\textrm{Cl}(\mathcal {O})$ the degree of $K_{\mathcal {O}}$ over K. First we apply Theorem 3.11 to x and get $d{d'}\frac{\log |H_\tau [x]|_\infty }{\log |H_\tau [j]|_\infty }\rightarrow \frac{2}{r(C)}$. Proposition 3.13, together with the hypothesis $h(j(\tau ))/(n\log (\log (n)))\rightarrow \infty $, gives $\frac{1}{{d'}}\frac{\log |H_\tau [C]|_\infty }{\log |H_\tau [x]|}\rightarrow \frac{1}{2}$ (as in the proof of Theorem 3.11). The product of these two limits gives the result. $\square $

Remark 3.15

Theorem 3.7 states that asymptotically the effect of the choice of a model of the curve $C$ is negligible, as is the effect of replacing f by 2f or $f+1$ or any other element of $\textbf{Q}(f)$ in Theorem 3.11.

However, in practice the error terms can be quite large and depend on these choices. For example, if f is integral over $\textbf{Z}[j]$ then $H_\tau [f]$ is monic, and if $f^{-1}$ is integral over $\textbf{Z}[j]$, then f has zero constant coefficient. This can make a difference in practical examples as it forces the coefficients at the beginning and end to be small, though this improvement is negligible asymptotically by the theorems. See also Remark 3.6.

4 Class invariants for $X^0(N)$ and $X^0_+(N)$

In this section we assume that C is a quotient over $\textbf{Q}$ of $X^0(N)$; in other words, C is a smooth, projective, geometrically irreducible curve over $\textbf{Q}$ with function field consisting only of modular functions for $\Gamma ^0(N)$ that have rational q-expansion. We will show how to obtain generalized class functions for every discriminant $D<0$ that is square modulo 4N (Sect. 4.1).

In some cases we get further reductions from class invariants generating subfields of $K_{\mathcal {O}}$ or from real class polynomials (Sects. 4.2–4.4).

In Sects. 4.5–4.6 we study what this means for $X^0_+(119)$ and in Sect. 4.7 we look for more examples of elliptic curve quotients of $X^0(N)$.

4.1 Class invariants for $X^0(N)$

The following result does not require C to be an elliptic curve, except that (unless C is an elliptic curve) one needs to read the definitions in Sect. 7 for the parts about generalized class polynomials.

Proposition 4.1

(based on Schertz [30]) Let $C = (C, \psi )$ be a quotient over $\textbf{Q}$ of $X^0(N)$ and let $D<0$ be a square modulo 4N.

There exist $a,b,c\in \textbf{Z}$ with $a,c>0$, $b^2-4ac = D$, $N\mid c$, and $\gcd (a,N) = \gcd (a,b,c) = 1$. Choose such a, b, c, let $\tau \in \textbf{H}$ be a root of $aX^2 + bX + c$, with order $\mathcal {O} = \textbf{Z}[a\tau ]$, which has discriminant D. Then we have

$$\begin{aligned} \psi (\tau )\in C(K_{{\mathcal {O}}}), \end{aligned}$$

thus giving rise to a generalized class polynomial $H_{\tau }[C]$.

The Galois orbit of $\psi (\tau )$ can be computed as follows. There exists an N-system, that is, there exist $\tau _1,\ldots , \tau _n\in \textbf{H}$ such that $(\tau _i\textbf{Z}+\textbf{Z})_i$ is a system of representatives of $\textrm{Cl}(\mathcal {O})$ and such that $\tau _i$ is a root of $a_iX^2 + b_iX + c_i$ with $\gcd (a_i,N) = \gcd (a_i,b_i,c_i)=1$ and $b_i\equiv b\ \textrm{mod}\ 2N$. Moreover, for any such choice, we have

$$\begin{aligned} \textrm{Gal}(K_{\mathcal {O}}/K)\cdot \psi (\tau ) = \{\psi (\tau _i) : i=1,\ldots ,n\}. \end{aligned}$$

Proof

For the existence of a, b, c, take an arbitrary square root b of D modulo 4N, let $a=1$, and $c= (b^2-D)/4$. Then the existence of an N-system is [30, Proposition 3].

For any $f\in \textbf{Q}(C)$, Theorem 4 of Schertz [30] states $f(\tau ) \in K_{\mathcal {O}}\cup \{\infty \}$ and gives the $\textrm{Gal}(K_{\mathcal {O}}/K)$-orbit as $\{g(N\tau _i) : i\}$, under an additional condition on the function f(1/z). However, the condition on f(1/z) is not needed, as stated in Theorems 3.9 and 4.4 of [13]. This proves the result. $\square $

4.2 Real class polynomials from ramification

There are some situations in which we can actually get real class polynomials, cutting the total required bit size in half. The first such situation is when all primes dividing N ramify.

Proposition 4.2

(based on Enge-Morain [9]) Let $C = (C, \psi )$ be a quotient over $\textbf{Q}$ of $X^0(N)$ and let $D<0$ be a discriminant divisible by N if N is odd and by 4N if N is even.

There exist $a,b,c\in \textbf{Z}$ with $a,c>0$, $N\mid b,c$, $\gcd (a,N)=1$, and $b^2-4ac = D$. Choose such a, b, c, let $\tau \in \textbf{H}$ be a root of $aX^2 + bX + c$, with order $\mathcal {O} = \textbf{Z}[a\tau ]$, which has discriminant D.

Then the $\textrm{Gal}(K_{\mathcal {O}}/K)$-orbit of $\psi (\tau )$ is stable under complex conjugation, and hence we may take $H_{\tau }[C]\in \textbf{Q}[X,Y]$.

Proof

If D is odd, take $b=N$, and if D is even, take $b=0$. If N is even, then we find $4N\mid b^2-D$. If N is odd, then we find both $4\mid b^2-D$ and $N\mid b^2-D$, hence also $4N\mid b^2-D$. Let $a = 1$ and $c = (b^2-D)/4$.

The complex conjugate of $\psi (\tau )$ is $\psi (-{\overline{\tau }})$ by the fact that the q-expansion coefficients are real. Here $-{\overline{\tau }}$ is a root of $aX^2 - bX + c$, and as $N\mid b$, we can choose the N-system in Proposition 4.1 in such a way that $-{\overline{\tau }} =\tau _i$ for some i. This proves the result. $\square $

4.3 Real class polynomials from $X^0_+(N)$

The second situation in which we get real class polynomials is when working with quotients of $X^0_+(N)$.

Proposition 4.3

(based on Theorem 3.4 of Enge-Schertz [10]) In the situation of Proposition 4.1, suppose furthermore that C is a quotient of $X^0_+(N)$, and that $\gcd (c/N,N) = 1$.

Then the $\textrm{Gal}(K_{\mathcal {O}}/K)$-orbit of $\psi (\tau )$ is stable under complex conjugation, and hence we may take $H_{\tau }[C]\in \textbf{Q}[X,Y]$.

Proof

The complex conjugate of $\psi (\tau )$ is $\psi (-{\overline{\tau }})$ by the fact that the q-expansion coefficients are real. As $\psi $ is invariant under the Fricke-Atkin-Lehner involution, this in turn is $\psi (\tau ')$ with $\tau ' = N/{\overline{\tau }}$, a root of $(c/N)X^2 + bX + Na$. As c/N is coprime to N, we can choose the N-system in Proposition 4.1 in such a way that $\tau ' =\tau _i$ for some i. This proves the result. $\square $

To use this result, we will need $\gcd (c/N,N)=1$, which can be achieved most of the time, as follows.

Lemma 4.4

If D is a square modulo 4N and $D = F^2D_0$ for a negative fundamental discriminant $D_0$ and a positive integer F coprime to N, then there exist a, b, c as in Proposition 4.1 with $\gcd (c/N,N) = 1$.

More generally, let $D<0$ be a square modulo 4N. Then there exist a, b, c as in Proposition 4.1 with $\gcd (c/N,N) = 1$ if and only if all of the following do not hold.

(1)
there exists a prime $p\mid N$ with ${{\,\textrm{ord}\,}}_p(N)$ odd and ${{\,\textrm{ord}\,}}_p(D) > {{\,\textrm{ord}\,}}_p(4N)$,
(2)
$m:={{\,\textrm{ord}\,}}_2(N) > 0$ and D is of the form $2^{m+1}d$ with $d\equiv 1\ (\textrm{mod}\ 4)$,
(3)
$m:={{\,\textrm{ord}\,}}_2(N) > 0$ and D is of the form $2^{m}d$ with $d\equiv 1\ (\textrm{mod}\ 8)$.

Proof

The triple (a, b, c) exists if and only if there exists $b\in \textbf{Z}$ such that for all $p\mid N$: ${{\,\textrm{ord}\,}}_p(b^2-D) = {{\,\textrm{ord}\,}}_p(4N)$.

Suppose that we are not in case (1), (2), or (3). By the Chinese remainder theorem, it suffices to find one $b\in \textbf{Z}$ for each $p\mid N$. So let $p\mid N$ be prime and let $k = {{\,\textrm{ord}\,}}_p(4N)$ and $l = {{\,\textrm{ord}\,}}_p(D)$. If $k < l$, then as we are not in case (2), we find that k is even, and we can take $b = p^{(k/2)}$. If $k=l$, then we can take $b = p^e$ with $e>k/2$. Now the case $k>l$ remains. As D is a square modulo 4N, there exists $b_0\in \textbf{Z}$ be such that $D\equiv b_0^2\ (\textrm{mod}\ 4N)$. If ${{\,\textrm{ord}\,}}_p(b_0^2-D) = {{\,\textrm{ord}\,}}_p(4N)$, then we are done, so suppose ${{\,\textrm{ord}\,}}_p(b_0^2-D)>k$.

Note that $2{{\,\textrm{ord}\,}}_p(b_0) = l$, hence l is even. Let $b = b_0 + p^{e}$ with e to be determined later. We get $b^2-D = (b_0^2-D) + 2p^{e}b_0 + p^{2e}$, and the terms have valuation $>k$, $e+(l/2)+{{\,\textrm{ord}\,}}_p(2)$, 2e respectively.

If $p\not =2$, then we choose $e=k-(l/2)$, so $2e = k+(k-l) > k$, hence ${{\,\textrm{ord}\,}}_p(b^2-D) = k$. If $p=2$ and $k > l+2$, then we choose $e=k-(l/2)-1$, so $2e = k+(k-l-2)>k$, hence ${{\,\textrm{ord}\,}}_p(b^2-D) = k$.

Now only the case $p=2$ with $k-l\in \{1,2\}$ remains. Write $d = 2^{-l}D$ and $b_1 = 2^{-(l/2)} b_0$, so $b_1$ is odd and $b_1^2-d$ is divisible by $2^{k-l}$.

In the case $k-l = 1$, we get $b_1^2 - d \equiv 0\pmod {2}$, and we claim that this is nonzero modulo 4. Indeed, $b_1^2$ is 1 modulo 4 and d is not (as we are not in case (2)). Therefore ${{\,\textrm{ord}\,}}_2(b_1^2-d) = 1$ and ${{\,\textrm{ord}\,}}_2(b_0^2-D) = 1+l=k$, so we take $b = b_0$.

In the case $k-l = 2$, we get $b_1^2 - d\equiv 0\pmod {4}$, and we claim that this is nonzero modulo 8. Indeed, $b_1^2$ is 1 modulo 8, and d is not (as we are not in case (3)). Therefore ${{\,\textrm{ord}\,}}_2(b_1^2-d) = 2$ and ${{\,\textrm{ord}\,}}_2(b_0^2-D) = 2 + l = k$, so we take $b = b_0$.

Conversely, suppose that b exists.

In case (1), we have ${{\,\textrm{ord}\,}}_p(D)> {{\,\textrm{ord}\,}}_p(4N)$, hence $2{{\,\textrm{ord}\,}}_p(b) ={{\,\textrm{ord}\,}}_p(4N)$ is odd, contradiction.

In case (2), we have ${{\,\textrm{ord}\,}}_2(b^2-2^{m+1}d) = m+2$, hence $m+1 = 2{{\,\textrm{ord}\,}}_2(b)=: 2e$. Write $b = 2^{e}b_1$ and note ${{\,\textrm{ord}\,}}_2(b_1^2-d) = 1$, but $b_1^2-d$ is 0 modulo 4.

In case (3), we have ${{\,\textrm{ord}\,}}_2(b^2-2^{m}d) = m+2$, hence $m = 2{{\,\textrm{ord}\,}}_2(b)=:2e$. Write $b = 2^eb_1$ and note ${{\,\textrm{ord}\,}}_2(b_1^2-d) = 2$, but $b_1^2-d$ is 0 modulo 8.

It remains only to prove the first statement, for which it suffices to show that the exceptions (1), (2), and (3) all imply $\gcd (N,F)>1$. In case (1), we see that $p^2\mid D$ and if $p=2$, then $p^4\mid D$, hence $p\mid F$. In cases (2) and (3), write $D = 2^v d$ with $v\in \{m,m+1\}$. As D is a square modulo $2^{m+2}$, we find that v is even, and hence $D = (2^{v/2})^2 d$ for a discriminant d, so $2\mid F$. $\square $

Lemma 4.5

Let N be the product of distinct odd primes $p_1,\ldots , p_k$. The negative discriminants that are a square modulo 4N and not in one of the exceptions of Lemma 4.4 have density

$$\begin{aligned} \prod _{i=1}^k \frac{p_i^2 + p_i-2}{2p_i^2} \end{aligned}$$

in the set of all negative discriminants.

The negative fundamental discriminants that are a square modulo 4N (which are not in one of the exceptions of Lemma 4.4) have density

$$\begin{aligned} \prod _{i=1}^k \frac{p_i^2 + p_i-2}{2(p_i^2-1)} \end{aligned}$$

in the set of all fundamental negative discriminants.

Proof

Being a discriminant is the condition of being 0 or 1 modulo 4. It is equivalent to being a square modulo 4. This is independent of being a square modulo $p_i$ that does not suffer from (1), which is happens for the $(p_i-1)/2$ residue classes modulo $p_i$ that are nonzero squares modulo $p_i$, and the $p_i-1$ nonzero residue classes modulo $p_i^2$ that are zero modulo $p_i$. As $p_i(p_i-1)/2 + p_i-1 = (p_i^2+p_i-2)/2$, we get the first statement.

Being a fundamental discriminant means being nonzero modulo the squares of all odd primes and being 1, 5, 8, 9, 12, 13 modulo 16. This happens for $\zeta (2)^{-1} (1-1/4)^{-1} \frac{6}{16}$ of all negative integers. In order to restrict this to products that satisfy the conditions of Lemma 4.4, we have to adjust the Euler product exactly by the given factor. $\square $

For example, if $N = 119 = 7\cdot 17$, then the numbers in Lemma 4.5 are $> 0.2898$ and $19/64 > 0.2968$.

4.4 Lower-degree class polynomials from ramification

In the case where all primes dividing N ramify, we get an even greater size reduction. The point $\psi (\tau )$ will then be defined over a subfield, cutting the degree of its minimal polynomial in half. This in turn also cuts the height of the coefficients of this polynomial in half, as we get $d\ge 2$ in Theorem 3.7. The amount of work required for computing the class polynomial, as well as the bit size of the polynomial (Remark 3.6), is related to the degree times the logarithm of the largest coefficient, and this product is reduced by a factor $\ge 2\times 2 \times r(C)= 4r(C)$.

Proposition 4.6

(based on Enge-Schertz [12]) Let $C = (C, \psi )$ be a quotient over $\textbf{Q}$ of $X^0(N)$ and let $D=F^2D_0<0$ be such that $N\mid D$, $\gcd (F,N)=1$, and $D\not \in \{N,4N\}$.

There exist $a,b,c\in \textbf{Z}$ with $a>0$, $N\mid b$, $c=N$, $b^2-4ac = D$, and $\gcd (a,b,c) = 1$. Choose such a, b, c, let $\tau \in \textbf{H}$ be a root of $aX^2 + bX + c$, with order $\mathcal {O} = \textbf{Z}[a\tau ]$, which has discriminant D.

Let ${\mathfrak {n}} = ((-b+\sqrt{D})/2, a)$, and let $K_{\mathcal {O}}^{[{\mathfrak {n}}]}$ be the subfield of $K_{\mathcal {O}}$ fixed by the image of ${\mathfrak {n}}$ under the Artin map. Then $[{\mathfrak {n}}]$ has order 2 in $\textrm{Cl}(\mathcal {O})$ and $\psi (\tau )\in C(K_{\mathcal {O}}^{[{\mathfrak {n}}]})$, where $K_{\mathcal {O}}$ has degree 2 over $K_{\mathcal {O}}^{[{\mathfrak {n}}]}$.

We get $m \le \#\textrm{Cl}(\mathcal {O})/2$ in the definition of $H_\tau [C]$, we get $H_\tau [C]\in \textbf{Q}[X,Y]$, and we get and $d\ge 2$ in Theorem 3.7.

If ${\mathfrak {a}}_i$ are the ideals $\tau _i\textbf{Z}+\textbf{Z}$ of an N-system, then $\mathfrak {a}_i$ and $\mathfrak {a}_i{\mathfrak {n}}$ yield the same point $\psi (\tau _i)$, while $\mathfrak {a}_i^{-1}$ and $\mathfrak {a}_i^{-1}{\mathfrak {n}}$ yield $\overline{\psi (\tau _i)}$.

Proof

This is exactly what we get when applying [12, Theorem 9] to the coordinate functions f of C. $\square $

4.5 Numerical results for $X^0_+(119)$

For the rest of this section we will return to our main Example 2.7, so set $N=119=7\cdot 17$. For any $\tau $ as in Proposition 4.3, we have $H_{\tau }[C]\in \textbf{Q}[X,Y]$. By scaling, we may assume that the coefficients of $H_{\tau }[C]$ are integral and coprime, and that the leading coefficient (i.e. the coefficient of the monomial of highest degree as a function on C) is positive, and this uniquely determines $H_{\tau }[C]\in \textbf{Z}[X,Y]$.

For any discriminant $D<0$ coprime to N such that D is a square modulo N, there are two generalized class polynomials (depending on the choice of $\tau $). We experimentally computed both of these for all fundamental discriminants of prime class number $<100$. The main reason for restricting to prime class number is to exclude the two tricks of Remark 3.3; for these discriminants, the reduction factor thus provides a fair comparison with the Hilbert class polynomial. The method we employ numerically evaluates class invariants by their q-expansions, and finds a minimal polynomial relation (2.3) using lattice basis reduction (LLL). We leave faster methods for future research, but see Sect. 6 for the first ideas. Since the q-expansions can only be evaluated up to finite precision, this does not result in provably correct polynomials, although – based on heuristic estimates – they are highly unlikely to be incorrect.

A few examples of computed polynomials are listed in Table 1. Here, for the given discriminant D, we consistently chose $\tau $ such that its primitive equation is $X^2+bX+(b^2-D)/4$ with $b\in \textbf{Z}_{>0}$ minimal satisfying $b^2\equiv D\pmod {4N}$ and $\gcd ((b^2-D)/(4N) ,N) = 1$.

Table 1 Some conjecturally correct generalized class functions for $C=X^0_+(119)$

Full size table

Still assuming that $H_{\tau }[C]$ is scaled such that it has coprime coefficients in $\textbf{Z}$, we denote by

$$\begin{aligned} r_A(\tau ):=\frac{\log |H_{\tau }[j]|_{\infty }}{\log |H_{\tau }[C]|_{\infty }} \end{aligned}$$

the practical reduction factor of $\tau $. Under the assumption $h(j(\tau ))/\log (\log (n))\rightarrow \infty $ for $n=\#\textrm{Cl}(\mathcal {O})$ (cf. Theorem 3.7) we have $d^{-1}r_A(\tau )\rightarrow r(C)$. Experimentally obtained practical reduction factors, plotted against both the class number n and $\log (|H_{\tau }[j]|_{\infty })/\log (\log (n))$, can be seen in Fig. 1. To visualize the role of the class number and the hypothesis $h(j(\tau ))/\log (\log (n))\rightarrow \infty $, the points of higher class number are given a darker color in the second figure.

The values of the practical reduction factor $r_A(\tau )$ seem to be around their expected asymptotic value $r(C)=72$ (represented by the horizontal grey line), though the convergence is not apparent; especially compared to, e.g. some classical class polynomials [8, Fig. 1]. However, in practical applications (see Sect. 5), the class numbers employed are typically several orders of magnitude higher (cf. e.g. [34]), so here we expect the speed of convergence not to cause major deviations in expected running times (cf. Sect. 6). For small class numbers, one can in practice even take advantage of this phenomenon by constructing generalized class polynomial with surprisingly good practical reduction factors by selecting a basis of $\mathcal {L}(\infty \mathcal {D})$ different from $1,x,y,x^2,xy,\ldots $ (see Example 7.4).

4.6 Comparison with existing class invariants

Real class invariants typically arise subject to congruence conditions on the discriminant. For example, Weber’s functions with reduction factor 72 are not known to give class invariants for discriminants $\equiv 5\pmod {8}$. The reduction factors obtained by class invariants coming from the family of (double) eta quotients ${\mathfrak {w}}_n$ and ${\mathfrak {w}}_{p,q}$ (such as the Weber function ${\mathfrak {w}}_2$, as well as the function ${\mathfrak {w}}_{7,17}$ of Example 2.7) have been extensively studied; cf. most notably [9]. These modular functions are not known to yield class invariants if D is not a square modulo 4n or 4pq. Hence, to the best of our knowledge, they also are not applicable to discriminants $\equiv 5\pmod {8}$ as soon as n, p or q is even. Excluding these cases, the (double) eta quotient with highest known reduction factor is ${\mathfrak {w}}_9$, with a reduction factor of 36 [9, Table 7.1].

A less-studied generalization are multiple eta quotients [12], which are quotients of products of $2^k$ eta functions. As far as we know these do not yield reduction factors better than 36 for $k>1$.

The only other known family of “good” class invariants (in the sense that they have large reduction factors) are the Atkin functions $A_p$ for prime numbers p, defined to be the smallest-degree functions in ${\mathcal {L}}(\infty \mathcal {D})$, where $\mathcal {D}$ is the unique cusp of $X^0_+(p)$. The “best” known one here is $A_{71}$, again with a reduction factor of 36, owing to the fact that $X^0_+(71)$ has genus zero [14, §3]).

The curve $C=X^0_+(119)$ has a reduction factor $r(C) = 72$ and yields real class invariants whenever D is a square modulo $4\cdot 7\cdot 17$ and not divisible by $7^2$ or $17^2$. The set of such D has density $> 28.98\%$ among the set of all negative discriminants (by Lemma 4.5). Out of these discriminants, one-fourth are $\equiv 5\pmod {8}$. Hence, for at least $28.98\%\cdot \frac{1}{4}> 7.24\%$ of imaginary quadratic discriminants, the reduction factor exceeds the previously best known reduction factors by a factor of at least two.

Remark 4.7

One should note that the above comparison does not take into account the discussion of Remark 3.3. Most importantly, the reduction factor is not synonymous with the true size reduction of the class polynomials. Indeed, as noted in that remark, the record-breaking CM construction [34] uses the Atkin invariant $A_{71}$ of reduction factor 36, because the effective size reduction of class polynomials is by a factor of roughly $2^2\cdot 36 = 144$ for certain discriminants. However, by Sect. 4.4, the same trick applies to generalized class polynomials, leading for $X^0_+(119)$ to a size reduction of $2^2\cdot 72 = 288$, again for a positive density subset of discriminants. In Fig. 2 we plot the practical reductions in bit size we found compared to the Hilbert class polynomial using this trick.

Remark 4.8

Note that the “classical” class polynomial $H_\tau [x]$, arising from the function x on $X^0_+(119)$ by itself attains a reduction factor of 36 for the same $28.98\%$ of discriminants. This beats all previously-known class invariants for a smaller subset ($\approx 1.2\%$) of discriminants: those that additionally are non-square modulo both 3 and 71. This x can be viewed as a generalisation of the Atkin functions to non-prime levels: it is the function of minimal degree in ${\mathcal {L}}(\infty \mathcal {D})$ for one of the cusps $\mathcal {D}$ of $X^0_+(119)$.

Similarly, the degree-two map of the hyperelliptic curve $X^0_+(191)$ (not to be confused with 119) has reduction factor 48, as observed by David Kohel in the AGC${}^2$T 2021 Zulip group chat. This beats the reduction factor 32 of the Atkin function $A_{191}$ of degree 3 on the same curve (see Example 7.2).

This shows that the search for generalized class invariants can even uncover new “classical” class invariants.

4.7 More modular curves of genus one

We searched for more elliptic curves that could be used, and the results are in Tables 2, 3, and 4. In our search, we used the fact that $X_0(N)$ is well-studied and that there is an isomorphism $X_0(N)\rightarrow X^0(N): z\mapsto Nz$. Surpisingly, we found lots of elliptic curves with reduction factor 72 and no elliptic curves with a greater reduction factor.

In Sect. 7, we will allow curves of higher genus, which do achieve arbitrarily high values of r(C). Moreover, our search is by no means exhaustive, as Tables 2 and 3 restrict to maps $\phi : X\rightarrow C$ of degree $\le 2$ and Table 4 only looks at one curve $X = X^0(N)$ per isomorphism class of curves C. For example, the curve $C = X^0_+(119)$ has $r(C) = 72$. However, in the Cremona database, it is listed as 17a4, and comes with a modular parametrization $\phi _{17} : X_0(17)\rightarrow C$ of degree 1, which has $r(\phi _{17}) = 18$. This is why C does not appear in Table 4.

Finally, the tables are restricted to quotients of $X^0(N)$. Letting go of $X^0(N)$, we find that the genus-one modular curves $7C^1$, $8K^1$, $9H^1$, $12V^1$, $15I^1 = X_1(15)$, $16M^1$, $24J^1$, $27C^1$, $32E^1$ in the Pauli-Cummins database [6] all achieve $r(C) \in \{84, 96, 108\}$. We have not pursued these curves yet, as Proposition 4.1 does not apply to them.

Table 2 The curves $X = X^0_+(N)$ for which there exists a map $\phi : X \rightarrow C$ of degree $\le 2$ with $g(C) \le 1$ and $r(C) \ge 48$

Full size table

Table 3 The curves $X = X^0(N)$ for which there exists a map $\phi : X \rightarrow C$ of degree $\le 2$ with $g(C) \le 1$ and $r(C) \ge 48$ and N is not already in Table 2

Full size table

Table 4 The elliptic curves $E/\textbf{Q}$ of conductor $< 500.000$ such that the modular parametrization $\phi : X \rightarrow E$ according to the LMFDB [5, 35, 36] gives $r(C) \ge 66$ or gives $r(C) \ge 48$ and odd N

Full size table

5 Application: the CM method

Class polynomials are used in the CM method for constructing elliptic curves over finite fields with a specified characteristic polynomial of Frobenius.

The input to the CM method is a monic quadratic polynomial $P = x^2 - tx + q\in \textbf{Z}[x]$, where q is a prime power coprime to t, and the discriminant $d=t^2-4q$ is negative. The output is an elliptic curve $E/\textbf{F}_q$ with $q+1-t$ rational points, which has P as its characteristic polynomial of Frobenius.

The algorithm of the classical CM method (without using class invariants for now) is as follows. Let $K = \textbf{Q}(\sqrt{d})$.

(1)
Compute the Hilbert class polynomial $H_K$ of ${\mathcal {O}}_K$.
(2)
Find a root $j_0\in \textbf{F}_q$ of $H_K$ (which is known to split into linear factors in $\textbf{F}_q$).
(3)
Construct an elliptic curve $E/\textbf{F}_q$ with $j(E)=j_0$. Compute all twists of E and return the one with $q+1-t$ rational points.

In practice, one can discard the curves for which $(q+1-t)Q\not =O$ for some random point Q, although there are also more straightforward methods to select the correct twist [28].

As the degree and height of the Hilbert class polynomial grow quickly with the absolute value of the discriminant $\Delta _K$ of K, the CM method is only feasible for small values of $|\Delta _K|$. The record computation of [34] uses class invariants, specifically arising from the Atkin function $A_{71}$. Combined with the tricks listed in Remark 3.3 this allows to handle a case where $|\Delta _K| > 10^{16}$.

We will now describe how to apply the CM method using generalized class polynomials. Hence let C be an elliptic modular curve. Since we are working with alternative class invariants instead of the usual j-invariant, we will relate the two using modular polynomials as follows.

Lemma 5.1

Let $d_j:=[\textbf{Q}(C,j):\textbf{Q}(C)]$. Then there exists a polynomial $\Psi _C=\sum _{i=0}^{d_j} f_iZ^i\in \textbf{Z}[X,Y][Z]$ of degree $d_j$ in Z such that

(i)
$\Psi _C(j)=0$;
(ii)
$\deg _{Y}(f_i) \le 1$ for each i;
(iii)
the coefficients (in $\textbf{Z}$) of $\Psi _C$ viewed as an element of $\textbf{Z}[X,Y,Z]$ are coprime;
(iv)
viewed as elements of $\textbf{Q}(C)$, the $f_i$ have at most one common zero in $C({\overline{\textbf{Q}}})$.

Furthermore, $\Psi _C$ is unique up to sign.

Proof

Consider the minimal polynomial $\Psi _C^0 = \sum _{i=0}^{d_j} g_iZ^i\in \textbf{Q}(C)[Z]$ of j over $\textbf{Q}(C)$. Let

$$\begin{aligned} \mathcal {E}:=\sum _{P\in C\setminus \{O\}} \min _i({{\,\textrm{ord}\,}}_P(g_i))(P). \end{aligned}$$

Then $\mathcal {E} - \left( \sum _{P\in C}{{\,\textrm{ord}\,}}_P(\mathcal {E})P\right) - (\deg (\mathcal {E})-1)(O)$ is a $\textbf{Q}$-rational principal divisor. There is a unique function g up to $\textbf{Q}^\times $-scaling such that $\textrm{div}(g)=\mathcal {E}$. Dividing each $g_i$ by g gives $g_i\in {{\mathcal {L}}}(\infty (O)) = \textbf{Q}[x,y]$ satisfying (iv) and unique up to $\textbf{Q}^\times $. Now take representatives $f_i$ satisfying (ii) and scale to get (iii), which makes $\Psi _C$ unique up to sign. $\square $

For each curve C with which we would like to apply the generalized CM method, the polynomial $\Psi _C\in \textbf{Z}[X,Y,Z]$ can be precomputed and stored. Next we need a criterion for which discriminants D yields class invariants. For example, if $C=X^0_+(N)$ then this is given by Proposition 4.1. Now, given a desired characteristic polynomial of Frobenius $x^2-tx+q$ such that $D=t^2-4q$ satisfies this criterion, we have the following algorithm for sufficiently large |D|.

(1)
Compute a generalized class function F of discriminant D as well as its Heegner point Q.
(2a)
Find a zero $P=(x,y)\in C(\textbf{F}_q)$ of F that is neither $-Q$ nor a common root of the polynomials $f_1,\ldots , f_{d_j}$ of Lemma 5.1.
(2b)
Find all roots $j_0\in \textbf{F}_q$ of the polynomial $\Psi _C(x,y,Z)\in \textbf{F}_q[Z]$.
(3)
For each $j_0$, construct an elliptic curve $E/\textbf{F}_q$ with $j(E)=j_0$ and all of its twists up to isomorphism over $\textbf{F}_q$. Return one with $q+1-t$ rational points.

The main advantage compared to the classical CM method, both in terms of memory and speed, is expected to be in the (dominant) first step (1) (see Sect. 6). Out of the computationally non-dominant steps, only (2a) is less straightforward. One way to proceed would be as follows.

(i)
Compute $F_x:=N_{\textbf{F}_q(C)/\textbf{F}_q(x)}(F)$.
(ii)
Find a root $x\in \textbf{F}_q$ of $F_x$.
(iii)
Solve for the corresponding value of y using the linear polynomial $H_\tau [C](x,Y)$, or continue with both solutions y coming from the Weierstrass equation.

Remark 5.2

The polynomial $F_x$ is very close to the classical class polynomial $H_{\tau }[x]$; indeed, it has the same roots, together with one additional root at the x-coordinate of the Heegner point of F. The norm computation in step () is however computationally asymptotically dominated by the computation of F.

6 The computational benefits of our invariants

6.1 Space complexity of the functions

The advantage of using generalized class functions lies in their size. This already gives a serious advantage when storing one or more class polynomials for later use, e.g. for various values of q in the CM method. Additionally, one would expect the smaller size to make the generalized class polynomials less expensive to compute. Again for C a modular elliptic curve with a given Weierstrass model, we present a preliminary analysis of the cost of computing a generalized class polynomial $H_{\tau }[C]$ when compared to the “classical” class polynomial $H_{\tau }[x]$ (though recall that the latter already dominates all previously-known class invariants along a positive density subset of discriminants for $C=X^0_+(119)$, cf. Sect. 4.6).

6.2 Speed of complex analytic computation

We now explain how to adapt the complex analytic approximation algorithm to generalized class polynomials.

To compute the classical class polynomial $H_\tau [x]$ one first evaluates $x(\tau )$ and all its conjugates, which are of the form $x_i(\tau _i)$, where $x_i$ and $\tau _i$ can be obtained using Shimura’s reciprocity law [18] or N-systems [30]. Then one multiplies the linear polynomials $X-x_i(\tau _i)$ together in a binary tree using fast multiplication algorithms.

As $H_\tau [C]$ has roughly half the height, we only need half the precision at each step. This gives a great speed-up when evaluating $x_i(\tau _i)$, but then we also need to compute $y_i(\tau _i)$. Fortunately that should only take a fraction of the time required for computing $x_i(\tau _i)$, as we can first compute it to low precision and then obtain as many digits as desired quickly using

$$\begin{aligned} y = \frac{-g(x) + \sqrt{g(x)^2+4f(x)}}{2} \end{aligned}$$

for $C : y^2 + g(x) y = f(x)$.

The binary tree step is harder to analyze. Instead of having polynomials $A(X) = \prod _{i\in S} (X-x_i(\tau ))$ to multiply for various subsets $S\subset \{1,2,\ldots , n\}$, we will have pairs (F, Q) with $F = A(X)+B(X)Y$ and

$$\begin{aligned} \textrm{div}(F) = \sum _{i\in S} (P_i) + (Q) - (\#S+1)\mathcal {D}. \end{aligned}$$

Instead of a single multiplication $A_1A_2$ to go from $S_1$ and $S_2$ to $S_3 = S_1\sqcup S_2$, we now need to compute the point $Q_3 = Q_1+Q_2$ (with the elliptic curve group law) and a function $F_3$ with

$$\begin{aligned} \textrm{div}(F_3) = \sum _{i\in S} (P_i) + (Q_3) - (\#S_3+1)\mathcal {D}= \textrm{div}(F_1)+\textrm{div}(F_2) + (Q_3) + (O) - (Q_1) - (Q_2). \end{aligned}$$

The following formula can be used:

$$\begin{aligned} F_3&= \frac{F_1\ F_2\ R \quad \textrm{mod}\quad (Y^2-f(X))}{(X-x(Q_1))(X-x(Q_2))},\quad \text{ where } \end{aligned}$$

(6.1)

$$\begin{aligned} R&= (x(Q_1)-x(Q_2))\ Y\ +\ (y(-Q_2)-y(-Q_1))\ X \end{aligned}$$

(6.2)

$$\begin{aligned}&\ \ + x(Q_2)y(-Q_1) - x(Q_1)y(-Q_2), \end{aligned}$$

(6.3)

and where the reduction modulo $Y^2-f(X)$ keeps the outcome of degree $\le 1$ in Y.

We can multiply $F_1$ with $F_2$ using three multiplications of half the degree, by the same trick that is used in Karatsuba multiplication. Indeed, let

$$\begin{aligned} C = A_1A_2,\quad D = B_1B_2,\quad \text{ and }\quad E=(A_1+B_2)(A_2+B_1) \end{aligned}$$

to get $F_1F_2 = (C+Df) + (E-C-D)Y$. So computing $F_3$ involves three polynomial multiplications of half the degree of $F_1$ and $F_2$, as well as various multiplications and long divisions by fixed-degree polynomials and various additions and subtractions. The most serious computations in the binary tree are now done with half the degree and half the number of digits, but three times as often, which takes 3/16th of the time with naive multiplication and still less than 3/4 of the time with quasi-linear-time multiplication. The impact of the extra additions and subtractions, as well as the extra multiplications by a linear polynomial in X and Y and long division by the denominator of (6.1) requires further analysis, but we expect this to be minor. Regardless, for large discriminants, the main bottleneck is in memory complexity (as noted in [7, Sect. 7]), and here we obtain an improvement of a factor of 1/2 when passing from $H_\tau [x]$ to $H_\tau [C]$.

6.3 Adapting the CRT method

6.3.1 Overview of CRT class polynomial computation

We now heuristically estimate the expected speed-up when computing $H_\tau [C]$ instead of $H_\tau [x]$ using the (currently state-of-the-art) CRT method for class polynomial computation [14, 33, 34]. We restrict to the case of C such that all q-expansion coefficients of x and y are rational, and will analyse some steps only in the main case where C is a quotient of $X^0_+(N)$. To keep our exposition simple, we will not treat the main improvement of [34], even though we do expect it to combine well with our generalized class invariants. We plan to give a more detailed account and an implementation in future work.

For the CM method, it is more efficient to directly compute class polynomials modulo q using the online CRT as in [33, Sect. 2]. In other words, we never write down $H_\tau [C]\in \textbf{Z}[X,Y]$, but instead compute $(H_\tau [C]\ \textrm{mod}\ q)\in \textbf{F}_q[X,Y]$ directly from $(H_\tau [C]\ \textrm{mod}\ p)$ for p in a set S of small primes. The space complexity of the CM method is then $n\log (q)$, which is independent of our choice of class function. The set S must be chosen in such a way that $\prod _{p\in S} p$ is larger than 4 times the largest coefficient.

By cutting the number of digits in half when switching from x to C, we essentially cut $\# S$ in half. If the amount of work that we do for each prime p does not grow too much, then our class function $H_\tau [C]$ yields a speed-up over the classical class polynomial $H_\tau [x]$.

What needs to be done for each p is the following.

(1)
Enumerate all $E''$ with endomorphism ring ${\mathcal {O}}$ and compute the appropriate points in $C(\textbf{F}_p)$.
(2)
Compute $(F\ \textrm{mod}\ p)$ by putting together the information from Step 1.

In practice, for “typical” discriminants D with 9 to 14 digits, Sutherland [33, Sects. 8.3 and 8.4] finds that performing Steps 1 and 2 together $\#S$ times is the dominant part of the CRT method.

We will now argue why we expect each of these steps to take (much) less than twice as long with the generalized class polynomial for suitable C. Together with the fact that our set S is only half the original size due to the reduction factor, this means that computing $H_\tau [C]$ takes less time than computing $H_\tau [x]$.

6.3.2 Enumerating via the Fricke involution

Step 1 is already very subtle in the case of a single class invariant f. Indeed, there could be multiple Galois orbits of values $f(\tau )$ for the same order $\mathcal {O}$, and hence multiple irreducible class polynomials $H_{\tau _i}[f]\in K[X]$. In the CRT method, one has to make sure to compute the polynomials $(H_{\tau _i}[f]\ \textrm{mod}\ p)_p$ for the same value of i, and only for $\tau _i$ for which this is a class invariant. This issue is addressed in detail in [14, Sect. 4].

We will first explain how to adapt one solution to our main case of quotients C of $X^0_+(N)$ where N is coprime to the conductor of $\mathcal {O}$ and $D=\textrm{disc}(\mathcal {O})$ is a square modulo 4N.

We adapt the method of Sect. 4.3 of [14] as follows. We have $\textbf{Q}(X^0(N)) = \textbf{Q}(j,j_N)$, where $j_N(z) = j(z/N) = j(W_N z)$ for the Fricke-Atkin-Lehner involution $W_N : z\mapsto -N/z$ (this follows for example from [32, Proposition 6.9]). In particular, every function $f\in \textbf{Q}(C)$ for a quotient C of $X^0(N)$ can be expressed as a rational function in j and $j_N$. In practice, these expressions can be quite large, but (analogously to [14, Lemma 2]) we can also obtain the value f(z) as a root of $\gcd (\Psi _{f}(X,j(z)),\Psi _{f\circ W_N}(X,j_N(z)))$ instead.

In the particular case where C is a quotient of $X^0_+(N)$, we even have $\textbf{Q}(C)\subset \textbf{Q}(X^0_+(N)) = \textbf{Q}(j+ j_N,j\cdot j_N)$, and we can use $\Psi _{f}$ instead of $\Psi _{f\circ W_N}$.

So instead of enumerating just the j-values, we wish to link them with the corresponding $j_N$-values, and we do that as follows.

Suppose that N is coprime to the conductor of $\mathcal {O}$ and that D is a square modulo 4N. Then by Lemma 4.4 we get $a,b,c\in \textbf{Z}$ with $a,c>0$, $b^2-4ac = D$, $N\mid c$, and $\gcd (ac/N,N)=\gcd (a,b,c)=1$. In line with Lemma 2 of [14] we could even take $c=N$ by replacing a by ac/N. We take $z = \frac{-b+\sqrt{D}}{2a}$, ${\mathfrak {n}} = a{\overline{z}}\textbf{Z}+N\textbf{Z}$, and $\mathfrak {a}= z\textbf{Z}+\textbf{Z}$. Then we have $\mathcal {O} = az\textbf{Z}+\textbf{Z}$, and we find that ${\mathfrak {n}}$ is an invertible $\mathcal {O}$-ideal with $\mathcal {O}/{\mathfrak {n}}\cong \textbf{Z}/N\textbf{Z}$. In fact, we find $\overline{{\mathfrak {n}}}\mathfrak {a}= z\textbf{Z}+N\textbf{Z}$ and hence

$$\begin{aligned} \sigma _{[{\mathfrak {n}}]}j(z) = j({\mathfrak {n}}^{-1}\mathfrak {a}) = j(\overline{{\mathfrak {n}}}\mathfrak {a}) = j_N(z). \end{aligned}$$

Exactly as in Sect. 4.3 of [14], we list the j-values of elliptic curves over $\textbf{F}_p$ with endomorphism ring $\mathcal {O}$, and arrange them into unoriented $[{\mathfrak {n}}]$-isogeny cycles. If C is a quotient of $X^0_+(N)$ over $\textbf{Q}$, then for each edge of this graph, we find the f-value from the two j-values of the end points. (In the case where the $[{\mathfrak {n}}]$-isogeny cycles are 2-cycles, we only get one f-value per 2-cycle and we get a lower-degree class polynomial $H_\tau [f]$.)

In practice, we could do this for $f=x$ exactly as in [14], and then solve for y using $\Psi _{C}(x,y,j)=0$, which is linear in y. The only additional work compared to what is done in [14] is computing and solving the linear equation to get y, which is much faster than all the other steps.

In particular, Step 1 takes much less than twice as long with C than with x, while we need to do it only half as often, which leads to a speed-up. Further research into these modular polynomials is needed in order to determine the exact gain.

To also make this work for quotients of $X^0(N)$ that are not quotients of $X^0_+(N)$, one would need to compute oriented $[{\mathfrak {n}}]$-isogeny cycles.

6.3.3 Other tricks for enumerating

The methods from [14, Sects. 4.1 and 4.2] also seem amenable.

The main computational tool at the beginning of Sect. 4.1 is the modular polynomial $\Phi _{\ell ,f}$, which we generalize from f to C as follows.

Let $\Phi _{\ell ,C}$ be a Gröbner basis of the ideal in $\textbf{Q}[X_1,Y_1,X_2,Y_2]$ of polynomials that vanish on $\{(\psi (z),\psi (\ell z)) : z\in \textbf{H}\}$, with respect to the lexicographic ordering with $X_1>Y_1>X_2>Y_2$. To get from $\psi (z)$ to all possible values of $\psi (\ell z)$, one substitutes $\psi (z)$ for $(X_1,Y_1)$, and then solves first for $X_2$ and then for $Y_2$. For each C and $\ell $ this works for all but a finite set of primes p. Such multivariate modular polynomials would need to be precomputed. One possible starting point for computing these would be [24, 25], which compute multivariate (Hilbert) modular polynomials, each with a different method. For yet another approach to computing modular polynomials, see [3].

We expect the reduction factor to also give a reduction of the size of these multivariate modular polynomials, but on the other hand, we need two of them: one to solve for x of an isogenous curve, and one to evaluate in x and get y. As evaluating is faster than solving, we expect the use of these modular polynomials to take much less than twice as long (and we need to do it only half as often, because we have half as many primes).

The ‘Trace Trick’ of [14, Sect. 4.2] enables the use of the Weber function ${\mathfrak {f}}$ in the CRT method. In case we would also need this trick, for some more exotic curves C, we could consider applying it with arbitrary functions $f\in \textbf{Q}(C)$ such as $f = ax+by$ for small integers a and b. In loc. cit. the relevant trace is computed with much fewer primes, so it is ok to apply this with the lower reduction factor of f.

We did not yet consider the general algorithm of [14, Sect. 4.4]. It is the method that works for all class invariants, but is only practical under additional restrictions. We do not have examples of generalized class invariants where this trick is needed. The challenging step to generalize is factoring a large-degree function in $\textbf{Q}(C)$ in order to obtain the small class functions.

6.3.4 Constructing a function from its roots

In the CRT setting the multiplications and long-divisions by small-degree polynomials of Sect. 6.2 only take time $O(nM(\log (p)))$ per level, which is asymptotically dominated by the $O(M(n\log (p))$ time of the multiplications of large-degree polynomials. Therefore, Step 2 seems to take about 1.5 times as long per prime p for $H_\tau [C]$ when compared to $H_\tau [x]$.

6.3.5 The total running time

Concluding this preliminary analysis, we estimate the cost of computing $H_\tau [C]$ to be significantly lower compared to $H_\tau [x]$, though further research, in particular into (the implementation of) modular polynomials for C is required to determine the exact gain. This is beyond the scope of the current paper, which focuses on introducing the generalized class functions and their height reduction. We plan to give a more detailed account and an implementation in future work.

7 General curves and bases

Now suppose that our modular curve $C$ is not necessarily an elliptic curve. Let $\mathcal {D}$ be an effective divisor over $\textbf{Q}$ on $C$ and let $\mathcal {B}=\{b_0,b_1,\ldots \}$ be a $\textbf{Q}$-basis of $\mathcal {L}(\infty \mathcal {D})$ ordered by ascending degree.

The classical case is the case where we have one modular function f and we take $C= \textbf{P}^1$, $\psi = f = (f:1)$, $\mathcal {D}= ((1:0))=(\infty )$, and $\mathcal {B}=\{1,f,f^2,\ldots \}$. The case of all previous sections of this paper is the case where $C$ is an elliptic curve given by a Weierstrass equation, $\mathcal {D}= ((0:1:0))$, and $\mathcal {B}=\{1,x,y,x^2,xy,x^3,x^2y,\ldots \}$.

Example 7.1

One systematic way to choose a $\textbf{Q}$-basis of ${{\mathcal {L}}}(\infty \mathcal {D})$ is as follows. First choose $x\in {{\mathcal {L}}}(\infty \mathcal {D})\setminus \textbf{Q}$ of some degree d. (For example, one can take $x=f$ with $d=1$ in the classical case, and $x=x$ with $d=2$ in the elliptic case.) Now, let $y_0 = 1$ and choose $y_j$ for $j = 1,2,\ldots , d-1$ in such a way that

$$\begin{aligned} y_j \in {{\mathcal {L}}}(m_j \mathcal {D})\setminus \langle y_k x^e : k < j, e\in \textbf{Z}\rangle , \end{aligned}$$

where $m_j$ is minimal such that this set is non-empty. This way we obtain a vector $\textbf{y} = (y_0,\ldots , y_{d-1})$ of d functions. (For example, in the classical case we have $\textbf{y}=1$, and in the elliptic case we chose $\textbf{y} =(1,y)$.) Then $\mathcal {B}=\{x^e y_j : e\in \textbf{Z}_{\ge 0}, j\in \{0,1,2,\ldots ,d-1\}\}$ is a basis of ${\mathcal {L}}(\infty \mathcal {D})$. We order this basis by ascending degree $de+m_j$, and if two elements have the same degree, then we put the one with lowest j first.

Example 7.2

Consider the modular curve $X^0_+(191)$ (not to be confused with 119), which is hyperelliptic with model $t^2 = s^6+2s^4+2s^3+5s^2-6s+1$ [16, Table 3], and the unique cusp is at $\mathcal {D}=((1:1:0))$. One of the possible bases of $\mathcal {L}(\infty \mathcal {D})$ obtained by the recipe above is $\mathcal {B}=\{1,x,y_1,y_2,x^2,x^2y_1,x^2y_2,\ldots \}$, where $x=(t+s^3+s+1)/2$, $y_1=sx$, and $y_2=s(y_1+1)$. The degrees of these functions are respectively 3, 5, and 7.

The function x is, up to multiplicative and additive constants, equal to the Atkin function $A_{191}$. The reduction factors are $r(C) = 96$, $r(s) = 48$, and $r(A_{191}) = 32$.

As in Sect. 2, let $\tau \in \textbf{H}$ imaginary quadratic and assume that $(b_i,\tau )$ is a class invariant for every $b_i\in \mathcal {B}$. Then, again unique up to scaling, we obtain a non-zero function $F_{\tau }[C,\mathcal {B}]=\sum _{i=0}^k a_ib_i\in K(C)$ ($a_i\in K$) with k minimal such that $\sum _{i=0}^k a_ib_i(\tau )=0$.

Definition 7.3

We call this $F_{\tau }[C,\mathcal {B}]$ the generalized class function for the triple $C,\mathcal {B},\tau $.

If $\mathcal {B}$ is as in Example 7.1 then we again refer to the associated polynomial $H_{\tau }[C,\mathcal {B}]\in K[X,Y_1,\ldots ,Y_d]$ (of total degree $\le 1$ in $Y_1,\ldots , Y_d$ and such that $H_{\tau }[C,\mathcal {B}](x,y_1,\ldots , y_d) = F_{\tau }[C,\mathcal {B}]$) as the generalized class polynomial.

Example 7.4

It turns out that, already for the case of elliptic curves, allowing the freedom of the choice of basis of may in reality lead to potentially better practical reduction factors. Revisiting our main example $C:=X^0_+(119)$, denote by $w:={\mathfrak {w}}_{7,17}$ the function (2.9) and by $z:=x+y$ the sum of the Weierstrass coordinates for the model (2.8). Now consider the basis $\mathcal {B}:=\{1,x,z,w,xz,wx,wz,w^2,wxz,w^2x,\ldots \}$ of $\mathcal {L}(\infty \mathcal {D})$. The resulting generalized class polynomials corresponding to the discriminants of Table 1 are listed in Table 5. We get practical reduction factors in Fig. 3 that are better than those in Fig. 1.

A likely explanation for this improvement is that now not only the poles, but also the zeroes are as much restricted to the cusps of $X^0_+(119)$ as possible. Indeed, the points $O=(0:1:0)$ and $P=(0,0)$ are the cusps, while 2P and 3P are rational CM points. Now $\textrm{div}(w) = 4(P)-4(O)$, $\textrm{div}(x) = (P) + (3P) - 2(O)$, and $\textrm{div}(y) = 2(P) + (2P) -3(O)$. In particular, the function w is a modular unit. As explained in Remark 3.15, modular units in the classical setting give better practical reduction factors than non-units, even though the reduction factors are asymptotically the same.

Table 5 Some conjecturally correct generalized class functions for the curve $C=X^0_+(119)$ using the $\mathcal {L}(\infty \mathcal {D})$-basis $\mathcal {B}:=\{1,x,z,w,xz,wx,wz,w^2,wxz,w^2x,\ldots \}$

Full size table

Theorem 7.5

Let $C: y^2 + g(x) y = f(x)$ with $f,g\in \textbf{Q}[X]$ be a hyperelliptic curve such that $4f(x)+g(x)^2$ has odd degree and $\textrm{Jac}(C)(\textbf{Q})$ is finite. Set $\mathcal {D}$ to be the unique point at infinity and choose the basis $\mathcal {B}=\{1,x,x^2,y,xy,x^2y,\ldots \}$ of $\mathcal {L}(\infty \mathcal {D})$. Then Theorem 3.7 and Proposition 3.13 also hold for C and $H_{\tau }[C,\mathcal {B}]$.

Proof

The original proof now goes through with only the following change. There are finitely many possibilities for the class c of the divisor $ - \sum _{\sigma } ((\sigma (\psi (\tau ))) - \mathcal {D})$ by our assumption that $\textrm{Jac}(C)(\textbf{Q})$ is finite. For every c, choose a representative $\sum _{i=1}^{m} ((P_i)-\mathcal {D})$ with m minimal and consider a primitive polynomial $T\in \textbf{Z}[X]$ with roots $x(P_i)$ for $i=1,\ldots , m$. $\square $

Remark 7.6

Our proofs of Theorems 3.7 and 7.5 heavily rely on the fact that Heegner points are torsion. To completely remove the assumption on ranks, one would therefore need to bound the Heegner points, even in the rank-one case. Moreover, the proofs rely on the hyperelliptic equation where we use that $|a| \le |a+bi|$ for real numbers a and b. Though we expect an analogue of these results to hold for general modular curves, this would require additional ideas. Do note that such an analogue would yield arbitrarily high reduction factors for generalized class polynomials by (3.2). For example, for $C=X^0_+(239)$ of genus 3 we already obtain $r(C)=120$, exceeding the Bröker–Stevenhagen bound.

Data availability

The authors declare that the data supporting the findings of this study are available within the article and its supplementary information files.

Notes

One way to deduce this is as follows. Using the command J0(119).decomposition() in SageMath [36] one finds that C has conductor 17. For each of the Weierstrass models of the now finitely many possible curves [35], there are finitely many options for the divisor of the function ${\mathfrak {w}}_{7,17}$ given by (2.9). The curve C has two rational CM points (both of discriminant $-19$), so given a possible Weierstrass model together with a possible divisor for ${\mathfrak {w}}_{7,17}$, one can first determine ${\mathfrak {w}}_{7,17}$ as a function of the Weierstrass coordinates x, y by evaluating in one CM point, and then determine whether it has the expected value in the other CM point. This process excludes all but one of the options, and we at once in fact deduce both the Weierstrass model (2.8) and the relation between ${\mathfrak {w}}_{7,17}$ and x and y (2.10).
We note that a slightly “simpler” Weierstrass model $v^2+uv+v=u^3-u^2-u$ exists by taking $u=x$ and $v=-y-2x$, but the given model (2.8) turns out to yield slightly better practical reduction factors (see Sect. 4.5).
The arXiv version v1 of [4] has weaker bounds than the final publication and needs to be combined with [21, Appendix 2] to get the same result.

References

Bars, F.: Bielliptic modular curves. J. Number Theory 76(1), 154–165 (1999)
Article MathSciNet MATH Google Scholar
Birch, B.J.: Weber’s class invariants. Mathematika 16, 283–294 (1969)
Article MathSciNet MATH Google Scholar
Bröker, R., Lauter, K., Sutherland, A.V.: Modular polynomials via isogeny volcanoes. Math. Comput. 81(278), 1201–1231 (2012)
Article MathSciNet MATH Google Scholar
Bröker, R., Stevenhagen, P.: Constructing elliptic curves of prime order. In: Computational Arithmetic Geometry. Contemporary Mathematics, vol. 463, pp. 17–28. American Mathematical Society, Providence, RI (2008)
Cremona, J.: Algorithms for Modular Elliptic Curves. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Cummins, C.J., Pauli, S.: Congruence subgroups of ${{\rm PSL}}(2,{\mathbb{Z}})$ of genus less than or equal to 24. Exp. Math. 12(2), 243–255 (2003)
Article MathSciNet MATH Google Scholar
Enge, A.: The complexity of class polynomial computation via floating point approximations. Math. Comput. 78(266), 1089–1107 (2009)
Article MathSciNet MATH Google Scholar
Enge, A., Morain, F.: Comparing invariants for class fields of imaginary quadratric fields. In: Algorithmic Number Theory (Sydney, 2002). Lecture Notes in Computer Science, vol. 2369, pp. 252–266. Springer, Berlin (2002)
Enge, A., Morain, F.: Generalised Weber functions. Acta Arith. 164(4), 309–342 (2014)
Article MathSciNet MATH Google Scholar
Enge, A., Schertz, R.: Constructing elliptic curves over finite fields using double eta-quotients. Journal de Théorie des Nombres de Bordeaux 16, 555–568 (2004)
Article MathSciNet MATH Google Scholar
Enge, A., Schertz, R.: Modular curves of composite level. Acta Arith. 118(2), 129–141 (2005)
Article MathSciNet MATH Google Scholar
Enge, A., Schertz, R.: Singular values of multiple eta-quotients for ramified primes. LMS J. Comput. Math. 16, 407–418 (2013)
Article MathSciNet MATH Google Scholar
Enge, A., Streng, M.: Schertz style class invariants for genus two (2016). preprint. arXiv:1610.04505
Enge, A., Sutherland, A.V.: Class invariants by the CRT method. In: Algorithmic Number Theory. Lecture Notes in Computer Science, vol. 6197, pp. 142–156. Springer, Berlin (2010)
Furumoto, M., Hasegawa, Y.: Hyperelliptic quotients of modular curves $X_0(N)$. Tokyo J. Math. 22(1), 105–125 (1999)
Article MathSciNet MATH Google Scholar
Galbraith, S.: Equations for modular curves. PhD thesis, St. Cross College. https://www.math.auckland.ac.nz/~sgal018/thesis.pdf
Gee, A.: Class invariants by Shimura’s reciprocity law, vol. 11, pp. 45–72. 1999. Les XXèmes Journées Arithmétiques (Limoges, 1997)
Gee, A., Stevenhagen, P.: Generating class fields using Shimura reciprocity. In: Algorithmic Number Theory (Portland, OR, 1998). Lecture Notes in Computer Science, vol. 1423, pp. 441–453. Springer, Berlin (1998)
Hindry, M., Silverman, J.H.: Diophantine Geometry. Graduate Texts in Mathematics, vol. 201. Springer, New York (2000). (An introduction)
Jeon, D.: Bielliptic modular curves $X_0^+(N)$. J. Number Theory 185, 319–338 (2018)
Article MathSciNet MATH Google Scholar
Kim, H.H.: Functoriality for the exterior square of ${\rm GL}_4$ and the symmetric fourth of ${\rm GL}_2$. J. Am. Math. Soc. 16(1), 139–183 (2003). (With appendix 1 by Dinakar Ramakrishnan and appendix 2 by Kim and Peter Sarnak)
Article Google Scholar
Littlewood, J.E.: On the class-number of the corpus $p(\sqrt{-k})$. Proc. Lond. Math. Soc. 27, 358–372 (1928)
Article MathSciNet MATH Google Scholar
Mahler, K.: An application of Jensen’s formula to polynomials. Mathematika 7, 98–100 (1960)
Article MathSciNet MATH Google Scholar
Martindale, C.: Hilbert modular polynomials. J. Number Theory 213, 464–498 (2020)
Article MathSciNet MATH Google Scholar
Milio, E., Robert, D.: Modular polynomials on Hilbert surfaces. J. Number Theory 216, 403–459 (2020)
Article MathSciNet MATH Google Scholar
Newman, M.: Construction and application of a class of modular functions. Proc. Lond. Math. Soc. 3(7), 334–350 (1957)
Article MathSciNet MATH Google Scholar
Ogg, A.P.: Hyperelliptic modular curves. Bull. Soc. Math. France 102, 449–462 (1974)
Article MathSciNet MATH Google Scholar
Rubin, K., Silverberg, A.: Choosing the correct elliptic curve in the CM method. Math. Comput. 79(269), 545–561 (2010)
Article MathSciNet MATH Google Scholar
Schertz, R.: Die singulären Werte der Weberschen Funktionen $\mathfrak{f} $, $\mathfrak{f} _1$, $\mathfrak{f} _{2}$, $\gamma _{2}$, $\gamma _{3}$. J. Reine Angew. Math. 286(287), 46–74 (1976)
MathSciNet Google Scholar
Schertz, R.: Weber’s class invariants revisited. J. Théor. Nombres Bordeaux 14(1), 325–343 (2002)
Article MathSciNet MATH Google Scholar
Selberg, A.: On the estimation of Fourier coefficients of modular forms. In: Proceedings of Symposia in Pure Mathematics, vol. VIII, pp. 1–15. American Mathematical Society, Providence, RI (1965)
Shimura, G.: Introduction to the Arithmetic Theory of Automorphic Functions. Kanô Memorial Lectures, No. 1. Iwanami Shoten Publishers, Tokyo; Princeton University Press, Princeton, NJ (1971). (Publications of the Mathematical Society of Japan, No. 11)
Sutherland, A.V.: Computing Hilbert class polynomials with the Chinese remainder theorem. Math. Comput. 80(273), 501–538 (2011)
Article MathSciNet MATH Google Scholar
Sutherland, A.V.: Accelerating the CM method. LMS J. Comput. Math. 15, 172–204 (2012)
Article MathSciNet MATH Google Scholar
The LMFDB Collaboration. The L-functions and modular forms database. http://www.lmfdb.org (2022). [Online; accessed March and August 2022]
The Sage Developers. SageMath, the Sage Mathematics Software System (Version 9.5) (2022). https://www.sagemath.org
Weber, H.: Algebraische Zahlen. Lehrbuch der Algebra, vol. 3. Friedrich Vieweg (1908)

Download references

Acknowledgements

The authors would like to thank Karim Belabas, Peter Bruin, Andreas Enge, David Kohel, Filip Najman, and Andrew Sutherland for helpful discussions and Edgar Costa for helping us use the LMFDB. The authors would further like to thank the organisers Samuele Anni, Valentijn Karemaker, and Elisa Lorenzo García of AGC${}^2$T 2021 for the inspiring setting in which the first ideas for this project came to be. Finally, the authors would like to thank the anonymous referees for their many comments that led to improvements of the exposition. The first-listed author is supported by the Research Foundation – Flanders (FWO) under a PhD Fellowship Fundamental Research.

Author information

Authors and Affiliations

Departement Wiskunde, KU Leuven, Celestijnenlaan 200B – bus 2400, 3001, Leuven, Belgium
Marc Houben
imec-COSIC, KU Leuven, Kasteelpark Arenberg 10/2452, 3001, Leuven, Belgium
Marc Houben
Mathematisch Instituut, Universiteit Leiden, P.O. Box 9512, 2300 RA, Leiden, The Netherlands
Marc Houben
Mathematisch Instituut, Universiteit Leiden, P.O. Box 9512, 2300 RA, Leiden, The Netherlands
Marco Streng

Authors

Marc Houben
View author publications
You can also search for this author in PubMed Google Scholar
Marco Streng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Houben.

Ethics declarations

Conflict of interest

The authors assert that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Supplementary file 1 (txt 0 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Houben, M., Streng, M. Generalized class polynomials. Res. number theory 8, 103 (2022). https://doi.org/10.1007/s40993-022-00400-2

Download citation

Received: 30 August 2022
Accepted: 30 September 2022
Published: 16 November 2022
DOI: https://doi.org/10.1007/s40993-022-00400-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Generalized class polynomials

Abstract

Similar content being viewed by others

Class invariants from a new kind of Weber-like modular equation

On Chudnovsky–Ramanujan type formulae

The General Coefficient Theorem of Jenkins and the Method of Modules of Curve Families

1 Introduction

2 Generalized class polynomials

Definition 2.1

Example 2.2

Definition 2.5

Definition 2.6

Example 2.7

3 Estimates and reduction factors

3.1 Reduction factors

Remark 3.3

3.2 Measures of polynomials and heights of their roots

Lemma 3.4

Proof

Remark 3.6

3.3 Proof of the height reduction

Theorem 3.7

Remark 3.9

Theorem 3.11

Proof

Proposition 3.13

Proof

Proof of Theorem 3.7

Remark 3.15

4 Class invariants for \(X^0(N)\) and \(X^0_+(N)\)

4.1 Class invariants for \(X^0(N)\)

Proposition 4.1

Proof

4.2 Real class polynomials from ramification

Proposition 4.2

Proof

4.3 Real class polynomials from \(X^0_+(N)\)

Proposition 4.3

Proof

Lemma 4.4

Proof

Lemma 4.5

Proof

4.4 Lower-degree class polynomials from ramification

Proposition 4.6

Proof

4.5 Numerical results for \(X^0_+(119)\)

4.6 Comparison with existing class invariants

Remark 4.7

Remark 4.8

4.7 More modular curves of genus one

5 Application: the CM method

Lemma 5.1

Proof

Remark 5.2

6 The computational benefits of our invariants

6.1 Space complexity of the functions

6.2 Speed of complex analytic computation

6.3 Adapting the CRT method

6.3.1 Overview of CRT class polynomial computation

6.3.2 Enumerating via the Fricke involution

6.3.3 Other tricks for enumerating

6.3.4 Constructing a function from its roots

6.3.5 The total running time

7 General curves and bases

Example 7.1

Example 7.2

Definition 7.3

Example 7.4

Theorem 7.5

Proof

Remark 7.6

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations