Abstract
In this paper, we provide refined sufficient conditions for the quadratic Chabauty method on a curve X to produce an effective finite set of points containing the rational points \(X({\mathbb {Q}})\), with the condition on the rank of the Jacobian of X replaced by condition on the rank of a quotient of the Jacobian plus an associated space of Chow–Heegner points. We then apply this condition to prove the effective finiteness of \(X({\mathbb {Q}})\) for any modular curve \(X=X_0^+(N)\) or \(X_\mathrm{{ns}}^+(N)\) of genus at least 2 with N prime. The proof relies on the existence of a quotient of their Jacobians whose Mordell–Weil rank is equal to its dimension (and at least 2), which is proven via analytic estimates for orders of vanishing of L-functions of modular forms, thanks to a Kolyvagin–Logachev type result.
1 Introduction
The Chabauty–Kim method is a method for determining the set \(X({\mathbb {Q}})\) of rational points of a curve X over \({\mathbb {Q}}\) of genus bigger than 1. The idea is to locate \(X({\mathbb {Q}})\) inside \(X({\mathbb {Q}}_p )\) by finding an obstruction to a p-adic point being global. The method developed in [39, 40] produces a tower of obstructions
In [5], it is conjectured that \(X({\mathbb {Q}}_p )_n =X({\mathbb {Q}})\) for all \(n\gg 0\), and in [40] it is proved that standard conjectures in arithmetic geometry imply \(X({\mathbb {Q}}_p )_n \) is finite for all \(n\gg 0\), but in general these results are not known.
The first obstruction set \(X({\mathbb {Q}}_p )_1\) is the one produced by Chabauty’s method. In situations when \(X({\mathbb {Q}}_p )_1 \) is finite, it can often be used to determine \(X({\mathbb {Q}})\).
The main results of this paper concern the finiteness of the Chabauty–Kim set \(X({\mathbb {Q}}_p )_2 \) when X is one of the modular curves \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) or \(X_0 ^+ (N)\) (N a prime different from p), whose definition and properties we now recall briefly (more details are given in Sect. 4).
The curve \(X_0 ^+ (N)\) is the quotient of \(X_0 (N)\) by the Atkin–Lehner involution \(w_N\). The curve \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) is the quotient of X(N) by the normalizer of a nonsplit Cartan subgroup. Determining the rational points of \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) would resolve Serre’s uniformity question [58, §4.3]: is there an \(N_0\) such that, for all \(N>N_0\) and all elliptic curves E defined over \({\mathbb {Q}}\) without complex multiplication, the mod N Galois representation
is surjective? The Borel and normalizer of split Cartan subgroups of Serre’s uniformity question have been given a positive answer respectively in the celebrated papers [50] and [11, 12].
Mazur’s proof may, very crudely, be described as having two stages.
-
1.
Construct a non-constant map \( f :X\rightarrow A \) from X to an abelian variety of rank zero over \({\mathbb {Q}}\).
-
2.
Compute the finite set \(A({\mathbb {Q}})\), and the pre-image \(f^{-1}(A({\mathbb {Q}}))\supset X({\mathbb {Q}})\).
As is explained in Sect. 4, in contrast to \(X_0 (N)\) and \(X_{\mathrm {s}}^+ (N)\), for \(X=X_0 ^+ (N)\) or \(X=X_{{{\,\mathrm{ns}\,}}}^+ (N)\), the Birch–Swinnerton-Dyer conjecture implies that there are no non-constant maps from A to abelian varieties of rank zero over \({\mathbb {Q}}\). It is hence natural to ask whether we can attempt to mimic Mazur’s strategy, with the set \(A({\mathbb {Q}})\) replaced by the set \(X({\mathbb {Q}}_p )_n\) for some n. In Sect. 4, we show that the Birch–Swinnerton-Dyer conjecture similarly implies \(X({\mathbb {Q}}_p )_1\) is infinite, hence we expect to need \(n>1\). The main result of this paper is to carry out the first stage of Mazur’s strategy for \(n=2\).
Theorem 1
-
1.
For all prime N such that \(g(X_0 ^+ (N)) \ge 2\), \(X_0 ^+ (N)({\mathbb {Q}}_p )_2 \) is finite for any \(p\ne N\).
-
2.
For all prime N such that \(g(X_{{{\,\mathrm{ns}\,}}}^+ (N)) \ge 2\) and \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}})\ne \emptyset \), \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}}_p )_2 \) is finite for any \(p\ne N\).
Remark 1
-
For all primes N for which one of the curves X above has genus 0 or 1, \(X({\mathbb {Q}})\) is infinite. Indeed, the prime numbers N such that \(g(X_0^+(N)) \le 1\) make up a finite list with maximal element 131 [27, Propositions 3.1 and 3.2], and the elliptic cases cases can then be checked on the LMFDB [48] by looking at the corresponding explicit elliptic curves sorted by conductor (the genus 0 case is automatic due to the rational cusp). For the nonsplit Cartan modular curve, the genus formula [55, Proposition 13] proves that \(g(X_\mathrm{{ns}}^+(N)) \le 1\) if and only if \(N \le 11\), and these 5 cases are sorted similarly, as one can always find a rational point associated to an elliptic curve with CM coming by one of the 9 class number one fields.
-
The only reason for the assumption that \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}})\) is nonempty is that the definition of \(X({\mathbb {Q}}_p )_2 \) currently assumes that X has a rational point (if Serre’s uniformity question has a positive answer, then there are infinitely many N for which \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}})\) is empty). One can modify the definition of \(X({\mathbb {Q}}_p )_2 \) - for example in a similar manner to [33] - to remove this assumption, and then \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}}_p )_2 \) will be finite whenever the genus of \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) is greater than 1. In particular, such a modification should in principle given a method to prove that \(X_{{{\,\mathrm{ns}\,}}}^+(N)({\mathbb {Q}}_p )_2\), and hence \(X_{{{\,\mathrm{ns}\,}}}^+ (N)({\mathbb {Q}})\), is empty in these cases (although the large genera of such curves mean that in practice such curves are currently beyond the scope of existing computational methods for other reasons). As this involves several techniques not relevant to the proof of Theorem 1, we do not pursue this point in this paper.
-
Finally, results of [3], together with Edixhoven and Parent’s explicit models for \(X_{{{\,\mathrm{ns}\,}}}(N)\) [23], allow us to deduce from our result an explicit bound (polynomial in N) on the number of rational points on \(X_0^+(N)\) and \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\), which we do in Sect. 3.1.
In this paper we say nothing about carrying out the second stage of Mazur’s strategy (i.e. computing the finite set \(X({\mathbb {Q}}_p )_2\)). However, as alluded to above, for a given X, if one can prove \(X({\mathbb {Q}}_p )_2\) is finite there has been significant recent progress in computing it, and \(X({\mathbb {Q}})\), in practice. For example, when \(N=13\), the rational points of \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) are computed in [8], by computing \(X({\mathbb {Q}}_p )_2\). Similarly for \(X=X_0 ^+ (N)\), the rational points of all X of genus 2 are computed in [4], and in forthcoming work [9], the case of all X of genus three is handled.
The proof of Theorem 1 proceeds along the lines of the quadratic Chabauty method, which requires a precise inequality (namely (2)) in terms of invariants of the Jacobian J of X to hold (see Sect. 1.1). This inequality is expected to hold asymptotically for \(X=X_0 ^+ (N)\) or \(X=X_{{{\,\mathrm{ns}\,}}}^+ (N)\) conditionally on Birch and Swinnerton-Dyer conjecture (see §4.1), but looks out of reach unconditionally for N in noncomputable range. There are thus two important steps obtained in the proof of Theorem 1:
-
For p a prime of good reduction of a smooth projective geometrically irreducible curve X over \({\mathbb {Q}}\) with \(X({\mathbb {Q}}) \ne \emptyset \), \(X({\mathbb {Q}}_p)_2\) is finite under the condition that a similar inequality to (2) holds not for J but a quotient abelian variety A of J, and under an additional hypothesis (C) on X, J, A.
-
For \(X=X_0^+(N)\) or \(X=X_{{{\,\mathrm{ns}\,}}}^+ (N)\), there is an abelian variety of A satisfying (2) and such that X, J, A satisfy (C), if for \(M=N\) (resp. \(N^2\)) there are two distinct normalised eigenforms \(f \in S_2 (\varGamma _0 (M))^{+,\text {new}} \) such that \(L'(f,1) \ne 0\).
The final input in the proof of Theorem 1 is the following Theorem.
Theorem 2
For all \(M=N\) or \(N^2\) with N prime, if the space \(S_2 (\varGamma _0 (M))^{+,\mathrm {new}} \) is of dimension at least two, it contains two distinct normalised newforms f such that \(L'(f,1) \ne 0\).
As explained in Remark 8, this result of nonvanishing is in fact quite weak compared to known or expected asymptotic estimates (giving a positive linear proportion of nonvanishing values) so the main difficulty in the proof of Theorem 2 lies in making such estimates effective enough to prove the result except for small enough N so that the remaining cases can be checked algorithmically.
1.1 Chow–Heegner points and quadratic Chabauty
In general, \(X({\mathbb {Q}}_p )_n \) cannot unconditionally be proved to be finite without some assumptions on the Jacobian of X (Kim showed that the Bloch–Kato conjectures imply that \(X({\mathbb {Q}}_p )_n \) is finite for all \(n\gg 0\) [40, Observation 2]). In the case \(n=1\) (which reduces to the classical set-up of Chabauty’s method) it is known that a sufficient condition is that
where \({{\,\mathrm{rk}\,}}(J)\) is the Mordell–Weil rank of \(J({\mathbb {Q}})\). The simplest instance extending Chabauty’s method when finiteness of \(X({\mathbb {Q}}_p )_n\) can be proved for \(n >1\) is the following Lemma. To state the Lemma, let J denote the Jacobian of X, and recall that the Picard number \(\rho (J)\) is defined to be the rank of the Néron–Severi group \({{\,\mathrm{NS}\,}}(J):={\text {Pic}}(J)/{\text {Pic}}^0 (J)\). By [51, Proposition 17.2], this is the same as the dimension of the subspace denoted by \({\text {End}}^\dagger (J)\) of \({\text {End}}^0 (J):={\text {End}}(J) \otimes {\mathbb {Q}}\) consisting of endomorphisms that are symmetric, i.e. fixed by the Rosati involution.
Lemma 1
([6], Lemma 3.2) If
then \(X({\mathbb {Q}}_p )_2 \) is finite. In particular, if \({{\,\mathrm{rk}\,}}(J) = \dim (J)\), then \(X({\mathbb {Q}}_p )_2 \) is finite whenever \(\rho (J)>1\).
By Kolyvagin–Logachev type results due to Nekovář and Tian (see Proposition 8 and its Corollary 4), Theorem 2 implies that the Jacobians of \(X_0 ^+ (N)\) and \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\), which we will henceforth denote by \(J_0 ^+ (N)\) and \(J_{{{\,\mathrm{ns}\,}}}^+ (N)\) respectively, do have \({\mathbb {Q}}\)-isogeny factors A satisfying \({{\,\mathrm{rk}\,}}(A) <\dim (A) + \rho (A)-1\), but it seems unattainable to prove unconditionally such a result for the full Jacobian. To deduce Theorem 1, we thus need a ‘quadratic Chabauty for quotients’ result, analogous to the well-known fact that Chabauty’s method also works under the relaxed condition \({{\,\mathrm{rk}\,}}(A)<\dim (A)\), i.e. (1) for an isogeny factor A instead of J (in fact, for modular curves, Mazur–Kamienny’s method refines this for factors A such that \({{\,\mathrm{rk}\,}}(A)=0\), see e.g. [2]).
As explained below, in general such a result seems non-trivial. Fix a basepoint \(b\in X({\mathbb {Q}})\), and let \(\mathrm {AJ}:X\rightarrow J\) be the corresponding Abel–Jacobi map. Let A, B be abelian varieties over \({\mathbb {Q}}\), satisfying \({\text {Hom}}(A,B)=0\), and suppose we have a surjection \((\pi _A ,\pi _B ) :J \rightarrow A \times B\).
A slight modification denoted by \({\widetilde{\mathrm {AJ}}}^*\) of the pullback by \(\mathrm {AJ}\) (which basically amounts to considering the restriction of \(\mathrm {AJ}^*\) on symmetric line bundles, see §2.1) vanishes on \({\text {Pic}}^0(J)\), so it factors through \({{\,\mathrm{NS}\,}}(J)\) and \({\widetilde{\mathrm {AJ}}}^* :{{\,\mathrm{NS}\,}}(J) \rightarrow {\text {Pic}}(X)\) will denote this factorisation by abuse of notation. It induces a map
and therefore a map
which is called the Chow-Heegner construction (see Definition 3 for details).
Remark 2
As an alternative definition (useful for the proofs), for any correspondence \(Z \subset X\times X\), we can associate a cycle \(D_Z (b)\in {\text {Pic}}^0(X)\) (see (16)), and this defines a homomorphism \({{\,\mathrm{NS}\,}}(X \times X) \rightarrow {\text {Pic}}^0(X)\) so that the composition
where \(\mathrm {AJ}^{(2)} :X \times X \rightarrow J\) is defined by \((x,y) \mapsto [x] + [y] - 2[b]\), is equal to \({\widetilde{\mathrm {AJ}}}^*\) on \(({\widetilde{\mathrm {AJ}}}^*)^{-1}({\text {Pic}}^0(X))\), which then allows us to retrieve \(\theta _{X,\pi _A,\pi _B}\) on cycles Z coming from \({\text {Ker}}d_{\pi _A}\).
The ‘quadratic Chabauty for quotients’ result that we prove in this paper says that we can replace J with A, but the price we pay is that we replace \(\rho (J)-1\) with the rank of \({\text {Ker}}(\theta _{X,\pi _A ,\pi _B})\), which can be smaller than \(\rho (A)-1\).
Proposition 1
Let X be a curve as above. Suppose J admits an isogeny \((\pi _A ,\pi _B) :J\rightarrow A\times B\), where \({\text {Hom}}(A,B)=0\). If

then \(X({\mathbb {Q}}_p )_2 \) is finite.
In the case where \({{\,\mathrm{rk}\,}}(A)=\dim (A)\) which we will focus on, we can simplify this condition in terms of nice correspondences, defined in Sect. 2.1. More precisely, \((\pi _A, \pi _B)\) induces an isomorphism \({\text {End}}^0(J) \cong {\text {End}}^0(A) \times {\text {End}}^0(B)\), and \(X({\mathbb {Q}}_p )_2 \) is finite whenever there exists a nontrivial nice correspondence Z on \(X\times X\) whose corresponding endomorphism of J is zero in \({\text {End}}^0(B)\), and whose corresponding Chow–Heegner point \(D_Z (b) \in {\text {Pic}}^0(X)\) is torsion when projected to B.
Remark 3
Note that, since \({{\,\mathrm{rk}\,}}({\text {Ker}}(\theta _{X, \pi _A ,\pi _B })) \le \rho (A) -1\), inequality (C) implies that A satisfies the naive analogue of Lemma 1
However, in general (C) is strictly stronger than (5). In fact, the trivial lower bound on \({{\,\mathrm{rk}\,}}({\text {Ker}}(\theta _{X,\pi _A,\pi _B})\) is \(\rho (A)-1 - {{\,\mathrm{rk}\,}}(B)\) and if the latter was positive, it would imply (2). This is why Proposition 2 looks quite particular to modular curves. Moreover, understanding the rank of \({\text {Ker}}(\theta _{X,\pi _A ,\pi _B })\) in general seems somewhat subtle - as becomes apparent in Example 1 and Sect. 3.2, this quantity is not an invariant of the pair (A, B), or even of the triple (X, A, B), and does not seem to behave so well functorially even under quite strong hypotheses. Finally, as explained in the first appendix, this quantity is also related to the Gross–Kudla–Schoen cycles constructed in [31].
The following proposition emphasises that in fact, the supplementary condition (C) can always be satisfied for our modular curves.
Proposition 2
Let \(X=X_0 ^+ (N)\) or \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\), and \(J={\text {Jac}}(X)\). Assume Theorem 2 holds, and the genus of X is at least two. Then J admits an isogeny \((\pi _A,\pi _B) :J \rightarrow A \times B\) satisfying
-
1.
\({{\,\mathrm{rk}\,}}(A) = \dim A \ge 2\).
-
2.
\(\rho (A)>1\).
-
3.
\({{\,\mathrm{rk}\,}}({\text {Ker}}(\theta _{X,\pi _A,\pi _B})) =\rho (A)-1\).
As will become apparent in the proof, in fact we take A to be the maximal isogeny factor of J whose analytic rank is equal to its dimension and B its complement, otherwise we might not be able to ensure that the kernel of \(\theta _{X,\pi _A,\pi _B}\) is nontrivial. This idea relies heavily on the use of (traces of) Heegner points on the modular curves \(X_0(N),X_\mathrm{{ns}}(N)\), which generate \(A({\mathbb {Q}})\) up to finite index, but will automatically be torsion in \(B({\mathbb {Q}})\), both situations being ultimately by-products of the generalised Gross–Zagier formula (see Sect. 4.2). Note that in this case the kernel of the theta morphism is not only nontrivial, but as large as it can be, which might indicate a deeper phenomenon at play.
The structure of the paper is as follows. In Sect. 2, we give some reminders on Néron–Severi groups, Chow groups and correspondences, and describe the map \(\theta _{X,\pi _A,\pi _B}\) in terms of cycles. In Sect. 3 we prove Proposition 1. In Sect. 4, we prove Proposition 2 assuming Theorem 2, after some discussion on (C), and using generalised Gross–Zagier formulas. In Sect. 5, we prove Theorem 2. Finally, for sake of clarity and by lack of easily available references in the literature, we gather in Appendix 6 results about the Chow–Heegner construction above and explain in Appendix 7 the proof of the Kolyvagin–Logachev type result needed to translate Theorem 2 into an algebraic rank result.
1.2 Notation and conventions
Unless stated otherwise, we adopt the following conventions in this paper.
\(\bullet \) X is a smooth projective geometrically irreducible curve of genus \(\ge 2\) over \({\mathbb {Q}}\). J is the Jacobian of X and \(\mathrm {AJ}:X \rightarrow J\) is the Albanese morphism with a fixed base point \(b \in X({\mathbb {Q}})\). The notation \({\widetilde{\mathrm {AJ}}}^*\) refers to twice the pullback on symmetric line bundles of X to \({\text {Pic}}(X)\) (see (13)), and then factors through \({{\,\mathrm{NS}\,}}(J)\) (this is not the same as just the pullback \(\mathrm {AJ}^*\) from \({\text {Pic}}(J)\) to \({\text {Pic}}(X)\), which does not vanish on \({\text {Pic}}^0(J)\)).
\(\bullet \) For any n and any \(S \subset \{1, \ldots , n\}\), the morphism
is defined so that the j-th coordinate of \(i_S(b)(x)\) is x if \(j \in S\) and b otherwise. When there is no ambiguity on b we denote it simply by \(i_S\). Similarly, the morphism
denotes the projection of \((x_1, \ldots , x_n)\) on the coordinates belonging to S.
\(\bullet \) Morphisms between algebraic varieties over \({\mathbb {Q}}\) and their structures (line bundles, divisors, etc) are assumed to be defined over \({\mathbb {Q}}\).
\(\bullet \) For a smooth projective algebraic variety Y over \({\mathbb {Q}}\), \({{\,\mathrm{NS}\,}}(Y)\) is the Néron-Severi group of Y, and \(\rho (Y):= {{\,\mathrm{rk}\,}}{{\,\mathrm{NS}\,}}(J)\) is the Picard number of J (see §2.1).
\(\bullet \) For any abelian variety A over \({\mathbb {Q}}\) (in particular for J), \({{\,\mathrm{rk}\,}}(A)\) is the rank of the finite type \({\mathbb {Z}}\)-module \(A({\mathbb {Q}})\) and \({\text {End}}^0 (A) := ({\text {End}}_{\mathbb {Q}}A) \otimes {\mathbb {Q}}\).
\(\bullet \) N is a prime number (the level of our modular curves) and \(M=N\) or \(N^2\).
\(\bullet \) \(X_0(N)\) (resp. \(X_\mathrm{{s}}^+(N)\), \(X_\mathrm{{ns}}^+(N)\)) is the modular curve quotient of X(N) corresponding to the Borel structure (resp. normaliser of split Cartan, normaliser of nonsplit Cartan), \(X^+_0(N)\) is the quotient of \(X_0(N)\) by the Atkin-Lehner \(w_N\). Accordingly, the respective jacobians of these modular curves are denoted respectively by \(J_0(N), J_\mathrm{{s}}^+(N), J_\mathrm{{ns}}^+(N), J_0^+(N)\) (see Sect. 4).
\(\bullet \) For X a variety over a field \(K\subset \mathbb {C}\), \(H^k (X,\mathbb {Z})\) refers to the singular cohomology of \(X({\mathbb {C}})\).
\(\bullet \) Given a unipotent group U, the central series filtration of U is defined by \(U^{(1)} = U\) and \(U^{(i+1)}= [U,U^{(i)}]\), and \({\text {gr}}_i (U):=U^{(i)}/U^{(i+1)}\) (in particular \({\text {gr}}_1 (U)=U^{\mathrm {ab}}\)). If a group G acts continuously on U, then G acts on the set of normal subgroups of U, and we say that a quotient U/H is G-stable if the normal subgroup H is stabilised by G. In this case there is a unique G-action on U/H making the surjection G-equivariant.
\(\bullet \) The letter p denotes a prime number different from N which will be used (except in Appendix 7) only in the context of p-adic numbers.
2 The quadratic Chabauty condition (C) for a quotient
2.1 Reminders on Chow groups and Néron–Severi groups
We recall here the basic notions on correspondences of curves, and the Chow groups and Néron–Severi groups that we need. A good reference on correspondences is Smith’s thesis [61, Chapter 3], and classical ones are [13, section 11.5] for the complex case and [26, Chapter 16] for the general case.
Definition 1
For any geometrically smooth and irreducible projective variety Y over \({\mathbb {Q}}\) and any \(k \le \dim Y\):
-
The Chow group \(\mathrm {CH}^{k}(Y)\) is the group of cycles of Y of codimension k up to rational equivalence.
-
\(c_k :\mathrm {CH}^k (Y)\rightarrow H^{2k}(Y,\mathbb {Z})\) is the cycle map, and \(\mathrm {CH}^{k}_0(Y) :={\text {Ker}}(c_k )\) is its subgroup of homologically trivial cycles (in \(Y({\mathbb {C}})\)).
In particular, there are canonical isomorphisms
The Néron-Severi group \({{\,\mathrm{NS}\,}}(Y) := {\text {Pic}}(Y)/ {\text {Pic}}^0(Y)\) is thus embedded in \(H^2(Y({\mathbb {C}}),{\mathbb {Z}})\).
We can also define a geometric étale cycle map [21, Cycle]
and an absolute étale cycle map
By the Artin comparison theorem we have \({\text {Ker}}(\prod _l c_k ^{l,{\acute{\mathrm{e}}\mathrm{t}}})=\mathrm {CH}^k _0 (Y)\). The étale Abel–Jacobi morphism is a homomorphism
which may be defined using the Leray spectral sequence or (equivalently but more directly) by realising the extension class of a homologically trivial cycle Z inside \(H^{2k-1}((X-Z)_{{\overline{{\mathbb {Q}}}}},{\mathbb {Q}}_p (k))\) (see Jannsen [38, II.9] or Nekovar [53, 5.1]). By Poincaré duality, we may equivalently think of the target of \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}} \) as being
In particular, when \(Y=X\) is a curve, and for \(k=1\), the target of \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}\) is
where J is the Jacobian of X and \(V_p(J) = T_p(J) \otimes _{{\mathbb {Z}}_p} {\mathbb {Q}}_p\).
Let us now review the basic definitions of correspondences.
Definition 2
For two curves \(X_1,X_2\) as before:
-
A correspondence Z on \(X_1,X_2\) is a divisor of \({\text {Div}}(X_1 \times X_2)\), prime if the underlying divisor is. It is called fibral if its prime components are horizontal or vertical divisors.
-
If Z is a nonfibral prime correspondence, the two projections \(\pi _{1,Z}, \pi _{2,Z} :Z \rightarrow X_1, X_2\) are nonconstant so \(\psi _Z :=(\pi _{2,Z})_* \circ \pi _{1,Z}^*\) defines a morphism from \({\text {Div}}(X_1)\) to \({\text {Div}}(X_2)\), inducing a morphism between the Jacobians of \(X_1\) and \(X_2\), and two rationally equivalent divisors define the same morphism. This defines by linearity (extending to 0 for fibral prime divisors) a surjective morphism
$$\begin{aligned} {\psi } :{\text {Pic}}(X_1 \times X_2) \rightarrow {\text {Hom}}({\text {Jac}}(X_1), {\text {Jac}}(X_2)), \end{aligned}$$(8)with kernel \(\pi _1 ^* {\text {Pic}}(X_1)\oplus \pi _2 ^* {\text {Pic}}(X_2)\) with notation (7) ( [13, Theorem 11.5.1] or [61, Theorem 3.3.12]).
When \(X=X_1=X_2\), with the choice of a base point b, using notation from (6) and (7), we obtain from \(\pi _1 \circ i_1 = {\text {Id}}_X\) and similar relations the identities
(see [61, Proposition 3.3.8], as homologically trivial cycles are homomorphically trivial) which induces a decomposition
where the last direct factor then canonically identifies with \({\text {End}}(J)\) via (8). By abuse of notation, we thus denote
the inverse of this isomorphism. Now, the morphism \(i_{1,2}^* - i_1^* - i_2^*\) is trivial when restricted to \({\text {Pic}}^0(X \times X)\), hence induces a morphism
Define
We have \({\widetilde{\mathrm {AJ}}}^* =[2]^* \circ \mathrm {AJ}^* -2\mathrm {AJ}^*\) so for \([{{\mathcal {L}}}] \in {\text {Pic}}(J)\),
using the classical identity \([n]^* ({\mathcal {L}})\simeq {\mathcal {L}}^{\otimes (\frac{n^2+n}{2})}\otimes [-1]^* ({\mathcal {L}}^{\otimes (\frac{n^2 -n}{2})})\). In particular, \({\widetilde{\mathrm {AJ}}}^*\) is twice the usual pullback by \(\mathrm {AJ}\) on symmetric line bundles.
For any divisor D of \(X \times X\), the degree of \(\varphi (D)\) is equal to the rational trace of \(\psi (D)\) ( [13, Proposition 11.5.2]). This induces a morphism
By [52, IV.20], the rule \({\mathcal {L}}\mapsto \lambda _{{\mathcal {L}}}\) defined by \(\lambda _{{\mathcal {L}}}(P) = T_P^*{{\mathcal {L}}}\otimes {{\mathcal {L}}}^{-1} \in {\text {Pic}}^0(J)\) induces an isomorphism
where \({{\mathcal {P}}}:J \overset{\cong }{\rightarrow } {\widehat{J}}\) is a natural principal polarisation given by a theta divisor. This the same as applying the composition \(- \psi \circ (\mathrm {AJ}^{(2)})^*\). Indeed, via the natural morphisms \({\widehat{J}} \cong {\text {Pic}}^0(J)\) and \({\text {Pic}}^0(X) \cong J\), the inverse \({\widehat{J}} \rightarrow J\) of the principal polarisation given by a theta divisor on J is equal to \(- \mathrm {AJ}^*\) from \({\text {Pic}}^0(J)\) to \({\text {Pic}}^0(X)\) [13, Proposition 11.3.5].
Now, in terms of line bundles, by definition, given a line bundle L on \(X \times X\), the endomorphism of \({\text {Pic}}(X)\) associated to it is given on points by \(x \mapsto i_2^*(x)(L)\) with notation (6). As \((\mathrm {AJ}^{(2)} \circ i_2(x)) = T_{[x]-[b]} \circ \mathrm {AJ}\), for a line bundle \({{\mathcal {L}}}\) on \({\text {Pic}}(J)\) and x, y points of X the endomorphism associated to \(L=(\mathrm {AJ}^{(2)})^* {{\mathcal {L}}}\) sends \([x]-[y]\) to
which gives the equality up to \(-1\). Hence, if we define
and
then we have the following commutative diagram to sum up all the previous properties. Every symbol \(\circlearrowleft \) means that the diagram around it commutes, and every \(\circlearrowleft _{-}\) means that one composition is equal to \(-1\) times the other. Dashed arrows indicate that the morphisms are only defined on part of the domain or with small codomain, but in each case, it admits a natural extension. By abuse of notation, \(\psi \) and \((\mathrm {AJ}^{(2)})^*\) are used both on Picard groups and Néron–Séveri groups.

Remark 4
In [8], an element of \({\text {Pic}}(X\times X)\) whose image under \(\psi \) lies in \({\text {End}}^\dagger (J)^{{{\,\mathrm{tr}\,}}=0}\) is referred to as a ‘nice correspondence’.
2.2 Chow–Heegner points and diagonal cycles
We recall an equivalent version of the morphism \({\widetilde{\theta }}_{X,b}\), which appears in [18] and [6]. As our discussion applies in fairly broad generality, we take X to be a smooth geometrically irreducible projective curve over a field K of characteristic zero. Fix \(b\in X(K)\), and \(S \subset \{ 1,\ldots n\}\), let \(X_S\) denote the image of X under the closed immersion \(i_S (b)\) defined in (6). For any \(Z \in {\text {Div}}(X \times X)\), let \(C_Z(b) := (i_{\{1,2\}}^*(b) -i_{\{1 \}}^*(b) -i_{\{2 \} }^*(b) )(Z) = \varphi ([Z])\) and
We refer to \(D_Z (b)\) and \(C_Z (b)\) as Chow–Heegner points, following [19].
The map \(Z\mapsto D_Z (b)\) factors through \({\text {Pic}}(X\times X)\), and has the following relation to \({\widetilde{\theta }}_{X,b}\). The projection
associated to (910) is given by \((1-\pi _1 ^* \circ i_1 ^* -\pi _2 ^* \circ i_2 ^* )\), giving the identities
Since \(\deg (C_Z (b))=\deg (\varphi (\varPi ([Z])))\), for any Z in \({\text {Pic}}(X\times X)\) which lies in the kernel of \(\deg \varphi \), we have
These computations also prove the claims of Remark 2 using the diagram (15). We define \(Z^t \in \mathrm {CH}^1 (X\times X)\) to be the pull-back of Z under the involution
Lemma 2
In the notation of Definition 2, we have
Proof
We have \(i_{\{1,2\}}(b)=i_{\{1,2\}}(b')\). Hence
By definition of the correspondences, we then have
and
which proves the equality for \(C_Z(b') - C_Z(b)\), thus for \(D_Z(b') - D_Z(b)\) as the degrees are then equal. \(\quad \square \)
Definition 3
Given a surjective homomorphism \(\pi _B:J\rightarrow B\) of abelian varieties, we obtain a homomorphism
By Lemma 2 and (17), for a divisor Z on \(X \times X\), if \(\psi _{\varPi (Z)}\) has image contained in \({\text {Ker}}(\pi _B )\), then the image of [Z] in B via (18) is independent of the choice of basepoint. In particular, if we have a surjection \((\pi _A ,\pi _B):J\rightarrow A\times B\), and \({\text {Hom}}(A,B)=0\), then we obtain a homomorphism independent of b, which we will denote by
Remark 5
This construction also has a direct description in terms of line bundles, although this is not the one we use to calculate \(\theta _{X,\pi _A ,\pi _B}\) in examples. Given a line bundle \({{\mathcal {L}}}_A\) on A whose pull-back to X via \(\mathrm {AJ}^* \circ \pi _A ^* \) has degree zero, we may also consider the projection of \(\mathrm {AJ}^* \circ \pi _A ^* ({{\mathcal {L}}}_A)\) to B. Variants of this construction are studied in the thesis of Michael Daub [20] when \({\text {Hom}}(A,B)=0\). By (13) and because \({\text {Pic}}^0(J)\) contains all classes of antisymmetric line bundles, we have the identity [20, Proposition 3.3.3]
where p is the projection from \({\text {Pic}}(A)\) to \({{\,\mathrm{NS}\,}}(A)\) restricted to \(p^{-1}({\text {Ker}}(d_{\pi _A}))\). In particular, the right-hand side does vanish on \({\text {Pic}}^0(A)\) [20, Proposition 3.3.2].
Example 1
Note that \(\theta _{X,\pi _A ,\pi _B }\) is not an invariant of A and B, or even of X, A, B. For example, let A and B be distinct isogeny factors of \(X_0 (N)\), and let \(X=X_0 (N^2 )\). Let \(f_1 ,f_2 :X\rightarrow X_0 (N)\) be the two natural morphisms, and let \((\pi _{A_i },\pi _{B_i })\) be the morphisms \({\text {Jac}}(X)\rightarrow A\times B\) obtained by composing the surjection \(J_0 (N)\rightarrow A\times B\) with \(f_{i*}\). Then \(\theta _{X,\pi _{A,i} ,\pi _{B,i} }\) can be nonzero (see [18] for examples), however if \(i\ne j\), \(\theta _{X,\pi _{A,i},\pi _{B,j}}\) is identically zero, since for any choice of line bundle \([{{\mathcal {L}}}]\) in \({{\,\mathrm{NS}\,}}(A)\), the associated point \(D_{[{{\mathcal {L}}}]}(b)\) will lie in \(f_{i}^* J_0 (N)\), hence the projection to \(f_{j*}J_0 (N)\) will be torsion.
3 Proof of finiteness of the Chabauty–Kim set under (C)
The strategy of proof of Proposition 1 is very similar to that of [6, Lemma 3.2]. To explain this strategy, we need to establish some notation. X, A, B are as in the proposition. Define
Let \(U_n (b)\) denote the maximal n-unipotent quotient of the \({\mathbb {Q}}_p \)-unipotent fundamental group of \({\overline{X}}\) at some basepoint b as defined in [22, §10]. Let U be a Galois-stable quotient of \(U_n (b)\) (i.e. a quotient by a Galois-stable normal subgroup of \(U_n(b)\)). Let \(T_0 \) be the set of primes of bad reduction for X, and let \(T=T_0 \cup \{ p \}\). Denote the maximal quotient of \({\text {Gal}}({\overline{{\mathbb {Q}}}} /{\mathbb {Q}})\) unramified outside T by \(G_{{\mathbb {Q}},T}\), and for \(v\in T\) denote \({\text {Gal}}({\overline{{\mathbb {Q}}}}_v /{\mathbb {Q}}_v )\) by \(G_{{\mathbb {Q}}_v} \). Then by [39, 40], we have a commutative diagram

with the following properties.
-
1.
For \(G=G_{{\mathbb {Q}},T}\) or \(G_{{\mathbb {Q}}_v }\), and all \(i<k\), the sets \(H^1 (G,U^{(i)}/U^{(k)})\) have the structure of \({\mathbb {Q}}_p \) points of an algebraic variety, so that the algebraic structure on \(H^1 (G,{\text {gr}}_i U)\) is just the usual scheme structure on a vector space, and the maps
$$\begin{aligned} H^1 (G,{\text {gr}}_i U)\rightarrow H^1 (G,U/U^{(i+1)})\rightarrow H^1 (G,U/U^{(i)}) \end{aligned}$$come from morphisms of algebraic varieties. The maps \({\text {loc}}_v \) are then algebraic for these structures.
-
2.
For \(v\in T_0 \), the map \(j_v \) has finite image.
-
3.
The image of the map \(j_p \) is contained inside the subvariety \(H^1 _f (G_{{\mathbb {Q}}_p },U)\) of crystalline torsors.
The following Lemma is proved in [6, Lemma 3.1] (although the result is stated only in the case \(A=J\), the proof generalises to the case where A is an arbitrary quotient of J).
Lemma 3
Let U be a Galois-stable quotient of \(U_2 (b)\). Suppose U is an extension of \(V_A\) by \({\mathbb {Q}}_p (1)^n\), where A is some abelian variety over \({\mathbb {Q}}\) and \(V_A= T_p (A) \otimes {\mathbb {Q}}_p\). If
then \(X({\mathbb {Q}}_p )_2\) is finite. In particular, if \({{\,\mathrm{rk}\,}}(A({\mathbb {Q}}))=\dim (A)\), then \(X({\mathbb {Q}}_p )_2 \) is finite whenenever \(n> 0\).
To prove Proposition 1, we construct a quotient U of \(U_2(b)\) as in Lemma 3, with \(n={{\,\mathrm{rk}\,}}({\text {Ker}}\theta _{X,\pi _A,\pi _B})\). We again take X to be a smooth projective geometrically irreducible curve over a field K of characteristic zero.
The group \(U_2 (b)\) is an extension
Hence for any \(\xi \in {\text {Ker}}({{\,\mathrm{NS}\,}}(J) \overset{{\widetilde{\mathrm {AJ}}}^*}{\rightarrow } {{\,\mathrm{NS}\,}}(X))\), we may quotient by the kernel of the dual of the Chern class \(c_p ^{{\acute{\mathrm{e}}\mathrm{t}}}(\xi ) \in H^2(X_{{\overline{{\mathbb {Q}}}}},{\mathbb {Q}}_p(1))\) (see Sect. 1.1)
to obtain a quotient \(U_Z\) of \(U_2 (b)\) which is an extension of V by \({\mathbb {Q}}_p (1)\). Similarly, for any nice correspondence on \(X\times X\), we obtain a quotient of \(U_2 (b)\) which is an extension of V by \({\mathbb {Q}}_p (1)\).
Lemma 4
([6], Theorem 6.3) Let U be a Galois-stable quotient of \(U_2 (b)\) of the form
coming from a correspondence \(Z\subset X\times X\) as above. Then the associated extension class of \(\mathrm {Lie}(U)\) in \({{\,\mathrm{Ext}\,}}^1 _{G_K }(V_p(J),{\mathbb {Q}}_p (1))\) is equal to the étale Abel–Jacobi class of the cycle \(D_Z (b)\) (see Sect. 2.1).
Proof
Let \({\mathcal {E}}(\mathrm {Lie}(U))\) be the universal enveloping algebra of \(\mathrm {Lie}(U)\), and let \(I(\mathrm {Lie}(U))\) be the kernel of the co-unit morphism \({\mathcal {E}}(\mathrm {Lie}(U))\rightarrow {\mathbb {Q}}_p \). In [6, §6], a Galois representation \(E_Z\) is constructed as a quotient of \({\mathcal {E}}(\mathrm {Lie}(U))\). The image of \(I(\mathrm {Lie}(U))\) in \(E_Z\) is an extension \(IE_Z \) of V by \({\mathbb {Q}}_p (1)\). By [6, Theorem 6.3], the extension class of \(IE_Z\) in \({{\,\mathrm{Ext}\,}}^1_{{{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}}(V_p(J),{\mathbb {Q}}_p(1))\) is the Abel–Jacobi class of \(D_Z (b)\). The restriction of \(I(\mathrm {Lie}(U))\rightarrow IE_Z\) to \(\mathrm {Lie}(U)\subset I(\mathrm {Lie}(U))\) is an isomorphism, and hence the extension class of \(\mathrm {Lie}(U)\) is isomorphic to \(D_Z (b)\). \(\quad \square \)
As explained in Appendix 6, Lemma 4 is really a consequence of Hain and Matsumoto’s computation of the extension class of \(\mathrm {Lie}(U_2 )\) in terms of the Ceresa cycle. Hence to complete the proof of Proposition 1, it will be enough to prove the following Lemma.
Lemma 5
Let \(U'\) denote the quotient of \(U_2 \) obtained from the surjection \({\text {gr}}_2 (U_2 )\rightarrow {\text {Ker}}(d_{\pi _A} )^* \otimes {\mathbb {Q}}_p (1)\). There exists a Galois stable quotient U of \(U'\) which is an extension of \(V_A\) by \({\text {Ker}}(\theta _{X,\pi _A ,\pi _B })\):

Proof
It will be enough to prove the corresponding statement for the Lie algebra \(L'\) of \(U'\). The commutator map
is the composite of the commutator on \(U_2 \), given by
with the surjection
Since the latter map factors through projection onto \(\wedge ^2 V_A /{\mathbb {Q}}_p (1)\), the composite map factors through projection onto \(V_A \times V_A \). Hence for any quotient Q of \({\text {Ker}}(d_{\pi _A })^* \otimes {\mathbb {Q}}_p (1)\), we can construct a Lie algebra quotient of \(L'\) which is an extension of \(V_A\) by Q. It remains to show that, when \(Q={\text {Ker}}(\theta _{X,\pi _A ,\pi _B})\), we can make this quotient Galois stable. That is, we first quotient out by \(({\text {Ker}}(d_{\pi _A })/{\text {Ker}}(\theta _{X,\pi _A ,\pi _B }))^* \otimes {\mathbb {Q}}_p (1)\), to form an extension
The surjection \(L''\rightarrow V_B \) induces a Galois equivariant short exact sequence of Lie algebras
and to construct the quotient \(U\rightarrow U'\), it is enough to show that this short exact sequence admits a Galois equivariant section. Here \(L'\) sits in a short exact sequence
and since \(L''/{\text {Ker}}(\theta _{X,\pi _A ,\pi _B})^* \otimes {\mathbb {Q}}_p (1)=V_A \oplus V_B\), it is enough to show that image of \([L'']\) under the composite map
is zero.
Equivalently, we want to show that \({\text {Ker}}(\theta _{X,\pi _A ,\pi _B})\) is contained in the kernel of the homomorphism
sending \(\xi \in {\text {Ker}}(d_{\pi _A })\) to the \(V_B\) component of the extension class in \({{\,\mathrm{Ext}\,}}^1 (V_A \oplus V_B ,{\mathbb {Q}}_p (1))\) associated to the quotient of \(L'\) defined by \(c _p ^{{\acute{\mathrm{e}}\mathrm{t}}}(\xi )\):

By Lemma 4, this extension class is equal to the étale Abel–Jacobi class of \(D_{c_p ^{{\acute{\mathrm{e}}\mathrm{t}}}(\xi )}(b)\), and hence its \(V_B\) component is equal to the étale Abel–Jacobi class of \(\theta _{X,\pi _A ,\pi _B }(c_p ^{{\acute{\mathrm{e}}\mathrm{t}}}(\xi ))\). Under the hypothesis, the latter is 0 so the extension class is trivial, which concludes the proof of Proposition 1. \(\square \)
3.1 Bounding the number of rational points on curves satisfying (C)
Following [3], the proof of finiteness of \(X({\mathbb {Q}}_p )_2\) may be used to prove an explicit upper bound on \(\# X({\mathbb {Q}}_p )_2\). To explain this, we introduce some notation. By [41, Corollary 1], for all \(v\ne p\), the size of the image of \(X({\mathbb {Q}}_v )\) in \(H^1 (G_{{\mathbb {Q}}_v },U_2 )\) is finite, and is equal to one for all primes of good reduction for X. Let \(T_0\) denote the set of primes of bad reduction for X, and for \(v\in T_0\) let \(n_v \) denote the size of the image of \(X({\mathbb {Q}}_v )\) in \(H^1 (G_{{\mathbb {Q}}_v },U_2 )\).
Corollary 1
Suppose X satisfies the hypotheses of Proposition 1, and furthermore that the rank of \(A({\mathbb {Q}})\) is equal to its dimension, and the p-adic closure of A has finite index in \(A({\mathbb {Q}}_p )\). Let \(n:=\prod _{v\in T_0 }n_v \). Let D be an effective divisor on X, let \(Y\subset X_{\mathbb {Z}_p }\) be the complement of the support of a normal crossings divisor on Y with generic fibre D, and let \(\{ \omega _0 ,\ldots ,\omega _{2g-1}\}\) be a set of differentials in \(H^0 (X,\varOmega (D))\) forming a basis of \(H^1 _{\text {dR}} (X)\). Then there are \(a_{ij},a_i \in {\mathbb {Q}}_p\), \(\eta \in H^0 (X,\varOmega (D))\) and \(g\in H^0 (X,\varOmega (2D))\), and \(\alpha _1 ,\ldots ,\alpha _n \) in \({\mathbb {Q}}_p \), such that
Proof
The argument is identical to the proof of [7, Proposition 6.4], however as the hypotheses are different we explain the steps. Arguing as in loc. cit, there are \(b_{ij}\), \(b_i \) in \({\mathbb {Q}}_p \) such that \(X({\mathbb {Q}}_p )_2 \cap Y(\mathbb {Z}_p )\) is contained in the finite set of \(x\in Y(\mathbb {Z}_p )\) satisfying
for some \((\phi _v )\) in \(\prod _{v \in T_0 }j_v (X({\mathbb {Q}}_v )).\) Here \(A_Z (b)^{(\phi _v )}\) denotes the twist of \(A_Z (b)\) by \(\phi _v \).
Hence we deduce (20) from the formula for \(h_p (A_Z (x))\) given in [7, Lemma 6.7], and the formula
\(\square \)
Corollary 2
Suppose X satisfies the hypotheses of Proposition 1, and furthermore that the rank of \(A({\mathbb {Q}})\) is equal to its dimension. Then
where \(\kappa _p :=1+\frac{p-1}{p-2}\frac{1}{\log (p)}\).
Proof
It is enough to prove that, for all \(x_0 \in X(\mathbb {Z}_p )\), we can choose \(D,\omega _i \) such that \({\overline{x}}:={{\,\mathrm{red}\,}}(x_0 )\) lies in \(Y(\mathbb {F}_p )\), and
This follows from [3, Proposition 3.2] together with [3, §4, below Lemma 4.4.]. \(\square \)
Remark 6
In [10], it is proved that the size of \(j_{2,v}(X({\mathbb {Q}}_v ))\) can be bounded by the number of irreducible components of a regular semistable model of X over a finite extension of \({\mathbb {Q}}_v \). Hence using work of Edixhoven and Parent on stable models of \(X_{{{\,\mathrm{ns}\,}}}^+(N)\) [23], one can use the above corollary, together with Theorem 1, to give explicit bounds on the size of \(X_{{{\,\mathrm{ns}\,}}}^+(N)\) and \(X_0 ^+ (N)\).
3.2 Functoriality properties of (C)
The heart of the proof of Proposition 3 is an interpretation of diagonal cycles on \(X_0 (N)\) and \(X_{{{\,\mathrm{ns}\,}}}(N)\) in terms of Heegner points. The following Lemma allows us to use this to deduce something about diagonal cycles on \(X_0 ^+ (N)\) and \(X_\mathrm{ns }^+ (N)\). This lemma is a special case of a theorem of Daub [20, Proposition 3.3.5].
Lemma 6
-
1.
Let \(f:X'\rightarrow X\) be a non-constant morphism of curves over a field K. Suppose \(b'\in X'(K)\) maps to \(b\in X(K)\) under f, and let Z be an element of \(\mathrm {CH}^1 (X\times X)\). Then
$$\begin{aligned} D_{(f,f)^* Z}(b')=f^* (D_Z (b)). \end{aligned}$$ -
2.
Let \(f:X'\rightarrow X\) and \(b'\) be as above, and let \(f_*\) denote the induced surjection \(J':={\text {Jac}}(X')\rightarrow J:={\text {Jac}}(X)\). Let \((\pi _A ,\pi _B)\) be a surjective homomorphism from J to \(A\times B\). Then
$$\begin{aligned} {\text {Ker}}(\theta _{X,\pi _A ,\pi _B})={\text {Ker}}(\theta _{X' ,\pi _A \circ f_* ,\pi _B \circ f_*}). \end{aligned}$$
Proof
For \(*=\{1\},\{2\}\) or \(\{1,2\}\), the diagram

commutes. Hence we obtain, in \(\mathrm {CH}^1 (X')\),
and the result follows for \(D_Z(b)\). The second item follows from the first, as we now prove. Let \({{\mathcal {L}}}\) be a line bundle on A belonging to \({\text {Ker}}d_{\pi _A}\). By definition of \(\theta _{X,\pi _A,\pi _B}\) and the right part of diagram (15), we fix some cycle Z on \(X \times X\) such that \([Z] = (\mathrm {AJ}^{(2)})^* \circ \pi _A ^* ([{{\mathcal {L}}}])\), and then by (17)
Now, considering the morphism \(f : X' \rightarrow X\) with those choices of base points, we have \(f_* \circ \mathrm {AJ}_{X'}^{(2)} = \mathrm {AJ}_{X}^{(2)} \circ (f,f)\). Consequently, with the same \({{\mathcal {L}}}\) and Z, \([(f,f)^* Z] = (\mathrm {AJ}_{X'}^{(2)})^* \circ (\pi _A \circ f_*)^* ([{{\mathcal {L}}}])\), so the kernels of \(d_{\pi _A}\) and \(d_{\pi _A `\circ f_*}\) are the same, and on this common kernel,
In particular,\(\theta _{X,\pi _A,\pi _B}\) and \(\theta _{X',\pi _A \circ f_*,\pi _B \circ f_*}\) have the same kernel. \(\square \)
Note that while the behaviour of diagonal cycles under pull-backs is tautological, their behaviour under push-forwards is not. For this reason it seems difficult to deduce statements about diagonal cycles on \(X_{{{\,\mathrm{ns}\,}}}(N)\) from results on \(X_{\mathrm {s}}(N)\), in spite of the explicit isogeny relating their Jacobians explained below.
4 Proof of (C) for \(X_0 ^+ (N)\) and \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\)
Given Proposition 1, it will be enough to prove Theorem 2, and the following.
Proposition 3
Assume Theorem 2. Then, for \(X=X_0 ^+ (N)\) or \(X_{{{\,\mathrm{ns}\,}}}^+ (N)\) of genus at least 2, there exists an isogeny
where \({{\,\mathrm{rk}\,}}(A) = \dim (A) = \rho (A) \ge 2\) and such that, for all \({{\mathcal {L}}}\) in \({\text {Ker}}(d_{\pi _A })\), \( \theta _{X,\pi _A ,\pi _B }({{\mathcal {L}}})=0 \) is torsion (see Definition 4 for the choices of A and B).
We recall the definitions of some of the modular curves which appear, for example, in [16]. Define \(C_{{{\,\mathrm{ns}\,}}}^+ (N),C_{\mathrm {s}}^+ (N)\) to be normalisers in \({\text {GL}}_2 (\mathbb {Z}/{\mathbb {N}}\mathbb {Z})\) of fixed choices of non-split Cartan \(C_\mathrm{{ns}}(N)\) and split Cartan subgroups \(C_\mathrm{{s}}(N)\) of \({\text {GL}}_2 (\mathbb {Z}/N\mathbb {Z})\). The (normaliser of) split and nonsplit Cartan modular curves are defined by
Similarly we define \(X_{{{\,\mathrm{ns}\,}}}(N)\) and \(X_{\mathrm {s}}(N)\) to be the quotients of X(N) by \(C_{{{\,\mathrm{ns}\,}}}(N)\) and \(C_{\mathrm {s}}(N)\) respectively. Since \(C_{{{\,\mathrm{ns}\,}}}(N)\) and \(C_{\mathrm {s}}(N)\) contain the centre of \({\text {GL}}_2 (\mathbb {Z}/N\mathbb {Z})\) and their determinant goes through all \(({\mathbb {Z}}/N{\mathbb {Z}})^*\), all \(X_{{{\,\mathrm{ns}\,}}}(N)\), \(X_{\mathrm {s}}(N)\) and their Atkin–Lehner quotients are geometrically connected and defined over \({\mathbb {Q}}\).
Non-cuspidal points of \(X_{\mathrm {s}}(N)\) (in characteristic not dividing N) correspond to elliptic curves E together with a pair \(C_1 ,C_2 \) of cyclic subgroups of E of order N generating E[N]. We have an isomorphism
which sends a point \((f:E\rightarrow E' )\) to \((E'',C_1 ,C_2 )\), where \(E'' :=E/(N\cdot {\text {Ker}}(f))\), \(C_1 \) is the image of \({\text {Ker}}(f)\) in \(E''\), and \(C_2 \) is the image of E[N] in \(E''\).
The curve \(X_{\mathrm {s}}(N)\) is naturally a degree two cover of \(X_{\mathrm {s}}^+ (N)\), and there is an isomorphism \(X_{\mathrm {s}}^+ (N)\simeq X_0 ^+ (N^2 )\) compatible with (21).
4.1 Jacobians of modular curves and the asymptotics of the quadratic Chabauty condition
We recall a formula for the Picard numbers and ranks of modular Jacobians and their quotients, due to Siksek [59]. Let \({\mathcal {B}}_{N^k} \) denote a normalised eigenbasis for the space of newforms in \(S_2 (\varGamma _0 (N^k ))\). Let \({\mathcal {B}}_{N^k }/{{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\) denote a choice of representatives of the orbits of \({{\mathcal {B}}}_{N^k}\) under \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\). We denote by \({\mathcal {B}}_{N^k }^+\) the subset of \({\mathcal {B}}_{N^k }\) with Atkin–Lehner eigenvalue 1 for \(w_{N^k }\). The Jacobians \(J_0 (N^k )^{\mathrm {new}}\) and \(J_0 ^+ (N^k )^{\mathrm {new}}\) admit \({\mathbb {Q}}\)-isogenies
where \(A_f \) denotes the \({\mathbb {Q}}\)-simple abelian variety associated to f by the Eichler–Shimura correspondence (which is independent of the choice of representative of the orbit). Because \(X_s^+(N)\) is isomorphic to \(X_0^+(N^2)\) as we have seen above,
and by a theorem of Chen [16, Theorem 1], we also have a \({\mathbb {Q}}\)-isogeny
The following lemma says that one would not expect to be able to use Chabauty’s method to understand \(X({\mathbb {Q}})\).
Lemma 7
Let \(X=X_0 ^+ (N)\) or \(X_{{{\,\mathrm{ns}\,}}}(N)\). Then the weak Birch–Swinnerton-Dyer conjecture implies \(X({\mathbb {Q}}_p )_1 =X({\mathbb {Q}}_p )\).
Proof
The weak Birch–Swinnerton-Dyer conjecture implies that, for \(f\in {\mathcal {B}}_{N^k}\), \(A_f\) will have positive rank whenever f has positive analytic rank. Since \(f\in {\mathcal {B}}_{N^k}\) has odd analytic rank whenever \(w_{N^k}(f)=1\), and \(A_f\) is simple over \({\mathbb {Q}}\), the Birch–Swinnerton-Dyer conjecture hence implies that every isogeny factor of \({\text {Jac}}(X)\) (over \({\mathbb {Q}}\)) has positive rank.
Since \({\text {End}}(A_f )\) is an order in the totally real field \(K_f\), every isogeny factor of \({\text {Jac}}(X)\) has rank at least equal to its dimension. To prove the lemma, we must show that the image of \(A_f ({\mathbb {Q}})\) in \(\mathrm {Lie}(A_f )_{{\mathbb {Q}}_p }\) under the p-adic logarithm map generates \(\mathrm {Lie}(A_f )_{{\mathbb {Q}}_p }\) as a \({\mathbb {Q}}_p \)-vector space. This is equivalent to the statement that the image of \(A_f ({\mathbb {Q}})\) in \(\mathrm {Lie}(A_f )_{\mathbb {C}_p }\) generates the latter as a \(\mathbb {C}_p \)-vector space. Since \(\mathrm {Lie}(A_f )_{{\overline{{\mathbb {Q}}}}}\) decomposes as a sum of one-dimensional isotypic components \(\mathrm {Lie}(A_f )_{{\overline{{\mathbb {Q}}}},g}\), for g conjugate to f, and the p-adic logarithm is \({\text {End}}(A_f )\)-equivariant, we deduce that if the image of \(A_f ({\mathbb {Q}})\) does not span \(\mathrm {Lie}(A_f )_{\mathbb {C}_p }\) then there is a g conjugate to f such that the image of \(A_f ({\mathbb {Q}})\) in \(\mathrm {Lie}(A_f )_{\mathbb {C}_p ,g}\) is zero. By the p-adic analytic subgroup theorem [49, Theorem 1], [25, Theorem 2.2] if \(P\in A_f ({\overline{{\mathbb {Q}}}} )\) has the property that \(\log (P)\in \mathrm {Lie}(A_f )_{\mathbb {C}_p }\) lies in a proper subspace defined over \({\overline{{\mathbb {Q}}}}\), then P lies in a proper commutative sub-variety \(B\subset A_{f,{\overline{{\mathbb {Q}}}}}\). Hence we deduce that if \(A_f ({\mathbb {Q}})\) does not generate \(\mathrm {Lie}(A_f )_{\mathbb {Q}_p }\), then \(A_f ({\mathbb {Q}})\) lies in a proper commutative subvariety of \(A_{f,{\overline{{\mathbb {Q}}}}}\), since the isotypic components of \(\mathrm {Lie}(A_f )_{\mathbb {C}_p }\) are defined over \({\overline{{\mathbb {Q}}}}\).
We claim that this contradicts the Birch–Swinnerton-Dyer conjecture. More generally, if A is a simple abelian variety over \({\mathbb {Q}}\) and \(\pi :A_K \rightarrow B\) is a non-zero morphism of abelian varieties over a finite Galois extension \(K/{\mathbb {Q}}\), we claim that \(P\in A({\mathbb {Q}})\) is torsion if and only if its image in B(K) is torsion (in particular, when \(A=A_f \) and B is an isogeny factor, we deduce that \(A_f\) has rank zero over \({\mathbb {Q}}\) if and only if there is as isogeny factor B of \(A_{f,{\overline{{\mathbb {Q}}}}}\) such that the image of \(A_f ({\mathbb {Q}})\) in B is torsion). To see this claim, for \(\sigma \in {\text {Gal}}(K/{\mathbb {Q}})\) let \(\pi ^{\sigma }\) denote the conjugate homomorphism \(A_K \rightarrow B^{\sigma }\). If \(\pi (P)\) is torsion then \(\pi ^{\sigma }(P)=\pi (P)^{\sigma }\) is torsion for all \(\sigma \), hence the image of P under the map
is torsion. However, this map descends to a non-zero morphism of \({\mathbb {Q}}\), and hence by simplicity of A, if \(\pi (P)\) is torsion then P is torsion. \(\square \)
Moreover, two abelian varieties \(A_f\), \(A_g\) for \(f,g \in {{\mathcal {B}}}_{N^k}\) are non-isogenous unless f and g are conjugate by \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\), and \({\text {End}}^{\dagger }(A_f)\) is always totally real of rank \(\dim (A_f )\), which proves that each of the Jacobians \(J = J_0^+(N),J_{\mathrm {s}}^+(N), J_\mathrm{{ns}}^+(N)\) satisfies \(\rho (J) = \dim J\), and hence the condition (2) becomes
(for a more general such condition for modular curves, see the main result of [59]). Using the isogenies above, the Birch–Swinnerton-Dyer conjecture implies
There is a whole literature on analytic estimates for these types of analytic ranks. In particular, using [45, Theorem 1.4] one can show that the Birch–Swinnerton-Dyer conjecture implies that
and in particular asymptotically that (2) is always satisfied. It is likely that the same result can be obtained for \(J_\mathrm{{ns}}^+(N)\), but the square level (we are looking at \(J_0^+(N^2)^\mathrm{{new}}\)) raises serious technical difficulties for analytic estimates of second moments used there.
On the other hand, by Corollary 4, Theorem 2 implies that we have an isogeny factor A of J satisfying \(\rho (A)>1\) and \({{\,\mathrm{rk}\,}}(A)=\dim (A)\), hence to prove Proposition 3 it suffices to construct a nonzero \([L] \in {\text {Ker}}({{\,\mathrm{NS}\,}}(A)\rightarrow {{\,\mathrm{NS}\,}}(X))\) satisfying \(\theta _{X,\pi _A ,\pi _B}([L])=0\), where B is the isogeny factor consisting of modular abelian varieties associated to modular forms whose analytic rank of L-functions is greater than 1. It will be shown that for any L, its image \(\theta _{X,\pi _A ,\pi _B}(L)\) can be represented by a divisor supported on cusps and Heegner points, and hence is torsion by the generalised Gross–Zagier formula ( [67, Theorem 6.1]) This motivates the following definition.
Definition 4
(Heegner quotient) Let \(M=N\) or \(N^2\). The Heegner quotient A of \(J_0(M)^\mathrm{{new}}\) is the product
and its complement is
(so that\(A \times B\) is isogenous to \(J_0^+(M)^\mathrm{{new}}\), not the full \(J_0(M)^\mathrm{{new}}\)).
In particular, Corollary 4 implies that \({{\,\mathrm{rk}\,}}(A) = \dim (A)\) (assuming the Birch–Swinnerton-Dyer conjecture, it is the largest factor of \(J_0^+(M)\) with this property) and the generalised Gross–Zagier formula implies that all images of traces of Heegner points on \(X_0 (N)\) in B are torsion (see Sects. 4.2 and 4.3). In the case of \(X_{{{\,\mathrm{ns}\,}}}(N)\), there is also a notion of Heegner point due to Kohen and Pacetti, inspired by the points used in Zhang’s Gross–Zagier formula for \(X_{{{\,\mathrm{ns}\,}}}(N)\) (and more general Shimura curves).
The main result of the next section is the following lemma, which refers to \(X_0 (N)\) and \(X_{{{\,\mathrm{ns}\,}}}(N)\) rather than their Atkin–Lehner quotients. However, by Lemma 6 it implies Proposition 3.
Lemma 8
Let \(X=X_0 (N)\) or \(X_{{{\,\mathrm{ns}\,}}}(N)\), and A, B the Heegner quotient and its complement as defined above, endowed with the natural projections \((\pi _A ,\pi _B ):{\text {Jac}}(X)\rightarrow A\times B.\) Then for all [L] in \({\text {Ker}}(d_{\pi _A })\), \( \theta _{X,\pi _A ,\pi _B }([L]) \) is torsion. In particular the rank of the kernel of \(\theta _{X,\pi _A,\pi _B}\) is maximal (in particular at least 1 if \(\dim A \ge 2\)).
4.2 How to prove (C) using Heegner points under the analytic hypothesis: \(X=X_0 (N)\)
In this section we prove Lemma 8. We will deduce it from the Gross–Zagier–Zhang theorem. In the case of \(X_0 (N)\), as explained in [20] or [19], we could also deduce it from the Yuan–Zhang–Zhang formula for the height of diagonal cycles (see Sect. 4.4). By a Heegner point on \(X_0 (N)\) we will mean a point
on \(Y_0 (N)\) such that E and \(E'\) have CM by the same order of an imaginary quadratic field K, not necessarily maximal but assumed to be with conductor prime to N (see [28] for a review of their properties, in particular N has to be split or ramified in K).
An eigenform \(f \in S_2 (\varGamma _0 (N))^{+,\text {new}}\) defines by Eichler-Shimura theory a \({\mathbb {Q}}\)-simple quotient \(\pi :J_0(N) \rightarrow A_f\) of \(J_0(N)\) (in fact of \(J_0 ^+ (N)\)) and the Heegner points behave on \(A_f\) in the following way.
Lemma 9
-
1.
If \(L'(f,1)\ne 0\), then \({{\,\mathrm{rk}\,}}(A_f)= \dim (A_f )\) (and \(A_f({\mathbb {Q}})\) is generated by the projection of a trace of a suitable choice of Heegner point).
-
2.
If \(L'(f,1)=0\), then for any P in \({\text {Div}}^0 (X_0(N))({\overline{{\mathbb {Q}}}} )^{{\text {Gal}}({\overline{{\mathbb {Q}}}}|{\mathbb {Q}})}\) supported on the set of Heegner points, the image \(\pi (P)\) is torsion in \(A_f ({\mathbb {Q}})\).
Remark 7
The original Gross–Zagier formula [32, Theorem I.6.3] is not sufficient for the second part of the Lemma, as it only deals with Heegner points for which the discriminant of the order is squarefree (in particular, the order is maximal) and prime to N, which we cannot afford to assume here. This is why we need Zhang’s formula and the ensuing technical interpretation.
Proof
The first part is given by Proposition 8. The second part is a consequence of the generalised Gross–Zagier formula of Zhang [67, Theorem 6.1] which for this case is made completely explicit in [15, Theorem 1.1], see also [15, Example after Theorem 1.5]. We use the following notation: \(f \in S_2(\varGamma _0(N))\) is a normalised eigenform, K an imaginary quadratic field number field in which N is not inert, c prime to N, \({{\mathcal {O}}}_c = {\mathbb {Z}}+ c {{\mathcal {O}}}_K\), and \(1_c\) the trivial ring class character on \({\text {Pic}}({{\mathcal {O}}}_c)\). We denote by \(H_c\) the ray class field of K with conductor c. If P is a Heegner point on \(X_0(N)\) with CM by \({{\mathcal {O}}}_c\), it belongs to \(X_0(N)(H_c)\), and we define
On the other hand, if \( J(H_c) \otimes {\mathbb {C}}\) denotes the extension of scalars of \(J(H_c )\) endowed with the extended Néron-Tate height, we have the decomposition into isotypical components
where g goes through all eigenforms of weight 2 of \(J_0(N)\), so that \(J_0(N)_g\) is exactly the isotypical part where \(T_n\) acts by multiplication by \(a_n(g)\). We denote by \(P_{1_c}^f\) the projection of \(P_{1_c}\) on the f-isotypical component. The statement of [15, Theorem 1.1] then tells (which is sufficient for us) that \(L'(f,1_c,1)\) as defined there is proportional (by an explicit nonzero factor) to the extended Néron-Tate height of \(P_{1_c}^f\).
We have the equality of L-functions
with \(1_K\) the trivial class character on \({\text {Pic}}({{\mathcal {O}}}_K)\) and \(\chi _K\) the Dirichlet character associated to K. In particular (and given the signs of functional equations on the right), our hypothesis \(L'(f,1)=0\) guarantees that \(L(f,1_K,s)\) vanishes with order at least 2 at 1, so the left-hand side of [15, Theorem 1.1] is zero for \(c=1\). This also holds for any c prime to N, because by construction \(L(f,1_{c},s)\) is a multiple of \(L(f,1_K,s)\) around 1 (given the definition again). We have thus proved that \(P_{1_c}^g\) is zero in \(J_0(N)(H_c) \otimes {\mathbb {C}}\).
Now, the group \({\text {Aut}}({\mathbb {C}})\) acts on \(J_0(N)(H_c) \otimes {\mathbb {C}}\) by the identity on the left and the natural action on the right, and for every \(\alpha \in {\text {Aut}}({\mathbb {C}})\) acting as such, we have \(P_{1_c}^\alpha = P_{1_c}\) and then for every \(\alpha \in {\text {Aut}}({\mathbb {C}})\), we obtain \((P_{1_c}^g)^{\alpha } = P_{1_c}^{\alpha (g)}\) where \(\alpha (g)\) is the eigenform obtained by conjugating the coefficients of g (see [32, Corollary V.1.2]). Now, as we also have the decomposition
in subrepresentations of the Hecke algebra, the sum of all \(P_{1_c}^g\) for g conjugate to f is proportional to the projection \(\pi \) of the trace of \(P- (\infty )\) (belonging to \(J_0(N)(K)\)) in \(A_f(K) \otimes {\mathbb {C}}\), so we have proven that this projection in \(A_f(K)\) is torsion. \(\square \)
We now explain how to deduce Lemma 8 from this result. Let m be an integer coprime to N. Define the Hecke correspondence \({\widetilde{C}}_{m}\) to be the image of \(X_0 (mN)\) in \(X_0 (N)\times X_0 (N)\) under the product of the two natural maps \(X_0 (mN) \rightarrow X_0 (N)\). We define
to be the projection of \({\widetilde{C}}_{m}\) onto the \({\text {End}}(J_0 (N))\) component of \({\text {Pic}}(X_0 (N)\times X_0 (N))\) (see (8)). Then \(C_{m}\) lands in the subspace \({{\,\mathrm{NS}\,}}(J_0(N)) \subset {\text {End}}(J_0 (N))\) of endomorphisms symmetric with respect to the Rosati involution. When m is square-free, \(C_{m}\) is the Hecke operator \(T_{m}\). In general, \(C_{m}\) is a linear combination of \(T_{m/d}\) for d divisors of m.
Recall that \(i_{1,2} :X_0 (N)\hookrightarrow X_0 (N)\times X_0 (N)\) denotes the diagonal morphism. A non-cuspidal point in the support of \(i_{1,2} ^* ({\widetilde{C}}_{m} )\) is a cyclic N-isogeny \(f:E_1 \rightarrow E_2 \), together with cyclic subgroups \(G_i\) of \(E_i \) of order m such that \(f(G_1 )=G_2 \), and isomorphisms
which commute with f and the induced isogeny \(E_1 /G_1 \rightarrow E_2 /G_2 \). In particular, the ring of endomorphisms of each \(E_i\), of discriminant denoted by \(D_i\), thus contains an element of norm m so there exist \(A_i ,B_i \) in \(\mathbb {Z}\) for which
The isogeny being cyclic, \(A_i\) and \(B_i\) must be coprime here. The point \(E_1 \rightarrow E_2 \) is a Heegner point of \(Y_0(N)\) if and only if \(D_1 =D_2 \).
Lemma 10
Let \(X=X_0 (N)\), let m be prime to N, and let \({\widetilde{C}}_m\) be the Hecke correspondence defined above. Then the divisor \(i_{1,2} ^* {\widetilde{C}}_{m}\) is supported on the set of Heegner points whenever m is less than N/4.
Proof
Let \((E_1 \rightarrow E_2 )\) be a non-cuspidal point in the support of \(i_{1,2}^* {\widetilde{C}}_m\) as above. Suppose the point is not Heegner. Since \(E_1 \) and \(E_2 \) are N-isogenous, \(D_2 = \lambda ^2 D_1\) for some rational \(\lambda >0\) a power of N. Since \(\lambda \ne 1\), we must have \(D_i\) divisible by \(N^2\) for some i, and hence \(m>N^2 /4\), by (24). Finally, if the conductor of the order was not prime to N, we would also have \(N^2 | D_i\) which leads to the same inequality. \(\square \)
By the following Lemma (essentially just the Sturm bound) we have enough Hecke operators \(C_m\) for which \(i_{1,2}^* C_m\) is supported on cusps and Heegner points to complete the proof of the first part of Lemma 8.
Lemma 11
Let N be a prime. Then, any element of \({\text {End}}^\dagger (J_0 ^+ (N))^{{{\,\mathrm{tr}\,}}=0}\), viewed as a subspace of \({\text {End}}^\dagger (J_0 (N))^{{{\,\mathrm{tr}\,}}=0}\), can be written as a \({\mathbb {Z}}\)-linear combination of endomorphisms associated to the Hecke correspondences \(C_m\), for \(m<N^2 /4\) prime to N.
Proof
By the Sturm bound ( [62] Theorem 9.18), the set of Hecke operators \(T_m\) for \(m<N^2 /4\) spans the Hecke algebra of endomorphisms of \(J_0(N)\). Since \(a_N (f)=-1\) on newforms such that \(f_{|w_N} = -f\), the set of Hecke operators \(T_m\) for \(m<N^2 /4\) prime to N spans the Hecke algebra of endomorphisms of \(J_0 ^+ (N)\) (which is the full endomorphism algebra over \({\mathbb {Q}}\)). \(\square \)
This completes the proof of case (1) of Proposition 3. Indeed, Lemma 11 implies that any nice correspondence Z on \(X_0 (N)\) can be written as a linear combination of the \(C_m \) for \(m<N^2 /4\) prime to N. By Lemma 10, for any such Z, \(D_Z (b)\) is supported on Heegner points and cusps, so by Lemma 9 (part 2), its image by \(\pi _B\) is torsion.
4.3 How to prove (C) using Heegner points under the analytic hypothesis: \(X=X_{{{\,\mathrm{ns}\,}}}^+ (N)\)
The second case is similar to the first, but we must replace the classical notion of Heegner point with Heegner points on non-split Cartan modular curves in the sense of Zhang/Kohen–Pacetti, and replace Gross–Zagier–Zhang on \(X_0 (N)\) with Zhang’s Gross–Zagier theorem on \(X_{{{\,\mathrm{ns}\,}}}(N)\).
To make results easier to state, we use the moduli interpretation of \(X_\mathrm{{ns}}(N)\) and \(X_\mathrm{{ns}}^+(N)\) given in [42] and its consequences. To do so, one fixes an \(\varepsilon \in \mathbb {F}_N\) which is not a square. A pair \((E,\phi _\varepsilon )\) is then an elliptic curve E together with an endomorphism \(\phi _\varepsilon \) of E[N] whose square is multiplication by \(\varepsilon \). Such an endomorphism has eigenvalues in \(\mathbb {F}_{N^2} \backslash \mathbb {F}_N\), and two pairs \((E,\phi _\varepsilon )\) and \((E',\phi _\varepsilon ')\) are isomorphic if there is an isomorphism \(\psi : E \rightarrow E'\) such that on E[N], \(\psi \circ \phi _\varepsilon = \phi _{\varepsilon }' \circ \psi \).
\(X_\mathrm{{ns}}(N)\) is the compactified moduli space of such pairs up to isomorphism [42, §1.2]. Furthermore, the natural involution on this modular curve is given by \((E,\phi _\varepsilon ) \mapsto (E, - \phi _\varepsilon )\).
First, we define Hecke correspondences \(\widetilde{C_m} \subset X_{{{\,\mathrm{ns}\,}}}(N)\times X_{{{\,\mathrm{ns}\,}}}(N)\) (for m prime to N) as follows. We have a curve \(X_{{{\,\mathrm{ns}\,}}}(N,m)=X_{{{\,\mathrm{ns}\,}}}(N)\times _{X(1)}X_0 (m)\) given by adding an auxiliary \(\varGamma _0(m)\) structure. We have two maps \(X_{{{\,\mathrm{ns}\,}}}(N,m)\rightarrow X_{{{\,\mathrm{ns}\,}}}(N)\), the forgetful one, and the one sending \((E, \phi _\varepsilon ,C)\) to \((E/C,\overline{\pi _C} \circ \phi _\varepsilon \circ \overline{\pi _C}^{-1})\) where C is a cyclic subgroup of order m, \(\pi _C : E \rightarrow E/C\) the natural projection, and \(\overline{\pi _C}\) the induced map \(E[N] \rightarrow (E/C)[N]\). Furthermore, Chen morphisms between \(J_\mathrm{{ns}}(N)\) and \(J_0(N^2)\) are equivariant with respect to the Hecke actions [42, Theorem 1.11].
We will again use the generalised Gross–Zagier formula from Zhang from [67], in a slightly different context here. We follow the notation of [67, §6]. Let \(K/{\mathbb {Q}}\) be an imaginary quadratic field inert at N (instead of split or ramified in the previous case), and let \(K\hookrightarrow M_2({\mathbb {Q}})\) be an embedding associated to an integral basis of \({{\mathcal {O}}}_K\). For a choice of order \({{\mathcal {O}}}_c\) of K of conductor c prime to N, define
(notice the index of \(N {{\mathcal {O}}}_K\) is \(N^2\)). The Shimura variety \(M_{U_c}\) is then uniformised as
where \(U_c\) can be defined as \({\text {GL}}_2({\mathbb {Z}}_v)\) for places v not dividing N, and \((R_c \otimes {\mathbb {Z}}_N)^* \subset {\text {GL}}_2({\mathbb {Z}}_N)\) at N (seen in \({\text {GL}}_2({\mathbb {Z}}_N)\)). Note that \({\text {GL}}_2({\mathbb {Q}})^+ \cdot U_c ={\text {GL}}_2(\mathbb {A}_f )\) and \({\text {GL}}_2({\mathbb {Q}})_+ \cap U_c \subset {\text {SL}}_2({\mathbb {Z}})\) contains the subgroup \(\varGamma (N)\) of \({\text {SL}}_2({\mathbb {Z}})\) of all matrices congruent to the identity modulo N, and the quotient is a conjugate of \(C_{{{\,\mathrm{ns}\,}}}(N) \cap {\text {SL}}_2({\mathbb {Z}}/N{\mathbb {Z}})\), where the precise choice of \(C_\mathrm{{ns}}(N)\) comes from the reduction modulo N of \({{\mathcal {O}}}_c\) inside \(M_2({\mathbb {Z}}/N{\mathbb {Z}})\) given by the embedding (it is nonsplit precisely because N is inert in \({{\mathcal {O}}}_c\)) . This gives an isomorphism
The CM points on \(M_{U_c}\) in the sense of Zhang are then the double cosets of pairs \((h_0 ,i_c )\), where \(h_0 \) is fixed by the image T of the torus \(K^\times \). and \(i_c\) has the property that
in other words the nonsplit Cartan structure of level N is the one determined by the endomorphism ring of the CM elliptic curve.
On the other hand, we say that \((E,\phi _\varepsilon ) \in Y_\mathrm{{ns}}(N)\) is a Heegner point (in the sense of Kohen–Pacetti) with multiplication by \({{\mathcal {O}}}_c\) if \({\text {End}}(E) \cong {{\mathcal {O}}}_c\) (with c prime to N) and \(\phi _\varepsilon \) comes from an endomorphism \(\beta \) of E. Note that this implies that N is inert in \({{\mathcal {O}}}_c\), since the minimal polynomial of \(\beta \) modulo N is then irreducible.
This discussion thus implies the following equivalence of definitions.
Lemma 12
Under the identification \(M_{U_c} \simeq Y_{{{\,\mathrm{ns}\,}}}(N)\) for every order \({{\mathcal {O}}}_c\) of conductor c prime to N, Zhang’s CM points correspond to Heegner points with CM by \({{\mathcal {O}}}_c\) in \(Y_\mathrm{{ns}}(N)\) in the sense of Kohen–Pacetti.
Let f be an eigenform in \(S_2 (\varGamma _0 (N^2 ))^{+,\mathrm {new}} \). It can be seen as an automorphic form on an \(M_{U_c}\) as above, using the isomorphism of Hecke modules \(S_2 (\varGamma _0 (N^2 ))^{+,\mathrm {new}} \cong S_2(\varGamma _\mathrm{{ns}}^+(N))\) and the isomorphism \(M_{U_c}({\mathbb {C}}) \cong Y_\mathrm{{ns}}(N)_{\mathbb {C}}\) and we again have by Eichler-Shimura theory a \({\mathbb {Q}}\)-simple quotient \(A_f\) of \(J_\mathrm{{ns}}^+(N)\).
The consequence of Zhang’s result that we will use is the following.
Theorem 3
([67], Theorem 6.1) With notation as above, let \(1_c\) be the trivial character of \({\text {Gal}}(H_c /K)\) and P a Heegner point on \(Y_\mathrm{{ns}}(N)\) with CM by \({{\mathcal {O}}}_c\) in the sense of Kohen-Pacetti. Denote by \(P_{1_c}\) be the projection of \(P - \xi \) (\(\xi \) the Hodge class) in \(J_{{{\,\mathrm{ns}\,}}}(N)(K) = J_{{{\,\mathrm{ns}\,}}}(N)(H_c )^{1_c}\). Let \(P_{1_c}^f\) be the projection of \(P_{1_c}\) onto the f-isotypical component of \(J_{{{\,\mathrm{ns}\,}}}(N)(H_c )\otimes \mathbb {C}\).
If \(L'(f,1)=0\), then \(P_{1_c}^f=0\) and \(\pi _f (P_{1_c})\) is torsion in \(A_f(H_c)\).
Proof
Using the previous lemmas and discussion, we can translate everything in terms of the Shimura curve \(M_{U_c}\): the Heegner point P becomes a CM point in the sense of Zhang and f becomes an automorphic representation \(\phi \). These changes are compatible with Hecke operators and Galois actions, so they preserve the decompositions into isotypical components above. We can then proceed along the same lines as the proof of Lemma 9 part 2 to deduce the conclusion from Zhang’s theorem. \(\square \)
We are now ready to prove the analogue of Lemma 10 with \(X_0 (N)\) replaced by \(X_{{{\,\mathrm{ns}\,}}}^+(N)\).
Lemma 13
Let \(X=X_{{{\,\mathrm{ns}\,}}} (N)\), let m be prime to N, and let \({\widetilde{C}}_m\) be the Hecke correspondence defined above. Then the divisor \(i_{1,2} ^* {\widetilde{C}}_{m}\) is supported on Heegner points in the sense of Kohen-Pacetti and cusps whenever m is less than \(N^2 /4\).
Proof
By the moduli interpretation of \(X_\mathrm{{ns}}(N)\) and the Hecke correspondences, a noncuspidal point in the support of \(i_{1,2} ^* {\widetilde{C}}_{m}\) is a pair \((E,\phi _\varepsilon )\) such that there exists an endomorphism \(\alpha \) of E of norm m with cyclic kernel (of order m) such that if \({\overline{\alpha }}\) is the induced endomorphism of E[N], \({\overline{\alpha }} \circ \phi _\varepsilon \circ {\overline{\alpha }}^{-1} = \phi _\varepsilon \). This implies that \({\overline{\alpha }}\) belongs to the nonsplit Cartan subgroup associated to \(\phi _\varepsilon \) (which is also the group of invertible elements of \({\mathbb {Z}}[\phi _\varepsilon ]\)). We claim \({\overline{\alpha }}\) is not scalar: if it were, we could write \(\alpha = k + N \beta , k \in {\mathbb {Z}}\beta \in {\text {End}}(E)\) and then the norm of \(\alpha \) being \(m <N^2/4\) forces \(\beta \) to be an integer as well, contradicting the assumption that \(\alpha \) has cyclic kernel.
From this, we deduce that \({\mathbb {Z}}[{\overline{\alpha }}] = {\mathbb {Z}}[\phi _\varepsilon ]\), as both are \({\mathbb {Z}}/N{\mathbb {Z}}\)-vector spaces of dimension 2 and the former is included in the latter. This implies that \(\phi _\varepsilon \) is induced by the action of an element of \({\mathbb {Z}}[\alpha ] \subset {\text {End}}(E)\) on E[N], and the ring of endomorphisms has conductor prime to N for the same reasons as in \(X_0(N)\), and its discriminant is automatically prime to N as discussed after defining Heegner points in the sense of Kohen-Pacetti. \(\square \)
By the compatibility with Hecke correspondences on \(X_0 (N^2 )\) (which is a consequence of Chen’s theorem without quotient by Atkin-Lehner involutions, e.g. [60, Théorème 2]), Lemma 11 implies that any nice correspondence Z on \(X_\mathrm{{ns}}^+(N)\) can be written as a linear combination of \(C_m\) for \(m<N^2 /4\) prime to N. By Lemma 13, for any such Z, \(D_Z (b)\) is supported on Heegner points (in the sense of Kohen–Pacetti) and cusps. Hence, Zhang’s Gross–Zagier theorem (together with Manin–Drinfeld) implies \(\pi _B (D_Z (b))\) is torsion. Assuming the conclusions of Theorem 2 hold for M, the Heegner quotient A of \(J_0^+(M)^\mathrm{{new}}\) is of dimension at least 2 so \(\rho (A) \ge 2\). This completes the proof of case (2) of Proposition 3.
4.4 An alternative approach
In this subsection, we sketch an alternative and less ad hoc approach for proving Proposition 3 in the case \(X=X_0 ^+ (N)\), using the Theorem of Yuan–Zhang–Zhang on the heights of diagonal cycles.
Theorem 4
(Darmon–Rotger–Sols [19], Theorem 3.7) Let \(X=X_0 (N)\), and let f, g be non-conjugate eigenforms in \(S_2 (\varGamma _0 (N))\). Let \(Z\in {{\,\mathrm{NS}\,}}(J_0 (N))\) lie in the image of \({{\,\mathrm{NS}\,}}(A_g )\). Suppose \(\epsilon (f)=-1\) and \(\epsilon ({{\,\mathrm{Sym}\,}}^2 (g)\otimes f)=1\). If the projection of \(D_Z (b)\) to \(A_f\) is non-torsion, then \(L'(f,1)\ne 0\).
The result above holds for arbitrary N, but is most useful when N is prime, since in this case we have \(\epsilon (f\otimes g\otimes g)=-a_N (f)a_N (g)^2 =-a_N (f)\) (see e.g. [30]). Hence in this case Theorem 4 implies that the image of \(D_Z (b)\) in \(A_f\) is torsion for all eigenforms f in \(S_2 ^+ (\varGamma _0 (N))\)., which implies that we get an alternative proof for \(X_0 ^+ (N)\). One way to view Proposition 3 is that it shows that it is easier to prove diagonal cycles are torsion than it is to prove they are non-torsion. On the other hand, one can show directly that the image of \(D_Z (b)\) in \(A_f \) is torsion for all eigenforms f satisfying \(w_N (f)=-f\), as explained in [20, Theorem 3.3.8]: by Lemma 6, we have
Since \(w_N ^* (Z)=Z\), and \(w_N ^* \) acts as (-1) on \(A_f\), we deduce \(\pi _{f *}(D_Z (b))\) is torsion.
5 Proof of the analytic part
In this section, we prove Theorem 2 using analytic weighted averages techniques, following guiding principles e.g. from [37] and [24]. For convenience and consistency, the notation below is as close as possible to that from [47].
Notation
-
N is a prime number and \(M = N\) or \(N^2\) in all of the following.
-
If \(f,g \in S_2(\varGamma _0(M))\), we denote their Petersson scalar product by
$$\begin{aligned} \langle f,g \rangle _M = \int _{{\mathcal {D}}}\overline{f(x+iy)} g(x+iy) dx dy, \end{aligned}$$where \({{\mathcal {D}}}\) is a fundamental domain of \(\varGamma _0(M)\), and the associated Petersson norm by \(\Vert \cdot \Vert _M\).
-
For \(\varepsilon = \pm 1\), the space \(S_2(\varGamma _0(M))^\varepsilon \) refers to the subspace of modular forms f of \(S_2(\varGamma _0(M))\) such that \(f_{|w_M} = \varepsilon \cdot f\), where \(w_M\) is the Fricke involution of \(S_2(\varGamma _0(M))\). Note that in weight 2, this is the space of modular forms f such that L(f, s) has root number \(- \varepsilon \).
-
For A, B linear forms on \(S_2(\varGamma _0(M))\) (resp. on a subspace indicated by superscripts), we write
$$\begin{aligned} \langle A,B \rangle _M = \sum _f \frac{\overline{A(f)} B(f)}{\Vert f\Vert _M^2}, \end{aligned}$$where f goes through an orthogonal basis of \(S_2(\varGamma _0(M))\) (it is readily checked not to depend on this choice of basis), resp. of the prescribed subspace. We will add superscripts \(\{+,-,\mathrm {new},\mathrm {old}\}\) to refer to the sum restricted to an orthogonal basis of the corresponding subspaces of \(S_2(\varGamma _0(M))\).
-
We denote by \(a_m\) (for \(m \in {\mathbb {N}}_{\ge 1}\)) and \(L'\) the linear forms on \(S_2(\varGamma _0(M))\) which to f associate respectively the m-th coefficient of the q-expansion of f, and \(L'(f,1)\) (defined properly in the next paragraph).
-
The (positive) greatest common divisors of integers a, b or integers a, b, c are respectively denoted by (a, b) and (a, b, c).
-
For any positive number B, \(O_1(B)\) refers to a complex number of absolute value \(\le B\).
The proof of Theorem 2 relies on the following lemma.
Lemma 14
Theorem 2 holds for M if
Proof
If \(\langle a_1,L'\rangle _M^{+,\text {new}} \ne 0\), by definition of this sum, there must be at least one normalised newform \(f \in S_2(\varGamma _0(M))^{+,\text {new}}\) such that \(L'(f,1) \ne 0\). As a byproduct of the Gross–Zagier formula ( [32], Corollary V.1.3), this implies that \(L'(g,1) \ne 0\) for all normalised newforms g which are conjugates of f by \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\), thus Theorem 2 holds for M unless the field of coefficients of f is \({\mathbb {Q}}\) and this f is unique, which we assume now. As f is normalised, those coefficients are algebraic integers hence belong to \({\mathbb {Z}}\). Now, one has
by hypothesis, so \(a_2(f) \notin {\mathbb {Z}}\) which leads to a contradiction and Theorem 2 holds. \(\square \)
Remark 8
The statement of this lemma appears quite ad hoc so let us explain the main motivations behind it.
-
As we will see later, as long as m is small compared to \(\sqrt{M}\), one has
$$\begin{aligned} \frac{ \langle a_m,L' \rangle _{M}^{+,\text {new}}}{4 \pi } = \ln (\sqrt{M}) + C - \ln (m) + O(m/\sqrt{M}) \end{aligned}$$with explicit implied constants. This proves that the hypotheses of the lemma are indeed satisfied for large M.
-
The error terms of the estimate above are smaller when the m’s are smaller, hence the choices of \(m=1\) and 2 for the ratio.
-
There are far better asymptotic estimates on the number of newforms f in \(S_2(\varGamma _0(M))^{+,\text {new}}\) such that \(L'(f,1) \ne 0\), e.g. : by [45] (at least for \(M=N\) prime), the proportion of such forms is asymptotically at least 7/8, in particular there are far more than just 2 for M large). These techniques, using also estimates of second moments and of the norms \(\Vert f\Vert _M\), are harder to make explicit, and we suspect the effective bounds obtained by following step-by-step the arguments would be huge. Lemma 14, while very crude (and giving a weaker result) is tailor-made to be efficient enough for precise estimates and approachable bounds.
5.1 Splitting of the terms to estimate the first moments
The starting point to estimate the weighted averages \(\langle a_m, L'\rangle _N^\mathrm{{new}}\) is the following trace formula of Petersson adapted by Akbary (and proven in greater generality in [47]).
Proposition 4
Let m, n, M be three positive integers, and \(\varepsilon = \pm 1\). Then, we have
where S is the notation for Kloosterman sums
(except for \(c=1\) where its value is 1 by convention), \(Q^{-1}\) means the inverse of Q modulo d in the Kloosterman sums and \(J_1\) is the Bessel function of the first kind and order 1.
The sums on the right-hand side are absolutely convergent thanks to the following well-known uniform bounds: \(|J_1(x)| \le |x|/2\) for all x, and the Weil bounds
with \(\tau \) the divisor-counting function, which improves, if M is a prime power dividing c, in
( [36], (3.2), (3.3), Theorem 11.11 and Corollary 11.12).
Now, our normalisation of the L-function associated to a form \(f \in S_2(\varGamma _0(M))\) is given by
and this L-series converges uniformly on any compact subset of \(\{{\text {Re}}(s)>2 \}\).
One can express \(L'(f,1)\) itself in terms of the Fourier coefficients of f in the following way.
Lemma 15
For any \(M \ge 1\) and any \(f \in S_2(\varGamma _0(M))^+\), one has
where \(E_1\) is the exponential integral function, defined on \(]0,+\infty [\) by
Proof
We define the completed L-function \(\varLambda \) associated to L by
By standard arguments(e.g. [14], section 1.5), this function extends to an holomorphic function on \({\mathbb {C}}\) and satisfies the functional equation
The expression of \(L'(f,1)\) is then deduced from the functional equation of \(\varLambda \) by integration of residues on vertical axes and Mellin transform (see e.g. [36] (26.10) where the definition of L is translated by 1/2). \(\square \)
With this formula and by uniform convergence of the terms involved, we obtain:
where
and
The main term in (29) will be \(E_1(2 \pi m /\sqrt{M})\) as long as \(m \ll \sqrt{M}\).
The trace formula does not separate the old and new spaces, which we need for \(M=N^2\). This is taken care of in the following lemma.
Lemma 16
For N prime and \(m \ge 1\) not divisible by N,
Proof
By orthogonality of the new and old subspaces,
To prove the formula on the oldpart, we need to be a bit careful with the definitions of completed L-functions: although the definition of L(f, s) does not depend on the ambient space of modular forms, the definition of the completed L-function \(\varLambda (f,s)\) in (27) does. The degeneracy operators are denoted by \(A_n\) as in the original article [1]. Let
Notice that \((A_N W_{N^2} W_N^{-1})/N\) belongs to \(\varGamma _0(N)\), thus for \(f \in S_2(\varGamma _0(N))\) such that \(f_{|W_N} = \varepsilon _f \cdot f\), one has
hence also
Consequently, an orthogonal (see the computations of section 4 of [47] for example) basis of \(S_2(\varGamma _0(N^2))^{+,\text {old}}\) is given by the \(f_{|A_1} + (f_{|A_1})_{|W_{N^2}}\), where f goes through an eigenbasis of \(S_2(\varGamma _0(N))\). The aforementioned computations also prove with (32) that if \(f_{|W_N} = \varepsilon _f \cdot f\), then
If N does not divide m (so that \(a_m(f_{|A_N})=0\)), this implies that
where f goes through an orthonormal basis of \(S_2(\varGamma _0(N))\). Now, by the functional equation of \(\varLambda (f,s)\) in (28), \( \varLambda '(f_{|A_1},1) = \varLambda '((f_{|A_1})_{|W_{N^2}},1)\) but
The first equality is a direct application of the definition of \(\varLambda \), the second one uses that \(L(f_{|A_N},1) = L(f,1)\) (easy to show by the integral formula of L(f, 1)) and the results above. Thus, to compute \(L'(f_{|A_1} + (f_{|A_1})_{|W_{N^2}},1)\), it is enough to know the sum of the two right-hand terms which is the sum of the two left-hand terms, which equal one another. Now, if \(\varepsilon _f=1\) then \(L(f,1)=0\) by sign of the functional equation of \(\varLambda (f,s)\) (in level N here !), and if \(\varepsilon _f = -1\), \(\varLambda '(f,1) =0\). We thus obtain in this case
and get the lemma by summation on those forms f’s gathered by sign of \(\varepsilon _f\). \(\square \)
5.2 First estimates
We recall that \(M=N\) or \(N^2\).
Lemma 17
Using the Weil bounds, we get for every c multiple of M and d prime to M:
where for every integer k, \(f(k) = \sum _{k'|k} \frac{1}{\sqrt{k'}}\). For \(m=2\) and c, d even, these estimates are improved to
Proof
In the definitions of \({{\mathcal {S}}}(c)\) (and similarly for \({{\mathcal {T}}}(d)\)), we separate the terms in n depending on the values of \((m,n,c) = m'\) which is a divisor of (m, c). Then, using \(|J_1(x)| \le |x|/2\), it only remains to control the sum of the \(E_1(2 \pi m'n/\sqrt{M})\) for n from 1 to \(+ \infty \), which after sum-integral comparison and variable change is smaller than \(\sqrt{M}/(2\pi m')\).
In the specific case where \(m=2\) and c or d even, the cases are made from the beginning on the values of \((m,n,c)^{1/2}\) instead of bounding by \((m,c)^{1/2}\), and a careful computation gives those bounds. \(\square \)
This allows us to bound the sum of the \({{\mathcal {S}}}(c)/c\) for all multiples c of M. By multiplicativity of \(\tau \),
the sum on c being exactly \(\zeta (3/2)^2\). We denote
hence (and similarly for \({{\mathcal {T}}}\)):
which gives
For \(m=2\), the previous refinements can be exploited and we get instead
hence
Identical bounds are found for
as the integral of \(e^{-t}\) on \([0,+\infty [\) is equal to 1 like the one of \(E_1\). Thus, by similar computations,
Gathering those bounds, we get for all m prime to N,
and slightly better ones for \(m=2\) coming from refinements above (it suffices to replace 86mg(m) by 213 and 43mg(m) by 97 above).
By computations on Sage, we deduce the following first estimates.
Proposition 5
With the previous estimates, one finds
Hence, Lemma 14 applies and Theorem 2 is true for \(N \ge 45341\) for \(X_0 ^+ (N)\) and for \(N \ge 269\) for \(X_\mathrm{{ns}}^+(N)\).
For \(M=N\), the estimates of \(\langle a_m,L'\rangle _N\) are readily obtained, but the slowness of convergence is much more visible. This is mainly due to the fact that the error term is in \(m/\sqrt{N}\) instead of m/N.
5.3 Improving the estimates for prime level
To attain from \(N \ge 45341\) a range where all remaining primes can be checked by a different method, one needs to improve upon the worst error term appearing in \(\langle a_m,L' \rangle _N^+\), which is in \(m/\sqrt{N}\) and comes from the estimates of \({{\mathcal {T}}}(d)\) after looking at (33).
The following arguments rely on cancellations of Kloosterman sums not exploited by the Weil bounds. For \(d=1\), the Kloosterman sum is always 1 (see the convention) so this case has to be dealt with separately. A careful analysis proves that
which will slightly improve the bounds later.
Assume now that \(d \ge 2\). The main term contributing to the bound is \(E_1(2\pi n/\sqrt{N})\), hence we write
where \({{\mathcal {T}}}_M(d)\) is the sum of terms for which \(n \le 3 \sqrt{N}/\pi \) and \({{\mathcal {T}}}_R(d)\) is the remainder.
By the Weil bounds, using the fact that the integral of \(E_1\) on \([5,+\infty [\) is less than \(10^{-4}\), we obtain
where \(\lambda _m = 43\) for \(m=1\) and 97 for \(m=2\) as before, so this contribution will be very small. For \({{\mathcal {T}}}_M(d)\), we will exploit Polyà-Vinogradov-type estimates ( [46], Lemma 5.9).
Proposition 6
For every \(d>1\), every k invertible modulo d and every \(m,K,K' \in {\mathbb {N}}\),
Now, assume \(N \ge 1000\), so that for \(m=1\) or 2 and \(n \le 5 \sqrt{N}/(2 \pi )\), \(4 \pi \sqrt{mn}/(d \sqrt{N}) \le 1.5\). This implies that in the considered range for n, the function \(t \mapsto J_1(4 \pi \sqrt{mt}/(d \sqrt{N}))/\sqrt{t} E_1(2 \pi t/\sqrt{N})\) is decreasing and positive (as the product of two such functions). Its total variation on \([1,5 \sqrt{N}/2 \pi ]\) is then bounded by its first value (itself controlled by \(E_1(2 \pi /\sqrt{N})/2\)).
By Abel transform and the previous proposition, we thus obtain
Compared to Weil bounds in Lemma 17, the new bound is approximately the best for \(d \le f(N)= \lfloor N/(2.5^2 E_1(2\pi /\sqrt{N})^2) \rfloor \). We then obtain
with lemma 5.11 of [46]. By Weil bounds and the same lemma, for \(m=1\),
and for \(m=2\),
Combining these arguments, we get, for \(N \ge 1000\),
and
and finally
for \(N \ge 8641\), which is much more reasonable than 45341.
The same improvements for the bounds apply exactly for \(M=N^2 \ge 1000\), thus allowing us to replace the estimate in 43/N in (3738) by the same expressions as above with f(M) instead of f(N).
One gets that \(\langle a_2,L' \rangle _{N^2}^{+,\mathrm {new}} >0\) for \(N \ge 71\) instead of 97, and that
for \(N \ge 151\).
We now discuss how to deal with the remaining cases, namely those for which \(N \le 8641\) and \(g(X_0^+(N)) \ge 2\), and those for which \(N \le 151\) and \(g(X_{\text {ns}}^+(N)) \ge 2\).
The most natural approach is the following: for any small N, compute a basis of eigenforms for \(S_2(\varGamma _0(M))^{+,\text {new}}\), and for every f (normalised) in this basis, compute \(L'(f,1)\) up to sufficient precision to ensure that \(L'(f,1) \ne 0\).
Recall that by ( [32], Corollary V.1.3), if \(L'(f,1) \ne 0\) under the same assumptions, the same is true for the Galois conjugate eigenforms, so only one check needs to be performed for the Galois orbit. Theorem 2 requires exactly that the sum of sizes of those Galois orbits is at least 2, so we only need to check that for two Galois orbits of size 1 (or one of size at least 2), one has \(L'(f,1) \ne 0\).
We have performed these verifications in MAGMA, and obtained the following.
\(\bullet \) For any prime \(N \le 2000\) such that \(X_0 ^+ (N)\) is of genus at least two, there are at least two distincts normalised newforms such that \(L'(f,1) \ne 0\), hence Theorem 2 holds. In fact, we have also checked that for all such N, \(L'(f,1) \ne 0\) for all the eigenforms in \(S_2(\varGamma _0(N))^{+}\), therefore by Proposition 8, \({{\,\mathrm{rank}\,}}J_0^+(N) ({\mathbb {Q}}) = \dim J_0^+(N)\) unconditionally for all those small primes.
\(\bullet \) Similarly, for any prime \(N \le 53\) such that \(X_\mathrm{{ns}}^+(N)\) is of genus at least two, \(L'(f,1) \ne 0\) for all the eigenforms in \(S_2(\varGamma _0(N^2))^{+,\text {new}}\), therefore by the same arguments, \({{\,\mathrm{rank}\,}}{\text {Jac}} (X_\mathrm{{ns}}^+(N))({\mathbb {Q}}) = \dim {\text {Jac}} (X_\mathrm{{ns}}^+(N))\) for all those small primes.
Unfortunately, these algorithms require explicit embeddings of the fields of coefficients \(K_f\) of f into \({\mathbb {C}}\), which makes them very slow when N becomes larger than 2000 (then, the degree of \(K_f\) can be larger than 100). We thus could not complete the argument by using only this method, let us explain how to deal with the intermediary range \(N \in [2000,9000]\) for \(X_0^+(N)\) and \(N \in [59,151]\) for \(X_\mathrm{{ns}}^+(N)\).
The idea is to look at the simple quotients of the two relevant Jacobians which are elliptic curves. If there are none, in this range, we have proved that \(\langle a_1,L' \rangle _M^{+,\text {new}} \ne 0\) so we must have f such that \(L'(f,1) \ne 0\), and it generates a simple quotient of dimension at least 2 by hypothesis, so we are done.
Now, if there are elliptic curves in there, it is sufficient to find two of them of rank 1 for the same reasons. Quotients of \(J_0(M)^{+,\text {new}}\) of dimension 1 are in one-to-one correspondence with isogeny classes of elliptic curves of conductor N and root number \(-1\) (the fact that this correspondence is surjective is a consequence of Cremona’s tables in this range but also a particular case of modularity theorems).
One can thus eliminate all levels N except the ones for which there exists exactly one (up to isogeny) elliptic curve E of analytic rank 1 and conductor N. Using Cremona’s tables, we obtain a list of respectively 70 (\(M=N\)) and 7 (\(M=N^2\)) possible exceptions, namely N in \(\{61,67,73,101,109,113\}\) for the latter.
Now, we use a last argument: if the modular form \(f_E\) associated to E is really the only one such that \(L'(f,1) \ne 0\) in the space, one should have
(the fact that this equality holds without a normalisation factor comes from the Manin constant being equal to 1 here, which is true in this range by results of Cremona).
Now, the left-hand side is larger than 4/5 for \(M=N\), \(N \ge 2000\) and than 1/2 for \(M=N^2\), \(N \ge 53\) by the (optimised) lower bounds given above, and the right-hand side is computable in terms of periods of E. Using this idea turns out to eliminate all remaining possible exceptions in both cases of M, which concludes the proof.
Remark 9
In some sense, this heuristic is natural: all terms in the sum defined by \(\langle a_1,L' \rangle _{M}^{+,\text {new}}\) are positive (another consequence of Gross–Zagier formula), hence there is no cancellation among those, and the idea is that one of them alone cannot be enough to approach the estimates given for the sum.
References
Atkin, A., Lehner, J.: Hecke operators on \(\Gamma _{0}(m)\). Math. Ann. 185, 134–160 (1970)
Baker, M.H.: Kamienny’s criterion and the method of Coleman and Chabauty. Proc. Am. Math. Soc. 127(10), 2851–2856 (1999)
Balakrishnan, J., Dogra, N.: An effective Chabauty–Kim theorem. Compos. Math. 155(6), 1057–1075 (2019). https://doi.org/10.1112/s0010437x19007243
Balakrishnan, J.S., Best, A.J., Bianchi, F., Lawrence, B., Müller, J.S., Triantafillou, N., Vonk, J.: Two recent p-adic approaches towards the (effective) Mordell conjecture. arXiv preprint arXiv:1910.12755 (2019)
Balakrishnan, J.S., Dan-Cohen, I., Kim, M., Wewers, S.: A non-abelian conjecture of Tate–Shafarevich type for hyperbolic curves. Math. Ann. 372(1–2), 369–428 (2018)
Balakrishnan, J.S., Dogra, N.: Quadratic Chabauty and rational points, I: \(p\) -adic heights. Duke Math. J. 167(11), 1981–2038 (2018). https://doi.org/10.1215/00127094-2018-0013
Balakrishnan, J.S., Dogra, N.: Quadratic Chabauty and rational points II: generalised height functions on selmer varieties. Int. Math. Res. Not. (2020). https://doi.org/10.1093/imrn/rnz362
Balakrishnan, J.S., Dogra, N., Müller, J.S., Tuitman, J., Vonk, J.: Explicit Chabauty–Kim for the split Cartan modular curve of level 13. Ann. Math. (2) 189(3), 885–944 (2019). https://doi.org/10.4007/annals.2019.189.3.6
Balakrishnan, J.S., Dogra, N., Müller, J.S., Tuitman, J., Vonk, J.: Quadratic Chabauty for modular curves: algorithms and examples. In preparation (2020)
Betts, L.A., Dogra, N.: Ramification of étale path torsors and harmonic analysis on graphs. arXiv preprint arXiv:1909.05734 (2019)
Bilu, Y., Parent, P.: Serre’s uniformity problem in the split Cartan case. Ann. Math. 2(173), 569–584 (2011)
Bilu, Y., Parent, P., Rebolledo, M.: Rational points on \({X}_0 ^+ (p^r)\). Annales de l’Institut Fourier 63, (2013)
Birkenhake, C., Lange, H.: Complex abelian varieties, 2nd edn. Grundlehren der Mathematischen Wissenschaften , vol. 302. Springer, Berlin, Heidelberg (2004)
Bump, D.: Automorphic forms and representations. Cambridge University Press, Cambridge (1996)
Cai, L., Shu, J., Tian, Y.: Explicit Gross-Zagier and Waldspurger formulae. Algebra Number Theory 8(10), 2523–2572 (2014). https://doi.org/10.2140/ant.2014.8.2523
Chen, I.: On relations between Jacobians of certain modular curves. J. Algebra 231(1), 414–448 (2000)
Colombo, E., van Geemen, B.: Note on curves in a Jacobian. Compos. Math. 88(3), 333–353 (1993)
Darmon, H., Rotger, V.: Diagonal cycles and Euler systems I: A \(p\)-adic Gross-Zagier formula. Ann. Sci. Éc. Norm. Supér. (4) 47(4), 779–832 (2014)
Darmon, H., Rotger, V., Sols, I.: Iterated integrals, diagonal cycles and rational points on elliptic curves. Publications Mathématiques de Besançon 2, 19–46 (2012). https://doi.org/10.5802/pmb.a-145
Daub, M.: Complex and \(p\)-adic computations of Chow–Heegner points. PhD Thesis, Berkeley (2013)
Deligne, P.: Cohomologie étale. Lecture Notes in Mathematics, vol. 569. Springer-Verlag, Berlin (1977)
Deligne, P.: Le groupe fondamental de la droite projective moins trois points. In: Galois groups over \({\bf Q}\) (Berkeley, CA, 1987), Math. Sci. Res. Inst. Publ., vol. 16, pp. 79–297. Springer, New York (1989)
Edixhoven, B., Parent, P.: Semistable reduction of modular curves associated with maximal subgroups in prime level. arXiv preprint arXiv:1907.02418 (2019)
Ellenberg, J.S.: Galois representations attached to \(\mathbb{Q}\)-curves and the generalized Fermat equation \(A^4+ B^2= C^p\). Am. J. Math. 126(4), 763–787 (2004)
Fuchs, C., Pham, D.H.: The \(p\)-adic analytic subgroup theorem revisited. P-adic Numbers Ultrametr Anal Appl 7(2), 143–156 (2015)
Fulton, W.: Intersection theory, Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 2, second edn. Springer-Verlag, Berlin (1998). https://doi.org/10.1007/978-1-4612-1700-8
González, J., Lario, J.C.: Rational and Elliptic parametrizations of \(\mathbb{Q}\)-curves. J. Number Theory 72(1), 13–31 (1998)
Gross, B.: Heegner points on \(X_0(N)\). In: Modular forms (Durham, 1983), Ellis Horwood Ser. Math. Appl.: Statist. Oper. Res., pp. 87–105. Horwood, Chichester (1984)
Gross, B.H.: Kolyvagin’s work on modular elliptic curves. In: \(L\)-functions and arithmetic (Durham, 1989), London Math. Soc. Lecture Note Ser., vol. 153, pp. 235–256. Cambridge Univ. Press, Cambridge (1991). https://doi.org/10.1017/CBO9780511526053.009
Gross, B.H., Kudla, S.S.: Heights and the central critical values of triple product \(L\)-functions. Compos. Math. 81(2), 143–209 (1992)
Gross, B.H., Schoen, C.: The modified diagonal cycle on the triple product of a pointed curve. Ann. Inst. Fourier 45(3), 649–679 (1995)
Gross, B.H., Zagier, D.B.: Heegner points and derivatives of \(L\)-series. Invent. Math. 84(2), 225–320 (1986). https://doi.org/10.1007/BF01388809
Hain, R.: Rational points of universal curves. J. Am. Math. Soc. 24(3), 709–769 (2011). https://doi.org/10.1090/S0894-0347-2011-00693-0
Hain, R., Matsumoto, M.: Galois actions of fundamental groups of curves and the cycle \(C-C^-\). J. Inst. Math. Jussieu 4(3), 363–403 (2005)
Hindry, M., Silverman, J.H.: Diophantine geometry. An introduction., vol. 201. New York, NY: Springer (2000)
Iwaniec, H., Kowalski, E.: Analytic number theory., vol. 53. Providence, RI: American Mathematical Society (AMS) (2004)
Iwaniec, H., Sarnak, P.: The non-vanishing of central values of automorphic \(L\)-functions and Landau-Siegel zeros. Isr. J. Math. 120, 155–177 (2000)
Jannsen, U.: Mixed motives and algebraic \(K\)-theory. Lecture Notes in Mathematics, vol. 1400. Springer-Verlag, Berlin (1990)
Kim, M.: The motivic fundamental group of \(\mathbf{P}^1 -\{0,1,\infty \}\) and the theorem of Siegel. Invent. Math. 161(3), 629–656 (2005)
Kim, M.: The unipotent Albanese map and Selmer varieties for curves. Publ. Res. Inst. Math. Sci. 45(1), 89–133 (2009)
Kim, M., Tamagawa, A.: The \(l\)-component of the unipotent Albanese map. Math. Ann. 340(1), 223–235 (2008). https://doi.org/10.1007/s00208-007-0151-x
Kohen, D., Pacetti, A.: Heegner points on Cartan non-split curves. Canad. J. Math. 68(2), 422–444 (2016). https://doi.org/10.4153/CJM-2015-047-6
Kolyvagin, V.A.: Euler systems. In: The Grothendieck Festschrift, Vol. II, Progr. Math., vol. 87, pp. 435–483. Birkhäuser Boston, Boston, MA (1990)
Kolyvagin, V.A., Logachev, D.Y.: Finiteness of the Shafarevich-Tate group and the group of rational points for some modular abelian varieties. Leningr. Math. J. 1(5), 1229–1253 (1990)
Kowalski, E., Michel, P., VanderKam, J.: Non-vanishing of high derivatives of automorphic \(L\)-functions at the center of the critical strip. J. Reine Angew. Math. 526, 1–34 (2000)
Le Fourn, S.: Surjectivity of Galois representations associated with quadratic \(\mathbb{Q}\)-curves. Math. Ann. 365(1), 173–214 (2016)
Le Fourn, S.: Nonvanishing of central values of \(L\)-functions of newforms in \(S_2 (\Gamma _0(dp^2))\) twisted by quadratic characters. Canad. Math. Bull. 60(2), 329–349 (2017)
LMFDB Collaboration, T.: The L-functions and modular forms database. http://www.lmfdb.org (2019)
Matev, T.: The \(p\)-adic analytic subgroup theorem and applications. arXiv preprint arXiv:1010.3156 (2010)
Mazur, B.: Rational isogenies of prime degree (with an appendix by D. Goldfeld). Invent. Math. 44(2), 129–162 (1978)
Milne, J.S.: Abelian varieties. In: Arithmetic geometry (Storrs, Conn., 1984), pp. 103–150. Springer, New York (1986)
Mumford, D.: Abelian varieties. With appendices by C. P. Ramanujam and Yuri Manin. Corrected reprint of the 2nd ed. 1974., corrected reprint of the 2nd ed. 1974 edn. New Delhi: Hindustan Book Agency/distrib. by American Mathematical Society (AMS); Bombay: Tata Institute of Fundamental Research (2008)
Nekovář, J.: On \(p\)-adic height pairings. In: Séminaire de Théorie des Nombres, Paris, 1990–91, Progr. Math., vol. 108, pp. 127–202. Birkhäuser Boston, Boston, MA (1993). https://doi.org/10.1007/s10107-005-0696-y
Nekovář, J.: The Euler system method for CM points on Shimura curves. In: \(L\)-functions and Galois representations, London Math. Soc. Lecture Note Ser., vol. 320, pp. 471–547. Cambridge Univ. Press, Cambridge (2007). https://doi.org/10.1017/CBO9780511721267.014
Rebolledo, M., Wuthrich, C.: A moduli interpretation for the non-split Cartan modular curve (2017). ArXiv:1402.3498
Ribet, K.: Abelian Varieties over \({\mathbb{Q}}\) and Modular Forms. In: Modular Curves and Abelian Varieties, pp. 241–261. Birkhäuser (2004)
Ribet, K.A.: Galois action on division points of Abelian varieties with real multiplications. Am. J. Math. 98(3), 751–804 (1976). https://doi.org/10.2307/2373815
Serre, J.P.: Propriétés galoisiennes des points d’ordre fini des courbes elliptiques. Invent. Math. 15(4), 259–331 (1972)
Siksek, S.: Quadratic Chabauty for modular curves (2017). ArXiv:1704.00473
de Smit, B., Edixhoven, B.: Sur un résultat d’Imin Chen. Mat. Res. Lett. 7, 147–153 (2000)
Smith, B.: Explicit endomorphisms and correspondences (2005). PhD thesis
Stein, W.: Modular forms, a computational approach, Graduate Studies in Mathematics, vol. 79. American Mathematical Society, Providence, RI (2007). https://doi.org/10.1090/gsm/079. With an appendix by Paul E. Gunnells
Tate, J.: WC-groups over \(p\)-adic fields. Séminaire Bourbaki 10e année, Textes des Conférences, Exposé No. 156, 13 p. (1958). (1958)
Tian, Y.: Euler systems of CM points on Shimura curves (2003). PhD Thesis, Columbia University
Tian, Y., Zhang, S.W.: Euler systems of CM points on Shimura curves. In preparation
Vignéras, M.F.: Valeur au centre de symétrie des fonctions L associées aux formes modulaires. In: Séminaire de Théorie des Nombres, Paris 1979-1980, Progress in Mathematics, vol .12, pp. 331–356. Birkhaüser, Boston (1981)
Zhang, S.W.: Gross-Zagier formula for \(\rm GL(2)\). II. In: Heegner points and Rankin \(L\)-series, Math. Sci. Res. Inst. Publ., vol. 49, pp. 191–214. Cambridge Univ. Press, Cambridge (2004). https://doi.org/10.1017/CBO9780511756375.008
Acknowledgements
The authors wish to thank heartily Samir Siksek, who initiated this project and contributed to its progression, but declined to be listed as a co-author. He also graciously authorised us to include his original argument from his preprint [59], which is found in paragraph 4.1. We would also like to thank Daniel Kohen and Jan Vonk for helpful discussions. Most of this paper was written during the second author’s postdoctoral position at the university of Warwick funded by the European Union’s Horizon 2020 research and programme under the Marie Sklodowska-Curie grant agreement No 793646, titled LowDegModCurve. The first author is supported by a Royal Society University Research Fellowship.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Wei Zhang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix: Chow–Heegner points and Ceresa cycles
In this appendix we explain how Lemma 4 is a consequence of Hain and Matsumoto’s work relating the extension \([\mathrm {Lie}(U_2 )]\) to the Ceresa cycle.
1.1 Ceresa cycles and Gross–Kudla–Schoen cycles
We recall some properties of modified diagonal cycles studied in [17, 31] and [19]. As our discussion applies in fairly broad generality, we take X to be a smooth geometrically irreducible projective curve over a field K of characteristic zero. Let \(\pi _S\) denote the projection
defined by projecting onto the coordinates in S as in (7). The Gross–Kudla–Schoen cycle is defined to be
where \(X_S \) is as defined in section 2.2.
It defines an element of the group \(\mathrm {CH}^2 (X^3 )\) of codimension two cycles in the triple product \(X\times X \times X\). By [31, Proposition 3.1], the class of \(\varDelta _{GKS}\) lies in the subspace \(\mathrm {CH}^2 _0 (X^3 )\) of homologically trivial cycles.
Now let \(Z\subset X\times X\) be a correspondence, and let
be the composite map
where the second map is the intersection product with \(Z \times X^2 \subset X^4 \).
Lemma 18
([19] Lemma 2.1) We have
1.2 The Gross–Kudla–Schoen cycle and the Ceresa cycle
Since \([\varDelta _{GKS}]\) is homologically trivial, it has (Sect. 2.1) an étale Abel–Jacobi class
By [31, Corollary 2.6], the cycle class \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([\varDelta _{GKS}])\) lies in the image of the Kunneth projector
and hence may be thought of as an element of \(H^1 (G_K ,V^{\otimes 3}(-1))\) (here \(V:=H^1 _{{\acute{\mathrm{e}}\mathrm{t}}}(X_{{\overline{K}}},{\mathbb {Q}}_p (1))\)). The action of \(S_3 \) on \(X^3 \) induces an action on \(V^{\otimes 3}(-1)\), which is given by \(\epsilon \otimes \sigma \), where \(\epsilon \) is the sign of a permutation and \(\sigma \) is the natural action of \(S_3 \) on \(V^{\otimes 3}\). Since \(\varDelta _{GKS}\) is invariant under the \(S_3 \) action, it lies in the image of \(H^1 (G_K ,\wedge ^3 V (-1))\) under the map induced by the inclusion
For the relations to fundamental groups, it will be helpful to recall the relation between \(\varDelta _{GKS}\) and the Ceresa cycle. By [31, Proposition 5.3], the image of \(\varDelta _{GKS}\) in \(\mathrm {CH}^{g-1}(J)\) under the map
is rationally equivalent to
The Ceresa cycle \(C_b\) is defined to be
Proposition 7
(Colombo–van Geemen, [17], Proposition 2.9) We have
in \(H^1 (G_K ,\wedge ^3 V(-1)).\)
We first recall Hain and Matsumoto’s description of the Galois action on \(U_2\). We again take X to be a smooth projective geometrically irreducible curve over a field K of characteristic zero. The group \(U_2 \) is an extension
with \(V = T_p J \otimes {\mathbb {Q}}_p\) again. We define
and write the image of \(v_1 \wedge v_2 \) in \(\overline{\wedge ^2 V}\) as \(\overline{v_1 \wedge v_2 }\). Taking the Lie algebra \(L_2 \) of \(U_2\), we obtain an element \([L_2 ]\in {{\,\mathrm{Ext}\,}}^1 _{G_K }(V,\overline{\wedge ^2 V})\), or equivalently an element of \(H^1 (G_K ,V(-1)\otimes \overline{\wedge ^2 V})\). The following theorem of Hain and Matsumoto characterises this extension class in terms of the Gross–Kudla–Schoen cycle.
Theorem 5
(Hain–Matsumoto [34], Theorem 3) Let \(\alpha :\wedge ^3 V\rightarrow V\otimes \overline{\wedge ^2 V}\) be the injective homomorphism
Then \([L_2 ]\in H^1 (G_K ,V(-1)\otimes \overline{\wedge ^2 V})\) is equal to \(\alpha (-1)_* (\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}[C_b ])\), where \([C_b]\) is the class of the Ceresa cycle in \(\mathrm {CH}^{g-1}(J)\), and \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([C_b ])\) is its image in \(H^1 (G_K ,\wedge ^3 V(-1))\).
Via the relation between the Ceresa cycle and the Gross–Kudla–Schoen cycle, this has the following corollary.
Corollary 3
The extension class \([L_2 ]\in H^1 (G_K ,V(-1)\otimes \overline{\wedge ^2 V})\) is equal to the image of \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([\varDelta _{GKS}])\) under the map
induced by the quotient
Proof
Let \(\iota :\wedge ^3 V\rightarrow V^{\otimes 3}\) be the inclusion (41), and \(\tau ':V^{\otimes 3}\rightarrow \wedge ^3 V\) the quotient map \(v_1 \otimes v_2 \otimes v_3 \mapsto v_1 \wedge v_3 \wedge v_3 \). By Proposition 7, the image of \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([\varDelta _{GKS}])\) in \(H^1 (G_K ,\wedge ^3 V(-1))\) under \(\tau ' _*\) is equal to \(\frac{1}{3}\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([C_b ])\). Since \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}([\varDelta _{GKS}])\) lies in the image of \(\iota _*\), and
we have
Hence we deduce from Theorem 5 that
\(\square \)
We now return to the case where \(K={\mathbb {Q}}\). Via the commutative diagram

(where c denotes the Chern class), we hence obtain a homomorphism
where \(L_2 :=\mathrm {Lie}(U_2 )\). The extensions obtained come from points on J. They can be related to the Gross–Kudla–Schoen cycle via the theorem of Hain and Matusmoto (the argument given below follows Darmon, Rotger and Sols [19], who prove a Hodge theoretic analogue of the Lemma below using, using the theorems of Harris and Pulte, which are Hodge theoretic analogues of the Hain–Matsumoto theorem).
Lemma 19
Let \(Z\subset X\times X\) be a codimension 1 cycle. Let \(i_1 ,i_2 ,i_3 :X\hookrightarrow X\times X\) be the closed immersions defined by the subschemes \(\{ b\} \times X,X\times \{ b\} \) and the diagonal \(\varDelta _X\) of \(X\times X\) respectively. For \(j=1,2,\{1,2\}\), let \(i_j ^*\) denote the pull-back morphism
Then the extension class in \(H^1 (G_K ,V)\) associated to the Lie algebra \(L_Z\) is given by \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}(D_Z (b))\), with \(D_Z(b)\) as in (16).
Proof
The class \([L_Z]\) is the image of \([L_2 ]\) under the morphism
induced by \(\pi _Z :\overline{\wedge ^2 V}\rightarrow {\mathbb {Q}}_p (1)\). We have a commutative diagram

By Theorem 5, the extension class \([L_2 ]\) is given by \(\mathrm {AJ}_{{\acute{\mathrm{e}}\mathrm{t}}}(\varDelta _{GKS})\), hence
by Lemma 18. \(\square \)
Appendix: Proof of the Kolyvagin–Logachev type result
In this appendix, we fix the following notation:
\(\bullet \) M is a fixed odd level (which for our applications will be N or \(N^2\))
\(\bullet \) \(f \in S_2(\varGamma _0(M))^{+,\mathrm {new}}\) is a normalised eigenform.
\(\bullet \) \(A=A_f\) is its associated quotient of \(J_0(M)\), together with the canonical projection \(\pi :J_0(M) \rightarrow A\) (independent of the choice of f in its Galois orbit).
We explain here the following result, attributed to Kolyvagin and Logachev.
Proposition 8
(Rank 1 BSD for modular abelian varieties) If \(L'(f,1) \ne 0\), the rank of \(A({\mathbb {Q}})\) is exactly \(g:=\dim A\).
Corollary 4
If \(L'(f,1) \ne 0\) for at least two distinct newforms f, for the Heegner quotient A of \(J_0(M)^{+,\text {new}}\) (Definition 4),
Proof of the Corollary
By Proposition 8 the rank of A is equal to its dimension as it is true for each of its factors \(A_f\). Now, we recall that all endomorphisms of an \(A_f\) are symmetric and the latter is of \({\text {GL}}_2\)-type, in particular \({\text {End}}^\dagger (A_f)\) is of rank \(\dim A_f\) (see Sect. 4.1) . Finally, for f, g non Galois conjugates, there is no morphism between \(A_f\) and \(A_g\) (by multiplicity one in the newpart) so the endomorphism ring splits and we get the last equality. \(\square \)
Remark 10
This result is well-known if \(\dim A=1\) ( [43] for the original reference, [29] for a survey), and proven in much greater generality in [54], all these along the lines of a stronger result in the rank zero case proved in [44]. It is also (a slightly weaker version of) the main result in Tian’s thesis [64] and of a paper of Tian and Zhang in preparation [65] for which we could not find quotable material. In any case, we felt it sufficiently different from the former references (to which we borrow constantly) to deserve a proof for the nonexperts. For the same reasons, we will simply refer to those papers for parts of the proofs which generalise seamlessly and focus on the more technical points.
Convention We use a well-chosen prime number p to obtain Proposition 8. As we only need one such p, in all this appendix, when a property holds when p is large enough, we then automatically assume it is without further mention.
We will prove Proposition 8 by reducing it successively to other statements which will be emphasized.
Notation Throughout this text, \(\tau \) denotes the usual complex conjugation and when it acts on an \({\mathbb {Z}}\)-module \({{\mathcal {M}}}\), \({{\mathcal {M}}}^+\) and \({{\mathcal {M}}}^-\) denote the spaces of \(m \in {{\mathcal {M}}}\) respectively fixed and reversed by \(\tau \). If \({{\mathcal {M}}}\) is finite of odd order, \({{\mathcal {M}}}= {{\mathcal {M}}}^+ \oplus {{\mathcal {M}}}^-\), which we will frequently use implicitly.
Given an Galois extension L/K of number fields and \({\mathfrak {P}}\) a prime ideal of L unramified over \({\mathfrak {p}}\), \(({\mathfrak {P}},L/K)\) denotes the Frobenius of \({\mathfrak {P}}\) for this extension, and \(({\mathfrak {p}},L/K)\) the conjugacy class of such Frobenius’s in \({\text {Gal}}(L/K)\).
1.1 Structure of the p-torsion and reduction to Selmer groups
Let \(K_f\) be the number field of coefficients of f. By [44, section 2.1], there is an isomorphism \([\cdot ]: \, K_f \rightarrow {\text {End}}_{\mathbb {Q}}A \otimes {\mathbb {Q}}\) such that for every prime \(\ell \not \mid N\), \([a_\ell (f)] \in {\text {End}}_{\mathbb {Q}}A\) and
The inverse image of \({\text {End}}_{\mathbb {Q}}A\) is thus an order in \(K_f\) denoted by \({{\mathcal {O}}}\), and A is endowed with a structure of \({{\mathcal {O}}}\)-module.
We now fix p an odd prime totally split in \(K_f\) and prime to the conductor of \({{\mathcal {O}}}\) (there are infinitely many such primes by Cebotarev density theorem), so that \(p {{\mathcal {O}}}= {\mathfrak {P}}_1 \ldots {\mathfrak {P}}_g\) as a decomposition into prime ideals. In all the following, the notation \({\mathfrak {P}}\) will run through \({\mathfrak {P}}_1, \ldots , {\mathfrak {P}}_g\).
Remark 11
It is likely the proof still holds for any type of decomposition of p but this hypothesis makes the exposition much more symmetric (and there are infinitely many of such p’s so we can choose it as large as necessary). In the opposite situation, if there is an inert prime in \(K_f\), the proof should be a bit simpler.
One of the key ideas to get closer to the case of elliptic curves is decomposing every structure of \({{\mathcal {O}}}/(p)\)-modules using those prime ideals. Our tool is the following Lemma, often used without mention.
Lemma 20
By the Chinese remainder theorem, \( {{\mathcal {O}}}/(p) \cong \bigoplus _{{\mathfrak {P}}} {{\mathcal {O}}}/{\mathfrak {P}}\), in particular each \({{\mathcal {O}}}/{\mathfrak {P}}\) is projective and flat over \({{\mathcal {O}}}/(p)\). Every \({{\mathcal {O}}}/(p)\)-module \({{\mathcal {M}}}\) splits canonically into sub-\({{\mathcal {O}}}/(p)\)-modules
and projections are given by elements of \({{\mathcal {O}}}\). All these isomorphisms are canonical, and for every \(m \in {{\mathcal {M}}}\), we will denote by \(m_{\mathfrak {P}}\) its projection onto \({{\mathcal {M}}}[{\mathfrak {P}}]\) (or in \( {{\mathcal {M}}}/{\mathfrak {P}}{{\mathcal {M}}}\) depending on the context).
Proof
The \({\mathfrak {P}}\) are pairwise coprime so the Chinese remainder theorems holds, and tensoring \({{\mathcal {M}}}\) by \({{\mathcal {O}}}/(p)\) on one hand fixes it and the other one decomposes it canonically into \(\bigoplus _{\mathfrak {P}}{{\mathcal {M}}}/{\mathfrak {P}}{{\mathcal {M}}}\). The latter clearly identifies each \({{\mathcal {M}}}/{\mathfrak {P}}{{\mathcal {M}}}\) with the \({\mathfrak {P}}\)-torsion part of \({{\mathcal {M}}}\), and the other statements follow. \(\square \)
The \({{\mathcal {O}}}\)-linear representation A[p] of \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\) thus splits into \(\bigoplus _{{\mathfrak {P}}} A[{\mathfrak {P}}]\) and for any extension L of \({\mathbb {Q}}\), we have canonical isomorphisms of \({{\mathcal {O}}}/(p)\)-modules
If L is a number field, for every place v of L, the natural localisation maps \({\text {loc}}_v\) give rise to a commutative diagram

inherited by flatness from the commonly known analogous diagram for the ideal (p) (for references on those facts and the Selmer groups, see [35, Appendix C.4]). Let us define the \({\mathfrak {P}}\)-Selmer group as
again canonically identified to \({\text {Sel}}_{p}(L,A)[{\mathfrak {P}}]\) hence fitting by the same arguments into the exact sequence

Now, consider an imaginary quadratic field K whose discriminant \(D_K <-4\) is squarefree, prime to the level M and a square modulo M. These conditions guarantee that there is a Heegner point (we fix definitively \({\mathfrak {n}}\) and \([{\mathfrak {a}}_0]\))
in the notation of [28], where H is the Hilbert class field of K. As \(f_{|w_M} = f\), \(\pi \circ w_M = \pi \) therefore by elementary properties of Heegner points [28, formulas (4.1) to (5.2)], for \(y_1 = \pi ((x)-(\infty )) \in A(H)\), one has
Now, using a theorem of Waldspurger [66, Théorème 2.3], let us fix once and for all a K such that \(L(f \otimes \varepsilon _K,1) \ne 0\) where \(\varepsilon _K\) is the Dirichlet character associated to K. By Gross–Zagier formula ( [32], Theorem I.6.3), the point \(y_K\) is then nontorsion in A(K) and has an integer multiple in \(A({\mathbb {Q}})\) by (50). The subgroup \({{\mathcal {O}}}\cdot y_K\) is thus a subgroup of A(K) of rank g (as nonzero elements of \({{\mathcal {O}}}\) act by isogenies), which leads us to the following.
Reduction 1 Prove that \({{\mathcal {O}}}\cdot y_K\) is of finite index in A(K).
Now, for p large enough,
which further leads by (47) to
Reduction 2 Prove that for all \({\mathfrak {P}}\), \(\delta (\overline{y_K})\) generates \({\text {Sel}}_{\mathfrak {P}}(K,A)\).
Proof
If this claim holds, every \({\text {Sel}}_{\mathfrak {P}}(K,A)\) is an \({{\mathcal {O}}}/{\mathfrak {P}}\cong \mathbb {F}_p\)-vector space of dimension 1, so \(A(K)/{\mathfrak {P}}A(K)\) is of dimension at most 1 by (47), and
is of dimension at most g over \(\mathbb {F}_p\). This imposes that the Mordell–Weil rank of A(K) over \({\mathbb {Z}}\) is at most g, hence the equality using \({{\mathcal {O}}}\cdot y_K\). \(\square \)
To conclude this paragraph, \(\tau \) acts naturally on \(A({\overline{{\mathbb {Q}}}}), A[{\mathfrak {P}}]\), \(H^1(K,A[{\mathfrak {P}}])\) and \({\text {Sel}}_{\mathfrak {P}}(K,A)\), and the action of \({{\mathcal {O}}}\) and the morphisms between those in (44) and (45) are \(\tau \)-equivariant. We fix from now on a polarisation \(A \rightarrow {\widehat{A}}\) of degree prime to p (otherwise choose a larger prime p), which thus defines a Weil pairing \(A[p] \times A[p] \rightarrow \mu _p\). Its elementary properties [51, Lemma 16.2] then imply the following structural result, crucial for our understanding.
Lemma 21
For every \({\mathfrak {P}}\) and \(\varepsilon = \pm 1\), the following are true.
\(\bullet \) The 2g spaces \(A[{\mathfrak {P}}]^\varepsilon \) are pairwise orthogonal for the Weil pairing, except the \(A[{\mathfrak {P}}]^\varepsilon \) with the same \({\mathfrak {P}}\) and opposite sign.
\(\bullet \) The two spaces \(A[p]^\varepsilon \) are isotropic for the Weil pairing.
\(\bullet \) Each \(A[{\mathfrak {P}}]^\varepsilon \) is then of dimension 1 over \(\mathbb {F}_p\) and \(\dim _{\mathbb {F}_p} A[{\mathfrak {P}}] = 2\).
1.2 Pairing the Galois group and Selmer groups, and Kolyvagin primes
Throughout this appendix, we fix
(notice L is Galois over \({\mathbb {Q}}\)).
Proposition 9
For p large enough:
(a) \(A[{\mathfrak {P}}]\) is (absolutely) irreducible as a representation of \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\).
(b) The canonical restriction morphism
is injective, with the action of G on \({\text {Gal}}(L^\mathrm{{ab}}/L)\) defined by conjugation in \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\).
Remark 12
Here is an important difference with the \(\dim A=1\) case: the Galois representation \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\rightarrow {\text {GL}}(A[{\mathfrak {P}}]) \cong {\text {GL}}_2(\mathbb {F}_p)\) is not proven to be surjective ( [57] does not cover the square M case), but we will manage with (a) and (b) although it introduces significant changes compared to some arguments in [29].
Proof
(a) is Lemma 3.7 of [56] and (b) is Proposition 6.1.2 of [54]. \(\square \)
We now choose S a finite sub-\({{\mathcal {O}}}\)-module of \(H^1(K,A[{\mathfrak {P}}])\), stable by \(\tau \) (this will be first \({\text {Sel}}_{\mathfrak {P}}(K,A)\) and then an auxiliary module for the proof). By Proposition 9 (b), there is a pairing
which is injective on the left. We define \(L_S\) the extension of L whose absolute Galois group is the orthogonal of S, and thus obtain a nondegenerate pairing between finite abelian p-torsion groups
Keeping track of the actions of \(\tau \) and the \(\sigma \in G\), we have that
In particular, the extension \(L_S/{\mathbb {Q}}\) is Galois.
Lemma 22
This pairing induces a perfect bilinear pairing from \(S^\varepsilon \times H_S^+\) to \(A[{\mathfrak {P}}]^\varepsilon \cong \mathbb {F}_p\), hence a duality between \(S^\varepsilon \) and \(H_S^+\).
Proof
By (52), these two pairings (for \(\varepsilon = \pm 1\)) are well-defined, let us prove they are injective on the left and on the right, they will then be perfect as everything is finite(-dimensional). For \(s \in S^\varepsilon \), if \([s, H_S^+]_S=0\), then
by the same arguments, but \([s,H_S]_S\) is stable by \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\) by (52) again. As \(A[{\mathfrak {P}}]\) is irreducible by Proposition 9 (a), it imposes \([s,H_S]_S=0\) therefore \(s=0\) by nondegeneracy. Now, assume \([S^\varepsilon ,h]_S=0\) for some \(h \in H_S^+\). This holds for all conjugates \(\sigma h \sigma ^{-1}\) of h in \(H_S\) by (52), so on the group \(H' \subset H_S\) they generate. Again, this forces \([S,H']_S \subset A[{\mathfrak {P}}]^{-\varepsilon }\), but this group is stable by \({{\text {Gal}}({\overline{{\mathbb {Q}}}}/ {\mathbb {Q}})}\) hence \(H'=0\). \(\square \)
Lemma 23
Fix \(\varepsilon =\pm 1\) and \(I_S^+\) a proper subgroup of \(H_S^+\). Then, \(s \in S^\varepsilon \) is 0 if for all \(\rho \in H_S^+ \backslash I_S^+\), \([s,\rho ]_S =0\).
Proof
It is a trivial consequence of the perfect duality above, knowing that the sub-\(\mathbb {F}_p\)-vector space generated by \( H_0^+ \backslash I_0^+\) is \(H_0^+\) itself, e. g. by a counting argument. \(\square \)
Reduction 3 For all \({\mathfrak {P}}\), apply Lemma23to (\(s_0=0\), \(\varepsilon =-1\)) (resp. \(\delta \overline{y_K}\), \(\varepsilon =1\)) to prove that \({\text {Sel}}_{\mathfrak {P}}(K,A)^- = 0\) (resp. \({\text {Sel}}_{\mathfrak {P}}(K,A)^+ = \langle \delta \overline{y_K} \rangle \)).
The next subsection will show us how to compute the pairing \([\cdot ,\cdot ]_S\).
1.3 Kolyvagin primes
Definition 5
-
A Kolyvagin prime \(\ell \) is a prime number such that:
-
\(\ell \) does not divide \(D_K M p\) (or the conductor of \({{\mathcal {O}}}\)), so is unramified in L.
-
The conjugacy class of \((\ell ,L/{\mathbb {Q}})\) is the one of \(\tau \) in \({\text {Gal}}(L/{\mathbb {Q}})\). In particular, \(\ell {{\mathcal {O}}}_K =: \lambda _\ell \) is inert over \(\ell \). We will often shorten it to \(\lambda \) if \(\ell \) is nonambiguous, and for any extension \(K'\) of K, \(\lambda _{K'}\) will be a choice of prime ideal of \({{\mathcal {O}}}_{K'}\) above \(\lambda \) (in a consistent fashion if multiple extensions are considered).
-
-
A Kolyvagin number n is a squarefree product of Kolyvagin primes \(\ell \).
In the same fashion as in [28, (3.3)], Kolyvagin primes have many strong properties.
Proposition 10
For a Kolyvagin prime \(\ell \), \(\lambda \) splits completely in L. Furthermore:
in \({{\mathcal {O}}}\), and all the points of \(A[{\mathfrak {P}}]\) are defined over \(K_{\lambda }\). Moreover, the two eigenspaces \((A(K_\lambda )/ {\mathfrak {P}}A(K_\lambda ))^\pm \) for the action of \({\text {Frob}}(\ell )\) are of dimension 1 over \(\mathbb {F}_p\).
Proof
Up to conjugation, \((\lambda _L,L/K) = (\lambda _L,L/{\mathbb {Q}})^{f(\lambda /\ell )} = \tau ^2 = {\text {Id}}\) so \(\lambda _L/\lambda \) is totally split. Now, by Eichler-Shimura theory [44, formula (2.1.8)], the characteristic polynomial of the Frobenius endomorphism \({\text {Frob}}(\ell )\) on the reduction \({\widetilde{A}}\) of A modulo \(\ell \) (as an \({{\mathcal {O}}}\)-linear endomorphism) is \(X^2 - a_\ell (f) X + \ell \) and the one of the complex conjugation is \(X^2-1\), and they must agree on \({\widetilde{A}}[p]\). In particular, \({\text {Frob}}(\ell )^2\) acts trivially on \(A[{\mathfrak {P}}]\) so \({\widetilde{A}}[{\mathfrak {P}}] = {\widetilde{A}}[{\mathfrak {P}}] (\mathbb {F}_\lambda )\) and we can lift those points to \(K_\lambda \). By the same arguments, on also has the decomposition
in two nontrivial spaces, given the characteristic polynomial of \({\text {Frob}}(\ell )\), so each of the two spaces on the right-hand side is of dimension 1 over \(\mathbb {F}_p\). We deduce immediately by the structure of finite abelian groups that as groups,
which proves that each \(({\widetilde{A}}(\mathbb {F}_\lambda )/{\mathfrak {P}}{\widetilde{A}}(\mathbb {F}_\lambda ))^{\varepsilon }\) must be of dimension 1 over \(\mathbb {F}_p\), and this also lifts to \(K_\lambda \) (without increasing the dimension as the group of elements reducing to 0 modulo \(\lambda \) is p-divisible). \(\square \)
To state the next result, recall that for a finite place \(v \not \mid p\) of good reduction of A, the image of \(A(K_v)/pA(K_v)\) in \(H^1(K_v,A[p])\) is precisely the inflation of \(H^1(K_v^{\text {unr}}/K_v,A[p])\), called the unramified part. The latter is isomorphic to A[p] when all the p-torsion is defined over \(K_v\), via the evaluation of the cocycles at \({\text {Frob}}(v)\) the topological generator of \({\text {Gal}}(K_v^{\text {unr}}/K_v)\). The same argument translates for \(A[{\mathfrak {P}}]\) by tensoring by \({{\mathcal {O}}}/{\mathfrak {P}}\) again.
Proposition 11
Let \({{\mathcal {L}}}\) be an unramified prime ideal of \(L_S\) whose Frobenius in \({\text {Gal}}(L_S/{\mathbb {Q}})\) is \(\tau h\) for \(h \in H_S\). It is above a Kolyvagin prime \(\ell \) and for every \(s \in S\) whose localisation at \(\lambda = \ell {{\mathcal {O}}}_K\) is unramified,
through the identification described above, as all \(A[{\mathfrak {P}}]\) is defined over \(K_\lambda \).
Proof
By hypothesis, \(({{\mathcal {L}}},L/{\mathbb {Q}})_{|L} = \tau \) so \(\lambda _L = {{\mathcal {L}}}\cap {{\mathcal {O}}}\) is indeed above a Kolyvagin prime \(\ell \). On the other hand, \(({{\mathcal {L}}},L_S/L)=({{\mathcal {L}}},L_S/{\mathbb {Q}})^2 = (\tau h)^2\) as the inertia does not change between K and \(L_S\). Now, the diagram

is clearly commutative, which establishes the equality by definition. \(\square \)
Remark 13
The set of all \((\tau h)^2\) thus obtained is exactly \(H_S^+\), by Cebotarev density theorem.
Now, for any place v of K, we can construct ( [63], section 2) a canonical bilinear pairing obtained from Tate duality
The key use of Tate duality is the following Proposition, which is a slight generalisation of [29, Proposition 8.2].
Proposition 12
If for a prime \(\lambda \) of K (above a Kolyvagin prime) and a \(\gamma \in H^1(K,A)^{\varepsilon }[{\mathfrak {P}}]\), one has \({\text {loc}}_v \gamma = 0\) for all \(v \ne \lambda \) but \({\text {loc}}_\lambda \gamma \ne 0\), then for every \(s \in {\text {Sel}}_{\mathfrak {P}}(K,A)^{\varepsilon }\), \({\text {loc}}_\lambda s=0\).
Proof
By its definition, (53) comes from the Weil pairing in the sense that the latter induces a cup product
for which \(\delta _v(A(K_v)/pA(K_v))\) is isotropic, and the resulting quotiented pairing is exactly \(\langle \cdot ,\cdot \rangle _{K_v}\). Now, the so-called global Tate duality states that for any \(s \in {\text {Sel}}_p(K,A)\), \(\gamma \in H^1(K,A)[p]\),
where \({\text {inv}}_v : {\text {Br}}(K_v) \rightarrow {\mathbb {Q}}/{\mathbb {Z}}\) is the Brauer invariant isomorphism for all v. Indeed, let us lift \(\gamma \) to \({\widetilde{\gamma }} \in H^1(K_v,A[p])\), so that for every \(v \in M_K\),
with the analogous definition of \((\cdot ,\cdot )_{K}\) on K, and \({\text {loc}}_{v,\mathrm {Br}}: {\text {Br}}(K) \rightarrow {\text {Br}}(K_v)\) the usual localisation. Now, by properties of Brauer groups, the sum of \({\text {inv}}_v \circ {\text {loc}}_v\) is 0 on \({\text {Br}}(K)\) hence the formula.
Under our assumptions on \(\gamma \) and s, we thus have \( \langle \delta _\lambda ^{-1} {\text {loc}}_\lambda s, {\text {loc}}_\lambda \gamma \rangle _{K_\lambda } = 0\) but \({\text {loc}}_\lambda \gamma \ne 0\), let us show how this implies that \({\text {loc}}_\lambda s =0\).
By the original arguments of [63], the pairing \(\langle \cdot , \cdot , \rangle _{K_\lambda }\) is a perfect pairing. Being inherited from the Weil pairing, the \({\mathfrak {P}}\) and \({\mathfrak {P}}'\)-parts for \({\mathfrak {P}}\ne {\mathfrak {P}}'\) are orthogonal, so it induces a duality
Now, it is also invariant by \({\text {Gal}}(K_\lambda /{\mathbb {Q}}_\ell )\)-action (there is a difference with the Weil pairing here, but it is also inherited from the cup product \((\cdot ,\cdot )_{K_\lambda }\)), so the \(+\) and − spaces on each side are orthogonal. We thus have for \(\varepsilon =\pm 1\) a duality
but making use of the fact that \(\lambda \) is above a Kolyvagin prime, each space of the duality is thus of dimension 1 over \(\mathbb {F}_p\) (Proposition 10), and so the pairing can be 0 only if one of the terms is 0, hence \({\text {loc}}_\lambda s=0\). \(\square \)
1.4 Construction of the Kolyvagin classes
Following [44], one takes the classes \([{\mathfrak {a}}]\) and prime ideal \({\mathfrak {n}}\) induced by the choices made in (48) on orders of \({{\mathcal {O}}}_K\), and for any Kolyvagin number n, we get Heegner points
where by class field theory, \(K_n\) is the class ring field of conductor n (\(K_1=H\)).
The notation \(\lambda _{n,\ell }\) will refer to a choice of prime ideal of \(K_n\) above \(\ell \) a Kolyvagin prime, consistent in case of towers of extensions, shortened to \(\lambda _n\) if there is no doubt on \(\ell \). One has that \(G_n := {\text {Gal}}(K_n/K_1) \cong ({{\mathcal {O}}}_K/n {{\mathcal {O}}}_K)^* / ({\mathbb {Z}}/n{\mathbb {Z}})^*\) and the following diagrams for \(n=\ell m\) by class field theory.

In particular, \(\mathbb {F}_{\lambda _n} = \mathbb {F}_{\lambda _m}= \mathbb {F}_{\lambda }\), a fact which will be ubiquitous and used without further mention in the end of the argument.
The crucial properties of these points (making them a Kolyvagin system) are the following, \({\widetilde{A}}\) denoting the (good) reduction of A modulo \(\ell \) and \({\text {Frob}}(\ell )\) the associated Frobenius endomorphism on \({\widetilde{A}}\).
Proposition 13
For \(n = \ell m\) a Kolyvagin number,
for some \(\sigma \in {{\mathcal {G}}}_n := {\text {Gal}}(K_n/K)\).
Proof
By classical properties of Heegner points ( [28], paragraphs 4 and 5) and class field theory for \(K_n/K_m\),
as divisors on \(X_0(N)\), which proves (55) when combined with (43). We obtain (57) with the same properties.
Looking at the diagrams (54), as \(\lambda _n/\lambda _m\) is totally ramified, the reduction of the left-hand side of (58) is \((\ell +1) x_n \, {\text {mod}} \,\lambda _n\), and the one of the right-hand side has one term equal to \({\text {Frob}}(\ell ) x_m\) by the Eichler-Shimura relation \(T_\ell = {\text {Frob}}(\ell ) + \widehat{{\text {Frob}}(\ell )}\), so there exists \(\sigma \in {\text {Gal}}(K_n/K_m)\) such that the reduction of \(\sigma x_n\) is \({\text {Frob}}(\ell ) \widetilde{x_m}\), but every \(\sigma \) reduces to the identity on \({\widetilde{A}}(\mathbb {F}_\lambda )\) so the equality is true term by term hence (56). See also [44, Corollaries 2.3.3 and 2.3.4] for the \(n=\ell \) case. \(\square \)
Proposition 14
For every Kolyvagin number n, one can define in successive order (using the Heegner points \(y_m\) for m|n):
\(\bullet \) A point \(P_n \in A(K_n)\) whose class \([P_n] \in A(K_n)/p A(K_n)\) is fixed by \({{\mathcal {G}}}_n\) (and \(P_1=y_K\)).
\(\bullet \) The unique class \(c(n) \in H^1(K,A[p])\) whose restriction to \(H^1(K_n,A[p])^{{{\mathcal {G}}}_n}\) comes from \([P_n]\), and its image d(n) in \(H^1(K,A)[p]\). They correspond to one another in the following commutative diagram with exact rows and columns

Proof
The construction and properties of \(P_n\) proceeds exactly as in ( [29], (3.5) to (4.1)). The only nontrivial thing to prove (to define c(n) from \([P_n]\))is that the central row of (59) is an isomorphism. The extension \(K_n/{\mathbb {Q}}\) is unramified outside primes dividing \(D_K n\), and the extension \({\mathbb {Q}}(A[p])/{\mathbb {Q}}\) is unramified outside primes dividing Mp, so as \(D_Kn\) and pM are coprime by construction, these extensions are linearly disjoint. In particular, \(K_n(A[p])/K_n\) has Galois group isomorphic to \({\text {Gal}}({\mathbb {Q}}(A[p])/{\mathbb {Q}})\) and thus no fixed point in A[p] by Proposition 9 (a). The isomorphism follows by [29, (4.2)]. \(\square \)
These points enjoy a wealth of very strong properties detailed below.
Proposition 15
For every Kolyvagin number n:
(a) \([P_n]\) (resp. c(n), d(n)) lives in the \(\mu (n)\)-eigenspace of \(A(K_n)/pA(K_n)\) (resp. \(H^1(K,A[p])\), \(H^1(K,A)[p]\)), where \(\mu (n)\) is the Moebius function.
(b) The class \(c(n)_{\mathfrak {P}}\in H^1(K,A[{\mathfrak {P}}])\) (resp. \(d(n)_{\mathfrak {P}}\in H^1(K,A)[{\mathfrak {P}}]\)) is trivial if and only if \(P_n \in {\mathfrak {P}}A(K_n)\) (resp. \({\mathfrak {P}}A(K_n) + A(K)^{\mu (n)}\)).
(c) For every place v of K, the class \({\text {loc}}_v d(n)\) is trivial except if v|n.
(d) If \(n=\ell m\) and \(\lambda = \ell {{\mathcal {O}}}_K\), the class \({\text {loc}}_{\lambda } d(n)_{\mathfrak {P}}\) is trivial if and only if \(P_m \in {\mathfrak {P}}A(K_{\lambda _m})\) if and only if \({\text {loc}}_\lambda c(m)_{\mathfrak {P}}= 0\).
Proof
(a) for \([P_n]\) is inherited from (57) by the construction of \(P_n\) (see Proposition 5.4 of [29]), and deduced for c(n), d(n) by \(\tau \)-equivariance of the morphisms of (59).
(b) is obtained by tensoring (59) by \({{\mathcal {O}}}/{\mathfrak {P}}\), which preserves exactness by flatness and \([P_n]\) seen in \(A(K_n)/pA(K_n) \otimes {{\mathcal {O}}}/{\mathfrak {P}}\) is exactly the image of \(P_n\) in \(A(K_n)/{\mathfrak {P}}A(K_n)\). The proof of (c) is given by Proposition 6.2 of [29].
For (d), define \(D={\text {Gal}}((K_n)_{\lambda _n}/K_\lambda )\), which is cyclic generated by some \(\sigma _\ell \). We thus have injective arrows (defined below)
where for a cocycle \(c \in Z^1(D, A)\), \({\text {red}}(c) = c(\sigma _\ell ) \, {\text {mod}} \,\lambda _n\), and invariant up to coboundary because \(K_n/K_m\) is totally ramified at \(\lambda _m\), so \({\text {red}}\) is well-defined. As \(A^1((K_n)_{\lambda _n})\) is a pro-\(\ell \)-group, \(H^1(D,A^1)[p]=0\) which proves that \({\text {red}}\) is injective. The map \(\iota \) is the quotiented connecting homomorphism, automatically injective. As \({\widetilde{A}}(\mathbb {F}_\lambda )\) is a finite abelian group, the orders of \({\widetilde{A}}(\mathbb {F}_\lambda )[p]\) and \( {\widetilde{A}}(\mathbb {F}_\lambda )/p {\widetilde{A}}(\mathbb {F}_\lambda ) \) are readily seen to be equal so \(\iota \) is also an isomorphism. By [29, Proposition 6.2(2)], the image of \({\text {loc}}_\lambda d(n)\) in \({\widetilde{A}}(\mathbb {F}_\lambda )[p]\) by \({\text {red}}\) is
where \(\widetilde{R_m}\) is any choice of p-th root of \(\widetilde{P_m}\) in \({\widetilde{A}}\). By the proof of Proposition 10, its image by \({\text {Frob}}(\ell )\) is then
but the injection \(\iota \) from (60) is explicitly given by taking a p-th root and applying \(({\text {Frob}}(\ell )^2 - {\text {Id}})\), as \({\text {Frob}}(\ell )^2 = {\text {Frob}}(\lambda )\) ( [44], Lemma 3.4.2 for details). The image of \({\text {loc}}_\lambda d(n)\) in \({\widetilde{A}}(\mathbb {F}_\lambda )/p {\widetilde{A}}(\mathbb {F}_\lambda )\) via (60) is thus exactly \(-{\text {Frob}}(\ell )^{-1} \cdot \widetilde{P_m}\), and its \({\mathfrak {P}}\)-part is trivial if and only if the \({\mathfrak {P}}\)-part of \(\widetilde{P_m}\) is. Finally, \(A^1(K_{\lambda _m})\) is p-divisible hence the equality of \({{\mathcal {O}}}/(p)\)-modules \(A(K_{\lambda _m})/pA(K_{\lambda _m}) \cong {\widetilde{A}}(\mathbb {F}_\lambda )/p{\widetilde{A}}(\mathbb {F}_\lambda )\), so finally \({\text {loc}}_\lambda d(n)_{\mathfrak {P}}\) is trivial if and only if \([P_m] \in A(K_{\lambda _m})/pA(K_{\lambda _m})[{\mathfrak {P}}]\), which is equivalent to \(P_m \in {\mathfrak {P}}A(K_{\lambda _m})\) and the equivalence in terms of c(m) is straightforward. \(\square \)
1.5 End of the proof
Let \(S = {\text {Sel}}_{\mathfrak {P}}(K,A)\). By (51), \(P_1 = y_K \notin {\mathfrak {P}}A(K)\), hence it defines a nonzero \(s_K:=c(1) \in S^+\) (Proposition 15 (a)). Fixing \(s \in S\), for every \(h \in H_S\), by Cebotarev density theorem, there is a prime ideal \({{\mathcal {L}}}\) such that \(({{\mathcal {L}}},L_S/{\mathbb {Q}}) = \tau h\), and by Proposition 11,
where \(\lambda \) is the prime ideal of K below \({{\mathcal {L}}}\), and above \(\ell \) which is a Kolyvagin prime. Outside of \(I_S^+\) (defined as the \(+\)-part of the orthogonal of \(s_K\)), this formula proves that \({\text {loc}}_\lambda s_K \ne 0\), so \({\text {loc}}_\lambda d(\ell )_{\mathfrak {P}}\ne 0\) and all other localisations of \(d(\ell )_{\mathfrak {P}}\) are trivial by Proposition 15. By Proposition 12, if \(s \in S^-\), \( {\text {loc}}_\lambda s = 0\) so \([s,(\tau h)^2]_S = 0\), hence \(S^-=0\) by Lemma 23.
Now, consider \(s \in S^+\) such that for some \({{\mathcal {L}}}\) as above (fixed, so it fixes \(\lambda \) and h above), \({\text {loc}}_\lambda s=0\). We have \({\text {loc}}_\lambda s_K \ne 0\) by hypothesis on h, so in turn \({\text {loc}}_\lambda d(\ell )_{\mathfrak {P}}\ne 0\) by Proposition 15 (d) and \(c(\ell )_{\mathfrak {P}}\) does not belong to S. By the perfect pairing result of Lemma 22 applied to \(\langle S,c(\ell ) \rangle \)if \((\tau h)^2 \notin I_S^+\), the extensions \(L_S\) and \(L_{\langle c(\ell ) \rangle }\) are linearly disjoint over L, which allows us, for any \(h' \in H_S\), to choose \({{\mathcal {L}}}'\) a prime ideal of \(L_S L_{\langle c(\ell ) \rangle }\) whose Frobenius restricted to \(L_S\) is \(\tau h'\) and whose Frobenius restricted to \(L_{\langle c(\ell ) \rangle }\) is of the shape \(\tau h_0\) and not orthogonal to \(c(\ell )_{\mathfrak {P}}\). Denoting \(\ell '\) the corresponding Kolyvagin prime and \(\lambda '\) the ideal of \({{\mathcal {O}}}_K\), we thus have
this formula being legitimate because \({\text {loc}}_{\lambda '}(d(\ell )_{\mathfrak {P}}) = 0\) by Proposition 15 (c). All this proves that \( {\text {loc}}_{\lambda '} c(\ell )_{\mathfrak {P}}\ne 0\) so \({\text {loc}}_{\lambda '} d(\ell \ell ')_{\mathfrak {P}}\ne 0\) by Proposition 15 (d), and it belongs to \(H^1(K,A)^+[{\mathfrak {P}}]\). Now, for our s above, the global Tate duality between s and \(d(\ell \ell ')\) in the proof of Proposition 12 has two possible nonzero terms (in \(\lambda \) and \(\lambda '\) ), but by hypothesis \({\text {loc}}_\lambda s=0\) so the \(\lambda '\)-term is alone, therefore 0 as well. This implies by Proposition 12 that \({\text {loc}}_{\lambda '} s = 0\) for all such \(\lambda '\), therefore \(s=0\) in this case by Lemma 23.
Finally, for \(s \in S^+\), as \({\text {loc}}_\lambda s_K \ne 0\) and the space \((A(K_\lambda )/{\mathfrak {P}}A(K_\lambda ))^+\) is one-dimensional (Proposition 10), there is \(k \in {\mathbb {Z}}\) such that \(s - k s_K\) satisfies the previous hypothesis and then \(s=k s_K\), so we have proved that \(S^+ = \langle s_K \rangle \).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Dogra, N., Le Fourn, S. Quadratic Chabauty for modular curves and modular forms of rank one. Math. Ann. 380, 393–448 (2021). https://doi.org/10.1007/s00208-020-02112-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00208-020-02112-3
Mathematics Subject Classification
- 11G18
- 14G05
- 11G30