In this chapter, we revisit classes of quaternion ideals: organizing ideals of given norm in terms of their classes, we find modular forms.

1 \(\triangleright \) Brandt matrices, neighbors, and modular forms

Let B be a quaternion algebra over \(\mathbb {Q}\). A major theme of this text has been the study of classes of quaternion ideals, beginning with chapter 17. When B is indefinite, we saw (Theorem 17.8.3, treated broadly in chapter 28) that strong approximation applies: via the reduced norm, very often the conclusion is that the class set is trivial.

We are left with the case that B is definite. By the geometry of numbers (see section 17.5) we found that the number of ideal classes of an order is finite, generated by ideals of small reduced norm. (Studying the zeta function we found a mass formula in chapter 25, and then studying quadratic embeddings we found a class number formula in section 30.8.) We now pursue this further: there is an exquisite arithmetic and combinatorial structure to be found by counting right ideals of given norm by their classes as follows. We begin in this section by an introduction and survey (working over \(\mathbb {Q}\)).

Let \(\mathcal {O}\subset B\) be an order. Let \({{\,\mathrm{Cls}\,}}\mathcal {O}\) be the right class set of \(\mathcal {O}\), keeping track of the isomorphism classes of invertible right \(\mathcal {O}\)-ideals in B. Let \(h :=\#{{\,\mathrm{Cls}\,}}\mathcal {O}\) be the (right) class number of \(\mathcal {O}\), and let \(I_1,\dots ,I_h \subseteq B\) represent the distinct classes in \({{\,\mathrm{Cls}\,}}\mathcal {O}\).

Let \(n \in \mathbb Z _{\ge 1}\). We define an \(h \times h\)-matrix \(T(n) \in {{\,\mathrm{M}\,}}_h(\mathbb Z )\) with nonnegative integer entries, called the n-Brandt matrix, by

$$\begin{aligned} \begin{aligned} T(n)_{ij}&:=\#\{J \subset I_j : {{\,\mathrm{nrd}\,}}(J)=n{{\,\mathrm{nrd}\,}}(I_j) \text { and } [J]=[I_i]\} \\&= \#\{J \subset I_j : [I_j:J]=n^2 \text { and } [J]=[I_i]\}. \end{aligned} \end{aligned}$$
(41.1.1)

(The notation T(n) deliberately overloads that of the Hecke operators defined in section 40.5: keep reading to see why!) The Brandt matrix T(n) depends on \(\mathcal {O}\), but for brevity we do not include this in the notation. In the jth column of the Brandt matrix T(n), we look at the subideals of \(I_j\) with index \(n^2\) and count them in the ijth entry according to the class \([I_i]\) they belong to. If \(n=p\) is prime and \(p \not \mid N = {{\,\mathrm{disc}\,}}\mathcal {O}\), then there are exactly \(p+1\) such ideals, so the sum of the entries in every column in T(p) is equal to \(p+1\).

Example 41.1.2

We continue with Example 17.6.3. We have \(B=\displaystyle {\biggl (\frac{-1,-23}{\mathbb {Q}}\biggr )}\) of discriminant 23 and a maximal order \(\mathcal {O}\) with three ideal classes \([I_1],[I_2],[I_3]\). In (17.6.5), we found three ideals in \(I_1=\mathcal {O}\): two belong to the class \([I_2]\) and the third is principal, belonging to \([I_1]\). This gives the first column of the matrix as \((1,2,0)^{\textsf {t} }\). Computing further, we find

$$\begin{aligned} T(2) = \begin{pmatrix} 1 &{} 1 &{} 0 \\ 2 &{} 1 &{} 3 \\ 0 &{} 1 &{} 0 \end{pmatrix}. \end{aligned}$$

In a similar manner, we compute

$$\begin{aligned} T(3) = \begin{pmatrix} 0 &{} 1 &{} 3 \\ 2 &{} 3 &{} 0 \\ 2 &{} 0 &{} 1 \end{pmatrix}, \quad T(101) = \begin{pmatrix} 30 &{} 28 &{} 24 \\ 56 &{} 54 &{} 60 \\ 16 &{} 20 &{} 18 \end{pmatrix}. \end{aligned}$$

41.1.3

There is a second and computationally more efficient way to define the Brandt matrix using representation numbers of quadratic forms. Let \(q_i={{\,\mathrm{nrd}\,}}(I_i)\), let \(\mathcal {O}_i=\mathcal {O}{}_{\textsf {\tiny {L}} }(I_i)\), and let \(w_i=\#\mathcal {O}_i^\times /\{\pm 1\}<\infty \). Then

$$\begin{aligned} T(n)_{ij} = \frac{1}{2w_i} \#\{\alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )q_i/q_j = n \}: \end{aligned}$$

indeed, \(\alpha I_i = J \subseteq I_j\) with \({{\,\mathrm{nrd}\,}}(J)=n{{\,\mathrm{nrd}\,}}(I_j)\) if and only if \(\alpha \in I_jI_i^{-1}=(I_j:I_i){}_{\textsf {\tiny {L}} }\) and \({{\,\mathrm{nrd}\,}}(\alpha )q_i=pq_j\), and \(\alpha \) is well defined up to right multiplication by \(\mu \in \mathcal {O}_i^\times \). Now

$$\begin{aligned} \begin{aligned} Q_{ij} :I_jI_i^{-1}&\rightarrow \mathbb Z \\ Q_{ij}(\alpha )&= {{\,\mathrm{nrd}\,}}(\alpha )\frac{q_i}{q_j} \end{aligned} \end{aligned}$$
(41.1.4)

is a positive definite quadratic form, so it suffices to enumerate lattice points!

Example 41.1.5

Returning to our example, we have

$$\begin{aligned} T(n)_{ii} = \frac{1}{2w_i} \#\{\gamma \in \mathcal {O}_i : {{\,\mathrm{nrd}\,}}(\gamma )=n\}. \end{aligned}$$

For \(i=1\), we have \(w_1=2\) and with \(\gamma =t+x\alpha + y\beta + z\alpha \beta \) and \(t,x,y,z \in \mathbb Z \), by (17.6.6)

$$\begin{aligned} {{\,\mathrm{nrd}\,}}(\gamma )=t^2+ty+x^2+xz+6y^2+6z^2 \end{aligned}$$

so \(T(p)_{11}\) counts half the number of representations of n by this positive definite quaternary quadratic form.

There is a third way to understand Brandt matrices which is visual and combinatorial.

Definition 41.1.6

Let \(I,J \subseteq \mathcal {O}\) be invertible right \(\mathcal {O}\)-ideals. We say J is a n -neighbor  of I if \(J \subseteq I\) and \(n{{\,\mathrm{nrd}\,}}(I)={{\,\mathrm{nrd}\,}}(J)\).

The n-Brandt graph is the directed graph with vertices \({{\,\mathrm{Cls}\,}}\mathcal {O}\) and a directed edge from \([I_i]\) to [J] for each n-neighbor \(J \subseteq I_i\).

There is no extra content here, just a reinterpretation: the n-Brandt matrix is simply the adjacency matrix of the n-Brandt graph.

Example 41.1.7

Returning a third time to our example, we have the 2-Brandt graph, as in Figure 41.1.8.

Figure 41.1.8:
figure 1

The 2-Brandt graph for discriminant 23

41.1.9

For \(n=p\) prime, there is another way to think of the p-Brandt graph. Consider the directed graph whose vertices are invertible right \(\mathcal {O}\)-ideals whose reduced norm is a power of p, and draw a directed edge from I to J if J is a p-neighbor of I. If \(p \not \mid N\), then this graph is a (\(p+1\))-regular directed tree, that is to say, from each vertex there are \(p+1\) directed edges. The notion of belonging to the same ideal class induces an equivalence relation on this graph, and the quotient is the p-Brandt graph.

It is helpful to think of the matrices T(n) as operators on a space, so we define the Brandt module \(M_2(\mathcal {O})\) to be the \(\mathbb {C}\)-vector space with basis \({{\,\mathrm{Cls}\,}}\mathcal {O}\) and equipped with the action of Brandt matrices T(n) for \(n \in \mathbb Z _{\ge 0}\) on the right.

The Brandt matrices have two important properties. First, they commute: by a quaternionic version of the Sun Zi theorem (CRT), if \(\gcd (m,n)=1\) then

$$\begin{aligned} T(m)T(n)=T(n)T(m). \end{aligned}$$
(41.1.10)

Informally, we might say that the process of taking m-neighbors commutes with the process of taking n-neighbors, when mn are coprime. Second, they are self-adjoint for the pairing on \(M_2(\mathcal {O})\) given by

$$\begin{aligned} \langle [I_i],[I_j] \rangle = {\left\{ \begin{array}{ll} 1/w_i, &{} \text { if}\, i=j; \\ 0, &{} \text {else.} \end{array}\right. } \end{aligned}$$

The proof of self-adjointness is contained in the equality \(w_i T(n)_{ij} = w_j T(n)_{ji}\), and this follows from a bijection induced by the standard involution.

Therefore the matrices T(n) are semisimple (diagonalizable) and \(M_2(\mathcal {O})\) has a simultaneous basis of eigenvectors, which we call a eigenbasis. The row \(e=(1,1,\dots ,1)\) is always an eigenvector (by the sum of columns) with eigenvalue \(a_p(e)=p+1\) for \(p \not \mid N\).

Example 41.1.11

We check that \(T(2)T(3)=T(3)T(2)\) from Example 41.1.2; and \(w_1,w_2,e_3=2,1,3\), so we verify that

$$\begin{aligned} \begin{pmatrix} 2 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 3 \end{pmatrix} T(2) = \begin{pmatrix} 2 &{} 2 &{} 0 \\ 2 &{} 1 &{} 3 \\ 0 &{} 3 &{} 0 \end{pmatrix} \end{aligned}$$

is symmetric. The characteristic polynomial of T(2) is \((x-3)(x^2+x-1)\), and for T(3) it is \((x-4)(x^2-5)\). We find the eigenbasis \(e=(1,1,1)\) and \(f_{\pm }=(4,\pm \sqrt{5}-3,\mp 3\sqrt{5}+3)\), and observe the required orthogonality

$$\begin{aligned} \langle e, f_{\pm } \rangle = \frac{1}{2}\cdot 4 + (\pm \sqrt{5}-3)+\frac{1}{3}(\mp 3\sqrt{5}+3)=0. \end{aligned}$$

By now, hopefully the reader is convinced that the Brandt matrices capture interesting arithmetic information about the order \(\mathcal {O}\) and that they are not difficult to compute.

Now comes the modular forms: the second way of viewing Brandt matrices shows that we should be thinking of a generating series for the representation numbers of the quadratic forms \(Q_{ij}\) defined in (41.1.4). As in section 40.4, we define the theta series for the quadratic form \(Q_{ij}\)

$$\begin{aligned} \Theta _{ij}(q) :=\sum _{n=0}^{\infty } T(n)_{ij} q^n = \frac{1}{2w_i} \sum _{\gamma \in I_jI_i^{-1}} q^{Q_{ij}(\gamma )} \in \mathbb Z [[q]]. \end{aligned}$$

Letting \(q :=e^{2\pi iz}\) for , by Theorem 40.4.4 (a consequence of Poisson summation), the function is a modular form of weight 2 for an explicit congruence subgroup of \({{\,\mathrm{SL}\,}}_2(\mathbb Z )\): for example, if \(\mathcal {O}\) is an Eichler order with reduced discriminant N, then \(\theta _{ij}(q) \in M_2(\Gamma _0(N))\) (trivial character).

There is something enduringly magical about the fact that the entries of Brandt matrices (arithmetic) give Fourier coefficients of holomorphic modular forms (geometric, analytic).

Example 41.1.12

Returning one last time to our example, the space \(M_2(\Gamma _0(23))\) of modular forms of weight 2 and level \(\Gamma _0(23)\) has eigenbasis \(e_{23},f_+,f_-\) where

$$\begin{aligned} e_{23}(z)=\frac{11}{12}+\sum _{n=1}^{\infty } \sigma ^*(n) q^n, \qquad \sigma ^*(n)=\sum _{\begin{array}{c} d \mid n \\ 23 \not \mid d \end{array}} d \end{aligned}$$

and

$$\begin{aligned} f_{\pm }(z) :=q - \frac{\pm \sqrt{5}+1}{2} q^2 + \sqrt{5}q^3 + \dots \end{aligned}$$

are cusp forms matching the eigenbasis in Example 41.1.11.

One of the main applications of Brandt matrices is to express the trace of the Hecke operator in terms of arithmetic data, as follows.

Theorem 41.1.13

Let B be a definite quaternion algebra over \(\mathbb {Q}\) of discriminant D and \(\mathcal {O}\subseteq B\) a maximal order. For \(d<0\) a fundamental discriminant, define

$$\begin{aligned} h_{D}(d) :=\frac{h(d)}{w_d} \prod _{p \mid D}\left( 1-\biggl (\displaystyle {\frac{d}{p}}\biggr )\right) \end{aligned}$$

where h(d) is the class number of \(\mathbb {Q}(\sqrt{d})\), \(w_d\) its number of roots of unity, and \(\biggl (\displaystyle {\frac{d}{p}}\biggr )\) is the Kronecker symbol. For \(d'<0\) a discriminant with \(d'=df^2\) and d fundamental, define \(h_D(d')=h_D(d)\). Then the trace of the nth Brandt matrix associated to \(\mathcal {O}\) is

$$ {{\,\mathrm{tr}\,}}T(n) = \sum _{\begin{array}{c} t \in \mathbb Z \\ t^2<4n \end{array}} h_{D}(t^2-4n) + {\left\{ \begin{array}{ll} \varphi (D)/12, &{} \text { if}\, n\, \text {is a square;} \\ 0, &{} \text { otherwise} \end{array}\right. } $$

where \(\varphi \) is the Euler totient function.

Theorem 41.1.13 is a special case of Main Theorem 41.5.2, see Example 41.5.8. In a surprising way, it exhibits a relationship between traces of Hecke operators and (modified) class numbers of imaginary quadratic fields!

2 Brandt matrices

To begin, we define the all-important Brandt matrices in the generality considered in this text.

Let R be a global ring with eligible set \({{{\texttt {\textit{S}}}}}\subseteq {{\,\mathrm{Pl}\,}}F\), let B be an \({{{\texttt {\textit{S}}}}}\)-definite quaternion algebra over F of discriminant \({{\,\mathrm{disc}\,}}B=\mathfrak D \) and let \(\mathcal {O}\subset B\) be an R-order in B with reduced discriminant \({{\,\mathrm{discrd}\,}}\mathcal {O}=\mathfrak N \). The reader will probably have in mind the case where F is a totally real (number) field, \({{{\texttt {\textit{S}}}}}\) the set of real (archimedean) places of F, and B a definite quaternion algebra over F; but the arguments hold just as well with a larger set \({{{\texttt {\textit{S}}}}}\) of ramified places or in the function field case.

41.2.1

Let \({{\,\mathrm{Cls}\,}}\mathcal {O}\) be the right class set of \(\mathcal {O}\). By Corollary 27.6.20, the class number \(h=\#{{\,\mathrm{Cls}\,}}\mathcal {O}<\infty \) is finite. Let \(I_1,\dots ,I_h\) be a set of representative invertible right \(\mathcal {O}\)-ideals for \({{\,\mathrm{Cls}\,}}\mathcal {O}\). For \(i=1,\dots ,h\), let \(\mathcal {O}_i=\mathcal {O}{}_{\textsf {\tiny {L}} }(I_i)\) be the left order of \(I_i\); then \(\mathcal {O}_i\) depends on the choice of \(I_i\) but its isomorphism class (i.e., type) is independent of the choice of \(I_i\). (Due to the possible presence of two-sided ideals, there may be repetition of types among the orders \(\mathcal {O}_i\).)

41.2.2

Let \(\mathfrak n \subset R\) be a nonzero ideal. For each j, we consider the set of right invertible \(\mathcal {O}\)-ideals \(J \subseteq I_j\) with \({{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I_j)\), and we count them according to their class in \({{\,\mathrm{Cls}\,}}\mathcal {O}\):

$$\begin{aligned} T(\mathfrak n )_{ij} :=\#\{J \subseteq I_j : {{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I_j) \text { and }[J]=[I_i]\} \in \mathbb Z _{\ge 0}. \end{aligned}$$
(41.2.3)

A containment \(J \subseteq I_j\) of right \(\mathcal {O}\)-ideals yields a compatible product \(J'=JI_j^{-1}\) and thus an invertible right \(\mathcal {O}_j\)-ideal with reduced norm \({{\,\mathrm{nrd}\,}}(J')={{\,\mathrm{nrd}\,}}(JI_j^{-1})=\mathfrak n \), and conversely. So equivalently

$$\begin{aligned} T(\mathfrak n )_{ij} = \#\{J' \subseteq \mathcal {O}_j : {{\,\mathrm{nrd}\,}}(J')=\mathfrak n \text { and }[J'I_j]=[I_i]\}. \end{aligned}$$

(We could also rewrite \([J' I_j]=[I_i]\) in terms of classes of right ideals of \(\mathcal {O}_j\).)

Definition 41.2.4

The \(\mathfrak n \)-Brandt matrix for \(\mathcal {O}\) is the matrix \(T(\mathfrak n ) \in {{\,\mathrm{M}\,}}_h(\mathbb Z )\) whose (ij)th entry is equal to \(T(\mathfrak n )_{i,j}\).

41.2.5

To make the definition more canonical, we define

$$\begin{aligned} M_2(\mathcal {O}) :={{\,\mathrm{Map}}}({{\mathrm{Cls}\,}}\mathcal {O},\mathbb Z ) \end{aligned}$$

to be the set of maps from \({{\,\mathrm{Cls}\,}}\mathcal {O}\) to \(\mathbb Z \) (as sets). Then \(M_2(\mathcal {O})\) has the structure of an abelian group under addition of maps, and it is a free \(\mathbb Z \)-module on the characteristic functions for \({{\,\mathrm{Cls}\,}}\mathcal {O}\). The \(\mathfrak n \)-Hecke operator is defined to be

$$\begin{aligned} \begin{aligned} T(\mathfrak n ) :M_2(\mathcal {O})&\rightarrow M_2(\mathcal {O}) \\ (T(\mathfrak n ) f)([I])&= \sum _{\begin{array}{c} J \subseteq I \\ {{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I) \end{array}} f([J]) \end{aligned} \end{aligned}$$
(41.2.6)

again the sum over all invertible right \(\mathcal {O}\)-ideals \(J \subseteq I\) with condition on the reduced norm. Visibly, this definition does not depend on the choice of representative I in its right ideal class. And in the basis of characteristic functions for \(I_i\), the matrix of \(T(\mathfrak n )\) is precisely the \(\mathfrak n \)-Brandt matrix.

Brandt matrices may be given in terms of elements instead of ideals. Let \(w_i=[\mathcal {O}_i^\times : R^\times ]\). By Proposition 32.3.7, since B is \({{{\texttt {\textit{S}}}}}\)-definite, the unit index \(w_i<\infty \) is finite.

Lemma 41.2.7

Let \(\mathfrak n _{ij} = \mathfrak n {{\,\mathrm{nrd}\,}}(I_j)/{{\,\mathrm{nrd}\,}}(I_i)\) for \(i,j=1,\dots ,h\). Then following statements hold.

  1. (a)

    We have

    $$\begin{aligned} \begin{aligned} T(\mathfrak n )_{ij}&= \# \left\{ \alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )R = \mathfrak n _{ij}\right\} /\mathcal {O}_i^\times \\&= \frac{1}{w_i} \# \left\{ \alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )R = \mathfrak n _{ij}\right\} /R^\times \end{aligned} \end{aligned}$$
    (41.2.8)

    where we count orbits under right multiplication by \(\mathcal {O}_i^\times \) and \(R^\times \), respectively.

  2. (b)

    If the class of \(\mathfrak n _{ij}\) in \({{\,\mathrm{Cl}\,}}^+ R\) is nontrivial, then \(T(\mathfrak n )_{ij}=0\).

  3. (c)

    Suppose that \(\mathfrak n _{ij}=n_{ij} R\) with \(n_{ij} \in F_{>0}^\times \) totally positive. Then

    $$\begin{aligned} T(\mathfrak n )_{ij} = \frac{1}{2w_i} \sum _{u R^{\times 2} \in R_{>0}^\times /R^{\times 2}} \#\left\{ \alpha \in I_j I_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha ) = un_{ij} \right\} \end{aligned}$$
    (41.2.9)

    where the sum is over a choice of representatives of totally positive units of R modulo squares.

Proof

We claim that there is a bijection

$$\begin{aligned} \begin{aligned}&\{J \subseteq I_j : {{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I_j) \text {and}[J]=[I_i]\} \\&\quad \leftrightarrow \left\{ \alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )=\mathfrak n _{ij} \right\} /\mathcal {O}_i^\times \end{aligned} \end{aligned}$$
(41.2.10)

with orbits under right multiplication by \(\mathcal {O}_i^\times =\mathcal {O}{}_{\textsf {\tiny {R}} }(I_i^{-1})^\times \). Indeed, a containment \(J \subseteq I_j\) of invertible right \(\mathcal {O}\)-ideals with \([I_i]=[J]\) corresponds to \(\alpha \in B^\times \) such that \(\alpha I_i=J\), so in fact \(\alpha \in (J:I_i){}_{\textsf {\tiny {L}} }= JI_i^{-1}\), and \({{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I_j)\) translates into \({{\,\mathrm{nrd}\,}}(\alpha ){{\,\mathrm{nrd}\,}}(I_i)={{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I_j)\) or \({{\,\mathrm{nrd}\,}}(\alpha )R=\mathfrak n _{ij}\). Writing \(JI_i^{-1}=\alpha \mathcal {O}_i\), we see that \(\alpha \) is unique up to multiplication on the right by \(\mathcal {O}_i^\times \). To finish (a), we note that the right action by \(\mathcal {O}_i^\times \) is free; and (b) follows from (a), since \({{\,\mathrm{nrd}\,}}(B^\times ) \le F_{>0}^{\times }\) as B is \({{{\texttt {\textit{S}}}}}\)-definite.

For (c), we just need to organize our generators; the sum in (41.2.9) is finite by the Dirichlet \({{{\texttt {\textit{S}}}}}\)-unit theorem. If \({{\,\mathrm{nrd}\,}}(\alpha )R=\mathfrak n _{ij}\) then \({{\,\mathrm{nrd}\,}}(\alpha )=un_{ij}\) for some \(u \in R_{>0}^\times \). Multiplying by an element of \(R_{>0}^\times \), we may suppose that \({{\,\mathrm{nrd}\,}}(\alpha )/n_{ij}=u\) belongs in a set of representatives for \(R_{>0}^{\times }/R^{\times 2}\), and for \(v \in R^\times \), we have \({{\,\mathrm{nrd}\,}}(v\alpha )=v^2{{\,\mathrm{nrd}\,}}(\alpha )={{\,\mathrm{nrd}\,}}(\alpha )\) if and only if \(v=\pm 1\), which gives us an extra factor 2. \(\square \)

41.2.11

The advantage of (41.2.9) is that it can be expressed simply in terms of a quadratic form. Suppose that F is a number field and \(R=\mathbb Z _F\), when this observation is especially clean. Since B is totally definite, as in 17.7.10, the space \(B \hookrightarrow B \otimes _{\mathbb {Q}} \mathbb R \cong \mathbb H ^n \cong \mathbb R ^{4n}\) comes equipped with the positive definite quadratic form \(Q={{\,\mathrm{Tr}\,}}_{F/\mathbb {Q}} {{\,\mathrm{nrd}\,}}:B \rightarrow \mathbb R \), and if J is a R-lattice, then \(J \cong \mathbb Z ^{4n}\) embeds as a Euclidean lattice \(J \hookrightarrow \mathbb R ^{4n}\) with respect to this quadratic form. Therefore,

$$\begin{aligned} \{\alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )=un_{ij}\} \subseteq \{\alpha \in I_jI_i^{-1} : Q(\alpha )={{\,\mathrm{Tr}\,}}_{F/\mathbb {Q}} un_{ij}\} \end{aligned}$$

where the latter set is finite and effectively computable.

Finally, Brandt matrices are adjacency matrices.

Definition 41.2.12

Let \(I,J \subseteq \mathcal {O}\) be invertible right \(\mathcal {O}\)-ideals. We say J is a \(\mathfrak n \)-neighbor of I if \(J \subseteq I\) and \({{\,\mathrm{nrd}\,}}(J)=\mathfrak n {{\,\mathrm{nrd}\,}}(I)\).

The \(\mathfrak n \)-Brandt graph is the directed graph with vertices \({{\,\mathrm{Cls}\,}}\mathcal {O}\) and a directed edge from \([I_i]\) to [J] for each \(\mathfrak n \)-neighbor \(J \subseteq I\).

By definition, the adjacency matrix of the \(\mathfrak n \)-Brandt graph is the \(\mathfrak n \)-Brant matrix \(T(\mathfrak n )\).

41.2.13

Let \(\mathfrak p \not \mid \mathfrak N \) be prime and suppose that the class of \(\mathfrak p \) generates \({{\,\mathrm{Cl}\,}}_{G(\mathcal {O})} R\). Then by Proposition 28.5.18, we may take the ideals \(I_i\) to have reduced norm a power of \(\mathfrak p \). Consider the directed graph whose vertices are the right \(\mathcal {O}\)-ideals whose reduced norm is a power of \(\mathfrak p \) with directed edges for each \(\mathfrak p \)-neighbor relation. (This graph is a regular directed tree by Proposition 41.3.1 below, every vertex has out degree equal to \({\,{\mathsf {N}}\mathfrak p +1}\).) The equivalence relation of belonging to the same right ideal class (left equivalent by an element of \(B^\times \)) respects edges, and the quotient by this equivalence relation is the \(\mathfrak p \)-Brandt graph.

Remark 41.2.14 The Brandt graphs have interesting graph theoretic properties: they are Ramanujan graphs (also called expander graph), having high connectivity and are potentially useful in communication networks. In the simplest case where \(F=\mathbb {Q}\) and B is the quaternion algebra of discriminant p, they were first studied by Ihara, then studied in specific detail by Lubotzky–Phillips–Sarnak [LPS88] and Margulis [Marg88]; for further reading, see the books by Lubotzky [Lub2010] and Sarnak [Sar90]. Over totally real fields, see work of Livné [Liv2001] as well as Charles–Goren–Lauter [CGL2009]. The proof that Brandt graphs are Ramanujan relies on the Ramanujan–Petersson conjecture, a deep statement proven by Deligne [Del74], giving bounds on coefficients of modular forms.

Remark 41.2.15. The space of functions on \({{\,\mathrm{Cls}\,}}\mathcal {O}\) can itself be understood as a space of modular forms, a special case of the theory of algebraic modular forms due to Gross [Gro99]. This general formulation harmonizes with the double coset description given in section 38.7, via the canonical bijection \({{\,\mathrm{Cls}\,}}\mathcal {O}\leftrightarrow B^\times \backslash \widehat{B}^\times / \widehat{\mathcal {O}}^\times \), but without the geometry!

3 Commutativity of Brandt matrices

In this section, we examine basic properties of Brandt matrices—including that they commute.

Proposition 41.3.1

The following statements hold.

  1. (a)

    The sum of the entries \(\sum _i T(\mathfrak n )_{ij}\) in every column of \(T(\mathfrak n )\) is constant; if \(\mathfrak n \) is coprime to \(\mathfrak N \), then this constant is equal to \({\sum _\mathfrak{d \mid \mathfrak n } \,{\mathsf {N}}(\mathfrak d )}\), where \({\,{\mathsf {N}}(\mathfrak a )=\#(R/\mathfrak a )}\) is the absolute norm.

  2. (b)

    If \(\mathfrak m ,\mathfrak n \) are relatively prime, then

    $$\begin{aligned} T(\mathfrak m \mathfrak n )=T(\mathfrak m )T(\mathfrak n )=T(\mathfrak n )T(\mathfrak m ). \end{aligned}$$
    (41.3.2)

Proof

First we prove (a). The orders \(\mathcal {O}_j\) are locally isomorphic, so by the local-global dictionary for lattices (Theorem 9.4.9), the number of invertible right \(\mathcal {O}_j\)-ideals with given reduced norm is independent of j, giving the first statement. For the second statement, under the hypothesis that \(\mathfrak n \) is coprime to \(\mathfrak N \), for all \(\mathfrak p \mid \mathfrak n \) we have \(\mathcal {O}_\mathfrak p \simeq {{\,\mathrm{M}\,}}_2(R_\mathfrak p )\), and we counted right ideals in our pursuit of the zeta function: by Proposition 26.3.9, these counts are multiplicative, and by Lemma 26.4.1(b), the number of reduced norm \(\mathfrak p ^e\) is \(1+q+\dots +q^e\) where \({q=\,{\mathsf {N}}(\mathfrak p )}\).

Statement (b) follows similar logic but with “unique factorization” of right ideals. As above, an invertible right \(\mathcal {O}_j\)-ideal of reduced norm \(\mathfrak m \mathfrak n \) by Lemma 26.3.6 factors uniquely into a compatible product of invertible lattices of reduced norm \(\mathfrak m \) and \(\mathfrak n \): organizing by classes, this says precisely that

$$\begin{aligned} T(\mathfrak m \mathfrak n )_{ij} = \sum _{k=1}^h T(\mathfrak m )_{ik}T(\mathfrak n )_{kj} \end{aligned}$$
(41.3.3)

which gives the matrix product \(T(\mathfrak m \mathfrak n )=T(\mathfrak m )T(\mathfrak n )\). Repeating the argument interchanging the roles of \(\mathfrak m \) and \(\mathfrak n \), the result is proven. \(\square \)

For prime powers coprime to \(\mathfrak N \), we have a recursion for the \(\mathfrak p ^r\)-Brandt matrices that is a bit complicated: the uniqueness of factorization fails when the product is a two-sided ideal, so we must account for this extra term. To this end, we need to keep track of the effect of multiplication by right ideals of R on the class set.

41.3.4

For an ideal \(\mathfrak a \subseteq R\), let \(P(\mathfrak a ) \in {{\,\mathrm{M}\,}}_h(\mathbb Z )\) be the permutation matrix given by \(I_i \mapsto \mathfrak a I_i\). In other words, we place a 1 in the (ij)th entry according as \([\mathfrak a I_j]=[I_i]\) (with 0 elsewhere). The matrix \(P(\mathfrak a )\) only depends on the class \([\mathfrak a ] \in {{\,\mathrm{Cl}\,}}R\): in particular, if \(\mathfrak a \) is principal then \(P(\mathfrak a )\) is the identity matrix. Therefore we have a homomorphism

$$\begin{aligned} P:{{\,\mathrm{Cl}\,}}R&\rightarrow {{\,\mathrm{GL}\,}}_h(\mathbb Z ) \\ [\mathfrak a ]&\mapsto P(\mathfrak a ). \end{aligned}$$

We have

$$\begin{aligned} P(\mathfrak a \mathfrak b )=P(\mathfrak a )P(\mathfrak b )=P(\mathfrak b )P(\mathfrak a ) \end{aligned}$$

and in particular \(P(\mathfrak a )P(\mathfrak a ^{-1})=1\) and the image \(P({{\,\mathrm{Cl}\,}}R) \subseteq {{\,\mathrm{GL}\,}}_h(\mathbb Z )\) is an abelian subgroup; however, this map need not be injective. Moreover, for all \(\mathfrak a ,\mathfrak n \) we have

$$\begin{aligned} P(\mathfrak a )T(\mathfrak n )=T(\mathfrak n )P(\mathfrak a ) \end{aligned}$$
(41.3.5)

by commutativity of multiplication by \(\mathfrak a \).

As in 26.4.3, we say an integral right \(\mathcal {O}\)-ideal I is primitive if we cannot write \(I=\mathfrak a I'\) with \(I'\) integral and \(\mathfrak a \subsetneq R\).

Proposition 41.3.6

Let \(\mathfrak p \not \mid \mathfrak N \) be prime. Then for \(r,s \in \mathbb Z _{\ge 0}\),

$$\begin{aligned} T(\mathfrak p ^r)T(\mathfrak p ^s) = \sum _{i=0}^{\min (r,s)} \,{\mathsf {N}}(\mathfrak p )^i T(\mathfrak p ^{r+s-2i}) P(\mathfrak p )^i. \end{aligned}$$
(41.3.7)

In particular, for all \(r \ge 0\),

$$\begin{aligned} T(\mathfrak p ^{r+2}) = T(\mathfrak p ^{r+1})T(\mathfrak p )- \,{\mathsf {N}}(\mathfrak p ) T(\mathfrak p ^r)P(\mathfrak p ). \end{aligned}$$
(41.3.8)

Proof

When \(s=0\), the matrix T(1) is the identity and the result holds. We next consider the case \(s=1\), and will then proceed by induction, and consider the product \(T(\mathfrak p ^r)T(\mathfrak p )\): its ijth entry

$$\begin{aligned} (T(\mathfrak p ^r)T(\mathfrak p ))_{ij} = \sum _{k=1}^h T(\mathfrak p ^r)_{ik} T(\mathfrak p )_{kj} \end{aligned}$$

counts the number of compatible products of right ideals \(J_r' J'\) where \(J_r'\) is an invertible \(\mathcal {O}_i,\mathcal {O}_k\)-ideal with \({{\,\mathrm{nrd}\,}}(J_r')=\mathfrak p ^r\) and \(J'\) is an invertible \(\mathcal {O}_k,\mathcal {O}_j\)-ideal with \({{\,\mathrm{nrd}\,}}(J')=\mathfrak p \). The issue: these products may not all be distinct when they are imprimitive. If the product \(J_r' J'\) is imprimitive, then we rewrite it as a compatible product

$$\begin{aligned} J_r' J' = (J_r' (\overline{J}')^{-1}) \overline{J'} J' = \mathfrak p (J_r' (\overline{J}')^{-1}) \end{aligned}$$

where now \(J_{r-1}'=\mathfrak p ^{-1} J_r' J'\) has reduced norm \(\mathfrak p ^{r-1}\). This procedure works in reverse as well.

With apologies for the temporarily annoying notation, define \(T_{prim }(\mathfrak p ^{r+1})\) and \(T_{imprim }(\mathfrak p ^{r+1})\) to be the \(\mathfrak p ^r\)-Brandt matrix counting classes of primitive or imprimitive, accordingly. Then

$$\begin{aligned} T(\mathfrak p ^{r+1}) = T_{prim }(\mathfrak p ^{r+1}) + T_{imprim }(\mathfrak p ^{r+1}). \end{aligned}$$
(41.3.9)

Under multiplication by \(\mathfrak p \), we have

$$\begin{aligned} T_{imprim }(\mathfrak p ^{r+1})=T(\mathfrak p ^{r-1})P(\mathfrak p ). \end{aligned}$$
(41.3.10)

Since there are \(\,{\mathsf {N}}(\mathfrak p )+1\) right \(\mathcal {O}\)-ideals of reduced norm \(\mathfrak p \), with the previous paragraph we obtain

$$\begin{aligned} \begin{aligned} T(\mathfrak p ^r)T(\mathfrak p )&= T_{prim }(\mathfrak p ^{r+1}) + (\,{\mathsf {N}}(\mathfrak p )+1) T_{imprim }(\mathfrak p ^{r+1}) \\&= T(\mathfrak p ^{r+1}) + \,{\mathsf {N}}(\mathfrak p ) T_{imprim }(\mathfrak p ^{r+1}) \\&= T(\mathfrak p ^{r+1}) + \,{\mathsf {N}}(\mathfrak p ) T(\mathfrak p ^{r-1})P(\mathfrak p ). \end{aligned} \end{aligned}$$
(41.3.11)

This proves the result for \(s=1\), and it gives (41.3.8) upon rearrangement and shifting indices.

We now proceed by (an ugly but harmless) induction on s:

$$\begin{aligned} \begin{aligned}&T(\mathfrak p ^r)\left( T(\mathfrak p ^{s+1})+\,{\mathsf {N}}(\mathfrak p )T(\mathfrak p ^{s-1})P(\mathfrak p )\right) \\&\qquad = \sum _{i=0}^s \left( \,{\mathsf {N}}(\mathfrak p )^i T(\mathfrak p ^{r+s+1-2i})P(\mathfrak p )^i \right. \\&\qquad \qquad \qquad \left. +\,{\mathsf {N}}(\mathfrak p )^{i+1} P(\mathfrak p ^{r+s+1-2(i+1)})P(\mathfrak p )^{i+1} \right) \end{aligned} \end{aligned}$$
(41.3.12)

so

$$\begin{aligned} \begin{aligned} T(\mathfrak p ^r)T(\mathfrak p ^{s+1})&= \sum _{i=0}^s \left( \,{\mathsf {N}}(\mathfrak p )^i T(\mathfrak p ^{r+s+1-2i}P(\mathfrak p )^i ) \right. \\&\qquad \qquad \qquad \left. + \,{\mathsf {N}}(\mathfrak p )^{i+1} P(\mathfrak p )^{i+1} P(\mathfrak p ^{r+s+1-2(i+1)})\right) \\&\qquad - \sum _{i=0}^s \,{\mathsf {N}}(\mathfrak p )^{i+1}P(\mathfrak p )^{i+1} T(\mathfrak p ^{r+s+1-2(i+1)}) \\&= \sum _{i=0}^{s+1} \,{\mathsf {N}}(\mathfrak p )^i P(\mathfrak p )^i T(\mathfrak p ^{r+s+1-2i}) \end{aligned} \end{aligned}$$
(41.3.13)

as claimed. \(\square \)

Definition 41.3.14

The Hecke algebra  \(\mathbf{T} (\mathcal {O})\) is the subring of \({{\,\mathrm{M}\,}}_h(\mathbb Z )\) generated by the matrices \(T(\mathfrak n )\) with \(\mathfrak n \) coprime to \(\mathfrak N \).

Corollary 41.3.15

The ring \(\mathbf{T} (\mathcal {O})\) is a commutative \(\mathbb Z \)-algebra.

Proof

By Proposition 41.3.1(b), we reduce to showing that \(T(\mathfrak p ^r)T(\mathfrak p ^s)=T(\mathfrak p ^s)T(\mathfrak p ^r)\) for all \(r,s \ge 0\), and this holds by Proposition 41.3.6: the right-hand side of (41.3.7) is symmetric under interchanging rs. \(\square \)

Example 41.3.16

Let \(F=\mathbb {Q}(\sqrt{10})\) and \(R=\mathbb Z _F=\mathbb Z [\sqrt{10}]\) its ring of integers. Then the class group \({{\,\mathrm{Cl}\,}}R \simeq \mathbb Z /2\mathbb Z \) is nontrivial, represented by the class of the ideal \(\mathfrak p _2=(2,\sqrt{10})\), and the narrow class group \({{\,\mathrm{Cl}\,}}^+ R \simeq {{\,\mathrm{Cl}\,}}R\) is no bigger: the fundamental unit is \(3+\sqrt{10}\) of norm \(-1\).

Let \(B=({-1, -1} \mid {F})\). Since 2 is not split in F, the ramification set \({{\,\mathrm{Ram}\,}}B\) is the set of real places of F. A maximal order is given by

$$\begin{aligned} \mathcal {O}=R \oplus \mathfrak p _2^{-1}(1+i) \oplus \mathfrak p _2^{-1}(1+j) \oplus R\frac{1+i+j+ij}{2}. \end{aligned}$$

We find that \(\#{{\,\mathrm{Cls}\,}}\mathcal {O}=4\), and

$$\begin{aligned} T(\mathfrak p _2) = \begin{pmatrix} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 3 &{} 2 \\ 0 &{} 2 &{} 0 &{} 0 \\ 3 &{} 1 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

In this case, the matrix \(P(\mathfrak p _2)\) is the identity matrix: for example, we have \(\mathfrak p _2 \mathcal {O}= (1+i)\mathcal {O}\). Thus

$$\begin{aligned} T(\mathfrak p _2^2)=T(2R) = T(\mathfrak p _2)^2 - 2 = \begin{pmatrix} 1 &{} 1 &{} 0 &{} 0 \\ 6 &{} 6 &{} 0 &{} 0 \\ 0 &{} 0 &{} 4 &{} 4 \\ 0 &{} 0 &{} 3 &{} 3 \end{pmatrix}. \end{aligned}$$

4 Semisimplicity

We now equip the space \(M_2(\mathcal {O})={{\,\mathrm{Map}\,}}({{\,\mathrm{Cls}\,}}\mathcal {O},\mathbb Z )\) with a natural inner product, and we show that the Hecke operators are normal with respect to this inner product.

41.4.1

For \([I] \in {{\,\mathrm{Cls}\,}}\mathcal {O}\), we define \(w_{[I]} :=[\mathcal {O}{}_{\textsf {\tiny {L}} }(I)^\times : R^\times ]\); this is well-defined, as a different choice of representative gives an isomorphic (conjugate) order.

Let \(1_{[I]}\) be the characteristic function of \([I] \in {{\,\mathrm{Cls}\,}}\mathcal {O}\); then \(1_{[I]}\) for \([I] \in {{\,\mathrm{Cls}\,}}\mathcal {O}\) form a basis for \(M_2(\mathcal {O})\). We define the bilinear form

$$\begin{aligned} \begin{aligned} \langle \phantom {x},\phantom {x} \rangle :M_2(\mathcal {O}) \times M_2(\mathcal {O})&\rightarrow \mathbb Z \\ \langle 1_{[I]}, 1_{[J]} \rangle&:=w_{[I]} \delta _{[I],[J]} \end{aligned} \end{aligned}$$
(41.4.2)

where \(\delta _{[I],[J]}=1,0\) according as \([I]=[J]\) or not, and extend linearly. The matrix of this pairing in the basis of characteristic functions is the diagonal matrix \({{\,\mathrm{diag}\,}}(w_i)_i\), where \(w_i=[\mathcal {O}_i^\times :R^\times ]\). The pairing is symmetric and nondegenerate.

The inner product (41.4.2) defines an adjoint \(T \mapsto T^*\).

Proposition 41.4.3

We have

$$\begin{aligned} P(\mathfrak n )^*&= P(\mathfrak n ^{-1}) \end{aligned}$$
(41.4.4)
$$\begin{aligned} T(\mathfrak n )^*&= P(\mathfrak n ^{-1})T(\mathfrak n ). \end{aligned}$$
(41.4.5)

The Hecke operators \(T(\mathfrak n )\) are normal with respect to the inner product (41.4.2), and for \(\mathfrak n \) trivial in \({{\,\mathrm{Cl}\,}}^+ R\) the operators \(T(\mathfrak n )\) are self-adjoint.

Proof

We may show the proposition for the Brandt matrices. Let \(W={{\,\mathrm{diag}\,}}(w_i)_i\) define the inner product on \(\mathbb Z ^h\) with the Brandt matrices acting on the right on row vectors. Then the inner product is \(\langle x,y\rangle = x W y^{\textsf {t} }\) and accordingly the adjoint \(\langle x T, y \rangle = \langle x, T^* y \rangle \) is defined by

$$\begin{aligned} T^* = W^{-1} T^{\textsf {t} }W \end{aligned}$$
(41.4.6)

The transpose of a permutation matrix is its inverse and that \(\mathcal {O}{}_{\textsf {\tiny {L}} }(\mathfrak n I_i)=\mathcal {O}{}_{\textsf {\tiny {L}} }(I_i)\), so that the unit groups match up, whence

$$\begin{aligned} P(\mathfrak n )^* = P(\mathfrak n )^{-1} = P(\mathfrak n ^{-1}) \end{aligned}$$
(41.4.7)

giving (41.4.4).

For the Brandt matrices, we refer to Lemma 41.2.7(a), giving

$$\begin{aligned} T(\mathfrak n )_{ij} = \frac{1}{w_i} \# \left\{ \alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )R = \mathfrak n _{ij}\right\} /R^\times \end{aligned}$$

where \(\mathfrak n _{ij} = \mathfrak n {{\,\mathrm{nrd}\,}}(I_j)/{{\,\mathrm{nrd}\,}}(I_i)\). Let

$$\begin{aligned} \Theta (\mathfrak n )_{ij} = \left\{ \alpha \in I_jI_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )R = \mathfrak n _{ij}\right\} /R^\times . \end{aligned}$$

By (41.4.6),

We extend the definition of \(\Theta (\mathfrak n )\) to include all fractional ideals \(\mathfrak n \). For each i, let \(i'\) be such that \([\mathfrak n ^{-1} I_i]=[I_{i'}]\), so that \(\mathfrak n ^{-1} I_i = \beta _i I_i'\); the induced action is given by the permutation map \(P(\mathfrak n ^{-1})\).

$$\begin{aligned} \begin{aligned} \Theta (\mathfrak n )_{ij}&\rightarrow \Theta (\mathfrak n )_{ji'} \\ \alpha&\mapsto (\alpha \beta _i)^{-1} = \beta _i^{-1} \alpha ^{-1} \end{aligned} \end{aligned}$$
(41.4.8)

is well-defined and bijective.

Indeed, if \(\alpha \in I_jI_i^{-1}\) then

$$\begin{aligned} \overline{\alpha } \in \overline{I_jI_i^{-1}} = \overline{I_i^{-1}}\,\overline{I_j} = I_i I_j^{-1} \frac{{{\,\mathrm{nrd}\,}}(I_j)}{{{\,\mathrm{nrd}\,}}(I_i)} \end{aligned}$$
(41.4.9)

since \(I\overline{I}={{\,\mathrm{nrd}\,}}(I)\) for an invertible R-lattice I, so

$$\begin{aligned} \alpha ^{-1} \in \mathfrak n ^{-1} I_i I_j^{-1} = \beta _i I_{i'} I_j^{-1} \end{aligned}$$

and therefore \(\beta _i^{-1} \alpha ^{-1} \in I_{i'} I_j^{-1}\) as claimed. And \({{\,\mathrm{nrd}\,}}(\alpha )=\mathfrak n _{ij}\) implies \({{\,\mathrm{nrd}\,}}(\alpha ^{-1})=\mathfrak n ^{-2}\mathfrak n _{ji}\) so \({{\,\mathrm{nrd}\,}}(\beta _i^{-1} \alpha ^{-1}) = \mathfrak n _{ji'}\). We can run the argument in the other direction to produce an inverse, and we thereby conclude the map is bijective.

The map (41.4.8) together with the action by permutation and \(W^{\textsf {t} }= W\) yields

$$\begin{aligned} WT(\mathfrak n ) = \Theta (\mathfrak n ) = P(\mathfrak n ^{-1}) \Theta (\mathfrak n )^{\textsf {t} }= P(\mathfrak n ^{-1}) W T(\mathfrak n )^* \end{aligned}$$

and thus \(T(\mathfrak n )^* = P(\mathfrak n )^* T(\mathfrak n )\), and substituting (41.4.7) gives (41.4.5).

For the final statement, by (41.3.5) we have \(T(\mathfrak n )\) commuting with \(P(\mathfrak n )\), so \(T(\mathfrak n )\) commutes with \(T(\mathfrak n )^*\); and when \(\mathfrak n \) is narrowly principal, then \(P(\mathfrak n ^{-1})\) is the identity matrix so \(T(\mathfrak n )^*=T(\mathfrak n )\). \(\square \)

By the spectral theorem in linear algebra, we have the following corollary.

Corollary 41.4.10

\(\mathbf{T} (\mathcal {O})\) is a semisimple commutative ring, and there exists a basis of common eigenvectors (eigenfunctions) for the Hecke operators. Each \(T(\mathfrak n )\) with \(\mathfrak n \) narrowly principal has real eigenvalues.

5 Eichler trace formula

In this section, we compute the trace of the Brandt matrices in terms of embedding numbers. We continue notation from the previous section.

We begin by recalling the main ingredients. Let \(K \supset F\) be a separable quadratic field extension and let \(S \subseteq K\) be a quadratic R-order. Let \(h(S)=\#{{\,\mathrm{Pic}\,}}S\). Let \(m(S,\mathcal {O},\mathcal {O}^\times )\) be the number of \(\mathcal {O}^\times \)-conjugacy classes of optimal embeddings \(S \hookrightarrow \mathcal {O}\). Then by Theorem 30.4.7,

$$\begin{aligned} \sum _{[I] \in {{\,\mathrm{Cls}\,}}\mathcal {O}} m(S,\mathcal {O}{}_{\textsf {\tiny {L}} }(I);\mathcal {O}{}_{\textsf {\tiny {L}} }(I)^\times ) = h(S) m(\widehat{S},\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) \end{aligned}$$
(41.5.1)

We also recall

$$\begin{aligned} {{\,\mathrm{mass}\,}}({{\,\mathrm{Cls}\,}}\mathcal {O}) :=\sum _{i=1}^h \frac{1}{w_i} \end{aligned}$$

and that the Eichler mass formula (Main Theorem 26.1.5) gives an explicit formula for this mass in terms of the relevant arithmetic invariants.

MainTheorem 41.5.2

(Trace formula). If \(\mathfrak n \) is not narrowly principal, then \({{\,\mathrm{tr}\,}}T(\mathfrak n ) = 0\). If \(\mathfrak n =nR\) is narrowly principal with \(n \in F_{>0}^\times \), then

$$ {{\,\mathrm{tr}\,}}T(\mathfrak n ) = \frac{1}{2}\sum _{(u,t,S)} \frac{h(S)}{w_S} m(\widehat{S},\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) +{\left\{ \begin{array}{ll} {{\,\mathrm{mass}}}({{\mathrm{Cls}\,}}\mathcal {O}), &{} \text { if}\, \mathfrak n =c^2R, c \in F^\times \\ 0, &{} \text { otherwise} \end{array}\right. } $$

where \(w_S :=[S^\times :R^\times ]\) and the sum is over finitely many triples (utS) where:

  • u belongs to a set of representatives of \(R_{>0}^\times /R^{\times 2}\);

  • \(t \in R\) satisfies \(t^2-4un \in F_{<0}^\times \); and

  • \(S \supseteq R[x]/(x^2-tx+un)\).

Proof

We have \({{\,\mathrm{tr}\,}}T(\mathfrak n ) = \sum _{i=1}^{k} T(\mathfrak n )_{ii}\). By Lemma 41.2.7, since \(\mathfrak n _{ii} = \mathfrak n \) we conclude \({{\,\mathrm{tr}\,}}T(\mathfrak n )=0\) if \(\mathfrak n \) is not narrowly principal. So suppose \(\mathfrak n =nR\) is narrowly principal, with \(n \in F_{>0}^\times \). Then by (41.2.9) we have

$$\begin{aligned} w_i T(\mathfrak n )_{ii} = \frac{1}{2}\sum _{uR^{\times 2} \in R_{>0}^\times /R^{\times 2}} \#\{\alpha \in \mathcal {O}_i : {{\,\mathrm{nrd}\,}}(\alpha )=un\}. \end{aligned}$$
(41.5.3)

We are free to organize by reduced trace, giving

$$\begin{aligned} w_i T(\mathfrak n )_{ii} = \frac{1}{2}\sum _{u} \sum _{t \in R} \#\{\alpha \in \mathcal {O}_i : {{\,\mathrm{trd}\,}}(\alpha )=t,\ {{\,\mathrm{nrd}\,}}(\alpha )=un\}. \end{aligned}$$
(41.5.4)

Since B is definite, we have \({{\,\mathrm{disc}\,}}(\alpha )=t^2-4un\) either zero or totally negative, so the inner sum is over finitely many \(t \in R\) either satisfying \(t^2=4un\) or \(t^2-4un \in F_{<0}^\times \).

If \(t^2=4un\) (equivalently \(\alpha =t/2 \in F\)), then \(\mathfrak n = nR = c^2R\) with \(c=\pm t/2\); conversely, if \(\mathfrak n = nR = c^2 R\) for some \(c \in F^\times \), then there exists a unique representative \(u \in R_{>0}^\times /R^{\times 2}\) such that \(un=c^2\). Consequently, exactly when \(\mathfrak n =c^2 R\) is a square of a principal ideal, there is a contribution of \((1/2)(2)=1\) to the sum.

For the remaining terms, we have \(\alpha \not \in F\) and \(R[\alpha ] \simeq R[x]/(x^2-tx+un)\) is a domain. The embedding \(R[\alpha ] \hookrightarrow \mathcal {O}_i\) need not be optimal, but nevertheless corresponds to the optimal embedding \(S \hookrightarrow \mathcal {O}_i\) for a unique superorder \(S \supseteq R[\alpha ]\), and conversely. We count these up to units: the action of conjugaton by \(\mu \in \mathcal {O}_i^\times \) centralizes such an embedding if and only if \(\mu \in S^\times \), so letting \(w_S :=[S^\times :R^\times ]\) we have

$$\begin{aligned} \#\{\alpha \in \mathcal {O}_i : {{\,\mathrm{trd}\,}}(\alpha )=t,\ {{\,\mathrm{nrd}\,}}(\alpha )=un\} = \sum _{S \supseteq R[x]/(x^2-tx+un)} m(S,\mathcal {O}_i;\mathcal {O}_i^\times ) \frac{w_i}{w_S}. \end{aligned}$$

Plugging these into (41.5.4), we obtain

(41.5.5)

Dividing through by \(w_i\) and summing (41.5.5), we have

$$\begin{aligned} {{\,\mathrm{tr}\,}}T(\mathfrak n ) = \frac{1}{2} \sum _{(u,t,S)} \sum _{i=1}^h \frac{1}{w_S} m(S,\mathcal {O}_i;\mathcal {O}_i^\times ) + \delta \sum _i \frac{1}{w_i} \end{aligned}$$
(41.5.6)

where \(\delta =1,0\) according as \(\mathfrak n \) is a square of a principal ideal or not. Now substituting (41.5.1) and the definition of mass, the theorem is proven. \(\square \)

Corollary 41.5.7

We have

$$\begin{aligned} \#{{\,\mathrm{Cls}\,}}\mathcal {O}= {{\,\mathrm{mass}\,}}({{\,\mathrm{Cls}\,}}\mathcal {O}) + \frac{1}{2}\sum _{(u,t,S)} \frac{h(S)}{w_S} m(\widehat{S},\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) \end{aligned}$$

where the sum is over (utS) with \(n=1\).

Proof

For \(\mathfrak n =R\), we have T(1) the identity matrix, so \({{\,\mathrm{tr}\,}}T(1)=\#{{\,\mathrm{Cls}\,}}\mathcal {O}\). \(\square \)

Corollary 41.5.7 gives a different way to prove (and interpret) the Eichler class number formula (Main Theorem 30.8.6): for the exact comparison, see Exercise 41.2.

Example 41.5.8

Suppose \(F=\mathbb {Q}\) and \(R=\mathbb Z \). For \(d \in \mathbb Z \) a nonsquare discriminant, we write \(S_d :=\mathbb Z [(d+\sqrt{d})/2]\) for the unique quadratic ring of discriminant d, so \(w_{S_d}=[S_d^\times :\mathbb Z ^\times ]=1\) except for \(d=-3,-4\). In the trace formula (Main Theorem 41.5.2), the quaternion algebra that appears is definite, and so the only quadratic orders that embed are imaginary quadratic, with discriminant \(d<0\). Simplifying in this way, the trace formula then becomes

$$\begin{aligned} {{\,\mathrm{tr}\,}}T(n) = \frac{1}{2} \sum _{\begin{array}{c} t \in \mathbb Z \end{array}} \sum _{\begin{array}{c} df^2 = t^2-4n<0 \end{array}} \frac{h(S_d)}{w_{S_d}} m(\widehat{S}_d,\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) \end{aligned}$$
(41.5.9)

for n not a square (adding a mass term for n a square).

To notationally simplify a bit further, we define modified Hurwitz class numbers

$$\begin{aligned} h_{\mathcal {O}}(S) :=\frac{h(S)}{w_S} m(\widehat{S},\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) \end{aligned}$$

where the factor \(m(\widehat{S},\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times )\) is defined by purely local data, given in section 30.5 for maximal orders and section 30.6 for Eichler orders. Writing \(h_{\mathcal {O}}(d)=h_{\mathcal {O}}(S_d)\) for the order of discriminant d, we arrive at a pleasing formula:

$$\begin{aligned} {{\,\mathrm{tr}\,}}T(n) = \frac{1}{2} \sum _{\begin{array}{c} t \in \mathbb Z \end{array}} \sum _{\begin{array}{c} df^2 = t^2-4n<0 \end{array}} h_{\mathcal {O}}(d) \end{aligned}$$
(41.5.10)

again for n not a square.

Taking \(n=1\) as in Corollary 41.5.7 (and adding back the mass term) gives

$$\begin{aligned} \begin{aligned} \#{{\,\mathrm{Cls}\,}}\mathcal {O}&= {{\,\mathrm{mass}}}({{\mathrm{Cls}\,}}\mathcal {O}) + \frac{1}{2}\sum _{\begin{array}{c} t \in \mathbb Z \\ t^2<4 \end{array}} h_\mathcal {O}(t^2-4) \\&= {{\,\mathrm{mass}\,}}({{\mathrm{Cls}\,}}\mathcal {O}) + \frac{1}{2}h_\mathcal {O}(-4) + 2h_\mathcal {O}(-3). \end{aligned} \end{aligned}$$
(41.5.11)

For \(\mathcal {O}\) an Eichler order, after substitution we recover the Eichler class number formula (Theorem 30.1.5).

Example 41.5.12

As an illustration of Example 41.5.8, we compute \({{\,\mathrm{tr}\,}}T(3)\) for the Hurwitz order. We compute the following table of values:

$$\begin{aligned} \begin{array}{cc|ccc|c} t &{} d &{} h(S_d) &{} w_{S_d} &{} m(\widehat{S}_d,\widehat{\mathcal {O}};\widehat{\mathcal {O}}^\times ) &{} h_{\mathcal {O}}(d) \\ \hline 0 &{} -12 &{} 1 &{} 1 &{} 0 &{} 0 \\ 0 &{} -3 &{} 1 &{} 3 &{} 2 &{} 2/3 \\ \pm 1 &{} -11 &{} 1 &{} 1 &{} 1-({-11} \mid {2})=2 &{} 2 \\ \pm 2 &{} -8 &{} 1 &{} 1 &{} 1-({-8} \mid {2})=1 &{} 1 \\ \pm 3 &{} -3 &{} 1 &{} 3 &{} 1-({-3} \mid {2})=2 &{} 2/3 \end{array} \end{aligned}$$

Summing then gives

$$\begin{aligned} {{\,\mathrm{tr}\,}}T(3) = (1/2)(0 + 2/3) + (2 + 1 + 2/3) = 4. \end{aligned}$$

Indeed, more generally since the Hurwitz order has \(\#{{\,\mathrm{Cls}\,}}\mathcal {O}=1\), the matrix T(n) is a \(1\times 1\)-matrix with \(T(n)=[\sigma (n)]\) for n odd. This observation implies a nontrivial (and otherwise surprising) relationship between class numbers of imaginary quadratic orders!

Remark 41.5.13. Brandt [Bra43, §III] defined Brandt matrices in the same paper as his groupoid; he called them Hecke matrices, as he claimed to follow parallels with certain operators defined by Hecke. Indeed, Hecke [Hec40, §9, Satz 53] conjectured that the space of cusp forms of weight 2 on \(\Gamma _0(p)\) for p prime was spanned by certain linear combinations of theta series, and it was this observation that motivated Brandt. (Eichler [Eic56a, footnote 16] says that Brandt should not have named them after Hecke, since it was really Brandt who interpreted function-theoretic results of Hecke using pure arithmetic.)

Eichler [Eic56a] proved that the ring generated by the Brandt matrices was a commutative, semisimple ring and proved the trace formula for Brandt matrices [Eic56a, §6]. In this early work, he already foresaw the application of Brandt matrices to other base fields: as an application, he used Brandt matrices to give class number relations between imaginary quadratic fields, and in the function field case these become relations among divisor class groups for hyperelliptic curves. Eichler [Eic77, Chapter II] presented the generalization to totally real fields, giving a treatment of Hecke operators, Brandt matrices, and theta series, and he proved that the Brandt matrices realize Hecke operators in certain spaces of Hilbert modular forms.

Eichler then later gave a self-contained presentation [Eic73, Chapter II] of the theory of Brandt matrices over \(\mathbb {Q}\), with the intended application the solution to Hecke’s conjecture (suitably corrected), now known as the basis problem for \(\Gamma _0(p)\): to give bases of linearly independent forms of spaces of modular forms in terms of theta series of quadratic forms coming from quaternion algebras. This line of work was followed by generalizations by Hijikata [Hij74] and Hijikata–Saito [HS73] for general Eichler orders, Pizer [Piz76b, Piz76c] for residually split orders, culminating in a solution over the rational numbers to the basis problem by Hijikata–Pizer–Shemanske [HPS89a].

The method of proof for the solution to the basis problem is the use of the trace formula, for which a key ingredient is the theory of optimal embeddings: see Remark 30.6.18.

Indeed, it is much more involved analytically, but one can similarly compute the trace of the Hecke operator acting on classical spaces of modular forms or more generally spaces of Hilbert modular forms. These trace formulae are quite complicated, but one notices that they have a similar shape as the above trace formula; and in fact, under certain hypotheses and after restricting to an appropriate new subspace, the traces are equal. But since both rings are semisimple, this implies that the same systems of eigenvalues for the Hecke operators arise! Such a correspondence was first given by Eichler, as above; it was generalized to totally real fields by Shimizu [Shz72] using theta series, and the most general formulation given by Jacquet–Langlands [JL70]. This correspondence was conjectured to generalize to the principle of Langlands functorial transfer: for an introduction to this vast area, see Gelbart [Gel84].

In light of the preceding epic remark, we hope we have inspired the reader to pursue the relationship between Brandt matrices and modular forms! Unfortunately, it would require another book to respectfully develop this subject.

Remark 41.5.14. Pizer [Piz80a] was the first to give an algorithm for computing classical modular forms using Brandt matrices (on \(\Gamma _0(N)\) for N not a perfect square); see also the work of Kohel [Koh2001] over \(\mathbb Z \). This algorithm was generalized to compute Hilbert modular forms over a totally real field of narrow class number 1 by Socrates–Whitehouse [SW2005], with algorithmic improvements by Dembélé [Dem2007]. The assumption on the class number was removed by Dembélé–Donnelly [DD2008]. A survey of these methods are given by Dembélé–Voight [DV2013, §4, §8].

Exercises

Unless otherwise specified, in these exercises let R be a global ring with eligible set \({{{\texttt {\textit{S}}}}}\subseteq {{\,\mathrm{Pl}\,}}F\), let B be an \({{{\texttt {\textit{S}}}}}\)-definite quaternion algebra over F and let \(\mathcal {O}\subset B\) be an R-order in B.

1.:

Extend the definition of the Brandt matrix to include the case \(\mathfrak n =(0)\) of the zero ideal, following (41.2.8): define \(T(0)_{ij}=1/w_i\) for \(i,j=1,\dots ,h\). Conclude \({{\,\mathrm{tr}\,}}T(0) = {{\,\mathrm{mass}\,}}({{\,\mathrm{Cls}\,}}\mathcal {O})\).

2.:

Show that Corollary 41.5.7 agrees with Main Theorem 30.8.6. [Hint: organize by \(q :=[S^\times : R^\times ]\), observe that in \(S \supseteq R[x]/(x^2-tx+u)\) we have necessarily \([S^\times : R^\times ] \ge 2\) and each such S contains \(q-1\) orders of the form \(R[x]/(x^2-tx+u)\).]

3.:

Refine Lemma 41.2.7(c) in a special case as follows. Suppose \({{\,\mathrm{Cl}\,}}^+ R\) is trivial. Show that

$$\begin{aligned} T(\mathfrak n )_{ij} = \frac{1}{w_{i,1}} \#\{\alpha \in I_j I_i^{-1} : {{\,\mathrm{nrd}\,}}(\alpha )=n_{ij}\} \end{aligned}$$

where \(w_{i,1} :=\#\mathcal {O}_i^1\).

4.:

Suppose \(R=\mathbb Z \) and \(F=\mathbb {Q}\), suppose \({{\,\mathrm{disc}\,}}B=p\) is prime and \(\mathcal {O}\) is a maximal order. Show that

$$ {{\,\mathrm{tr}\,}}T(p) = {\left\{ \begin{array}{ll} 1, &{} \text { if}\, p=2,3; \\ h_{\mathcal {O}}(-4p), &{} \text { if}\, p>3. \end{array}\right. } $$

where

$$ h_{\mathcal {O}}(-4p) = {\left\{ \begin{array}{ll} h(-4p)/2, &{} \text { if}\, p \equiv 1 ~(\text{mod } ~{4}); \\ h(-p), &{} \text { if }\,p \equiv 7 ~(\text{mod } ~{8}); \\ 2h(-p), &{} \text { if}\, p \equiv 3 ~(\text{mod } ~{8})\, \text {and}\, p>3. \end{array}\right. } $$

What does this say about the number of maximal orders in B up to isomorphism such that every two-sided ideal is principal?

5.:

Give another proof of Proposition 41.3.1(b) using the local-global dictionary for lattices.

6.:

Prove (41.3.12) using induction and then expand to verify (41.3.13).

7.:

Let \(B=\displaystyle {\biggl (\frac{-1,-11}{\mathbb {Q}}\biggr )}\) with \({{\,\mathrm{disc}\,}}B=11\) and let \(\mathcal {O}=\mathbb Z \langle i, \frac{1}{2}(j+1) \rangle \).

(a):

Show that \(\mathcal {O}\) is a maximal order with \(\#\mathcal {O}^\times =4\).

(b):

Show that the ternary quadratic form associated to \(\mathcal {O}\) is similar to \(x^2-xz+y^2+3z^2\).

(c):

Show that \({{\,\mathrm{Cl}\,}}\mathcal {O}=\{[\mathcal {O}],[I_2]\}\) where \(I_2=2\mathcal {O}+ \frac{1}{2}(1+2i+j)\mathcal {O}\). [Hint: Follow Example 17.6.3.] Along the way, show that

$$\begin{aligned} T(2) = \begin{pmatrix} 1 &{} 3 \\ 2 &{} 0 \end{pmatrix}. \end{aligned}$$
(d):

Pause and show that \(\mathcal {O}_2 :=\mathcal {O}{}_{\textsf {\tiny {L}} }(I_2)\) has \(\#\mathcal {O}_2^\times =6\) and associated ternary quadratic form \(x^2-xy-xz+y^2+yz+4z^2\).

(e):

Show that \(M_2(\mathcal {O})\) has two eigenspaces for the Hecke algebra, one spanned by a form e with \(T(p)(e)=(p+1)e\) for all \(p \ne 11\), and the other spanned by a form f with \(T(2)(f)=-2f\).

(f):

Verify the trace formula (41.5.9) for \({{\,\mathrm{tr}\,}}T(2)=1\) by computing class numbers.

[There is a unique normalized cusp form \(f \in S_2(\Gamma _0(11))\) of weight 2 and level 11 with

$$\begin{aligned} f(q)=q\prod _{n=1}^{\infty } (1-q^n)^2 \prod _{n=1}^{\infty } (1-q^{11n})^2 = q - 2q^2 - q^3 + \dots = \sum _{n=1}^{\infty } a_n q^n \end{aligned}$$

matching f in the sense that \(T(n)f = a_n f\) for all \(11 \not \mid n\).]

8.:

Let \(\mathcal {O}(3,5)\) be an Eichler order of level 5 and reduced discriminant 15 (in a quaternion algebra B of discriminant 5) and similarly let \(\mathcal {O}(5,3)\) be an Eichler order of level 3 and reduced discriminant 15.

(a):

Show that \(\#{{\,\mathrm{Cls}\,}}\mathcal {O}(3,5)=\#{{\,\mathrm{Cls}\,}}\mathcal {O}(5,3)=2\).

(b):

Show that there is a unique eigenvector for the Brandt matrix T(2) for both orders with eigenvalue \(-1\). As far as you can compute, show that this eigenvector shares the same eigenvalues for T(n) both orders.

9.:

We consider an example of Brandt matrices not restricted to maximal orders. Let \(B=\displaystyle {\biggl (\frac{-1,-1}{\mathbb {Q}}\biggr )}\) and let

$$\begin{aligned} \mathcal {O}=\mathbb Z \langle 2i, 2j \rangle = \mathbb Z \oplus \mathbb Z (2i) \oplus \mathbb Z (2j) \oplus \mathbb Z (4ij). \end{aligned}$$
(a):

Show that \(\mathcal {O}\) is an order with \({{\,\mathrm{discrd}\,}}\mathcal {O}= N = 64\).

(b):

Compute that \(\#{{\,\mathrm{Cls}\,}}\mathcal {O}= 4\).

(c):

Under the action of the Brandt matrices T(n), show that there are 3 irreducible factors of dimensions 1, 1, 2. In a basis of characteristic functions, identify the one-dimensional factors as:

$$\begin{aligned} (1,1,1,1)&\leftrightarrow e(q) :=\frac{1}{24} + \sum _{n=1}^{\infty } \sigma ^*(n) q^n \\ (1,-1,-1,1)&\leftrightarrow e_\chi (q) :=\frac{1}{24} + \sum _{n=1}^{\infty } \sigma ^*(n)\chi (n) q^n \end{aligned}$$

where now \(\sigma ^*(n) :=\displaystyle {\sum _{2 \not \mid d \mid n} d}\) and \(\chi (n)=\biggl (\displaystyle {\frac{-1}{n}}\biggr )\).

[The two-dimensional space has basis \((1,0,0,-1),(0,1,-1,0) \leftrightarrow f_1,f_2\) and \(a_p(f_1)=a_p(f_2)\) for all p, with

$$\begin{aligned} f_i = q + 2q^5 - 3q^9 - 6q^{13} + 2q^{17} + \ldots \end{aligned}$$

corresponding to the isogeny class of the elliptic curve \(E :y^2=x^3+x\) of conductor 64.]