Moonshine

Monstrous moonshine relates distinguished modular functions to the representation theory of the Monster . The celebrated observations that 1=1,196884=1+196883,21493760=1+196883+21296876,……(*)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ 1=1,\ \ \ 196884=1+196883,\ \ \ 21493760=1+196883+21296876,\ \ \ \dots\dots (*)$$ \end{document} illustrate the case of J(τ)=j(τ)−744, whose coefficients turn out to be sums of the dimensions of the 194 irreducible representations of . Such formulas are dictated by the structure of the graded monstrous moonshine modules. Recent works in moonshine suggest deep relations between number theory and physics. Number theoretic Kloosterman sums have reappeared in quantum gravity, and mock modular forms have emerged as candidates for the computation of black hole degeneracies. This paper is a survey of past and present research on moonshine. We also compute the quantum dimensions of the monster orbifold and obtain exact formulas for the multiplicities of the irreducible components of the moonshine modules. These formulas imply that such multiplicities are asymptotically proportional to dimensions. For example, the proportion of 1’s in (*) tends to dim(χ1)∑i=1194dim(χi)=15844076785304502808013602136=1.711…×10−28.\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$\frac{\dim(\chi_{1})}{\sum_{i=1}^{194}\dim(\chi_{i})}=\frac{1}{5844076785304502808013602136}=1.711\ldots \times 10^{-28}. $$ \end{document} 2010 Mathematics Subject Classification: 11F11; 11 F22; 11F37; 11F50; 20C34; 20C35


Introduction
This story begins innocently with peculiar numerics, and in its present form exhibits connections to conformal field theory, string theory, quantum gravity, and the arithmetic of mock modular forms. This paper is an introduction to the many facets of this beautiful theory.
We begin with a review of the principal results in the development of monstrous moonshine in § §2-4. We refer to the introduction of [97], and the more recent survey [109], for more on these topics. After describing these classic works, we discuss the interplay between moonshine and Rademacher sums in §5, and related observations which suggest a connection between monstrous moonshine and three-dimensional quantum gravity in §6. The remainder of the paper is devoted to more recent works. We describe a generalization of moonshine, the moonshine tower, in §7, and in §8 we discuss the distributions of irreducible representations of the monster arising in monstrous moonshine. We present a survey of the recently discovered umbral moonshine phenomenon in §9, and conclude, in §10, with problems for future work.
Fischer and Griess independently produced evidence for the monster group in 1973 (cf. [112]). Well before it was proven to exist, Tits gave a lecture on its conjectural properties at the Collège de France in 1975. In particular, he described its order (2.1). Around this time, Ogg had been considering the automorphism groups of certain algebraic curves, and had arrived at the set of primes {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 41, 47, 59, 71} (2.3) in a purely geometric way (cf. the Corollaire of [183]). Making what may now be identified as the first observation of monstrous moonshine, Ogg offered a bottle of Jack Daniels 1 for an explanation of this coincidence (cf. Remarque 1 of [183]).
Ogg's observation would ultimately be recognized as reflecting another respect in which the monster is distinguished amongst finite simple groups: as demonstrated by the pioneering construction of Frenkel-Lepowsky-Meurman [95,96,97], following the astonishing work of Griess [113,114], the "most natural" representation of the monster, is infinite-dimensional.
(2.5) 1 We refer the reader to [83] for a recent analysis of the Jack Daniels problem.
The right hand sides of (2.4) and (2.5) are familiar to finite group theorists, as simple sums of dimensions of irreducible representations of the monster M. In fact, the irreducible representations appearing in (2.4) and (2.5) are just the first five, of a total of 194, in the character table of M (cf. [54]), when ordered by size. We have that Here e denotes the identity element of M, so χ i (e) is just the dimension of the irreducible representation of M with character χ i .
At the time that Thompson's conjecture was made, the monster had not yet been proven to exist, but Griess [112], and Conway-Norton [56], had independently conjectured the existence of a faithful representation of dimension 196883, and Fischer-Livingstone-Thorne had constructed the character table of M, by assuming the validity of this claim (cf. [56]) together with conjectural statements (cf. [112]) about the structure of M.
Thompson also suggested [210] to investigate the properties of the graded-trace functions for g ∈ M, now called the monstrous McKay-Thompson series, that would arise from the conjectural monster module V ♮ . Using the character table constructed by Fischer-Livingstone-Thorne, it was observed [56,210] that the functions T g are in many cases directly similar to J in the following respect: the first few coefficients of each T g coincide with those of a generator for the function field of a discrete group 2 Γ g < SL 2 (R), commensurable with SL 2 (Z), containing −I, and having width one at infinity, meaning that the subgroup of upper-triangular matrices in Γ g coincides with Γ ∞ := ± 1 n 0 1 | n ∈ Z , (3. 3) for all g ∈ M.
This observation was refined and developed by Conway-Norton [56], leading to their famous monstrous moonshine conjectures:. Conjecture 3.2 (Monstrous Moonshine: Conway-Norton). For each g ∈ M there is a specific group Γ g < SL 2 (R) such that T g is the unique normalized principal modulus 3 for Γ g . This means that each T g is the unique Γ g -invariant holomorphic function on H which satisfies T g (τ ) = q −1 + O(q), (3.4) as ℑ(τ ) → ∞, and remains bounded as τ approaches any non-infinite cusp of Γ g . We refer to this feature of the T g as the principal modulus property of monstrous moonshine.
The hypothesis that T g is Γ g -invariant, satisfying (3.4) near the infinite cusp of Γ g but having no other poles, implies that T g generates the field of Γ g -invariant holomorphic functions on H that have at most exponential growth at cusps, in direct analogy with J. In particular, the natural Riemann surface structure on Γ g \H (cf. e.g. [201]) must be that of the Riemann sphere C = C ∪ {∞} with finitely many points removed, and for this reason the groups Γ g are said to have genus zero, and the principal modulus property is often referred to as the genus zero property of monstrous moonshine.
The reader will note the astonishing predictive power that the principal modulus property of monstrous moonshine bestows: the fact that a normalized principal modulus for a genus zero group Γ g is unique, means that we can compute the trace of an element g ∈ M, on any homogeneous subspace of the monster's natural infinite-dimensional representation V ♮ , without any information about the monster, as soon as we can guess correctly the discrete group Γ g . The analysis of Conway-Norton in [56] establishes very strong guidelines for the determination of Γ g , and once Γ g has been chosen, the "theory of replicability" (cf. [1,56,181]) allows for efficient computation of the coefficients of the normalized principal modulus T g , given the knowledge of just a few of them (cf. [92], or (3.10)).
It was verified by Atkin-Fong-Smith [202], using results of Thompson [210] (cf. also [197]), that a graded (possibly virtual) infinite-dimensional monster module V ♮ , such that the functions T g of (3.2) are exactly those predicted by Conway-Norton in [56], exists. Theorem 3.3 (Atkin-Fong-Smith). There exists a (possibly virtual) graded M-module V ♮ = ∞ n=−1 V ♮ n such that if T g is defined by (3.2), then T g is the Fourier expansion of the unique Γ g -invariant holomorphic function on H that satisfies T g (τ ) = q −1 + O(q) as τ approaches the infinite cusp, and has no poles at any non-infinite cusps of Γ g , where Γ g is the discrete subgroup of SL 2 (R) specified by Conway-Norton in [56]. Thus Thompson's conjecture was confirmed, albeit indirectly. By this point in time, Griess, in an astonishing tour de force, had constructed the monster explicitly, by hand, by realizing it as the automorphism group of a commutative but non-associative algebra of dimension 196884 [113,114]. (See also [53,212].) Inspired by Griess' construction, and by the representation theory of affine Lie algebras, which also involves graded infinitedimensional vector spaces whose graded dimensions enjoy good modular properties (cf. e.g. [138,139,140,143]), Frenkel-Lepowsky-Meurman established Thompson's conjecture in a strong sense.
Frenkel-Lepowsky-Meurman generalized the homogeneous realization of the basic representation of an affine Lie algebraĝ due, independently, to Frenkel-Kac [94] and Segal [198], in such a way that Leech's lattice Λ [157,158]-the unique [51] even self-dual positive-definite lattice of rank 24 with no roots-could take on the role played by the root lattice of g in the Lie algebra case. In particular, their construction came equipped with rich algebraic structure, furnished by vertex operators, which had appeared first in the physics literature in the late 1960's.
We refer to [94], and also the introduction to [97] for accounts of the role played by vertex operators in physics (up to 1988) along with a detailed description of their application to the representation theory of affine Lie algebras. The first application of vertex operators to affine Lie algebra representations was obtained by Lepowsky-Wilson in [161].
Borcherds described a powerful axiomatic formalism for vertex operators in [13]. In particular, he introduced the notion of a vertex algebra, which can be regarded as similar to a commutative associative algebra, except that multiplications depend upon formal variables z i , and can be singular, in a certain formal sense, along the canonical divisors [18,93]).
The appearance of affine Lie algebras above, as a conceptual ingredient for the Frenkel-Lepowsky-Meurman construction of V ♮ hints at an analogy between complex Lie groups and the monster. Borcherds' vertex algebra theory makes this concrete, for Borcherds showed [13] that both in the case of the basic representation of an affine Lie algebra, and in the case of the moonshine module V ♮ , the vertex operators defined by Frenkel-Kac, Segal, and Frenkel-Lepowsky-Meurmann, extend naturally to vertex algebra structures.
In acts naturally on the underlying vector space. (See [144] for a detailed analysis of V. The generator L(m) lies above the vector field −t m+1 d dt .) This Virasoro structure, which has powerful applications, was axiomatized in [97], with the introduction of the notion of a vertex operator algebra. If V is a vertex operator algebra and the central element c of the Virasoro algebra acts as c times the identity on V , for some c ∈ C, then V is said to have central charge c.
For the basic representation of an affine Lie algebraĝ, the group of vertex operator algebra automorphisms-i.e. those vertex algebra automorphisms that commute with the Virasoro action-is the adjoint complex Lie group associated to g. For the moonshine module V ♮ , it was shown by Frenkel-Lepowsky-Meurman in [97], that the group of vertex operator algebra automorphisms is precisely the monster.
is a vertex operator algebra of central charge 24 whose graded dimension is given by J(τ ), and whose automorphism group is M.
Vertex operator algebras are of relevance to physics, for we now recognize them as "chiral halves" of two-dimensional conformal field theories (cf. [98,99]). From this point of view, the construction of V ♮ by Frenkel-Lepowsky-Meurman constitutes one of the first examples of an orbifold conformal field theory (cf. [69,70,71]). In the case of V ♮ , the underlying geometric orbifold is the quotient where Λ denotes the Leech lattice. So in a certain sense, V ♮ furnishes a "24-dimensional" construction of M. We refer to [93,97,142,160] for excellent introductions to vertex algebra, and vertex operator algebra theory.
Affine Lie algebras are special cases of Kac-Moody algebras, first considered by Kac [137] and Moody [174,175], independently. Roughly speaking, a Kac-Moody algebra is "built" from copies of sl 2 , in such a way that most examples are infinite-dimensional, but much of the finite-dimensional theory carries through (cf. [141]). Borcherds generalized this further, allowing also copies of the three-dimensional Heisenberg Lie algebra to serve as building blocks, and thus arrived [14] at the notion of generalized Kac-Moody algebra, or Borcherds-Kac-Moody (BKM) algebra, which has subsequently found many applications in mathematics and mathematical physics (cf. [128,196]).
One of the most powerful such applications occurred in moonshine, when Borcherds introduced a particular example-the monster Lie algebra m-and used it to prove [16] the moonshine conjectures of Conway-Norton. His method entailed using monsterequivariant versions of the denominator identity for m to verify that the coefficients of the McKay-Thompson series T g , defined by (3.2) according to the Frenkel-Lepowsky-Meurman construction of V ♮ , satisfy the replication formulas conjectured by Conway-Norton in [56]. This powerful result reduced the proof of the moonshine conjectures to a small, finite number of identities, that he could easily check by hand.
Theorem 3.6 (Borcherds). Let V ♮ be the moonshine module vertex operator algebra constructed by Frenkel-Lepowsky-Meurman, whose automorphism group is M. If T g is defined by (3.2) for g ∈ M, and if Γ g is the discrete subgroup of SL 2 (R) specified by , then T g is the unique normalized principal modulus for Γ g .
Recall that an even self-dual lattice of signature (m, n) exists if and only if m − n = 0 (mod 8) (cf. e.g. [60]). Such a lattice is unique up to isomorphism if mn > 0, and is typically denoted II m,n . In the case that m = n = 1 we may take where e and f are isotropic, e, e = f, f = 0, and e, f = 1. Then me + nf ∈ II 1,1 has square-length 2mn. Note that II 25,1 and Λ ⊕ II 1,1 are isomorphic, for Λ the Leech lattice, since both lattices are even and self-dual, with signature (25,1).
In physical terms the monster Lie algebra m is ("about half" of) the space of "physical states" of a bosonic string moving in the quotient of the 26-dimensional torus II 25,1 ⊗ Z R/II 25,1 ≃ R 24 /Λ ⊕ R 1,1 /II 1,1 by the Kummer involution x → −x. The monster Lie algebra m is constructed in a functorial way from V ♮ (cf. [35]), inherits an action by the monster from V ♮ , and admits a monster-invariant grading by II 1,1 .
The denominator identity for a Kac-Moody algebra g equates a product indexed by the positive roots of g with a sum indexed by the Weyl group of g. A BKM algebra also admits a denominator identity, which for the case of the monster Lie algebra m is the beautiful Koike-Norton-Zagier formula (3.9) where σ ∈ H and p = e 2πiσ (and c(n) is the coefficient of q n in J(τ ), cf. (2.6)). Since the right hand side of (3.9) implies that the left hand side has no terms p m q n with mn = 0, this identity imposes many non-trivial polynomial relations upon the coefficients of J(τ ). Among these is (3.10) which was first found by Mahler [163] by a different method, along with similar expressions for c(4n), c(4n + 1), and c(4n + 3), which are also entailed in (3.9). Taken together these relations allow us to compute the coefficients of J(τ ) recursively, given just the values  To recover the replication formulas of [56,181] we require to replace J with T g , and c(n) = dim(V ♮ n ) with tr(g|V ♮ n ) in (3.9), and for this we require a categorification of the denominator identity, whereby the positive integers c(mn) are replaced with M-modules of dimension c(mn).
A categorification of the denominator formula for a finite-dimensional simple complex Lie algebra was obtained by Kostant [155], following an observation of Bott [21]. This was generalized to Kac-Moody algebras by Garland-Lepowsky [111], and generalized further to BKM algebras by Borcherds in [16]. In its most compact form, it is the identity of virtual vector spaces −1 (e) = H(e), (3.12) where e is the sub Lie algebra of a BKM algebra corresponding to its positive roots (cf. [133,134,141]).
In (3.12) we understand −1 (e) to be the specialization of the formal series t (e) := k≥0 ∧ k (e)t k (3.13) to t = −1, where ∧ k (e) is the k-th exterior power of e, and we write (3.14) for the alternating sum of the Lie algebra homology groups of e.
In the case of the monster Lie algebra m, the spaces ∧ k (e) and H k (e) are graded by II 1,1 , and acted on naturally by the monster. If we use the variables p and q to keep track of the II 1,1 -gradings, then the equality of (3.12) holds in the ring R(M)[[p, q]][q −1 ] of formal power series in p and q (allowing finitely many negative powers of q), with coefficients in the (integral) representation ring of M. More precisely, (3.12) becomes , and divide both sides by p. More generally, replacing V ♮ k with tr(g|V ♮ k ) for g ∈ M, the identity (3.15) implies [16], and also [136]), which, in turn, implies the replication formulas formulated in [56,181]. Taking g = e in (3.16) we recover (3.9), so (3.16) furnishes a natural, monster-indexed family of analogues of the identity (3.9).

Modularity
Despite the power of the BKM algebra theory developed by Borcherds, and despite some conceptual improvements (cf. [61,134,135]) upon Borcherds' original proof of the moonshine conjectures, a conceptual explanation for the principal modulus property of monstrous moonshine is yet to be established. Indeed, there are generalizations and analogs of the notion of replicability which hold for generic modular functions and forms (for example, see [30]), not just those modular functions which are principal moduli.
Zhu explained the modularity of the graded dimension n dim(V ♮ n )q n of V ♮ in [227], by proving that this is typical for vertex operator algebras satisfying quite general hypotheses, and Dong-Li-Mason extended Zhu's work in [74], obtaining modular invariance results for graded trace functions arising from the action of a finite group of automorphisms.
To prepare for a statement of the results of Zhu and Dong-Li-Mason, we mention that the module theory for vertex operators algebras includes so-called twisted modules, associated to finite order automorphisms. If g is a finite order automorphism of V , then V is called g-rational in case every g-twisted V -module is a direct sum of simple g-twisted V -modules. Dong-Li-Mason proved [73] that a g-rational vertex operator algebra has finitely many simple g-twisted modules up to isomorphism. So in particular, a rational vertex operator algebra has finitely many simple (untwisted) modules. More generally, if G is a finite subgroup of Aut(V ) and V is g-rational for every g ∈ G, then the graded trace functions n tr(h|M n )q n , attached to the triples (g,h, M), where g, h ∈ G commute, M is a simple h-stable g-twisted module for V , andh is a lift of h to GL(M), span a finite-dimensional representation of SL 2 (Z).
We refer to the Introduction of [74] (see also §2 of [75]) for a discussion of h-stable twisted modules, and the relevant notion of lift. Note that any two lifts for h differ only up to multiplication by a non-zero scalar, so n tr(h|M n )q n is uniquely defined by (g, h, M), up to a non-zero scalar.
In the case of V ♮ , there is a unique simple g-twisted module V ♮ g = n (V ♮ g ) n for every g ∈ M = Aut(V ♮ ) (cf. Theorem 1.2 of [73]), and V ♮ g is necessarily h-stable for any h ∈ M that commutes with g. Therefore, Theorem 4.1 suggests that the functions T (g,h) (τ ) := n tr(h|(V ♮ g ) n )q n , (4.1) associated to pairs (g, h) of commuting elements of M, may be of interest.
Indeed, this was anticipated a decade earlier by Norton, following observations of Conway-Norton [56] and Queen [188], which associated principal moduli to elements of groups that appear as centralizers of cyclic subgroups in the monster. Norton formulated his generalized moonshine conjectures in [180] (cf. also [182], and the Appendix to [171]). : H → C to every pair (g, h) of commuting elements in the monster, such that the following are true: (2) For every γ ∈ SL 2 (Z) we have that T (g,h)γ (τ ) is a scalar multiple of T (g,h) (γτ ).
(3) The coefficient functionsh → tr(h|(V ♮ g ) n ), for fixed g and n, define characters of a projective representation of the centralizer of g in M, (4) We have that T (g,h) is either constant or a generator for the function field of a genus zero group Γ (g,h) < SL 2 (R). Remark. In Conjecture 4.2 (2) above, the right-action of SL 2 (Z) on commuting pairs of elements of the monster is given by The (slightly ambiguous) T (g,h)γ denotes the graded trace of a lift of g b h d to GL(V ♮ g a h c ). Norton's generalized moonshine conjectures reduce to the original Conway-Norton moonshine conjectures of [56] when g = e.
Conjecture 4.2 is yet to be proven in full, but has been established for a number of special cases. Theorem 4.1 was used by Dong-Li-Mason in [74], following an observation of Tuite (cf. [72], and [213,214,215] for broader context), to prove Norton's conjecture for the case that g and h generate a cyclic subgroup of M, and this approach, via twisted modules for V ♮ , has been extended by Ivanov-Tuite in [131,132]. Höhn obtained a generalization of Borcherds' method by using a particular twisted module for V ♮ to construct a BKM algebra adapted to the case that g is in the class named 2A in [54]the smaller of the two conjugacy class of involutions in M-and in so doing established [125] generalized moonshine for the functions T (g,h) with g ∈ 2A. So far the most general results in generalized moonshine have been obtained by Carnahan [33,34,35]. (See [36] for a recent summary.) Theorem 4.1 explains why the McKay-Thompson series T g (τ ) of (3.2), and the T (g,h) (τ ) of (4.1) more generally, should be invariant under the actions of (finite index) subgroups of SL 2 (Z), but it does not explain the surprising predictive power of monstrous moonshine. That is, it does not explain why the full invariance groups Γ g of the T g should be so large that they admit normalized principal moduli, nor does it explain why the T g should actually be these normalized principal moduli.
A program to establish a conceptual foundation for the principal modulus property of monstrous moonshine, via Rademacher sums and three-dimensional gravity, was initiated in [77] by the first author and Frenkel.

Rademacher Sums
To explain the conjectural connection between gravity and moonshine, we first recall some history. The roots of the approach of [77] extend back almost a hundred years, to Einstein's theory of general relativity, formulated in 1915, and the introduction of the circle method in analytic number theory, by Hardy-Ramanujan [121]. At the same time that pre-war efforts to quantize Einstein's theory of gravity were gaining steam (see [203] for a review), the circle method was being refined and developed, by Hardy-Littlewood (cf. [120]), and Rademacher [189], among others. (See [216] for a detailed account of what is now known as the Hardy-Littlewood circle method.) Despite being contemporaneous, these works were unrelated in science until this century: as we will explain presently, Rademacher's analysis led to a Poincaré series-like expression-the prototypical Rademacher sum-for the elliptic modular invariant J(τ ). It was suggested first in [67] (see also [170]) that this kind of expression might be useful for the computation of partition functions in quantum gravity.
Rademacher "perfected" the circle method introduced by Hardy-Ramanujan, and he obtained an exact convergent series expression for the combinatorial partition function p(n). In 1938 he generalized this work [190] and obtained such exact formulas for the Fourier coefficients of general modular functions. For the elliptic modular invariant J(τ ) = n c(n)q n (cf. (2.6)), Rademacher's formula (which was obtained earlier by Petersson [187], via a different method) may be written as where d, in each summand, is a multiplicative inverse for a modulo c, and (a, c) is the greatest common divisor of a and c. Having established the formula (5.1), Rademacher sought to reverse the process, and use it to derive the modular invariance of J(τ ). That is, he set out to prove directly that J 0 (τ + 1) = J 0 (−1/τ ) = J 0 (τ ), when J 0 (τ ) is defined by setting J 0 (τ ) = q −1 + n>0 c(n)q n , with c(n) defined by (5.1).
Rademacher achieved this goal in [191], by reorganizing the summation into a Poincaré series-like expression for J. More precisely, Rademacher proved that where a, b ∈ Z are chosen arbitrarily, in each summand, so that ad − bc = 1. We call the right hand side of (5.3) the first Rademacher sum.
Rademacher's expression (5.3) for the elliptic modular invariant J is to be compared to the formal sum for m a positive integer, which we may regard as a (formal) Poincaré series of weight zero for SL 2 (Z). In particular, (5.4) is (formally) invariant for the action of SL 2 (Z), as we see by recognizing the matrices ( a b c d ) as representatives for the right coset space Γ ∞ \Γ, where Γ = SL 2 (Z) and Γ ∞ is defined in (3.3): for a fixed bottom row (c, d) of matrices in SL 2 (Z), any two choices for the top row (a, b) are related by left-multiplication by some element of Γ ∞ .
The formal sum (5.4) does not converge for any τ ∈ H, so a regularization procedure is required. Rademacher's sum (5.3) achieves this, for m = 1, by constraining the order of summation, and subtracting the limit as ℑ(τ ) → ∞ of each summand e −2πi aτ +b cτ +d , whenever this limit makes sense. Rademacher's method has by now been generalized in various ways by a number of authors. The earliest generalizations are due to Knopp [147,148,149,150], and a very general negative weight version of the Rademacher construction was given by Niebur in [178]. We refer to [44] for a detailed review and further references. A nice account of the original approach of Rademacher appears in [146].
We note here that one of the main difficulties in establishing formulas like (5.3) is the demonstration of convergence. When the weight w of the Rademacher sum under consideration lies in the range 0 ≤ w ≤ 2, then one requires non-trivial estimates on sums of Kloosterman sums, like (for the case that w = 0 or w = 2). The demonstration of convergence generally becomes more delicate as w approaches 1.
In [77] the convergence of a weight zero Rademacher sum R (−m) Γ (τ ) is shown, for m a positive integer and Γ an arbitrary subgroup of SL 2 (R) that is commensurable with SL 2 (Z). Assuming that Γ contains −I and has width one at infinity (cf. (3.3)), we have where the summation, for fixed K, is over non-trivial right cosets of Γ ∞ in Γ (cf. (3.3)), having representatives ( a b c d ) such that 0 < c < K and |d| < K 2 . The modular properties of the R (−m) Γ are also considered in [77], and it is at this point that the significance of Rademacher sums in monstrous moonshine appears. To state the relevant result we give the natural generalization (cf. §3.2 of [77]) of the Rademacher-Petersson formula (5.1) for c(n), which is where the summation, for fixed K, is over non-trivial double cosets of Γ ∞ in Γ (cf. (3.3)), having representatives ( a b c d ) such that 0 < c < K. Note that this formula simplifies for n = 0, to for any group Γ < SL 2 (R) that is commensurable with SL 2 (Z). If Γ has width one at infinity (cf. (3.3)), then also The following theorem by the first author and Frenkel summarizes the central role of Rademacher sums and the principal modulus property.
In the case that the normalized Rademacher sum T is an abelian integral of the second kind for Γ, in the sense that it has at most exponential growth at the cusps of Γ, and satisfies T Theorem 5.1 is used as a basis for the formulation of a characterization of the discrete groups Γ g of monstrous moonshine in terms of Rademacher sums in §6.5 of [77], following earlier work [55] of Conway-McKay-Sebbar. It also facilitates a proof of the following result, which constitutes a uniform construction of the McKay-Thompson series of monstrous moonshine. Γg . Proof. Theorem 3.6 states that T g is a normalized principal modulus for Γ g , and in particular, all the Γ g have genus zero. Given this, it follows from Theorem 5.1 that T (−1) Γg is also a normalized principal modulus for Γ g . A normalized principal modulus is unique if it exists, so we conclude T g = T (−1) Γg for all g ∈ M, as we required to show.
Perhaps most importantly, Theorem 5.1 is an indication of how the principal modulus property of monstrous moonshine can be explained conceptually. For if we can develop a mathematical theory in which the underlying objects are graded with graded traces that are provably (1) modular invariant, for subgroups of SL 2 (R) that are commensurable with SL 2 (Z), and (2) given explicitly by Rademacher sums, such as (5.6), then these graded trace functions are necessarily normalized principal moduli, according to Theorem 5.1.
We are now led to ask: what kind of mathematical theory can support such results? As we have alluded to above, Rademacher sums have been related to quantum gravity by articles in the physics literature. Also, a possible connection between the monster and three-dimensional quantum gravity was discussed in [222]. This suggests the possibility that three-dimensional quantum gravity and moonshine are related via Rademacher sums, and was a strong motivation for the work [77]. In the next section we will give a brief review of quantum gravity, since it is an important area of physical inquiry which has played a role in the development of moonshine, but we must first warn the reader: problems have been identified with the existing conjectures that relate the monster to gravity, and the current status of this connection is uncertain.

Quantum Gravity
Witten was the first to predict a role for the monster in quantum gravity. In [222] Witten considered pure quantum gravity in three dimensions with negative cosmological constant, and presented evidence that the moonshine module V ♮ is a chiral half of the conformal field theory dual to such a quantum gravity theory, at the most negative possible value of the cosmological constant.
To explain some of the content of this statement, note that the action in pure threedimensional quantum gravity is given explicitly by where G is the Newton or gravitational constant, R denotes the Ricci scalar, and the cosmological constant is the scalar denoted by Λ.
The case that the cosmological constant Λ is negative is distinguished, since then there exist black hole solutions to the action (6.1), as was discovered by Bañados-Teitelboim-Zanelli [8]. These black hole solutions-the BTZ black holes-are locally isomorphic to three-dimensional anti-de Sitter space [7], which is a Lorentzian analogue of hyperbolic space, and may be realized explicitly as the universal cover of a hyperboloid [164]). The parameter ℓ in (6.2) is called the radius of curvature. For a locally anti-de Sitter (AdS) solution to (6.1), the radius of curvature is determined by the cosmological constant, according to In what has become the most cited 4 paper in the history of high energy physics, Maldacena opened the door on a new, and powerful approach to quantum gravity in [165], by presenting evidence for a gauge/gravity duality, in which gauge theories serve as duals to gravity theories in one dimension higher. (See [164] for a recent review.) In the simplest examples, the gauge theories are conformal field theories, and the gravity theories involve locally AdS spacetimes. The gauge/gravity duality for these cases is now known as the AdS/CFT correspondence.
Maldacena's duality furnishes a concrete realization of the holographic principle, introduced by 't Hooft [206], and elaborated on by Susskind [205]. For following refinements to Maldacena's proposal due to Gubser-Klebanov-Polyakov [117], and Witten [221], it is expected that gravity theories with (d + 1)-dimensional locally AdS spacetimes can be understood through the analysis of d-dimensional conformal field theories defined on the boundaries of these AdS spaces. Thus in the case of AdS solutions to three-dimensional quantum gravity, a governing role may be played by two-dimensional conformal theories, which can be accessed mathematically via vertex operator algebras (as we have mentioned in §3).
The conjecture of [222] is that the two-dimensional conformal field theory corresponding to a tensor product of two copies of the moonshine module V ♮ (one "left-moving," the other "right-moving") is the holographic dual to pure three-dimensional quantum gravity with ℓ = 16G, and therefore It is also argued that the only physically consistent values of ℓ are ℓ = 16Gm, for m a positive integer, so that (6.4) is the most negative possible value for Λ, by force of (6.3).
Shortly after this conjecture was formulated, problems with the quantum mechanical interpretation were identified by Maloney-Witten in [168]. Moreover, Gaiotto [107] and Höhn [127] cast doubt on the relevance of the monster to gravity by demonstrating that it cannot act on a holographically dual conformal field theory corresponding to ℓ = 32G (i.e. m = 2), at least under the hypotheses (namely, an extremality condition, and holomorphic factorization) presented in [222].
Interestingly, the physical problems with the analysis of [222] seem to disappear in the context of chiral three-dimensional gravity, which was introduced and discussed in detail by Li-Song-Strominger in [162] (cf. also [167,204]). This is the gravity theory which motivates much of the discussion in §7 of [77].
In order to define chiral three-dimensional gravity, we first describe topologically massive gravity, which was introduced in 1982 by Deser-Jackiw-Templeton [65,66]. (See also [64].) The action for topologically massive gravity is given by where I EG is the Einstein-Hilbert action (cf. (6.1)) of pure quantum gravity, and I CSG denotes the gravitational Chern-Simons term The Γ * * * are Christoffel symbols, and the parameter µ is called the Chern-Simons coupling constant.
Chiral three-dimensional gravity is the special case of topologically massive gravity in which the Chern-Simons coupling constant is set to µ = 1/ℓ = √ −Λ. It is shown in [162] that at this special value of µ, the left-moving central charges of the boundary conformal field theories vanish, and the right-moving central charges are for m a positive integer, ℓ = 16Gm.
Much of the analysis of [222] still applies in this setting, and the natural analogue of the conjecture mentioned above states that V ♮ is holographically dual to chiral threedimensional quantum gravity at ℓ = 16G, i.e. m = 1. However, as argued in detail in [167], the problem of quantizing chiral three-dimensional gravity may be regarded as equivalent to the problem of constructing a sequence of extremal chiral two-dimensional conformal field theories (i.e. vertex operator algebras), one for each central charge c = 24m, for m a positive integer. Here, a vertex operator algebra V = n V n with central charge c = 24m is called extremal, if its graded 5 dimension function satisfies The moonshine module is the natural candidate for m = 1 (indeed, it is the only candidate if we assume the uniqueness conjecture of [97]), as the right hand side of (6.8) reduces to q −1 + O(q) in this case, but the analysis of [107,127] also applies here, indicating that the monster cannot act non-trivially on any candidate 6 for m = 2. Thus the role of the monster in quantum gravity is still unclear, even in the more physically promising chiral gravity setting.
Nonetheless, the moonshine module V ♮ may still serve as the holographic dual to chiral three-dimensional quantum gravity at ℓ = 16G, m = 1. In this interpretation, the graded dimension, or genus one partition function for V ♮ -namely, the elliptic modular invariant J-serves as the exact spectrum of physical states of chiral three-dimensional gravity at µ = √ −Λ = 1/16G, in spacetime asymptotic to the three-dimensional anti-de Sitter space (cf. (6.2)).
Recall that if V is a representation of the Virasoro algebra V (cf. where v is a Virasoro highest weight vector, and m 1 ≤ · · · ≤ m k ≤ −1. 5 We regard all vertex operator algebras as graded by L(0) − c/24. Cf. (3.5). 6 The existence of extremal vertex operator algebras with central charge c = 24m for m > 1 remains an open question. We refer to [103,108,127,224] for analyses of this problem.
Assuming that V ♮ is dual to chiral three-dimensional gravity at m = 1, the Virasoro highest weight vectors in V ♮ define operators that create black holes, and the Virasoro descendants of a highest weight vector describe black holes embellished by boundary excitations. In particular, the 196883-dimensional representation of the monster which is contained in the 196884-dimensional homogenous subspace V ♮ 1 < V ♮ (cf. (2.4) and (3.1)), represents an 196883-dimensional space of black hole states in the chiral gravity theory.
More generally, the black hole states in the theory are classified, by the monster, into 194 different kinds, according to which monster irreducible representation they belong to.
Question 6.1. Assuming that the moonshine module V ♮ serves as the holographic dual to chiral three-dimensional quantum gravity at m = 1, how are the 194 different kinds of black hole states distributed amongst the homogeneous subspaces V ♮ n < V ♮ . Are some kinds of black holes more or less common than others?
This question will be answered precisely in §8. A positive solution to the conjecture that V ♮ is dual to chiral three-dimensional gravity at m = 1 may furnish a conceptual explanation for why the graded dimension of V ♮ is the normalized principal modulus for SL 2 (Z). For on the one hand, modular invariance is a consistency requirement of the physical theory-the genus one partition function function is really defined on the moduli space SL 2 (Z)\H of genus one curves, rather than on H-and on the other hand, the genus one partition function of chiral three-dimensional gravity is given by a Rademacher sum, as explained by Manschot-Moore [170], following earlier work [67] by Dijkgraaf-Maldacena-Moore-Verlinde. (Cf. also [167,168,169].) So, as we discussed in §5, the genus one partition function must be the normalized principal modulus J(τ ) for SL 2 (Z), according to Theorem 5.1.
In the analysis of [169,170], the genus one partition function of chiral three-dimensional gravity is a Rademacher sum (5.6), because it is obtained as a sum over three-dimensional hyperbolic structures on a solid torus with genus one boundary, and such structures are naturally parameterized by the coset space Γ ∞ \ SL 2 (Z) (cf. (3.3)), as explained in [166] (see also §5.1 of [67]). The terms e −2πim aτ +b cτ +d in (5.6) are obtained by evaluating e −I TMG , with µ = √ −Λ = 1/16Gm, on a solution with boundary curve C/(Z + τ Z), and the subtraction of e −2πim a c represents quantum corrections to the classical action. In [77], the above conjecture is extended so as to encompass the principal modulus property for all elements of the monster, with a view to establishing a conceptual foundation for monstrous moonshine. More specifically, the first main conjecture of [77] states the following.
From the point of view of vertex operator algebra theory, T g (−1/τ )-which coincides with T (−1) Γg (−1/τ ) according to Theorems 3.6 and 5.1-is the graded dimension of the unique simple g-twisted V ♮ -module V ♮ g (cf. §4). This non-trivial fact about the functions Geometrically, the twists of the above conjecture are defined by imposing (generalized) spin structure conditions on solutions to the chiral gravity equations, and allowing orbifold solutions of certain kinds. See §7.1 of [77] for a more complete discussion. The corresponding sums over geometries are then indexed by coset spaces Γ ∞ \Γ, for various groups Γ < SL 2 (R), commensurable with SL 2 (Z). According to Theorem 5.1, the genus one partition function corresponding to such a twist, expected to be a Rademacher sum on physical grounds, will only satisfy the basic physical consistency condition of Γinvariance if Γ is a genus zero group. One may speculate that a finer analysis of physical consistency will lead to the list of conditions given in §6.5 of [77], which characterize the groups Γ g for g ∈ M, according to Theorem 6.5.1 of [77]. Thus the discrete groups Γ g of monstrous moonshine may ultimately be recovered as those defining physically consistent twists of chiral three-dimensional gravity.
On the other hand, it is reasonable to expect that twisted chiral gravity theories are determined by symmetries of the underlying untwisted theory. Conceptually then, but still conjecturally, the monster group appears as the symmetry group of chiral threedimensional gravity, for which the corresponding twists exist. The principal modulus property of monstrous moonshine may then be explained: as a consequence of Theorem 5.1, together with the statement that the genus one partition function of a twisted theory is T is the normalized Rademacher sum attached to the subgroup Γ < SL 2 (R) that parameterizes the geometries of the twist.
For more background on the mathematics and physics of black holes we refer the reader to [116]. We refer to [31,32] for reviews that focus on the particular role of conformal field theory in understanding quantum gravity.

Moonshine Tower
An optimistic view on the relationship between moonshine and gravity is adopted in §7 of [77]. In particular, in §7.2 of [77] the consequences of Conjecture 6.2 for the second quantization of chiral three-dimensional gravity are explored. (We warn the reader that the notion of second quantized gravity is very speculative at this stage.) Motivated in part by the results on second quantized string theory in [68], the existence of a tower of monster modules parameterized by positive integer values of m, is predicted in §7.2 of [77]. Moreover, it is suggested that the graded dimension of V (−m) should be given by whereT (m) denotes the (order m) Hecke operator, acting on SL 2 (Z)-invariant holomorphic functions on H according to the rule Standard calculations (cf. e.g. Chp.VII, §5 of [199]) determine that mT (m)J is an SL 2 (Z)-invariant holomorphic function on H, whose Fourier coefficients for n > 0, where (m, n) denotes the greatest common divisor of m and n. In partic- There is only one such SL 2 (Z)-invariant holomorphic function on H, so we have according to (5.10) and Theorem 5.1, when Γ = SL 2 (Z). So the graded dimension of V (−m) is also a normalized Rademacher sum.
We would like to investigate the higher order analogues of the McKay-Thompson series T g (cf. (3.2)), encoding the graded traces of monster elements on V (−m) , but for this we must first determine the M-module structure on each homogeneous subspace V A solution to this problem is entailed in Borcherds' proof [16] of the monstrous moonshine conjectures, and the identity (3.15), in particular. To explain this, recall the Adams operation ψ k on virtual G-modules, defined, for k ≥ 0 and G a finite group, by requiring that tr(g|ψ k (V )) = tr(g k |V ) (7.7) for g ∈ G. (Cf. [6,151] for more details on Adams operations.) Using the ψ k we may equip V (−m) with a virtual M-module structure (we will see momentarily that it is actually for n > 0, where C m/k denotes the trivial M-module of dimension m/k. For convenience later on, we also define V (0) = V  Proof. We will use Borcherds' identity (3.15). To begin, note that T (−m) g is given explicitly in terms of traces on V ♮ by according to (7.8) and (7.9). Recall that R(G) denotes the integral representation ring of a finite group G.
Then it is a general property of the Adams operations (cf. §5.2 of [136]) that for the logarithm of the left hand side of (3.15).
, then the generating series m>0 p m V (−m) (q) is obtained when we apply −p∂ p to (7.12), according to the definition (7.8) of the V (−m) n as elements of R(M). So apply −p∂ p log( · ) to both sides of (3.15) to obtain the identity The right hand side of (7.13) really is a taylor series in p, for we use V ♮ (p) −1 as a short hand for , and V ♮ (q) with T g (τ ), etc. and shows that T Remark. The identity obtained by taking the trace of g ∈ M on (7.13) may be compactly rewritten , (7.14) where p = e 2πiσ and T g (σ) = m tr(g|V ♮ m )p m . This expression (7.14) is proven for some special cases by a different method in [11].
Recall that the monster group has 194 irreducible ordinary representations, up to equivalence. Let us denote these by M i , for 1 ≤ i ≤ 194, where the ordering is as in [54], so that the character of M i is the function denoted χ i in [54]. Proof. The claim follows from the modification of Borcherds' proof of Theorem 3.6 presented by Jurisich-Lepowsky-Wilson in [136]. In [136] a certain free Lie sub algebra u − of the monster Lie algebra m is identified, for which the identity Λ(u − ) = H(u − ) (or rather, the logarithm of this) yields m,n>0 k|(m,n) The coefficient of p m q n in the right hand side of (7.17) is evidently a non-negative integer combination of the M-modules V ♮ n , so the proof of the claim is complete.
In §8 we will determine the behavior of the multiplicity functions m i (−m, n) (cf. (7.15)) as n → ∞. For applications to gravity a slightly different statistic is more relevant. Recall from §6 that it is the Virasoro highest weight vectors-i.e. those v ∈ V ♮ n with L(k)v = 0 for k > 0-that represent black hole states in chiral three-dimensional gravity at m = 1. Such vectors generate highest weight modules for V, the structure of which has been determined by Feigin-Fuchs in [91]. (See [5] for an alternative treatment.) Specializing to the case that the central element c (cf. (3.5)) acts as c = 24m times the identity, for some positive integer m, we obtain from the results of [91] that the isomorphism type of an irreducible highest weight module for V depends only on the L(0)-eigenvalue of a generating highest weight vector, v, and if L(0)v = hv for h a non-negative integer, then (7.18) where L(h, c) denotes the irreducible highest weight V-module generated by v. We write L(h, c) n for the subspace of L(h, c) with L(0)-eigenvalue h−m in (7.18), (all the Virasoro modules in this work are graded by L(0) − c/24), and (q) ∞ := n>0 (1 − q n ). (7.19) (See [119] for details of the calculation that returns (7.18) in the case that m = 1.) Remark. We may now recognize the leading terms in (6.8) as exactly those of the graded dimension of the Virasoro module L(0, 24m).
It is known that V ♮ is a direct sum of highest weight modules for the Virasoro algebra (cf. e.g. [119]). Since the Virasoro and monster actions on V ♮ commute, we have an isomorphism of modules for V × M, where W ♮ n denotes the subspace of V ♮ n spanned by Virasoro highest weight vectors. To investigate how the black hole states in V ♮ are organized by the representation theory of the monster, we define non-negative integers n i (n) by requiring that for n ≥ −1.
Evidently n i (n) ≤ m i (−1, n) for all i and n since W ♮ n is a subspace of V ♮ n .
To determine the precise relationship between the n i (n) and m i (−1, n), define U g (τ ) for g ∈ M by setting (7.21)). Combining (7.18), (7.20) and (7.21), together with the definitions (3.2) of T g and (7.22) of U g , we obtain (7.23) or equivalently, U g (τ ) = (q) ∞ T g (τ ) + 1 (7.24) for all g ∈ M. (This computation also appears in [119].) In §8 we will use (7.24) to determine the asymptotic behavior of the n i (n) (cf. Theorem 8.1), and thus the statistics of black hole states, at ℓ = 16G, in the conjectural chiral three-dimensional gravity theory dual to V ♮ .
Remark. Note that we may easily construct modules for V × M satisfying the extremal condition (6.8), for each positive integer m, by considering direct sums of the monster modules V (−m) constructed in Proposition 7.2. A very slight generalization of the argument just given will then yield formulas for the graded traces of monster elements on the corresponding Virasoro highest weight spaces. Since it has been shown [107,127] that such modules cannot admit vertex operator algebra structure, we do not pursue this here.

Monstrous Moonshine's Distributions
We now address the problem of determining exact formulas and asymptotic distributions of irreducible components. This work will rely heavily on the modularity of the underlying McKay-Thompson series (i.e. Theorems 3.3 and 7.1).
We prove formulas for the multiplicities m i (−m, n) and n i (n) which in turn imply the following asymptotics.
These asymptotics immediately imply that the following limits are well-defined .
In particular, we have that We illustrate these asymptotics explicitly, for χ 1 , χ 2 , and χ 194 , in the case that m = 1, in Table 1, where we let δ (m i (−m, n)) denote the proportion of components corresponding to χ i in V (m) n . Table 1. The precise values given in the bottom row of Table 1  8.1. The modular groups in monstrous moonshine. To obtain exact formulas, we begin by recalling the modular groups which arise in monstrous moonshine. Suppose Γ * < GL 2 (R) is a discrete group which is commensurable with SL 2 (Z). If Γ * defines a genus zero quotient of H, then the field of modular functions which are invariant under Γ * is generated by a single element, the principal modulus (cf. (3.4)). Theorem 3.6 implies that the T g (defined by (3.2)) are principal moduli for certain groups Γ g . We can describe these groups in terms of groups E g which in turn may be described in terms of the congruence subgroups where W e , W f , etc. are representative of Atkin-Lehner involutions on Γ 0 (N/h). We use the notation W g := {1, e, f, . . . } to denote this list of Atkin-Lehner involutions contained in E g . We also note that Γ 0 (N|h) + e, f, . . . contains Γ 0 (Nh).
The groups E g are eigengroups for the T g , so that if γ ∈ E g , then T g (γτ ) = σ g (γ)T g , where σ g is a multiplicative group homomorphism from E g to the group of h-th roots of unity. Conway and Norton [56] give the following values for σ g evaluated on generators of N|h + e, f, . . . . Lemma 8.3 (Conway-Norton). Assuming the notation above, the following are true: This information is sufficient to properly describe the modularity of the series T (−m) g (τ ) on E g . In section 8.4, we give an explicit procedure for evaluating σ g . The invariance group Γ g , denoted by Γ 0 (N||h) + e, f, . . . (or by the symbol N||h + e, f, . . . ), is defined as the kernel of σ g . A complete list of the groups Γ g can be found in the Appendix ( §A) of this paper, or in table 2 of [56]. is the unique weakly holomorphic modular form of weight zero for Γ g that satisfies T (−m) g = q −m + O(q) as τ approaches the infinite cusp, and has no poles at any cusps inequivalent to the infinite one.

Remark.
A weakly holomorphic modular form is a meromorphic modular form whose poles (if any) are supported at cusps.

Harmonic Maass forms.
Maass-Poincaré series allow us to obtain formulas for weakly holomorphic modular forms and mock modular forms. We begin by briefly recalling the definition of a harmonic Maass form of weight k ∈ 1 2 Z and multiplier ν (a generalization of the notion of a Nebentypus). If τ = x + iy with x and y real, we define the weight k hyperbolic Laplacian by Suppose Γ is a subgroup of finite index in SL 2 (Z) and 3 2 ≤ k ∈ 1 2 Z. Then a real analytic function F (τ ) is a harmonic Maass form of weight k on Γ with multiplier ν if: (a) The function F (τ ) satisfies the modular transformation with respect to the weight k slash operation, for every matrix γ ∈ Γ, where if k ∈ Z + 1 2 , the square root is taken to be the principal branch. In particular, if ν is trivial, then F is invariant under the action of the slash operator. (b) We have that ∆ k F (τ ) = 0, (c) The singularities of F (if any) are supported at the cusps of Γ, and for each cusp ρ there is a polynomial P F,ρ (q −1 ) ∈ C[q −1/tρ ] and a constant c > 0 such that F (τ ) − P F,ρ (e −2πiτ ) = O(e −cy ) as τ → ρ from inside a fundamental domain. Here t ρ is the width of the cusp ρ.
Remark. The polynomial P F,ρ above is referred to as the principal part of F at ρ. In certain applications, condition (c) of the definition may be relaxed to admit larger classes of harmonic Maass forms. However, for our purposes we will only be interested in those satisfying the given definition, having a holomorphic principal part.
More precisely, we have the following. The Fourier expansion of harmonic Maass forms F at a cusp ρ (see Proposition 3.2 of [29]) splits into two components. As before, we let q := e 2πiτ .
By inspection, we see that weakly holomorphic modular forms are themselves harmonic Maass forms. In fact, under the given definition, all harmonic Maass forms of positive weight are weakly holomorphic. On the other hand, if the weight is non-positive, then the space of harmonic Maass forms may be larger than the space of weakly holomorphic modular forms. However, as with the weakly holomorphic modular forms, a harmonic Maass form is uniquely defined by its principal parts at all cusps. This is clear if the form is weakly holomorphic. If the nonholomorphic part is non-zero, we have the following lemma which follows directly from the work of Bruinier and Funke [29]. Lemma 8.6. If F ∈ H 2−k (Γ 0 (N)) has the property that F − = 0, then the principal part of F is nonconstant for at least one cusp.
Sketch of the Proof. Bruinier and Funke define a pairing {•, •} on harmonic weak Maass Forms. The particular quantity {2iy k ∂ ∂τ F , F } can be described in terms of either a Petersson norm, or in terms of products of coefficients of the principal part of F with coefficients of the nonholomorphic part. Since the Petersson norm is non-zero, at least one coefficient of the principal part is also non-zero.
Hence, if F and G are harmonic Maass forms of non-positive weight whose principal parts are equal at all cusps, then F − G is holomorphic and vanishes at cusps, and therefore identically 0.

8.3.
Maass-Poincaré series. The Maass-Poincaré series define a basis for a space of harmonic Maass forms and provide exact formulas for their coefficients. The following construction of the Maass-Poincaré series follows the method and notation of Bringmann and the third author [24] which builds on the early work of Rademacher, followed by more contemporary work of Fay, Niebur, among many others [90,177,178]. The Poincaré series we construct in this section are modular for congruence subgroups Γ 0 (N) which we will then use to construct the McKay-Thompson series. Although we could follow similar methods to construct Poincaré series for the groups Γ g directly, we restrict our attention to these groups since the congruence subgroups Γ 0 (N) are more standard.
For s ∈ C, w ∈ R \ {0}, and k ≥ 3/2, k ∈ 1 2 Z, let where M ν,µ (z) is the M-Whittaker function which is a solution to the differential equation and (here and throughout this paper) τ = x + iy. Using this function, let It is easy to check that φ s (τ ) is an eigenfunction of ∆ 2−k with eigenvalue The right hand side of (8.8) converges absolutely for ℜ(s) > 1, however Bringmann and the third author establish conditional convergence when s ≥ 3/4 [24], giving Theorem 8.7 below. The theorem is stated for the specific case Γ = Γ 0 (N) for some N and k ≥ 3/2, in which case we modify the notation slightly and define (8.9) P L (τ, m, N, 2 − k, ν) := 1 Γ(k) P L (τ, m, Γ 0 (N), 2 − k, k 2 , ν) In the statement of the theorem below, K c is a modified Kloosterman sum given by (8.10) If ν is trivial, we omit it from the notation. We also have that δ L,S (m) is an indicator function for the cusps ρ = L −1 ∞ and µ = S −1 ∞ given by if µ ∼ ρ in Γ 0 (N).
Using this notation, we have the following theorem which gives exact formulas for the coefficients and principal part of P L (τ, m, N, 2−k, ν), which is a generalization of Theorem 3.2 of [24].
Moreover, if n > 0, then a + (n) is given by where I k is the usual I-Bessel function.
(2) If S ∈ SL 2 (Z), then there is some c ∈ C so that the principal part of P L (τ, m, N, 2− k) at the cusp µ = S −1 ∞ is given by Remark. We shall use Theorem 8.7 to prove Theorem 8.1. We note that one could prove Theorem 8.1 without referring to the theory of Maass-Poincaré series. One could make use of ordinary weakly holomorphic Poincaré series. However, we have chosen to use Maass-Poincaré series and the theory of harmonic Maass forms because these results are more general, and because of the recent appearance of harmonic Maass forms in the theory of umbral moonshine (see §9).
Sketch of the proof. Writing ρ = α γ , Bringmann and the third author prove this theorem for the case that γ | N and (α, N) = 1, along with the assumption that µ and ρ are in a fixed complete set of inequivalent cusps, so that δ µ,ρ = 1 or 0. This general form is useful to us particularly since it works equally well for the cusps ∞ with L taken to be the identity, and for 0 with L taken to be 0 −1 1 0 .
Here and for the remainder of the paper we let S N denote any complete set of inequivalent cusps of Γ 0 (N), and for each ρ ∈ S N , we fix some L ρ with ρ = L −1 ρ ∞. Rankin notes [195] (Proof of Theorem 4.1.1(iii)) that given some choice of S N , each right coset of Γ 0 (N)\ SL 2 (Z) is in Γ 0 (N) · L ρ T r for some unique ρ ∈ S N . Moreover, the r in the statement is unique modulo t ρ , so the function δ L,S (m) given above is well-defined on all matrices in SL q (Z).
In the proof given by Bringmann and the third author, the sum of Kloosterman sums Since harmonic Maass forms with a nonholomorphic part have a non-constant principal part at some cusp, we have the following theorem.

Exact formulas for T
(−m) g . Using Theorems 8.7 and 8.8, we can write exact formulas for the coefficients of the T (−m) g provided we know its principal parts at all cusps of Γ 0 (Nh). With this in mind, we now regard T g as a modular function on Γ 0 (Nh) with trivial multiplier. The location and orders of the poles were determined by Harada and Lang [118]. Lemma 8.9. [118,Lemma 7,9] Suppose the Γ g is given by the symbol N||h + e, f..., and let L = −δ β γ −α ∈ SL 2 (Z). Then T g | 0 L has a pole if and only if γ (γ,h) , N h = N eh , for some e ∈ W g (Note, here we allow e = 1). The order of the pole is given by (h,γ)  , then LU is an Atkin-Lehner involution W e ∈ E g . Therefore, we have that Harada and Lang do not compute σ g (LU), however we will need these values in order to apply Theorem 8.8. Using Lemma 8.3, the following procedure allows us to compute σ g (M) for any matrix M ∈ E g .
Given a matrix M ∈ E g , we may write M as M = ae b h cN de with e ∈ W g and ade − bc N eh = 1. We may also write h = h e · h e , where h e is the largest divisor of h co-prime to e. Since c N eh is co-prime to both d and e, we may chose integers A, B, and C (mod h) such that: Combined with Lemma 8.9, this leads us to the following proposition.
Proposition 8.10. Given a matrix L = −δ β γ −α ∈ SL 2 (Z), let u and U be chosen as above, and define ǫ g (L) := σ g (LU) · e 2πi u·(h,γ) eh 2 . Then by (8.11), we have that Using this notation, we are equipped to find exact formulas for the T (−m) g . Theorem 8.11. Let g ∈ M, with Γ g = N|h + e, f, . . . , and let S N h and W g be as above. if m and n are positive integers, then there is a constant c for which Proof. Every modular function is a harmonic Maass form. Therefore, the idea is to exhibit a linear combination of Maass-Poincaré series with exactly the same principal parts at all cusps as T is a polynomial in T g , and as such must be the sum of powers of T g each of which is congruent to m (mod h). Therefore, if M ∈ Γ g , then T (−m) | 0 M = σ g (M) m T (−m) . Given L ∈ SL 2 (Z), let U be a matrix as in (8.11) so that LU ∈ Γ g . By applying Proposition 8.10, we find Theorem 8.8, along with the observations that t ρ = (h,γ) 2 eh 2 and κ ρ = 0 for every ρ = α γ = L −1 ρ ∞, implies the first part of the theorem. The formula for the coefficients follows by Theorem 8.7. 8.5. Exact formulas for U g up to a theta function. Following a similar process to that in the previous section, we may construct a series U g (τ ) = q − 23 24 + O(q 1 24 ) with principal parts matching those of η(τ )T g (τ ) at all cusps. Then according to (7.24), the difference q 1 24 (U g −1)− U g is a weight 1 2 holomorphic modular form, which by a celebrated result [200] of Serre-Stark, is a finite linear combination of unary theta functions. This will not affect the asymptotics in Theorem 8.1. The functions T g and U g differ primarily in their weight, and in that U g has a non-trivial multiplier ν η : M → η(M τ ) (M :τ ) 1/2 η(τ ) . They also have slightly different orders of poles, which is accounted for by the fact that the multiplier ν η implies that κ ρ = t ρ /24 at every cusp ρ for the U g , rather than 0 for the T g . The proof of the following theorem is the same as that of Theorem 8.11, mutatis mutandis. Theorem 8.12. Let g ∈ M, with Γ g = N|h + e, f, . . . , and let S N h and W g be as above. If m is a positive integer then For n a non-negative integer, the coefficient of q n+ 1 24 in U g is given by This immediately admits the following corollary.
Corollary 8.13. Given the notation above, there is a weight 1 2 linear combination of theta functions h g (τ ) for which the coefficient q n in U g (τ ) − q − 1 24 h g (τ ) coincides with the coefficient of q n+ 1 24 in U g , given explicitly in Theorem 8.12.
Proof of Theorem 8.1. Following Harada and Lang [118], we begin by defining the functions The orthogonality of characters imply that for g and h ∈ M, Here |C M (g)| is the order of the centralizer of g in M. Since the order of the centralizer times the order of the conjugacy class of an element is the order of the group, (8.13) and (8.12) together imply the inverse relation In particular we have that T , and therefore we can identify the m i (−m, n) as the Fourier coefficients of the T Using Theorem 8.11, we obtain exact formulas for the coefficients of T (−m) χ i (τ ). Let g ∈ M with Γ g = N g ||h g + e g , f g , . . . . If m and n are positive integers, then the nth coefficient is given exactly by where S Nghg and W g are given as above.
Using the well-known asymptotics for the I-Bessel function we see that the formula for m i (−m, n) is dominated by the c = 1 term which appears only for g = e (so that N e = h e = 1). This term yields the asymptotic as in the statement of the theorem. The asymptotics for n i (n) follows similarly, using the formula We note that the coefficients of the theta functions h g (τ ) in Corollary 8.13 are bounded by constants and so do not affect the asymptotics. This yields √ 23|24n+1| as in the theorem.

8.7.
Examples of the exact formulas. We conclude with a few examples illustrating the exact formulas for the McKay-Thompson series. These formulas for the coefficients generally converge rapidly. However the rate of convergence is not uniform and often requires many more terms to converge to a given precision.
Example. We first consider the example g = e. In this case we have Γ g = SL 2 (Z), which has only the cusp infinity. In this case Theorem 8.11 reduces to the well known expansion Example. The second example we consider is g in the conjugacy class 4B. In this case we have Γ g = 4||2 + 2 ⊃ Γ 0 (8). The function T g has a pole at each of the four cusps of Γ 0 (8): (1) The cusp ∞ has e = 1, width t = 1, and coefficient ǫ(L ∞ ) = 1.
2) whose coefficients are virtual vector bundles, obtained as sums of tensor products of the exterior and symmetric powers of the holomorphic tangent bundle T M , and its dual bundle T * M . (Cf. also §1 of [115] or Appendix A of [68].) In (9.2) we interpret t E in direct analogy with t E (cf. (3.13)), replacing exterior powers ∧ k E with symmetric powers ∨ k E.
Since a complex K3 surface M has trivial canonical bundle, and hence vanishing first Chern class, its elliptic genus is a weak Jacobi form Z M (τ, z) of weight zero and index dim(M)/2 = 1-see [115] or [126] for proofs of this fact-once we set q = e(τ ) and y = e(z). This means 7 that Z M (τ, z) is a holomorphic function on H × C, satisfying for ( a b c d ) ∈ SL 2 (Z) and λ, µ ∈ Z, with a Fourier expansion Z M (τ, z) = n,r c(n, r)q n y r such that c(n, r) = 0 whenever n < 0.

Mathieu McKay-Thompson series
tr(g|K n−1/8 )q n−1/8 (9.11) associated to a graded M 24 -module K = n>0 K n−1/8 , such that K 7/8 is the sum of the two 45-dimensional irreducible representations of M 24 , and K 15/8 the sum of the two 231dimensional irreducible representations, etc. (See [41] for a detailed review of Mathieu moonshine, and explicit descriptions of the H g in particular.) We have the following beautiful recent result of Gannon [41].
Theorem 9.1 (Gannon). There is a graded M 24 -module K = n>0 K n−1/8 for which (9.11) is true (given that the H g are as described in [41]).

Remark. A concrete construction of K remains unknown.
The observer may ask: how were the H g determined, if the module K is as yet unknown? To explain this, note that the subscript in M 24 is a reference to the fact that M 24 is distinguished amongst permutation groups: it may be characterized as the unique proper subgroup of the alternating group A 24 that acts quintuply transitively on 24 points (cf. [62]). Write χ g for the number of fixed points of an element g ∈ M 24 , in this defining permutation representation.
The first few terms of H g are determined by the Eguchi-Ooguri-Tachikawa observation on (9.9), for it indicates that the coefficient of q 7/8 in H g should be the trace of g on the sum of the two 45-dimensional irreducible representations, and the coefficient of q 15/8 should be the trace of g on the sum of the two 231-dimensional irreducible representations, etc. To determine the remaining infinitely many terms, modularity may be used: the series H g , determined in [37,85,100,101], have the property that is a weak Jacobi form of weight zero and index one for Γ J 0 (N) := Γ 0 (N) ⋉ Z 2 (with non-trivial multiplier when χ g = 0), where N = o(g) is the order of g, and Γ 0 (N) is as in (8.3).
Thus Mathieu moonshine entails twisted, or twined versions (9.12) of the K3 elliptic genus (9.4), but the single variable series H g (τ ) may also be studied in their own right, as automorphic objects of a particular kind: it turns out that they are mock modular forms 8 of weight 1/2, for various groups Γ 0 (N), with shadows χ g η(τ ) 3 . This means that the completed functions are harmonic Maass forms of weight 1/2, with the same multiplier system as η(τ ) −3 when χ g = 0. (In case χ g = 0, i.e. when H g is already a modular form, the multiplier is slightly different. See e.g. [41]. The groups Γ 0 (o(g)) for which χ g = 0 are characterized in [43].) The function H g (τ ), being the holomorphic part of H g (τ ), is the mock modular form.
In contrast to the twined K3 elliptic genera Z g , the mock modular forms H g are distinguished, in a manner directly analogous to the McKay-Thompson series T g of monstrous moonshine: it is shown in [42] that the H g admit a uniform description in terms of Rademacher sums, in direct analogy with Theorem 5.2. (We refer to [42] or the review [41] for a precise statement of this result.) Since the coincidence between the monstrous McKay-Thompson series and (normalized) Rademacher sums depends in a crucial way upon the genus zero property of monstrous moonshine, as evidenced by Theorem 5.1, it is natural to identify the Rademacher sum realization of the H g as the Mathieu moonshine counterpart to the genus zero property of monstrous moonshine.
As we have hinted above, the Rademacher sum property that distinguishes the T g and H g does not hold for the weight zero Jacobi forms Z g (cf. (9.12)). A Poincaré series approach to Jacobi forms is described in [22], using the foundations established in [25,26], and it is verified there that the Z g are not all realized in this way. On the other hand, the main result of [22] is the Poincaré series construction of certain Maass-Jacobi forms of weight one, naturally associated to elements of M 24 . Thus we can expect that Jacobi forms of weight one, rather than the Z g of (9.12), will play an important role in a comprehensive conceptual explanation of the Mathieu moonshine phenomenon. 9.2. Niemeier Lattices. Vector-valued versions of the Rademacher sums that characterize the H g were used in [46] to identify Mathieu moonshine as a special case of six directly similar correspondences, between conjugacy classes in certain finite groups and distinguished (vector-valued) mock modular forms of weight 1/2. Since the mock modular forms arising seemed to be characterized by their shadows, this was dubbed umbral moonshine in [46].
The conjectures of [46] were greatly expanded in [47], following an observation of Glauberman (cf. the Acknowledgement in [47]), that the finite groups identified in [46] also appear as automorphism groups of codes associated to deep holes in the Leech lattice (cf. [57] or [54]).
To explain the significance of this, recall that an integral lattice is a free abelian group L together with a symmetric bilinear form · , · : L × L → Z. A lattice L is called positive-definite if λ, λ ≥ 0 for all λ ∈ L, with equality only when λ = 0. It is called even if λ, λ ∈ 2Z for all λ ∈ L, and self-dual if L = L * , for L * the dual of L, (9.14) The even self-dual positive-definite lattices of rank 24 have been classified [179] (see also [58,217]) by Niemeier: there are 24 in total, up to isomorphism. They are characterized by their root systems-i.e. the configurations of their vectors with square length equal to 2-and the Leech lattice Λ (cf. §3) is the unique such lattice whose root system is empty. We refer to the remaining 23 as the Niemeier lattices. The Niemeier root systems are the root systems of the Niemeier lattices, and they are described explicitly as A 24 1 , A 12 2 , A 8 3 , A 6 4 , A 4 6 , A 2 12 , (9.15) 16) in terms of the irreducible, simply-laced (i.e. ADE type) root systems. (See [60] or [129] for more on root systems.) In (9.15) and (9.16) we use juxtaposition as a shorthand for direct sum, so that A 24 1 denotes 24 copies of the A 1 root system, and A 11 D 7 E 6 is shorthand for and one can check that the Niemeier root systems (9.15), (9.16) are exactly those unions of ADE type root systems for which the total rank is 24, and the Coxeter number is constant across irreducible components.
For X a Niemeier root system and N X the corresponding Niemeier lattice, define the outer automorphism group of N X by setting Out(N X ) := Aut(N X )/W X , (9.18) where W X denotes the subgroup of Aut(N X ) generated by reflections in root vectors. Applying this construction to the Leech lattice, corresponding to X = ∅, we obtain the Conway group, Co 0 := Aut(Λ), (9.19) so named in light of Conway's detailed description [50,52] of its structure. A number of the 26 sporadic simple groups appear as subgroups, or quotients of subgroups of Co 0 , including the three sporadic simple Conway groups, Co 1 , Co 2 and Co 3 . The Conway group Co 0 is a double cover of the first, and largest of these, (9.20) Note that M 24 is naturally a subgroup of Co 0 , and also Co 1 , for if {λ i } ⊂ Λ is a set of 24 vectors such that λ i , λ j = 8δ ij , then the subgroup of Co 0 that stabilizes this set {λ i } is a copy of M 24 .
According to Conway-Parker-Sloane [57], the Niemeier root systems classify the deepest holes in the Leech lattice, being the points in Λ⊗ Z R at maximal distance from vectors in Λ. Moreover, this correspondence is strong enough that the Niemeier outer automorphism groups Out(N X ) are also visible inside the Conway group, Co 0 . More precisely, if x ∈ Λ ⊗ Z R is a deep hole, with corresponding Niemeier root system X according to [57], then the stabilizer Aut(Λ, x) of x in Aut(Λ) has a normal subgroup C x such that Aut(Λ, x)/C x ≃ Out(N X ). (9.22) The subgroup C x even encodes a method for constructing N X , as is explained in detail in [59], for if L X denotes the sub lattice of N X generated by roots, then N X is determined by its image in (L X ) * /L X (cf. (9.14)) under the natural map N X → (L X ) * /L X . Write C X for this subgroup of (L X ) * /L X , called the glue code of X in [59] (see also [58]). Then C x is isomorphic to C X , according to [59].
Thus Out(N X ) acts as automorphisms on the glue code C X , and Glauberman's observation suggests an extension of the results of [46], whereby distinguished vector-valued mock modular forms H X g = (H X g,r ) are associated to elements g in the umbral groups G X := Out(N X ), (9.24) for each Niemeier root system X. The realization of this suggestion is described in detail in [47]. For X = A 24 1 , the glue code C X is a copy of the extended binary Golay code (cf. [60] or [192]), and G X is its full automorphism group, M 24 . Thus, from the Niemeier root system perspective, Mathieu moonshine is the special case of umbral moonshine corresponding to the root system A 24 1 . In (9.15) we have separated out the Niemeier root systems of the form A d n with d = 24/n even. It is exactly these cases of umbral moonshine that are discussed in [46]. The original umbral moonshine observation of Eguchi-Ooguri-Tachikawa stemmed from consideration of the weight zero, index one weak Jacobi form Z M (cf. (9.4)), realized as the K3 elliptic genus. The analysis of [46] is, to some extent, similarly motivated, including the attachment of a weight zero, index n weak Jacobi form Z (n+1) g (τ, z) to each g ∈ G X , for each Niemeier root system X = A d n with d = 24/n even. A notion of extremal Jacobi form is formulated in [46], motivated by the representation theory of the N = 4 superconformal algebra, and it is proven 9 there that the six functions Z (n+1) := Z (n+1) e , for n ∈ {1, 2, 3, 4, 6, 12}, exhaust all examples. Thus the cases (9.15) of umbral moonshine considered in [46] are distinguished from the point of view of Jacobi forms of weight zero.
By contrast, there seems to be no natural way to associate weight zero Jacobi forms to the Niemeier root systems not 10 of the pure A-type, A d n . Rather, the mock modular 9 The main step in the classification given in [46] is a demonstration that the existence of an extremal Jacobi form of index m − 1 implies the vanishing of L(f, 1) for all new forms f of weight 2 and level m, where L(f, s) is the Dirichlet series naturally attached to f (cf. e.g. §3.6 of [201]). At this point one expects extremal Jacobi forms to be very few in number, on the strength of the Birch-Swinnerton-Dyer conjecture (cf. [12,219]), for example. This machinery is evidently quite powerful, and we may anticipate further applications to umbral moonshine in the future. 10 The cases A 3 8 and A 24 do come with weight zero Jacobi forms attached, which are obtained via a slight weakening of the notion of extremal Jacobi form formulated in [46]. Cf. §4.3 of [47].
forms H X g described in [47] naturally appear as the theta-coefficients of finite parts of certain meromorphic Jacobi forms ψ X g of weight 1 and index m, H X g,r (τ )θ m,r (τ, z), (9.25) where m = m(X) is the Coxeter number of any irreducible component of X (cf. (9.17)).
Here, meromorphic means that we allow poles in the functions z → ψ X g (τ, z), at torsion points z ∈ Qτ + Q. The Weierstrass ℘ function is a natural example (with weight two and index zero).
The decomposition (9.25) is described in detail in [47], following the general structural results on meromorphic Jacobi forms established in [63,228]. For now let us just mention that the first summand on the right hand side is the polar part of ψ X g , defined as in §8.2 of [63], and θ m,r (τ, z) := k∈Z y 2km+r q (2km+r) 2 /4m (9.27) evidently depends only on r modulo 2m.
A number of the meromorphic Jacobi forms attached to Niemeier root systems by umbral moonshine also appear amongst the specific examples of [63], where the main application is the computation of quantum degeneracies of black holes in certain string theories. However, whilst some speculations are offered in §5.5 of [46], no direct relationship between umbral moonshine and string theory has been formulated as yet.
We have seen in §9.1 that the mock modular forms attached to M 24 by Mathieu moonshine (i.e. umbral moonshine for X = A 24 1 ) may be characterized as Rademacher sums, and this serves as an umbral analogue of the principal modulus/genus zero property of monstrous moonshine, on the strength of Theorem 5.1. It is natural to ask for an extension of this result to all cases of umbral moonshine.
Conjecture 5.4 of [46] amounts to the prediction that vector-valued generalizations of the Rademacher sums of [42] will recover the H X g for X = A d n with d even (cf. (9.15)), and Conjecture 3.2 of [48] is an extension of this to all Niemeier root systems X. Thus a positive solution to Conjecture 3.2 of [48] will verify the umbral analogue of the principal modulus property of monstrous moonshine. So far, the Rademacher sum conjecture for umbral moonshine is known to be true only in the case that X = A 24 1 , but a program to analyze the Rademacher sum conjecture for more general cases of umbral moonshine, via the theory of Maass-Jacobi forms (cf. [25,26]), has been initiated in [22].
A notion of optimal growth was formulated in §6.3 of [47], following the work [63], with a view to extending Conjecture 5.4 of [46]. It is now known that this condition does not uniquely determine the H X g for general X (see [48] for a full discussion of this), but all the H X g serve as examples. With this in mind, it is interesting to note that many of Ramanujan's mock theta functions [193,194] appear as components of the umbral McKay-Thompson series H X g . (Cf. §4.7 of [46] and §5.4 of [47].) 9.3. Modules. As we have explained above, the Rademacher sum property of umbral moonshine is a natural counterpart to the principal modulus, or genus zero property of monstrous moonshine (cf. §3), formulated in a detailed way by Conway-Norton [56].
Conjecture 9.2 (Cheng-Duncan-Harvey). For each Niemeier root system X, there is a bi-graded G X -moduleǨ (9.28) such that the vector-valued umbral McKay-Thompson series H X g = (H X g,r ) is recovered 11 from the graded trace of g onǨ X via for r ∈ I X .
In (9.28) and (9.29), m = m(X) is the Coxeter number of any irreducible component of X, as in (9.25). The H X g,r satisfy H X g,−r = −H X g,r , so the umbral McKay-Thompson series H X g is determined by its components H X g,r with 0 < r < m. If the highest rank irreducible component of X is of type D or E then there are more symmetries amongst the H X g,r , and the definition of the set I X ⊂ Z/2mZ reflects this: if X has an A-type component then I X := {1, 2, 3, . . . , m − 1}. If the highest rank component of X is of type D then m = 2 mod 4, and I X := {1, 3, 5, . . . , m/2}. The remaining cases are X = E 4 6 , in which case I X := {1, 4, 5}, and X = E 3 8 , for which I X := {1, 7}. As mentioned in §9.1, the existence of the moduleǨ X for X = A 24 1 has been proven by Gannon [110]. More specifically, Gannon has shown that the coefficients of the nonnegative powers of q in H X g for X = A 24 1 are traces of elements of M 24 on direct sums of irreducible M 24 -modules. A priori, we might have needed C-linear combinations of such traces in order to recover the H X g,r . In forthcoming work [78], the authors confirm the validity of Conjecture 9.2. 11 In the original formulation, Conjecture 6.1 of [47], the function H X g,r in (9.29) is replaced with 3H X g,r in the case that X = A 3 8 . It also predicted thatǨ X r,−D/4m is a virtual G X -module in case X = A 3 8 and D = 0. Recently, a modification of the specification of the H X g for X = A 3 8 has been discovered, which leads to the simpler, more uniform formulation appearing here. We refer to [48] for a full discussion of this. Theorem 9.3 (Duncan-Griffin-Ono). Conjecture 9.2 is true. Theorem 9.3 serves, to a certain extent, as the umbral counterpart to Borcherds' result, Theorem 3.6. Indeed, the method of [78] may be used to give an alternative proof of the existence of the M-module V ♮ , for which the associated graded trace functions are the normalized principal moduli of the genus zero groups Γ g , identified by Conway-Norton in [56].
Nonetheless, there is still work to be done, for in order to have a direct counterpart to Theorem 3.6 we require concrete constructions of theǨ X . In the case of monstrous moonshine, the construction of V ♮ due to Frenkel-Lepowsky-Meurman came equipped with rich algebraic structure, ultimately leading to the notion of vertex operator algebra, and powerful connections to physics. We can expect that a full explanation of the umbral moonshine phenomena will require analogues of this for all theǨ X .
Just such an analogue for X = E 3 8 has recently been obtained in [79], where a super vertex operator algebra V X is constructed, together with an action of G X ≃ S 3 , such that the components of the vector-valued mock modular forms H X g = (H X g,r ) are recovered from traces of elements of G X on canonically-twisted modules for V X . The main ingredient in the construction of [79] is an adaptation of the familiar (to specialists) lattice vertex algebra construction (cf. [13,97]), to cones in indefinite lattices. The choice of cone is in turn inspired by Zwegers' work [229] on a particular pair of the fifth order mock theta functions of Ramanujan. In [45,82] a different approach to the module problem is considered, whereby meromorphic Jacobi forms associated to the H X g are recovered as graded traces on canonicallytwisted modules for certain super vertex algebras. In [82] constructions are given for the ψ X g of (9.25), for X ∈ {A 8 3 , A 6 4 , A 4 6 , A 2 12 }. In [45] certain half-integral index analogues of the ψ X g are recovered, for X ∈ {D 4 6 , D 3 8 , D 2 12 , D 24 }. As we will explain in more detail in the next section, recent work [39] constructs modules underlying assignments of vector-valued mock modular forms to the sporadic simple groups M 23 and M 22 . Here, M 23 denotes the maximal subgroup of M 24 composed of elements fixing any given point in the defining permutation representation (cf. §9.1), and M 22 is obtained similarly from M 23 , as the subgroup stabilizing a point in its natural permutation representation of degree 23. Although the mock modular forms realized in [39] are not directly related to the H X g , it seems likely that the construction used therein holds important hints for future developments in the module problem for umbral moonshine.  [176] (cf. also [152]), that the finite groups of symplectic automorphisms of complex K3 surfaces are, up to isomorphism, precisely the subgroups of M 24 that have at least five orbits in the unique non-trivial permutation representation on 24 points, including at least one fixed point.
Since a symplectic automorphism of a complex K3 surface M induces a supersymmetry preserving automorphism of a sigma model attached to M (cf. §9.1), and since it is the supersymmetry preserving automorphisms of a K3 sigma model that can be used to twine the K3 elliptic genus (9.4), the problem of classifying the supersymmetry preserving automorphism groups of nonlinear K3 sigma models was considered in [102].
One might have anticipated that all supersymmetry preserving K3 sigma model automorphism groups would be contained in M 24 , but this is not the case. Rather, the main result of [102], being a quantum analogue of Mukai's classification of finite symplectic automorphism groups of K3 surfaces, is that the groups of supersymmetry preserving automorphisms of K3 sigma models are, up to isomorphism, precisely the subgroups of Co 0 = Aut(Λ) (cf. (9.19)) that fix a sublattice of Λ with rank at least four.
Thus the K3 sigma models do not furnish quite the right theoretical setting for solving the mysteries of umbral moonshine. In particular, not all of the M 24 -twinings (9.12) of the K3 elliptic genus (9.4) arise as twinings defined by K3 sigma model automorphisms, since, for example, there are elements of M 24 (cf. (9.21)) that do not fix a rank four lattice in the Leech lattice, Λ.
Nonetheless, we can expect to learn useful information about umbral moonshine from further investigation of K3 sigma models. The history of monstrous moonshine provides a useful point of comparison: in advance of his proof of the Conway-Norton conjectures, Borcherds considered a certain BKM algebra (cf. §3) in [15], which was, at the time, called the monster Lie algebra, although it turned out to be only indirectly connected to the monster. The Lie algebra constructed in [15] is now known as the fake monster Lie algebra (cf. §2 of [16]), and has found a number of applications outside of moonshine. For example, the denominator function of the fake monster Lie algebra (cf. Example 2 in §10 of [17]) is used to prove facts about families of K3 surfaces in [19,153,225].
At the level of vertex operator algebras, the fake monster Lie algebra corresponds to the lattice vertex algebra V Λ = ∞ n=−1 V Λ,n attached to the Leech lattice. This may be regarded as a "fake" moonshine module, for it has exactly the same graded dimension as V ♮ , up to the constant term, There is no action of the monster on V Λ , although there is an action by a group 12 of the shape 2 24 .Co 0 = 2 24 .(2.Co 1 ) (cf. (9.20)), whereas the monster contains a maximal subgroup with the shape (2.2 24 ).Co 1 . The Frenkel-Lepowsky-Meurman construction of V ♮ takes V Λ as a main ingredient. (Cf. (3.6).) It is striking that the Conway group Co 0 = 2.Co 1 plays a prominent role in so many of the objects we have discussed: it is visible within the monster, and within the automorphism group of V Λ . It serves for K3 sigma models as M 24 does for K3 surfaces, as discussed above, and all of the umbral moonshine groups (9.24) are visible within Co 0 , according to (9.22).
Moreover, there is moonshine for the Conway group, in direct analogy with that for the monster, in the sense that there is an assignment of normalized principal moduli T s g to elements g ∈ Co 0 which are realized as trace functions on a graded infinite-dimensional Co 0 -module. A proof of this statement has recently appeared in [80].
To explain this, take g ∈ Co 0 , let {ε i } be the eigenvalues associated to the action of g on Λ ⊗ Z C, and define is the character value associated to the action of g on Λ ⊗ Z C. Then T s g (2τ ) = q −1 + O(q) is the normalized principal modulus for a genus zero group, according to Conway-Norton [56] and Queen [188]. (See also [154].) It has been demonstrated in [80] that the functions T s g are the graded traces attached to the action of Co 0 on a distinguished 13 super vertex operator algebra, V s♮ = ∞ n=−1 V s♮ n/2 .
tr(−g|V s♮ n/2 )q n/2 (9.33) Thus the super vertex operator algebra V s♮ solves the Conway moonshine analogue of Thompson's Conjecture 3.1, and V s♮ is the natural analogue of the moonshine module V ♮ for the Conway group, Co 0 .
The Conway module V s♮ is closely related to monstrous moonshine, for, in addition to being directly analogous to V ♮ , many of the discrete groups Γ g < SL 2 (R), for g ∈ M, also arise as invariance groups of principal moduli attached to Co 0 via its action on V s♮ . (Cf. [80].) On the other hand, V s♮ enjoys a close connection to K3 sigma models, for it is shown in [81] that the data defining a K3 sigma model gives rise to a bi-grading on a 13 The super vertex operator algebra V s♮ admits actions by both Co 0 (cf. (9.19)) and the simple group Co 1 (cf. (9.20)). It's construction as a Co 1 -module was sketched first in §15 of [96], described later in §5 of [20], and subsequently studied in detail in [76]. The Co 0 -module structure on V s♮ is mentioned in [76], following [20], but it seems that the modular properties of the trace functions associated to the Co 0 -action were not considered until [80].
distinguished, canonically-twisted 14 V s♮ -module V s♮ tw = n,r (V s♮ tw ) n,r , (9.34) such that the associated graded traces of compatible elements of Co 0 are weak Jacobi forms.
More specifically, following §2.1 of [102], we may regard the data of a K3 sigma model as equivalent 15 to a choice of positive-definite 4-space Π < II 4,20 ⊗ Z R (cf. (3.7)), such that Then the supersymmetry preserving automorphism group of the nonlinear sigma model defined by Π is the group G Π := Aut(II 4,20 , Π), (9.36) composed of orthogonal transformations of II 4,20 that extend to the identity on Π, according to §2.2 of [102]. One of the main results of [102] is that G Π may be identified with a subgroup of Co 0 .
The construction of [81] uses V s♮ tw to attach a graded trace function φ g (τ, z) := n,r tr(−g|(V s♮ tw ) n,r )q n y r (9.37) to each pair (g, Π), where Π < II 4,20 ⊗ Z R satisfies (9.35), and g ∈ G Π . It is shown in [81] that φ g is a weak Jacobi form of weight zero and index one for Γ J 0 (N) (cf. (9.12)), for some N, for all choices of Π and g ∈ G Π . Moreover, φ g is found to coincide with the g-twined K3 elliptic genus associated to the sigma model defined by Π, for all the examples computed in [102,106,218]. (These examples account for about half of the conjugacy classes of Co 0 that fix a 4-space in Λ ⊗ Z R.) In particular, taking g = e in (9.37) recovers the K3 elliptic genus (9.4), but in the form φ e (τ, z) = −2 11 θ 2 (τ, z) 2 θ 2 (τ, 0) 2 ∆(2τ ) ∆(τ ) + 1 2 24 . Thus V s♮ tw serves as a kind of universal object for K3 sigma models. This is interesting, for generally it is difficult to construct the Hilbert spaces underlying a K3 sigma model, and therefore difficult to compute the associated twined K3 elliptic genera, for instance, for all but a few special examples.
In [130], Huybrechts has related the positive-definite 4-spaces Π < II 4,20 ⊗ Z R satisfying (9.35) to pairs (X, σ), where X is a projective complex K3 surface, and σ is a stability condition on the bounded derived category of coherent sheaves on X. In this way he has obtained an alternative analogue of Mukai's result [176], whereby symplectic automorphisms of K3 surfaces are replaced by symplectic derived autoequivalences. (The results of [81] are formulated in this language.) A number of the functions Z g (cf. (9.12)) arising in Mathieu moonshine are realized as φ g for some g ∈ Co 0 . So the construction of [81] relates V s♮ to umbral moonshine, but the connection goes deeper, for it is shown in [81] that a natural generalization of the definition (9.37) recovers a number of the Jacobi forms attached to other root systems of the form X = A d n (cf. §9.2), beyond the special case X = A 24 1 . It is interesting to compare this to the results of [49] (see also the precursor [123]), which demonstrate a role for K3 surface geometry in all cases of umbral moonshine (i.e., for all the Niemeier root systems), by considering sigma models attached to K3 surfaces admitting du Val singularities. Indeed, many of the Jacobi forms computed in [49] appear also in [81].
From the discussion above we see that the Conway module V s♮ provides evidence for a deep connection between monstrous and umbral moonshine. Further support for the notion that monstrous and umbral moonshine share a common origin is obtained in [185], where the generalized Borcherds products of [28] are used to relate the trace functions of monstrous and umbral moonshine directly.
We elaborate now on the results of [39], which, as mentioned at the end of §9.3, fall outside of umbral moonshine as formulated in [47], but are nonetheless related. Actually, this work is another application of the Conway module V s♮ , for in [39] the canonicallytwisted V s♮ -module V s♮ tw (cf. (9.34)) is equipped with module structures for the N = 2 and N = 4 superconformal algebras, which in turn give rise to an assignment of distinguished vector-valued mock modular forms to elements of the sporadic simple Mathieu groups M 23 and M 22 , respectively. This work furnishes the first examples of concretely constructed modules for sporadic simple groups, such that the associated graded trace functions define mock modular forms. Interestingly, the sporadic groups of McLaughlin and Higman-Sims (both rather larger than M 24 , or any of the other umbral groups G X ) also make an appearance in this setting.
As a further indication of the important role that K3 sigma models will play in illuminating umbral moonshine, we mention the interesting work [105,207,208,209], which seeks to explain the Mathieu moonshine observation by formulating a precise mechanism for combining symmetries of distinct K3 sigma models into a single group. We note in particular, that a fixed-point-free maximal subgroup of M 24 is constructed in this way in [209].
We conclude this section with references [40,122,123,124,145,186,223] to a number of other occurrences of umbral groups in geometry and physics, all promising connections to Mathieu moonshine, or umbral moonshine more generally. We also note the recent work [38], which analyzes all cases of generalized umbral moonshine, thereby extending the investigation of generalized Mathieu moonshine that was initiated in [104].

Open Problems
We conclude the article by identifying some open problems for future research that are suggested by our results in §7 and §8, and the developments described in §9.
Problem 10.1. We have seen that the known connections between monstrous moonshine and physics owe much to the Frenkel-Lepowsky-Meurman construction of the moonshine module V ♮ , and its associated vertex operator algebra structure. Just as the vertex operator algebra structure on V ♮ gives a strong solution to Thompson's conjecture, Conjecture 3.1, we can expect concrete constructions of theǨ X -whose existence is now guaranteed thanks to Theorem 9.3-to be necessary for the elucidation of the physical origins of umbral moonshine. As we have described in §9.3, progress on this problem has been obtained recently in [45,79,82], and the related work [39] may also be useful, in the determination of a general, algebraic solution to the module problem for umbral moonshine.
Problem 10.2. Norton's generalized moonshine conjectures were discussed in §4, and generalized umbral moonshine has been investigated in [38,104]. A special case of generalized moonshine for the Conway group is established in [80], but the full formulation and proof of generalized Conway moonshine remains open. Given the close connections between V s♮ and umbral moonshine discussed in §9.4, it will be very interesting to determine the precise relationship between the corresponding generalized moonshine theories. We can expect that the elucidation of these structures will be necessary for a full understanding of the role that umbral moonshine plays in physics. Problem 10.3. As discussed in §9.2, the fact (cf. Theorem 5.2) that the McKay-Thompson series of monstrous moonshine are realized as Rademacher sums admits conjectural analogues for umbral moonshine. (See Conjecture 3.2 of [48] for a precise formulation.) So far this has been established only for X = A 24 1 , corresponding to Mathieu moonshine (cf. [42]), and the general case remains open. As explained in §9.2, a positive solution to Conjecture 3.2 of [48] will establish an umbral moonshine counterpart to the principal modulus/genus zero property of monstrous moonshine. Problem 10.4. As discussed above, in §5 and in §9.2, Rademacher sums play a crucial role in both monstrous and umbral moonshine, by serving to demonstrate the distinguished nature of the automorphic functions arising in each setting. In the case of monstrous moonshine, the Rademacher sum property also indicates a potentially powerful connection to physics, via three-dimensional gravity, as explained in §6. Thus it is an interesting problem to formulate umbral moonshine analogues of the conjectures of [77], discussed here in § §6,7.
Problem 10.5. Relatedly, it follows from the results of [77] that the McKay-Thompson series T s g (cf. (9.33)), attached to elements g in the Conway group Co 0 via its action on V s♮ (cf. §9.4), are also realized as Rademacher sums. That is, Theorem 5.2 generalizes naturally to Conway moonshine. Thus it is natural to investigate the higher order analogues V s(−m) of the super vertex operator algebra V s♮ , and the Conway group analogues of the three-dimensional gravity conjectures of [77]. Some perspectives on this are available in [127,168,222].
Problem 10.6. The notion of extremal vertex operator algebra is defined by (6.8). So far the only known example is the moonshine module V ♮ . As explained in §6, the construction of a series of extremal vertex operator algebras, with central charges the positive integer multiples of 24, would go a long way towards the construction of a chiral three-dimensional quantum gravity theory. This problem also has a super analogue, cf. [127]. Problem 10.8. Relatedly, V × M-modules satisfying the extremal condition (6.8) may be easily constructed from the monster modules V (−m) , as is mentioned at the conclusion of §7. What is the algebraic significance of these spaces? We know from [107,127] that they cannot admit vertex operator algebra structure compatible with the given V × M actions. Is there some other kind of algebraic structure which is compatible with this symmetry?
Problem 10.9. The result of Corollary 8.2 implies that the V (−m) n and W ♮ n tend to direct sums of copies of the regular representation of M, as n → ∞. This means that if we write each homogeneous subspace of each module, particularly the moonshine module V ♮ , as the sum of a free part (free over the group ring of M) and a non-free part, then the non-free part tends to 0 (relative to the free part) as n → ∞. Is there something to be learnt from an analysis of the non-free parts of V (−m) , W ♮ ? As one can see from Table 8, some irreducible representations of the monster feature more often in the non-free part than others. We thank Bob Griess for posing this question.
Problem 10.10. It is a natural problem to generalize the methods employed in §8, to determine the distributions of irreducible representations of the umbral groups G X (cf. (9.24)) in the umbral moonshine modulesǨ X (cf. (9.28)). Similarly, one may also consider the distributions of the Co 0 -modules in V s♮ = V s(−1) , and in the V s(−m) more generally. In all of these cases, questions analogous to Problem 10.9 may reward investigation.

Appendix A. Monstrous Groups
The table below contains the symbols Γ g = N||h + e, f, . . . , for each conjugacy class of the monster. Following [55], if h = 1, we omit the '||1' from the symbol. If W g = {1}, then we write N||h, whereas if it contains every exact divisor of N/h, we write N||h+.