Universal Dynamics of Heavy Operators in CFT$_2$

We obtain an asymptotic formula for the average value of the operator product expansion coefficients of any unitary, compact two dimensional CFT with $c>1$. This formula is valid when one or more of the operators has large dimension or -- in the presence of a twist gap -- has large spin. Our formula is universal in the sense that it depends only on the central charge and not on any other details of the theory. This result unifies all previous asymptotic formulas for CFT$_2$ structure constants, including those derived from crossing symmetry of four point functions, modular covariance of torus correlation functions, and higher genus modular invariance. We determine this formula at finite central charge by deriving crossing kernels for higher genus crossing equations, which give analytic control over the structure constants even in the absence of exact knowledge of the conformal blocks. The higher genus modular kernels are obtained by sewing together the elementary kernels for four-point crossing and modular transforms of torus one-point functions. Our asymptotic formula is related to the DOZZ formula for the structure constants of Liouville theory, and makes precise the sense in which Liouville theory governs the universal dynamics of heavy operators in any CFT. The large central charge limit provides a link with 3D gravity, where the averaging over heavy states corresponds to a coarse-graining over black hole microstates in holographic theories. Our formula also provides an improved understanding of the Eigenstate Thermalization Hypothesis (ETH) in CFT$_2$, and suggests that ETH can be generalized to other kinematic regimes in two dimensional CFTs.


Introduction and discussion
Two dimensional conformal field theories are among the most important and interesting quantum field theories. They describe important condensed matter and statistical mechanics systems at criticality and, remarkably, possess an infinite dimensional group of symmetries related to local conformal transformations [1]. In this paper we will be interested in irrational CFTs with c > 1 and an infinite number of primary states. Although these theories are not exactly solvable, they are nevertheless under much greater analytic control than their higher dimensional cousins. In this paper we will describe a particular example of this fact: the dynamics of heavy (i.e. high dimension) operators is universal in two dimensional CFTs, in the sense that these dynamics are determined only by the central charge and not by any other details of the theory.
The basic dynamical data that defines a CFT 2 is a list of primary operators O i , along with • Their scaling dimensions ∆ i ≡ h i +h i and spins J i ≡ h i −h i , and • The operator product expansion (OPE) coefficients C ijk .
These data, along with the central charge c, uniquely determine the correlation functions of the theory in flat space as well as on an arbitrary surface. Ideally one would like to solve the constraints of unitarity and conformal invariance to determine the possible allowed values of the {h i ,h i , C ijk }, and hence completely classify two dimensional CFTs. In the absence of such a complete classification, however, we will ask a more modest question: which features of this data are universal (i.e. true in any conformal field theory) and which are theory dependent?
A simple example of a universal feature is the dimension and spin of the identity operator: 1 h 1 = 0 =h 1 (1.1) which is the same in every CFT 2 . A second and somewhat more subtle universal feature is Cardy's formula for the growth of the high energy density of primary states [2]: 2 (1.2) 1 We restrict our attention in this paper to unitary, compact CFTs, defined to have a discrete spectrum with a unique sl(2)-invariant ground state. The same approach will, however, apply more generally with some modest modifications. We focus on theories with cL = cR = c for simplicity, but the modification of our results to theories with cL = cR is straightforward. 2 Throughout this paper we use the notation a ∼ b to denote that a b → 1 in the limit of interest. We will also use the notation a ≈ b to denote that a and b have the same leading scaling in the limit of interest.
Equation (1.2) is true in any compact CFT 2 with c > 1, and is universal in the sense that it depends only on the central charge c and not on any other details of the theory. In fact, these two universal features (1.1) and (1.2) are closely related: they are "dual," in the sense that they are related by modular invariance. Cardy's formula is the statement that the identity operator has dimension zero, albeit interpreted in a dual channel in the computation of the torus partition function.
Every unitary, compact CFT possesses an additional universal feature: the identity operator will appear in the fusion of any operator with itself. In terms of the OPE data, this means that We will answer this question in this paper. The result is a universal asymptotic formula for the average value of the OPE coefficients: .
(1. 5) In this equation ± denotes a product of eight terms with all possible sign permutations. Here rather than using the central charge c and dimensions h andh to write our formula, we have used the "Liouville parameters" c = 1 + 6Q 2 = 1 + 6(b + b −1 ) 2 , h = α(Q − α), α = Q 2 + iP . (1.6) Just as with Cardy's formula, this result is universal in the sense that it is true in any (compact, unitary) CFT, and the only free parameter appearing in this formula is the central charge c.
In interpreting this formula, a few comments are in order. The first is that equation (1.4) is an expression for the average OPE coefficient, with the heavy operator weight(s) averaged over all Virasoro primary operators, which is valid for any finite c > 1. In this sense, our result differs from most of the previous results in the literature. The second is that, although we have only written one formula, equation (1.4) is secretly three different formulas hiding in one. In particular, this formula is valid in three different regimes, and is derived using three types of crossing symmetry. Equation (1.4) holds: • When two operators are taken to be fixed and the third is taken to be heavy, in which case it follows from the crossing symmetry of four-point functions with pairwise identical external operators.
• When one operator is fixed and the other two are heavy, in which case it follows from the modular covariance of torus two-point functions of identical operators.
• When all of the operators are taken to be heavy, in which case it follows from modular invariance of the genus two partition function.
In each case, the averaging taken in equation (1.4) should be understood as an average over the heavy operator(s), but not over the other operators which are held fixed. 4 The surprising result is that we obtain exactly the same formula in each case.
Various authors have previously considered the asymptotic behaviour of three point coefficients in each of these three separate limits [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. The asymptotic formulas which were obtained generally relied on detailed computations of the conformal blocks, and -while correct -required assumptions about the behaviour of the blocks in certain kinematic regimes or the simplification of large central charge. Our single asymptotic formula (1.5) unifies all of these previous results, and in the darkness binds them. Moreover, it holds for any finite value of the central charge c > 1, and interpolates between all of the previously known results in the literature.
Before describing the details of our derivation, in the remainder of the introduction we will describe the strategy underlying our derivation and comment in more detail on the interpretation of this result.

The strategy: bootstrap without the blocks
In order to illustrate our basic strategy, consider the following simple example where one extracts the asymptotic behaviour of OPE coefficients from crossing symmetry of four point functions. where in first line and second lines we have expanded in a basis of intermediate operators in the S-channel and T -channel, respectively. In this simple version of the computation the sums run over all operators in the theory, both primaries and descendants, and we are not organizing the states into representations of the conformal group. The functions x hs−2h O and (1 − x) ht−2h O play the role of conformal blocks in the Sand T -channel, respectively. This four point function has a pole at x = 1 coming from the operator 1 in the T -channel, which allows us to determine the asymptotic behaviour of the S-channel expansion coefficients |C OOOs | 2 when h s is large. We do so 4 As we will elaborate on below, "heavy" in this context means that h andh are much larger than both the central charge and the dimensions of the other operators which are held fixed. For this reason the three different regimes described above are distinct, and there is a-priori no reason to expect to get the same result in each regime. by expanding the T -channel conformal block of the identity operator into S-channel blocks: The binomial coefficient 2h O +n−1 n appearing in this expression is a simple example of a crossing kernel: the coefficients which appear when we expand a conformal block in one channel in terms of conformal blocks in a dual channel. 5 Comparing the two channel decompositions of our correlation function, we see that our crossing kernel must equal the average value of the OPE coefficients at h s = 2h O + n in the limit where the operator O s is heavy: The subscript 'scaling' reminds us that, as we did not organize into representations of the conformal group, the average here is over all heavy operators O s -both primaries and descendants -of dimensions h s ,h s . We have also not specified the exact nature of the average which is being taken, i.e. over how wide a range of operators one must average in order for the result (1.9) to hold. We will return to this subtlety below.
In order to determine the asymptotic behaviour of primary operator OPE coefficients we must improve this computation by organizing the sum over intermediate states into a sum over representations of the conformal group. This is accomplished by taking O s and O t above to be primary operators and replacing the functions x hs−2h O and (1 − x) ht−2h O by the appropriate conformal blocks. We then expand the identity block for the T -channel in terms of the S-channel blocks for heavy operators, exactly as in (1.8). The average value of the primary operator OPE coefficients is then given by the analog of the binomial coefficient appearing in this expansion. As conformal blocks for Virasoro symmetry are not known analytically one might think that this computation is impossible. Remarkably, this is not the case, as Ponsot and Teschner obtained explicit (but complicated) expressions for the crossing kernel of Virasoro blocks for four-point functions [21,22]. 6 However, when we take the operator in the T -channel to be 1 these crossing kernels simplify considerably, and they are essentially given by our expression (1.5).
This computation will be carried out in more detail below, but already several features are apparent. The first is that, as conformal blocks are purely kinematic objects -i.e. they depend on central charge and the dimensions of the operators under consideration but not on which theory we are studying -the crossing kernels are purely kinematic as well. This guarantees that our resulting 5 Note however that this crossing kernel is only supported on a discrete set of intermediate operator weights (namely hs = 2hO + n for n a non-negative integer); this is similar to the situation for global SL(2, R) conformal blocks, which can be expanded as a sum over double-twist blocks and their derivatives in the cross channel (see [20] for an explicit decomposition). This is unlike the case of Virasoro blocks that will be the subject of this paper, as the cross-channel decomposition of the Virasoro block will typically involve a continuum. 6 The higher-dimensional analog of the Virasoro fusion kernel is the 6j symbol for the principal series representations of the Euclidean global conformal group SO(d + 1, 1) [23], which serves as a crossing kernel for conformal partial waves.
asymptotic formula will be universal, in the sense that it depends only on the central charge but not on any other details of the theory. The second is that, from this point of view, conformal blocks can be bypassed altogether and one can work directly with crossing kernels. In particular, as long as one is interested in understanding the constraints that crossing symmetry imposes on the dynamical data of a CFT (the spectrum and OPE coefficients) the conformal blocks represent an unnecessary complication. Blocks are only needed if one wishes to extract an observable, such as a correlation function, from this basic dynamical data.
The above discussion shows that crossing symmetry of four point functions will determine the asymptotic behaviour of OPE coefficients in the limit where one operator is taken to be heavy and the others are held fixed. In order to obtain other constraints, we must consider crossing symmetry and modular invariance for more general observables. The most general observable is an n-point correlation function of Virasoro primaries on a Riemann surface of genus g, which we will denote G g,n ({q i }), where the q i are a set of continuous variables which parameterize the moduli of the Riemann surface as well as the locations of the insertion points of these primary operators. We then expand this observable as a sum over intermediate operators propagating in a particular channel, as (1.10) Here the {O j } are the internal operators which contribute to this observable, and the C {O j } are the corresponding products of OPE coefficients. We are organizing into conformal families, and the conformal block F({P j }|{q i }) encodes the contribution of all descendants of the operators {O j }. As the conformal blocks are kinematic, they depend only on the spins and dimensions of the operators {O j }, which we are writing in terms of the parameters {P j } defined by equation (1.6). In order to keep the notation compact, in this formula {P j } and {q i } denote both the holomorphic and anti-holomorphic weights of the internal operators and moduli of the punctured Riemann surface, and the block F({P j }|{q i }) includes contributions from both left-and right-moving descendants. For simplicity we have suppressed the dependence on the external operators. In the last line we have introduced a "density of OPE coefficients" which is a function only of the P j . 7 In (1.10) we have reduced the correlation function to a sum of products of OPE coefficients. On a higher genus Riemann surface this is an in principle complicated procedure, as one must decompose the Riemann surface into pairs-of-pants and then sum over internal operators which 7 Strictly speaking ρ is a distribution rather than a function. Moreover, the Pi will be either real or purely imaginary depending on dimensions and spins of the operators Oj, and the definition of the integral in (1.10) includes contributions from all states.
propagate through the cuffs of these pairs of pants. This makes the computation of the conformal blocks quite difficult. The advantage of our approach is that by working directly with crossing kernels rather than conformal blocks, almost all of the details of this construction are irrelevant. Thus it is possible to understand the constraints of modular invariance and crossing symmetry without the need to explicitly construct the Riemann surface.
We now wish to compare this to the expansion of our observable in another channel: (1.12) Here we denote the OPE coefficients, the Virasoro conformal blocks, and the OPE spectral density in this alternate channel with a tilde. We have also denoted the moduli on which the conformal blocks depend with a tilde to emphasize that the blocks in different channels typically admit perturbative expansions in different parameterizations of the moduli. In general the relationship between the two coordinate systems q j andq i on moduli space is quite complicated. Our strategy of working entirely with crossing kernels ensures, however, that we never need to determine this relationship explicitly.
Associativity of the operator product expansion implies that our two different operator product expansions must agree. We then compare these two different expansions by introducing the crossing kernel K defined by: (1.13) Plugging this into equation (1.10) and comparing with (1.12) gives us the crossing equation. (1.14) In cases where the same OPE data appears in both channels, the solutions to the crossing equation are the unit eigenvectors of the crossing kernel.
We now wish to extract universal features of the OPE coefficients C {O j } by considering limits where the identity operator dominates in one channel. In particular, we would like to consider cases where the right hand side of the crossing equation (1.14) is dominated by the identity operator (i.e. dominated by the term with all O j = 1) when the internal weights R k are taken to infinity. This will occur when (1. 15) In this limit the density of OPE coefficients is just given by the corresponding crossing kernel of the identity operator:ρ This is the generalization of our earlier result (1.9), that the crossing kernel of the identity operator serves as the universal asymptotic behaviour of the OPE coefficients for heavy states.
We emphasize that, although we have phrased it more abstractly, this is equivalent to the familiar strategy where one studies the crossing equation in a kinematic regime in which the exchange of the identity operator dominates in one channel. For example, in the case of the four-point function the limit we are considering is equivalent to the one where the cross ratio x → 1. Similarly, the application of this strategy to the torus partition function gives Cardy's formula. A final example is the lightcone bootstrap [24,25], where the spectrum and OPE data of CFT d>2 approaches that of mean field theory at large spin. However these arguments typically require the detailed knowledge of conformal blocks in certain Lorentzian kinematic regimes, which in the Virasoro case is out of reach except in the simplest cases. The advantage of our approach is that we only require the crossing kernel, bypassing the need to compute the conformal blocks explicitly.

The Moore-Seiberg construction of crossing kernels
We now wish to apply this construction to constrain the asymptotics of the squared OPE coefficients |C ijk | 2 . To begin, recall that C ijk is the correlation function O i O j O k S 2 on the sphere, with the operators inserted at three points. Thus to study |C ijk | 2 we must consider observables obtained by sewing together two copies of the sphere at these insertion points. For example, the four point function on the sphere is obtained by sewing together these two spheres at a single point -say, the insertion point of the operators O k -to give: 8 where F(P k |x) is an appropriate holomorphic conformal block. Applying the crossing arguments of the previous section will then lead to an asymptotic formula for the |C ijk | 2 in the limit where O k is taken to be heavy but the operators O i and O j are held fixed. Similarly, we can sew together the spheres at a pair of points, the locations of the operators O j and O k , to obtain the two point function on the torus: where F(P j , P k |τ, v) is now a conformal block for two point functions on the torus. This will lead to an asymptotic formula for |C ijk | 2 in the limit where both O j and O k are heavy and O i is fixed. Finally, sewing together all three insertion points gives the genus two partition function: where q is a collection of genus two modular parameters and F(P i , P j , P k |q) is a holomorphic genus two conformal block. This will lead to an asymptotic formula which is valid when all of the operators are taken to be heavy. The strategy described above is only useful, however, if we can accomplish two things: we first need to find a dual channel where the identity operator dominates, and we must then compute the relevant crossing kernels. To accomplish this we will follow the strategy of Moore and Seiberg [26], who argued that all of the constraints of the associativity of the OPE are completely captured by crossing symmetry of four point functions on the sphere and modular covariance of one-point functions on the torus. This is because any crossing transformation for any observable can be constructed by composing "elementary" crossing transformations: four point crossing on the sphere (or fusion), and modular transformations for one-point functions on the torus (along with braiding, which we will not use in this paper). The crossing kernels for these elementary crossing transformations were written down explicitly in [21,22,27,28]. Thus, by assembling these together using the Moore-Seiberg construction, we can obtain explicit formulas for general crossing transformationssuch as those on higher genus Riemann surface -without ever computing a conformal block.
We will write this down very explicitly below, but the general strategy is easy to understand. The two elementary crossing transformations we use can be represented pictorially as in figure 1. The first of these is the crossing transformation for four point functions on the sphere, where we have chosen to represent the four external operators by holes rather than infinitesimal points. The Sand T -channel decompositions of the four point function then correspond to the two different ways of constructing this four-holed sphere as two pairs-of-pants glued together shown above. Similarly, the second picture in figure 1 describes the crossing transformation between two different channels for a one-point function on the torus.
We can now construct crossing transformations for two point functions on the torus by composing these elementary transformations, as in figure 2. We recognize the first of these as the modular S transformation for one point functions on the torus, and the second as the fusion move for four point functions on the sphere. The result is an expression for this more complicated crossing kernel as a product of these two elementary kernels. Indeed, we recognize the channel on the far right as precisely the one which gives the square of the OPE coefficients in equation for |C ijk | 2 is then obtained by considering the kinematic limit which is dominated by the identity operator 1 propagating in the channels (marked by yellow circles) on the far left.
We can construct the crossing transformations at genus two in a similar manner, as in figure  3: we have first done two crossing moves for torus one point functions, followed by a four-point crossing move on the sphere. Again, the channel on the far right gives the square of the OPE coefficients considered in equation (1.19) where the operators O i , O j and O k propagate through the three blue circles. The asymptotic formula for |C ijk | 2 when these three operators are taken to be heavy is found by considering the limit where the identity operator 1 dominates in the channel decomposition depicted on the far left. This formula is given in terms of a genus two crossing kernel which -by construction -is a product of the elementary crossing kernels which were written down by Ponsot and Teschner.
The result is an asymptotic formula for the averaged OPE coefficients |C ijk | 2 in the three limits described above, where either one, two or all three operators are taken to be heavy, and only the heavy operators are averaged over. For example, in the case where the differences between the heavy operator dimensions and all spins J i are held fixed in the large-dimension limit, we can state all of our asymptotic formulas as follows: 9 In addition to these, there are other distinct asymptotic limits, for example fixing the ratios of ∆ i instead of differences as in (4.12) and (4.20), which are also controlled by (1.4). Remarkably, all of these formulas (appearing in equations (4.4), (4.12), (4.13), (4.20) and (4.21)) are realized as limits of the same underlying formula (1.5). This is perhaps the most surprising feature of our result, and is a consequence of the Moore-Seiberg procedure which constructs all of these different crossing kernels from the same elementary building blocks.

Generalizations to other observables
We emphasize that, although we have applied our strategy to the computation of the asymptotics of the |C ijk | 2 , this argument works much more generally. Whenever one can find a kinematic limit where the identity block dominates a CFT observable, there is a corresponding universal formula for the OPE data in the dual channel -it is just a matter of assembling the appropriate crossing kernel. In this sense our strategy should be regarded as defining an entire class of CFT asymptotic formulas which govern the universal dynamics of heavy operators in two dimensional CFTs. It would clearly be worth exploring these dynamics in more detail.
In addition, while our main focus is on universal asymptotic formulas -namely those which are constructed only from the propagation of the identity operator in a cross channel -one can also consider non-universal quantities which are constructed from other light operators propagating in a cross channel. For example, the leading corrections to the universal formulas described above will come from the other light operators in the theory, and one can obtain improved (but non-universal) asymptotic formulas which depend on the data (such as the spectrum and OPE coefficients) of whatever light operators are present in the theory.
The most interesting example of this type would be one where the contribution from 1 in the cross channel vanishes, in which case the asymptotic behaviour would be non-universal and depend on the light data of the theory. The prototypical example is the average value of the Light-Heavy-Heavy OPE coefficient C iij , where the state i is heavy and averaged over, while the j is held fixed. This is determined by considering the modular covariance of one point functions O j T 2 (τ ) on the torus in the limit τ → 0 [5]. The contribution from the identity operator propagating in the dual channel (i.e. taking τ → −1/τ ) is just the one-point function of O j on the plane, which vanishes. The first non-vanishing contribution will come from the lightest operator χ which has C jχχ = 0. Previous results have either worked only at large central charge, or have organized into scaling blocks or global blocks, rather than full conformal blocks (so that the average in C iij is an average over quasi-primaries or over all states in the theory, rather than over Virasoro primaries) [5]. We can now write down the complete answer at finite central charge, where the average is taken only over primaries; this will be discussed in Section 6.

Large central charge limit
One important special case is the large central charge limit, which is relevant for holographic theories with an AdS gravity dual. In this case a generic heavy state is interpreted as a microstate of a BTZ black hole. The observation that the average OPE coefficients take a universal form then has a natural physical interpretation, as the emergence of a semi-classical black hole geometry which arises upon coarse-graining over heavy states. That our formula depends only on the central charge and the dimensions and spins of the operators reflects the fact that this semi-classical configuration is purely geometric: the holographically computed OPE coefficient depends on Newton's constant and the masses and spins of the objects under consideration, but not on any other details of the state. Our formulas can thus be regarded as an extrapolation of the usual gravitational "no hair" theorems to CFT. Indeed, various limits of our formula have already been shown to reproduce the classical dynamics of particles in black hole backgrounds [4][5][6][7]10], and appear in closely related gravitational computations of semiclassical conformal blocks [29,30]. We note that from the point of view of classical gravity it is not at all obvious that there should be a single formula that interpolates between the three different limits we are considering (where either one, two or all three of the operators are taken to be heavy). Indeed, our formula reflects this: it smoothly interpolates between these three limits at finite c, but not after taking a c → ∞ limit.
Perhaps the most important point to emphasize here is that, as we take c → ∞, the "heavy" operators appearing in our formula should still be understood to have dimension large compared to c. This is necessary in order for the identity operator to still dominate in the dual channel. Such a state, however, will be interpreted as a black hole whose horizon area is very large in AdS units. A black hole whose size is order one in AdS units would correspond to an operator whose dimension is order c. It is therefore natural to ask under what circumstances the regime of validity of our asymptotic formulas could be extended to operators with finite h/c in the large c limit. Generically, this will only happen if we impose severe restrictions on the "light" data in our theory. For example, the regime of validity of Cardy's formula can be extended all the way down to dimensions of order c only if the density of states of the light spectrum is sufficiently sparse [31]. It would be interesting to ask whether similar considerations could be applied to our asymptotic formulas. We expect that the corresponding sparseness constraint will be considerably more subtle, however, and may require more than just a constraint on the density of OPE coefficients of light operators -see [30,32] for discussions of this in the context of higher genus partition functions of symmetric product orbifolds and holographic CFTs.

Chaos, integrability and eigenstate thermalization
Our results have an important role to the play in the study of chaos in two dimensional CFTs. To see this, we first note that while we have written formulas of the form we have not yet stated precisely what range of states one must average over. The weakest possible statement would be that our asymptotic formula is true only in an integrated sense, where rather than averaging over a small window of states one simply sums over all states below some (large) cutoff. We expect, however, that a much stronger version is true, where one needs to integrate only over a small window; results that establish this kind of behaviour go under the general name of Tauberian theorems (see e.g. [16,[33][34][35][36][37] for recent applications of Tauberian theorems in this context). In the present case we would require new results for several variables, adapted to the Virasoro crossing transforms. This is an important avenue for future research, which is not merely a mathematical subtlety but a question of important physical interest.
In particular, our expectation is that in a generic, chaotic theory one would need to average only over very small window in order to obtain the asymptotic result (1.23). In other words, in a chaotic theory the typical OPE coefficient should be rather close to the average one. In an integrable theory, however, many OPE coefficients will vanish due to selection rules, so any average result is obtained only by including many different states in the average. We expect that in a chaotic theory one would need to average over a window of size not much larger than e −S , where S is the microcanonical entropy, while in an integrable theory one must average over a window of some fixed width rather than one that is exponentially small at high energies. It is important to emphasize that all of our results are derived from crossing and modular constraints which hold in any CFT. Thus our result (1.23) will be equally true in integrable and chaotic theories. The crucial difference will be in the way in which this average is realized. Indeed, we would propose that the size of the window one must average over should be used as a sharp criterion for chaos in conformal field theory: a chaotic theory is one where one needs to average only over windows of size O(e −S ). It would be interesting to compare this to other proposed characterizations of chaos in quantum field theory.
Indeed, our asymptotic formulas also play an important role in the Eigenstate Thermalization Hypothesis (ETH) [38,39], which states that in a chaotic theory the matrix elements of an operator for states i and j of fixed energy density in a large volume thermodynamic limit. Here, f O and g O are smooth functions of energy related to the microcanonical one-and two-point functions, and R ij is a random variable of zero mean and unit variance; if the one-and two-point functions are of order one, then f O is of order one and g O of order e −S/2 . In a scale-invariant theory, the large volume thermodynamic limit is equivalent to a large energy limit at fixed volume, which is the heavy limit we have been studying. When O is a local operator, ETH is a statement about the statistics of structure constants (see [19,[40][41][42][43][44][45][46][47][48][49][50][51][52][53] for more detailed discussion of ETH in the context of conformal field theories).
In a two dimensional CFT it is natural to take this to be a statement about primary operator OPE coefficients; descendant state OPE coefficients are completely determined by Ward identities, and hence by definition do not provide any information about the chaotic dynamics of the theory. Indeed, dynamics within a particular Virasoro representation will never thermalize due to the infinitude of conserved quantities. At infinite central charge this distinction is largely irrelevant, as the typical high energy state is -if not a primary state itself -then very close to one. For finite c CFTs, however, these considerations become important and the most sensible definition of ETH is one where (1.24) is interpreted as a statement about the statistics of primary operators.
In this case our asymptotic formulas for C Oii and C Oij 2 determine the functions f O and g O : Thus our formulas provide a precise formulation of ETH for CFTs with finite central charge c. It is important to emphasize that our asymptotic formulas predict the form of the smooth functions f O and g O (and provide the consistency check that |C Oij | 2 ∼ e −S ), but say nothing about the statistics of the remainder term R ij . The statement that R ij has zero mean and unit variance, severely constraining the fluctuations of matrix elements, is an important component of ETH and one which is invisible using the techniques of this paper. Indeed, all CFTs are crossing invariant, so no argument based on crossing symmetry alone can distinguish between a chaotic and an integrable theory. Our arguments establish the universal behaviour of averaged OPE asymptotics, and so are not sensitive to the fine-grained statistics of individual eigenstates. Some additional input must be included in order to use crossing arguments to probe this more refined structure of ETH. One might hope that assuming no additional currents would be sufficient to ensure the theory is chaotic, but while we make use of this assumption to establish universal formulas that apply at large spin, it is not clear how to use it to say more about statistics of OPE coefficients relevant for ETH.
An important feature of the ETH formula is that it is expected to govern the statistics of OPE coefficients in the Heavy-Heavy-Light limit, where the operators i and j are heavy but O is fixed. On the other hand, our asymptotic formulas for OPE coefficients smoothly interpolate between this limit and the Light-Light-Heavy and Heavy-Heavy-Heavy regimes. This immediately suggests that the ETH conjecture (1.24) should be generalized to these regimes as well. It also suggests that a version of ETH should hold not just at large dimension, but also for operators with large spin at fixed twist. We expect this extended regime of validity to be a special feature of CFTs (where there is a state-operator correspondence) rather than general QFTs. One intriguing aspect of this conjecture is that while the Heavy-Heavy-Light version of ETH has a natural thermodynamic interpretation -it captures the intuitive notion that in a chaotic theory every state should be approximately thermal in the thermodynamic limit -the interpretation of equation (1.24) in this extended regime is much more mysterious.
A second important point is that the behaviour of the two functions f O and g O is quite different in two dimensional CFTs from their behaviour in higher dimensions. In a higher dimensional theory the diagonal terms in the OPE coefficients are exponentially larger than the off-diagonal terms: ) is exponentially suppressed. In a two dimensional CFT this behaviour is modified, as f O itself is exponentially small. This can be seen by noting that at high temperature a thermal one point function becomes a one point function on the cylinder S 1 ×R, which is -by the usual radial quantization map -conformally equivalent to the plane. Hence thermal one point functions will be exponentially small at high temperature, with exponent determined by the dimension of the lightest operator which couples to the operator O. Thus we expect that the off-diagonal terms for a generic primary operator O will be exponentially suppressed relative to the diagonal terms, but with an exponent that is not ) but rather is determined by the size of the gap in the theory. This is a consequence of the strange fact that in CFT 2 thermal one point functions vanish at high temperature, while thermal two point functions do not.
In the extreme case -where the size of the gap in the theory is sufficiently large -the off-diagonal terms will be the same size as the diagonal terms. We will clarify this statement in section 6 and show that this will occur when the lightest non-vacuum primary that couples to O has dimension greater than or equal to c−1 16 (in the case that this lightest operator is a scalar). This fact will be a simple consequence of the structure of the corresponding crossing kernels. A theory with a gap of size O(c) would be interpreted as a theory of pure gravity in AdS 3 in the large c limit, as the spectrum of perturbations around empty AdS would include only boundary gravitons (i.e. descendants of the identity operator). We therefore come to a remarkable conclusion -a theory of pure gravity in AdS 3 is precisely one where the off-diagonal terms in ETH are not suppressed relative to the diagonal ones. This provides an intriguing link between black hole dynamics and quantum chaos. A similar conclusion was recently reached for JT gravity in two dimensions in [54].

Discussion
Before moving on to a derivation of our formula, we discuss a few final interesting features of our result.
We first note that our asymptotic formula (1.5) is very similar to the DOZZ formula for the structure constants of Liouville theory [55,56]. The exact relation is: where µ is the Liouville cosmological constant and The only difference is a universal normalization factor. That the structure constants of Liouville theory can be written in terms of the vacuum fusion kernel follows purely from Virasoro representation theory; indeed there is a precise sense in which Liouville theory can be regarded as the diagonal A-series Virasoro minimal model for c > 1 [57].
One might therefore be tempted to interpret our result as describing the precise sense in which Liouville theory captures the universal dynamics of heavy operators, a point of view that has been advocated in the context of holographic theories in [58,59]. We should not, however, interpret this too literally, since C DOZZ has a very different interpretation to C 0 : Liouville theory has only scalar primary operators, with OPE coefficients C DOZZ , whereas our results give OPE coefficients for all spins, from a square of C 0 (product of left-and right-moving). Indeed, a unitary compact CFT with c > 1 will necessarily contain primary operators with arbitrarily large spin [60], and Liouville theory falls outside the scope of our asymptotic formula precisely because it is not compact. Rather, we regard the relation (1.26) as a consequence of the fact that Liouville dynamics is governed by precisely the same Virasoro representation theory that determines our asymptotic formula.
While our asymptotic formula (1.5) might look arbitrary, it is in fact extremely highly constrained if we assume analyticity. In fact, equation (1.5) is almost completely determined by its analytic structure and simple physical considerations. To see this, we note that C 0 (P i , P j , P k ) is a meromorphic function of its arguments which has and is invariant under reflections P i → −P i and permutations of the (P i , P j , P k ). These zeros occur precisely when O i has has a null Virasoro descendant at level rs. The poles occur precisely when the weights of O i are equal to the weights of a double twist operator built out of O j and O k [20]. A meromorphic function is uniquely determined by its poles and zeroes, up to the exponential of a polynomial. Thus in retrospect, once one postulates the existence of a meromorphic function that interpolates between the asymptotic regimes, one could have completely determined C 0 (P i , P j , P k ) up to the exponential of a polynomial in the (P i , P j , P k ), simply by demanding the existence of zeros at null states and poles at double twist operators. One might even argue that this polynomial must be a constant in order to guarantee the convergence of the operator product expansion (although this argument is subtle because we are varying the (P i , P j , P k ) as complex variables independently). This suggests that the function C 0 (P i , P j , P k ) can be completely determined by analyticity and simple physical constraints. It would, naturally, be quite exciting to apply this simple strategy to other conformal field theories.
We will now move on to the derivation of our result. We begin in section 2 with a detailed warm-up exercise, where we describe the derivation of various versions of Cardy's formula using the crossing kernel for modular transformations. We then proceed to discuss the Moore-Seiberg procedure in more detail in section 3.1, before turning to the elementary crossing kernels in sections 3.2 and 3.3. We apply this to compute higher genus crossing kernels and OPE asymptotics in section 4. Large central charge limits, and comparisons to the literature, are discussed in section 5. Section 6 discusses the computation of the average value of the light-heavy-heavy OPE coefficients using the modular covariance of torus one-point funcitons. We relegate some details of the elementary crossing kernels and their asymptotics to the appendices.

Cardy's formula from crossing kernels
To illustrate the main idea of the paper, we first revisit the derivation of the Cardy formula for primary states (and its large-spin version [20,[61][62][63][64]) using the modular S-matrix, a strategy which we will generalize in later sections. We follow the presentation and notation of [64], which contains some more details and applications. The relationship between the Cardy formula and the modular S-matrix was first elucidated in [65].

Natural variables for Virasoro representation theory
As a preliminary, we introduce a parameterization of the CFT data that turns out to be natural for the representation theory of the Virasoro algebra. The central charge c can be written in terms of a "background charge" Q or "Liouville coupling" b as (2.1) We will make the choice that c > 25 corresponds to 0 < b < 1, while 1 < c < 25 corresponds to b a pure phase in the first quadrant. To label Virasoro representations we use a variable P , or sometimes the equivalent α = Q 2 − iP , which is related to the more usually seen conformal weight by and similarlyP orᾱ in place ofh. Two things about this parameterisation should be noted. First, it is redundant, being invariant under the reflection reflections P → −P (or α → Q − α). Secondly, it naturally splits unitary values of the weights (h ≥ 0) into two distinct ranges: h ≥ c−1 24 corresponds to real P (or α ∈ Q 2 + iR), and 0 ≤ h < c−1 24 , which corresponds to imaginary P (or α ∈ (0, Q 2 )).

The partition function and density of primary states
Now consider the torus partition function of a compact 10 CFT with c > 1. The partition function encodes the spectrum of the theory, admitting a decomposition into Virasoro characters: 3) The sum runs over Virasoro primary states labelled by i, with conformal weights labelled by P i ,P i , and the nondegenerate Virasoro characters χ P packaging together all states in a conformal multiplet are given by where q = e 2πiτ . The identity character χ 1 is distinguished because the corresponding representation is degenerate (L −1 annihilates the vacuum state), so If there are any other conserved currents (operators with h = 0 orh = 0) in the theory, we should similarly use this degenerate character for either the left-or right-moving half.
We can rewrite the character decomposition of the partition function in terms of a density of primary states ρ, writing where ρ is a distribution given by a sum of delta-functions δ(P − P i )δ(P −P i ) for each primary.
Using the reflection symmetry, we make the choice that ρ is an even distribution, so each primary contributes four terms related by reflections in P,P , and we introduce the factors of 1 2 in the integrals to avoid overcounting. It is also convenient to always use nondegenerate characters in the expansion, so for the identity (and other currents, if present), ρ includes delta-functions with negative weight at P,P = ± i 2 b −1 − b to subtract the null descendants. Finally, we note that ρ is a somewhat unconventional distribution, since it has support at imaginary values for operators with h,h < c−1 24 . This is nonetheless rigorously defined if we integrate against analytic test functions, of which the characters should form a complete set in an appropriate topology (see [64] for more details).

The modular S-transform
Locality of a CFT implies invariance of the torus partition function under the modular S-transform, Z(−1/τ, −1/τ ) = Z(τ,τ ), which in turn constrains the allowed CFT spectrum. We will reformulate this constraint directly on the density of states ρ(P,P ). To do this, first note that the modular S-transformation τ → −1/τ acts on individual characters as a Fourier (cosine) transform in the momentum: The kernel of this integral transform is the 'modular S kernel' S P P [1], where the [1] label indicates that the partition function is a trivial example of the torus one-point function of the identity operator, with the generalization to nontrivial operators to follow. The notation emulates the situation in rational CFTs, where there are a finite number representations, so the modular kernel S[1] becomes a finite-dimensional matrix.
Given a function Z(τ,τ ) expanded in characters using a density of primary states as in (2.6), we can take a modular S-transform and use the kernel (2.7) to rewrite the transformed characters: Exchanging order of integration between the primed and unprimed variables, we can interpret this as an expansion (2.6) of the modular transformed function with a transformed density of primary states:ρ (P ,P ) = dP 2 Since the partition function uniquely determines the spectrum, this equation expresses the modular S-transform as a Fourier transform acting on the density of primary states ρ. 11 In particular, a physical spectrum corresponding to a modular invariant theory is invariant under this Fourier transform: Modular invariance ⇐⇒ρ(P,P ) = ρ(P,P ) (2.10) From (2.9), we can think of the modular S-matrix as the contribution of a single operator to the density of states in the transformed channel. The only exception to this is the degenerate 11 We can strip off the characters since, by assumption, they are complete in the relevant space of test functions.
This just means that a distribution is defined by its integral against all characters, i.e. its corresponding partition function. The same applies for the more complicated transforms we encounter later.
representations with h = 0 (orh = 0), so we introduce an 'identity S-matrix' which encodes the contribution of such a degenerate state. The density of states S P 1 [1]SP 1 [1] dual to the vacuum will be of central importance for us.

Cardy formulas
The density of states ρ(P,P ) is a sum of delta-functions for each primary operator, so for a modular invariant spectrum, by taking the S-transform we can instead write it as a sum over modular Smatrices: We have not explicitly included any nontrivial primary currents, which would contribute the identity S-matrix in P and the nondegenerate S-matrix inP or vice versa. If such currents are present, it is most natural to organise the states into multiplets of an extended algebra, under which all currents are descendants of the vacuum, and use the modular S-matrix pertaining to the extended algebra.
Now consider this sum in the limit of large P and/orP . In this limit, the relative importance of the terms is determined by P i ,P i : for a state with 0 < h < c−1 24 , the relevant S-matrix is exponentially suppressed relative to the vacuum: From this, we find (at least naively; we revisit this more carefully at the end of the section) that the density of states at large P,P asymptotically approaches the vacuum S-matrix: This is of course nothing but Cardy's formula for the asymptotic density of primary states at large dimension, correct up to corrections exponential in √ h, √h coming from the lightest non-vacuum primary state. 12 With this derivation, it becomes clear that the Cardy formula (2.14) is also valid in a 'large spin' regime where we fix h and takeh → ∞ [20,[61][62][63][64]. In this limit, the relative suppression (2.13) of non-vacuum blocks is controlled by 'barred' dimension only, so we require the additional assumption of a 'twist gap' (h is bounded away from zero for all non-vacuum operators, so in particular there are no extra conserved currents). In this limit, for any fixed h > c− 1 24 , the density of states grows with spin as e 2π c−1

6
, with a prefactor determined by ρ 0 (P ); for any h < c−1 24 , this prefactor is formally zero, which means that the density grows more slowly (perhaps still exponentially in √ , but with a smaller coefficient).
We therefore find that the asymptotic spectrum of CFTs is quite generally determined by the simple formula which we refer to as the 'universal density of states' for c > 1 compact CFTs without extended current algebras. Our derivation emphasizes that this object comes from the representation theory of the Virasoro algebra, describing the decomposition of the trivial representation after modular transformation. 13 In the remainder of the paper, we will show that another representation theoretic object similarly controls the OPE coefficients in a variety of limits. Now, our argument for the asymptotic formula (2.14) was very imprecise, and indeed the result is simply false if interpreted literally, so we briefly discuss the sense in which it holds. The equation (2.12) expressing the density of states as a sum of modular S kernels does not converge in the usual sense (and uniform convergence would be necessary for our argument to apply immediately), and since ρ is a sum of delta functions, it does not have smooth asymptotic behaviour. Rather, the sum converges in the sense of distributions (it should converge when integrated against any test function), which requires some 'smearing', and the the asymptotic formulas should be interpreted accordingly. The most conservative statement is that the formula applies in an integrated sense: the total number of states below a given energy or spin is asymptotic to the integral of the Cardy formula (see [35][36][37] for a more detailed discussion and rigorous results). In the particular case of the Cardy formula, a very interesting recent paper [35] has shown that if the averaging window is of fixed width in the large dimension limit, corrections due to the finite size of the averaging window only affect the order-one term in the expansion of the logarithm of the density of states at large dimension. For chaotic theories, we expect the far stronger statement that the asymptotic formula applies to a microcanonical density of states averaged over a small window (we require only that the window contains parametrically many states, so its width can shrink as fast as e −S ); this is a consequence of the eigenstate thermalization hypothesis (ETH) [38,39]. The exact interpretation of our asymptotic formulas is not the focus of this paper, so we will henceforth leave this aspect for future study.

The Moore-Seiberg construction
In two dimensional CFTs, the most general correlation function of local operators, comprising n operators O 1 , . . . , O n on a surface Σ g of arbitrary genus g (which we denote by G g,n ), can be formulated entirely in terms of the basic data of the theory, namely the spectrum and OPE coefficients of primary operators. 14 Note that this is far better than the situation in higher dimensions, where it is unclear how to determine general correlation functions, even on conformally flat manifolds such as the torus (S 1 ) d , in terms of data of the theory on R d . Here, we review the construction of general correlation functions, and the crossing relations required to consistently formulate the theory on an arbitrary surface.
The basic strategy is to break the surface into simple constituent pieces, separated by circular boundaries, and insert a complete set of states along each boundary. First, we insert a circle surrounding each operator insertion; by the state-operator correspondence, the operator insertion is equivalent to deleting a disc to produce a boundary, and projecting onto the corresponding state on that boundary. Label the resulting n boundaries by an index e ∈ E (for 'external') and let k e denote the operator on each boundary, falling in Virasoro representations P ke ,P ke .
We are then left with a genus g surface with n boundaries, which we can decompose into 2g + n − 2 pairs of pants (that is, topological 3-holed spheres, occasionally called 'trinions'), which we label by indices t ∈ T , by cutting along a further 3g +n−3 circles. Along each of these 3g +n−3 'cuffs' where the pants are joined to one another, labelled by an index i ∈ I (for 'internal'), we insert a complete set of states. Each term in the sum over states is then a product of amplitudes for each pair of pants, which can be conformally mapped to sphere three-point functions, and thus is fixed by the structure constants of the corresponding Virasoro primaries.
The contribution of descendants propagating along each cuff is completely fixed by Virasoro symmetry, proportional to the OPE coefficients of the primaries from which they descend. We may therefore package together the contribution of all descendants of a particular set of primaries (labelled by {k i } i∈I ) together, into a 'conformal block'. In other words, this is the sum over states described above, but restricting the states along each cuff i to some chosen multiplet of the symmetry, in the representation P k i ,P k i . By construction, the blocks are purely kinematic, depending on the surface Σ g 15 and the pair of pants decomposition 16 , the locations of operator insertions, 14 This excludes correlation functions on surfaces with boundaries and/or nonorientable surfaces, both of which require additional data. 15 The blocks (and the correlation functions) depend on the metric on the surface in two distinct ways. Firstly, there are finitely many moduli (the 3g + n − 3 complex parameters σ) determining the metric and operator locations up to equivalence under diffeomorphisms and Weyl transformations g → e 2ω g, upon which the correlation function and blocks depend nontrivially. Secondly, there is the choice of metric within each such conformal class, which changes the correlation function only by kinematic factors: the conformal anomaly, and local conformal factors for each operator. 16 In fact, the decomposition into pairs of pants is not quite sufficient to determine the blocks. A Dehn twist, a relative rotation by angle 2π around a cuff, introduces phases e 2πi(h−c/24) and e −2πi(h−c/24) in F andF respectively, so extra topological data is needed to keep track of these relative phases. When we combine blocks into the product FF with c−c ∈ 24Z (here, we always have c =c) and integer spin (h−h ∈ Z), this ambiguity cancels. We also require this extra data to fix an ambiguity in ordering of OPE coefficients, which pick up a sign under odd permutations of the central charge, and the conformal weights P ke ,P ke and P k i ,P k i labelling the representations of the n external and 3g + n − 3 internal operators. Since the conformal algebra factorizes into holomorphic and antiholomorphic sectors, the blocks also factorize in this way, so we can write them as a product FF: F = F[P e ](P i |σ) depends on the n external representations P e (for e ∈ E), the 3g +n−3 internal representations P i (for i ∈ I), and kinematic variables collectively labelled by σ; we similarly haveF =F[P e ](P i |σ). For Euclidean correlation functions, the kinematic variables σ are (once a conformal frame has been specified) 3g − 3 + n complex numbers parameterising the complex structure moduli of Σ g and complex coordinates of the locations x e of operator insertions, andσ are complex conjugates of σ; more generally, σ andσ need not be related in this way (for example, for Lorentzian kinematics they often become independent and real 'lightcone' coordinates).
The dynamical data of the theory appears through the spectrum of operators, and the OPE coefficients C ∂t for each pair of pants t ∈ T , where ∂t denotes a triple of indices k e or k i labelling the primary operators propagating in the three cuffs bounding t. The result is an expression of the following form for the correlation function: The last line defines a 'spectral density' ρ spec analogous to the density of states in (2.6), now with several internal operators, weighted by OPE coefficients; the 'reflections' refers to an additional three terms with P k i → −P k i and/orP k i → −P k i so that ρ spec is an even function of these variables. This general case is rather abstract, but we will ultimately be interested in a few simple instances, for which we write concrete versions of (3.1) in later sections; for now, one illustrative example is shown in figure 4.
While our quick argument is sufficient to demonstrate that the conformal blocks exist, and are determined by Virasoro symmetry, it is another matter entirely to actually compute them. Closed form expressions are known only in very special cases. The most efficient way to compute them numerically is via recursion relations [66][67][68][69], but even these are organised using different kinematic parameters and conformal frames for different channels, so it remains a challenging task to formulate crossing symmetry using them. The technical obstacles remain formidable even with indices if the total spin is odd: C π(1)π(2)π(3) = sgn(π) 1 + 2 + 3 C123 for π ∈ S3. Relatedly, note that the condition for unitarity is C123C321 ≥ 0, so for total odd spin 1 + 2 + 3, C123 is pure imaginary.
The OPE coefficients C e 1 e 2 i 1 , C i 1 i 2 i 2 are associated with the pairs of pants labelled A, B respectively, with ∂A = (e 1 , e 2 , i 1 ) and ∂B = (i 1 , i 2 , i 2 ).
the simplification of large central charge, where there are still few analytic results, and one must also confront the possibility of Stokes phenomena that are not well understood [14,20,45]. Fortunately, we will see later that for our purposes, it is not required to know anything about the blocks directly! While we have a systematic procedure for constructing the correlation functions by sewing pairs of pants, it is far from unique, since there are infinitely many distinct ways to decompose a surface into pairs of pants. We refer to a choice of decomposition as a "channel", each channel giving rise to a corresponding conformal block decomposition of the correlation function. Consistency requires that the conformal block decompositions (3.1) give the same result for the correlation function, whichever channel we choose to use. This is a generalized statement of crossing symmetry or modular invariance, which imposes strong constraints on the data of the CFT.
To formulate this notion of crossing symmetry more directly in terms of the data of the CFT, we must first consider how to relate the block decompositions in different channels. Following the work of Moore and Seiberg [26,70], we can relate any two of the infinite collection of possible channels by repeated composition of a small number of elementary 'moves', which can be described by purely topological relationships between pair of pants decompositions. We will make use of two such moves, 'fusion' and 'modular S' (or just S), illustrated and described in figure 5, along with an example where the two are composed. 17 Now, we may informally think of the set of conformal blocks in any particular channel, labelled 17 For a complete set of moves, we also require 'braiding', which acts on any two joined pairs of pants by adding a half twist to the separating cycle. The extra topological data required to fix the phases from footnote 16 is also necessary to uniquely prescribe the fusion/braiding moves among the infinitely many ways to split a sphere with four boundaries into two pairs of pants. It was only recently proved in [71] that fusion, braiding and S moves form a complete set of generators to relate any channels. We are grateful to Xi Yin for bringing [71] to our attention. fourpunctured sphere and the once-punctured torus, or more generally anywhere that these appear as pieces of any decomposition of a surface. The associated crossing kernels relate Virasoro conformal blocks in the corresponding channels. The fusion kernel (top) relates sphere four-point Virasoro blocks in the S-and T-channels, and the modular kernel (middle) relates torus one-point blocks in modular S-transformed frames. In the final line, we show an example relating two channels in the torus two-point function G 1,2 by composing these moves. by the set of internal representations {P i } i∈I , as forming a basis for correlation functions. Given a second channels, with a new set of internal cuffs I , there should be a change of basis matrix to the new variables {P i } i ∈I , relating the two corresponding sets of blocks. From this point of view, it is plausible that the conformal blocks in any two channels can be related by an integral transform, with some 'crossing kernel' K: We allow for a change of kinematic variables σ → σ because natural variables (e.g. those appropriate for recursion relations) may be different in each channel. This equation is a generalisation of the relationship (2.7) between characters in channels related by a modular transform, where the kernel K[P e ] was given by the modular S-matrix S [1]. Furthermore, if we relate two channels by a composition of the elementary moves described above and in figure 5, the crossing kernel itself can be built by composing the kernels for the elementary moves. 18 Remarkably, not only do these kernels exist, but for the elementary moves they are known in closed form! This is surprising and powerful when we consider how little analytic control we have regarding the conformal blocks. We will introduce these elementary kernels in the following subsections.
If the blocks are to be regarded as basis vectors, then the corresponding components of any particular correlation function are the OPE coefficients, as encoded in the spectral densities ρ spec . Given a change of basis matrix K, we can therefore relate the spectral densities in two channels by an integral transform with kernel K, generalising (2.9): 19 ρ spec (P i ,P i ) = i∈I dP i 2 This is a direct statement of crossing or modular invariance, which makes no reference to the correlation function, the kinematics or the conformal blocks. As a corollary to the Moore-Seiberg construction, invariance under elementary moves implies invariance in complete generality, so fourpoint function crossing symmetry and torus one-point modular invariance for all operators suffice to prove consistency of a theory formulated on any surface. Nonetheless, more complicated correlation functions encode an infinite set of these constraints in a natural way, so more general crossing relations are still useful to learn about the theory, as we will see.
The elementary moves do not act freely on the space of channels, so they themselves are also highly constrained by the relations between moves. For example, we can consider a five-point 18 For this, it is important that the same kernels apply for the elementary moves when the external operators are descendants of a given primary (which we sum over when these external legs become internal legs for a more complicated correlation function). This follows because descendant correlators can be obtained by acting with differential operators which are independent of the channel decomposition. 19 This requires that the space of blocks is not overcomplete, so there is no nontrivial linear combination of blocks that gives the zero correlation function. This is extremely plausible; for example, the short distance behaviour of the correlator should be determined by the minimal dimension on which the spectral density has support.
function, made up of three pairs of pants, joined with two internal cuffs. Applying fusion moves alternately on each of the cuffs, we return to the original channel after five moves, and imposing that this combination of five F's acts trivially gives us the 'pentagon identity'. Assuming analyticity of the kernels, along with properties of degenerate representations, such identities suffice to determine the kernels uniquely [21,22,27].
The considerations we have described here have been understood and exploited for several decades, but largely in the context of rational models, for which only finitely many representations appear, so the kernel K is a finite-dimensional matrix (for a review, see [72,73]). When applied to irrational theories, the technicalities are somewhat more subtle, and our aims must be more modest (we should certainly not hope to classify and solve all theories!), but this point of view nonetheless seems to be the most powerful way to formulate the constraints of crossing, even for irrational CFTs.
For the remainder of the section, we move beyond the abstract discussion to discuss more concretely the kernels for the elementary fusion and S moves, and their salient properties.

Elementary crossing kernels 1: fusion
The first of our elementary crossing moves arises when we consider the sphere four-point function where z,z denote the conformal cross ratios. By successively taking the OPE between pairs of operators (corresponding to inserting a complete set of states in radial quantization), this can be written as sum over products of three-point functions of pairs of the external operators and intermediate operators: where F P 2 P 1 P 3 P 4 (P |z) are the S-channel Virasoro blocks. In the second line we have written this decomposition as an integral against the S-channel 'spectral density' ρ s (leaving implicit the dependence on external operators), which for a discrete spectrum is a sum of delta-functions weighted by the OPE coefficients C 12s C 34s ; this is an example of the general decomposition (3.1), analogous to (2.6) for the partition function.
For this expression, we have chosen to take the OPE between operators O 1 and O 2 , giving the S-channel expansion (equivalently, we decompose the four-holed sphere into two pairs of pants, with cuffs 1, 2, s and s, 3, 4). But the result must be the same if we instead choose to use the T-channel expansion, taking the OPE of operators O 2 and O 3 . This associativity of the OPE leads to the crossing equation: dP s 2 dP s 2 ρ s (P s ,P s )F P 2 P 1 P 3 P 4 (P s |z)F P 2P1 P 3P4 (P s |z) The T-channel spectral density ρ t appearing here is similar to ρ s , but weighted by different OPE coefficients C 41t C 23t . This is the crossing relation between the two pair of pants decompositions of the four-holed sphere pictured on the top line of figure 5.
Continuing to follow the philosophy we applied to modular invariance in section 2 and generalised in 3.1, we will rewrite the crossing equation directly as a transform relating S-and T-channel spectral densities. To do this, we require an object expressing the decomposition of the T-channel Virasoro blocks in terms of S-channel blocks. This is the fusion kernel (or crossing kernel, or 6j symbol), with the defining relation which is analogous to the relation (2.7) between the modular S-matrix and characters.
It is not a priori obvious that such an object should even exist, but it is a remarkable fact that it does, and an even more remarkable fact that it has been explicitly constructed by Ponsot and Teschner [21,22,27]. A closed form expression is given in (A.2) in appendix A, which contains the necessary technical results, many of which were derived in [20]. We discuss the most relevant properties in a moment.
With the fusion kernel F in hand, we can now write the crossing equation as a transform relating the spectral density in each channel, just as in (2.9): ρ s (P s ,P s ) = dP t 2 dP t 2 F PsPt FP sPt ρ t (P t ,P t ) (3.8) Here we have suppressed the notation labelling the external operators, but it should be borne in mind that the kernel of this transform depends on the external operator dimensions P 1,2,3,4 . 20 Like the modular transform of the vacuum (2.15) was the most important object in section 2, the fusion transform of the vacuum will play a correspondingly central role for our new asymptotic formulas. This can only appear in the case that the external operator dimensions are equal in pairs, 20 There is a similar transform to write the S-channel spectral density in terms of U-channel data (with density weighted by OPE coefficients C13uCu24) using the braiding kernel. This is a fusion kernel conjugated by phases, which become signs for integer spins: The resulting signs for odd spins are much the same as for U-channel inversion in [74], for example. P 1 = P 4 and P 2 = P 3 (in the T-channel). In that case, the fusion kernel simplifies 21 [20], and we find it convenient to write it as F Ps1 P 2 P 1 P 2 P 1 = ρ 0 (P s )C 0 (P 1 , P 2 , P s ), (3.10) where ρ 0 (P ) is the density of states appearing as the modular S-transform of the vacuum (2.15). It turns out that C 0 is then symmetric under the exchange of all three of its arguments, and has a simple explicit expression in terms of the special function Γ b : The in the numerator denotes the product of the eight combinations related by the reflections P k → −P k . The function Γ b is a 'double' gamma function, which is meromorphic, with no zeros, and with poles at argument −mb−nb −1 for nonnegative integers m, n (similarly to the usual gamma function, which has poles at nonpositive integers).
If external operators are sufficiently light (specifically, α 1 + α 2 ≤ Q 2 or α 3 + α 4 ≤ Q 2 ), the fusion kernel has a new subtlety, arising from poles in P s that cross the real axis. In order to maintain analyticity in the parameters, the contour in the decomposition (3.7), which is implicitly taken to run along the real P s axis, must be deformed. We can take the deformed contour to run along the real P s axis, but must additionally include circles surrounding the poles which have crossed the axis, contributing residues. This gives rise to a finite sum of S-channel operators with imaginary P s (h s < c−1 24 ) in the decomposition of the T-channel conformal block. See [20] for more details. We can describe this by including a sum of δ-functions supported at imaginary P s in the kernel F [64].
The non-vacuum kernels with T-channel dimension h t > 0 will be important for us only to compare their asymptotic contribution to the S-channel. The key result, established in [20], is precisely analogous to (2.13) for the modular S-matrix: This result is accurate up to a factor independent of P s , see equation (B.3).

Elementary crossing kernels 2: modular S
The second elementary move is a modular transform applied to one-point functions of Virasoro primary operators on the torus where τ labels the complex structure of the torus, and the conformal weight of the external operator . The translation invariance of the torus means that the correlation function is independent of the location of the operator.
Generalizing the modular invariance of the torus partition function (which is the special case where the external operator O 0 is the identity), G 1,1 transforms covariantly under modular transformations, in particular the S-transform τ → −1/τ : 1 (τ,τ ) (3.14) The factor τ h 0τh0 = |τ | ∆ e −i 0 arg τ comes from rescaling and rotating the torus so the thermal circle becomes the spatial circle 22 . It occurs because the definition of the one-point function implicitly makes a choice of metric on the torus, namely the flat metric in which the spatial circle has length 2π; after modular transform, the cycle interpreted as the spatial circle changes, and hence the metric is rescaled. The discussion of subsection 3.1 implicitly assumed that we use the same metric for every channel, so there were no such factors.
We can write this correlation function in terms of the usual CFT data by inserting a complete set of states on the spatial circle, and collecting the contributions from each Virasoro representation into torus conformal blocks F[P 0 ](P |τ ) with internal primary weight P : 23 In the second line we have defined the thermal spectral density ρ[O 0 ] for the external operator O 0 , consisting of δ-functions for each internal operator with coefficient C OOO 0 , analogously to (2.6) and (3.5), and another special case of (3.1).
Reprising the same strategy, we will recast modular covariance as invariance of ρ O 0 (P,P ) under an S-transform, directly generalizing (2.9) for the density of states. To do this, we introduce the torus one-point kernel, the object which decomposes torus one-point conformal blocks into the modular-S transformed frame: Given this object, the modular S transformation acts on the spectral density as Performing this transform twice corresponds to rotating the torus through an angle π and gives a factor (−1) 0 , from which we conclude that G1,1 is zero for operators with odd spin, since any nonzero expectation value would break this Z2 symmetry. 23 Explicitly, F[P0](P |τ ) = TrP (e 2πiτ L 0 O0), where the trace is taken over the representation of the Virasoro algebra with weight labelled by P , normalising the expectation value of O0 in the lowest weight state to unity. and modular covariance of G 1,1 is stated asρ Once again, we are fortunate to have an explicit expression for the modular S-kernel due to Teschner [28] (see also [75,76]). We reproduce the precise formula in (A.8) of appendix A, where we demonstrate various important properties of the kernel, the most salient of which we now state.
Most important for us is that, like the fusion kernel, the modular S-kernel simplifies when the external operator is the identity, taking h 0 → 0 (P 0 → i Q 2 ). In this limit, we find that recovering the modular S-matrix for non-degenerate torus characters (2.7) from section 2. Note that the kernel relevant for inversion of the vacuum character, namely as in equation (2.11), is not recovered by a straightforward α → 0 limit of (3.18), because the degenerate vacuum character is not given simply by the h → 0 limit of the non-degenerate character. This is unlike the fusion kernel, where the identity kernel is obtained by an α t → 0 limit of the generic kernel with external operators identical in pairs: in that case the null descendants continuously decouple in the h t → 0 limit. 24 The second important property for us will be the behaviour of the kernel in the large dimension limit P → ∞, which we normalise by the vacuum S-matrix S P 1 [1] ≈ e 2πQP for comparison: These formulas, derived in appendix B.2, are accurate up to a constant (that is, independent of P ) factor. Crucially, this ratio is exponentially suppressed at large P , as long as h > 0. This result reduces to (2.13) when the external operator is the identity.

OPE asymptotics from crossing kernels
Now that we have formulated the consistency conditions as statements about transforms of spectral densities, it is simple to repeat the arguments of section 2, which led to the Cardy formula, in a variety of new situations. Specifically, we study crossing for the three correlation functions which decompose into two pairs of pants, and extract asymptotic formulas for squares of OPE coefficients. 24 Since 2h h |O0|h , we can take a vacuum limit in which the null descendant is decoupled by fixing h = − 1 2 h0(h0 − 1) ∼ 1 2 h0 and taking h0 → 0. Indeed, taking a limit α0, α → 0 with α ∼ 1 2 α0, one can explicitly check that S P P [P0] → SP 1 [1] (for a derivation, see (A.14) and surrounding discussion). In contrast, for the fusion kernel we can take a more direct limit because the matrix elements ht|O2|h2 go to zero as ht → 0.

Sphere four-point function: heavy-light-light
For our first example, we study the constraints of crossing symmetry for the four-point function of pairwise identical operators. We have already introduced all the required definitions and results in subsection 3.2; in particular, we have the fusion transformation (3.7) relating S-and T-channel spectral densities, ρ s (P s ,P s ) = dP t 2 dP t 2 F PsPt FP sPt ρ t (P t ,P t ), (4.1) and the result (B.3) that the fusion kernel for operators of positive dimension h t > 0 is exponentially suppressed compared to the identity at large P s . This is precisely the same situation we had for the modular S-matrix when we derived the Cardy formula (2.14), so repeating that argument gives us an analogous result for the S-channel spectral density: This finding is not new, but was one of the main results of [20]. The focus of that paper was the large spin limit of fixed P s andP s → ∞, but we here emphasise that this also holds for large dimension (both P s ,P s → ∞), in fact more generally since we need not assume existence of a twist gap in that case.
In higher dimensional CFTs, the analogous operation of expanding the T-channel identity block (which is simply the product of two-point functions) into the S-channel defines the spectrum and OPE coefficients of 'double trace' operators of mean field theory (MFT). The identity fusion kernel can therefore be thought of as a deformation of MFT to include Virasoro symmetry, and the corresponding spectral data was accordingly dubbed "Virasoro mean field theory" (VMFT) in [20]. The large-spin universality of the identity kernel is the d = 2 analogue of the result for d > 2 that there exist 'double-twist' operators whose dimensions and OPE coefficients approach those of MFT at large spin [24,25].
The analogy with double-twist operators in higher dimensions is sharpest for h < c−1 24 . If the external operators O 1 , O 2 have sufficiently low twist, then there are a finite number of trajectories that asymptote at large spin to discrete values of h < c−1 24 ; see [20] for details. There is also a continuum starting at h = c−1 24 described by the smooth VMFT OPE density, which has no known analog in higher dimensions.
For h > c− 1 24 , either fixed in the large spin limit or taken to be large simultaneously withh, the asymptotic spectrum encoded in the fusion kernel is a smooth function of P,P . Just as for the Cardy formula explained in section 2, (3.5) should then be interpreted as a microcanonical statement about the asymptotic spectral density integrated over a window of energies. We can translate the result to a microcanonical average of OPE coefficients, by dividing by the Cardy formula (2.14) giving the asymptotic density of primary states ρ(P s ,P s ) ∼ ρ 0 (P s )ρ 0 (P s ) in the relevant limits. Writing the identity fusion kernel in the form (3.10) of the universal density ρ 0 (P s ) times C 0 (P 1 , P 2 , P s ), we find that C 0 gives the microcanonical average of the OPE coefficients: |C 12s | 2 ∼ C 0 (P 1 , P 2 , P s )C 0 (P 1 ,P 2 ,P s ), P s ,P s → ∞. This result is valid for any two fixed operators O 1 , O 2 , averaging over operators O s in either a large dimension or large spin limit.
The asymptotic form of C 0 in this limit was computed in [20]: , is a special function that appears in the large-argument asymptotics of Γ b ; see appendix A of [20] for more details. The first factor exactly cancels a similar factor in the conformal blocks (F ≈ (16q) hs [66]), ensuring that the block expansion has the correct domain of convergence. A formula of this form for the asymptotics of the averaged heavy-light-light structure constants was first obtained in [9]. In that paper, the authors used the asymptotics of the Virasoro four-point blocks in the heavy limit h s → ∞ [66], subsequently taking a z → 1 limit to reproduce the OPE singularity from the T-channel identity operator. Their result matches the leading asymptotics of our formula (4.4) when written in terms of the conformal weights and central charge (as in equation (1.20)); we find new terms appearing at subleading order arising from a subtlety in the order of h s → ∞ and z → 1 limits. Working directly with the spectral densities allows us to avoid such difficulties in studying conformal blocks.

Torus two-point function: heavy-heavy-light
For our second example, we study the two-point function of identical Virasoro primaries on the torus: There are two qualitatively distinct ways to decompose such a correlation function into conformal blocks. Firstly, we can take the OPE between the two operators and insert a single complete set of states around a cycle of the torus, which we call the OPE channel. Secondly, we can insert two complete sets of states between the operators on each side of the thermal circle, which we call the necklace channel.
(4.6) The second and fourth lines define 'necklace' and 'OPE' spectral densities ρ N , ρ OPE . We have written the blocks using different kinematic variables, since the natural parameters (for recursion Figure 6: The sequence of Moore-Seiberg moves to express the OPE channel torus two-point block in terms of necklace channel blocks: a modular S, followed by a fusion move. relations, for example [69]) are different in the two channels. In the necklace channel, q 1 and q 2 encode a Euclidean time evolution, between the two operator insertions, and then round the torus back to the first operator insertion; in the OPE channel, there is only one such parameter q, along with a separation v between the operators controlling the OPE. These parameters can be related to one another, but all our results are derived without explicit reference to any kinematics.
We will consider the crossing kernel that decomposes torus two-point blocks for identical operators in the OPE channel (with internal Liouville momenta P 1 , P 2 ) into two-point blocks in the necklace channel (in the modular S-transformed frame). This sewing procedure is illustrated in figure 6, from which we see that the required kernel is simply given by the product of the torus one-point kernel and the sphere four-point kernel: K P 1 P 2 ;P 1 P 2 [P 0 ] = S P 1 P 1 [P 2 ]F P 2 P 2 P 0 P 1 P 0 P 1 ρ N (P 1 , P 2 ,P 1 ,P 2 ) = dP 1 2 dP 1 2 dP 2 2 dP 2 2 K P 1 P 2 ;P 1 P 2 [P 0 ]KP 1P2 ;P 1P 2 [P 0 ]ρ OPE (P 1 , P 2 ,P 1 ,P 2 ) (4.7) In an appropriate limit, the necklace channel data will be dominated by the identity propagating in both internal cuffs of the OPE channel, described by the identity kernel Once again, the asymptotics of C 0 universally governs the asymptotics of OPE coefficients, this time in a 'heavy-heavy-light' limit, where one operator is fixed, and the other two operators are taken to have large dimensions. Corrections to this identity contribution due to the exchange of non-vacuum primaries in the OPE channel are exponentially suppressed when we take P 1 , P 2 to be large, just as we have seen before. The technical result required to show this is ≈ e −2πα 1 P 1 (4.9) in the limit P 1 , P 2 → ∞, with either the ratio or difference of P 1 and P 2 held fixed. This result is asymmetric in P 1 and P 2 because the OPE channel does not treat operators symmetrically 25 ; it guarantees suppression of all non-vacuum blocks because α 2 cannot be nonzero unless α 1 is also nonzero. See the discussion in appendix B.1.1 for more details.
As in the case of the sphere four-point function, this result means that the necklace channel spectral density is well approximated by exchange of the vacuum Verma module in the OPE channel when the internal weights are taken to be heavy: ρ (P 0 ,P 0 ) necklace (P 1 ,P 1 ; P 2 ,P 2 ) ≈ K P 1 P 2 ;11 [P 0 ]KP 1P2 ;11 [P 0 ], P 1 , P 2 ,P 1 ,P 2 → ∞ (4.10) Thus the kernel corresponding to propagation of the identity in the OPE channel (4.9) encodes an asymptotic formula for OPE coefficients in the heavy-heavy-light regime, averaged over the heavy operators, and for any fixed light operator. Stripping off the density of states of the heavy operators, we have |C 012 | 2 ∼ C 0 (P 0 , P 1 , P 2 )C 0 (P 0 ,P 1 ,P 2 ), P 1 , P 2 ,P 1 ,P 2 → ∞. (4.11) As in the case of the sphere four-point function, in the presence of a nonzero twist gap the above asymptotic formula also holds in the large-spin regime when only P 1 , P 2 orP 1 ,P 2 are taken to be large.
Now that there are multiple internal weights, there are several distinct ways to take the largeweight limit. First, we can take the weights to infinity at fixed ratio P 2 P 1 , assuming without loss of generality that P 1 > P 2 . We will take this limit by writing P i = x i P , with x i fixed in the large-P limit. One finds: (4.12) The other interesting limit takes the difference P 1 − P 2 = 2δ to be fixed, with the average P → ∞.
Note that in terms of dimensions h, this means that h 1 − h 2 is of order √ h. In this limit one finds the following asymptotics log C 0 (P 0 , P − δ, P + δ) (4.13) Several recent papers have studied asymptotics of the averaged off-diagonal heavy-heavy-light structure constants in CFT 2 , including [10][11][12]. The most directly comparable result is equation (2.33) of [10], which studied these OPE asymptotics by considering the torus two-point function in a particular kinematic limit, imposing modular covariance, and performing an inverse Laplace transform to extract the spectral density. While the first line of our result (4.13) reproduces the entropic suppression e −S/2 expected from the eigenstate thermalization hypothesis, there appears to be a nontrivial difference between our subleading terms (written in terms of the dimensions and the central charge in equation (1.21)) and those of [10]. Again, we would like to emphasize the technical simplicity of our argument, which does not rely on carefully establishing the behaviour of conformal blocks in simultaneous large-weight and kinematic limits.

Genus-two partition function: heavy-heavy-heavy
The final constraint from crossing we will study arises from modular invariance of the genus two partition function G 2,0 . We will relate the conformal block decomposition in two channels, which we call 'sunset' and 'dumbbell'; these channels and the relation between them are illustrated in figure 7.
(4.14) We have here suppressed the dependence of G 2,0 and the blocks on the moduli, since by now it is hopefully clear that we have no need of them. This is fortunate, because for g ≥ 2 the description of the moduli spaces and relations between different channels becomes technically very challenging, Figure 7: The sequence of moves expressing a genus 2 'dumbell' channel block in terms of 'sunset' channel blocks.
and in particular, we must contend more directly with the factors arising from the conformal anomaly.
To study the consequences of the genus-two modular crossing equation, we will employ the crossing kernel that relates dumbbell channel genus-two Virasoro blocks to those in the sunset channel. From figure 7, we see that, like the crossing kernel for the torus two-point function, this kernel is simply a product of sphere four-point and torus one-point kernels: Once again, we will find that in appropriate limits, the spectral density in the sunset channel is dominated by the contribution of the identity in all internal cuffs of the dumbbell channel. The corresponding spectral density is given by the following identity kernel: = ρ 0 (P 1 )ρ 0 (P 2 )ρ 0 (P 3 )C 0 (P 1 , P 2 , P 3 ). (4.16) Thus, once again, the asymptotic behaviour of the OPE coefficients, now when all three operators are heavy, is determined by the asymptotics of the universal object C 0 (P 1 , P 2 , P 3 ). Precisely as in (4.9), corrections to this asymptotic formula due to the exchange of non-vacuum primaries in the dumbbell channel are exponentially suppressed by the ratio in the limit where the ratios or differences between the P i are held fixed. In the original dumbbell channel, α 2 cannot be nonzero unless both α 1 and α 3 are nonzero, so this is always exponentially small. More details are contained in appendix B.1.1.
The conclusion is that the sunset channel OPE density is well-approximated by the exchange of the vacuum Verma module in the dumbbell channel when the internal weights all become heavy: ρ sunset (P 1 ,P 1 ; P 2 ,P 2 ; P 3 ,P 3 ) ≈ K P 1 P 2 P 3 ;111 KP 1P2P3 ;111 , P i ,P i → ∞ (4.18) Thus the kernel (4.16) encodes an asymptotic formula for OPE coefficients in the heavy-heavy-heavy regime, averaged over the weights of all three heavy operators |C 123 | 2 ∼ C 0 (P 1 , P 2 , P 3 )C 0 (P 1 ,P 2 ,P 3 ), P i ,P i → ∞. (4.19) As before, in the presence of a nonzero twist gap this formula holds at large spin in which only the left-moving momenta P 1 , P 2 , P 3 or the right-moving momentaP 1 ,P 2 ,P 3 are taken to be large.
We can now recover asymptotic formulas for the microcanonical average of all heavy OPE coefficients from the relevant asymptotics of C 0 . For example, if we fix ratios of P i , parameterizing as P i = x i P with x i > 0 fixed and P → ∞, we have (4.20) In the case where |P i − P j | is fixed in the limit, we instead have (4.21) These limits were studied using genus 2 modular invariance in [14], using conformal block techniques. This analysis used the same underlying crossing relation, relating the heavy blocks in the sunset channel to the identity in the OPE channel (or, equivalently, a different necklace channel, obtained by an additional fusion move; the identity blocks in these two channels are identical). Results were only obtained for large c, where additional techniques to analyse conformal blocks are available, only included terms up to order P ∼ √ h in log C 0 , and did not have a complete result for the term scaling exponentially in P 2 ∼ h (the first line of (4.21)) valid at general ratios of operator dimensions. Nonetheless, all our formulas match those in [14], including confirming a conjectured correction c → c − 1 from finite central charge. Our new method, with far less work, extends these results to higher orders and finite central charge.

Semiclassical limits
Throughout this paper we have emphasized that our asymptotic formulas apply in any twodimensional irrational CFT for any c > 1, providing universal results in a kinematic limit of large dimension or spin. However, it is natural to expect our results to be particularly powerful in holographic theories with a weakly coupled AdS 3 dual, and to have a corresponding gravitational interpretation. The basic reason for this is simple: the corrections to the asymptotic formula come from the lightest operators in the theory, and existence of a holographic dual requires having few such operators (a sparse light spectrum) [31,77,78]. For example, in higher dimensions generic theories contain double-twist operators with anomalous dimensions suppressed at large spin [24,25]; in holographic theories, the 't Hooft limit extends this to double-trace operators with anomalous dimensions suppressed at large N , now at finite spin. The corresponding gravitational interpretation involves two-particle states in AdS, which generically are weakly interacting only with very large orbital angular momentum, when the particles are widely separated, but in holographic theories also interact weakly at finite separation. An example in d = 2 is the density of states, which for holographic theories is given by the Cardy formula not just for very heavy operators, but also at large c for energies of order c [31], interpreted as the Bekenstein-Hawking entropy of BTZ black holes [79].
With this in mind, in this section we will give gravitational interpretations of our universal OPE coefficients C 0 in various large c limits. We will not attempt here to pin down precisely when these formulas apply, in terms of constraints on the theory and regime of operator dimensions; see [17] for recent work in this direction.
Nonetheless, it is simpler to interpret and understand this regime in the gravitational description. Since our formulas come from expanding an identity block in an alternative channel, we can interpret our formulas as a microcanonical version of 'vacuum block domination', giving the density of states in a regime where a correlation function is well-approximated by only the identity Virasoro block in the appropriate channel [80][81][82][83]. At large c, an identity block is given by the gravitational action of a particular locally empty AdS solution (which could be a BTZ black hole or handlebody at higher genus), along with worldlines of particles propagating between external operator insertions [6,30,[84][85][86]: We therefore expect our formulas to be applicable when the gravitational path integral is dominated by such a solution, up to loop corrections 26 . This holds for a kinematic regime of parametrically low temperature or small cross-ratios, but for holographic theories is expected to extend to a regime of kinematics which are fixed in the large c limit. The question is how far this regime extends before encountering a phase transition. The simplest such phase transitions are first-order 'Hawking-Page' transitions, where an identity block in different channel dominates. However, note that even for local, weakly coupled gravitational theories, there need not be any channel in which the vacuum dominates: for example, there may be a phase in which a scalar field condenses after a secondorder phase transition [30,32]. Vacuum dominance potentially particularly subtle for correlation functions in kinematic regimes such as those with operators out of time order [87].
We now give our examples of gravitational interpretations of the universal OPE coefficients C 0 in various limits. These are all explored in more detail elsewhere, but we present them here together as consequences of the same formula, emphasising the unifying nature of our results. Furthermore, the list may well not be exhaustive, since we have not included all possible semiclassical limits, and our understanding of the connections to gravity is far from complete.

Spectral density of BTZ black holes
For our first example, we take a large c limit of C 0 which probes the physics of BTZ black holes. We take two operators to be heavy, with dimensions h 1 , h 2 scaling with c, to correspond to black hole states, but with similar dimensions, h 1 − h 2 fixed as c → ∞. The third operator, acting as a probe of the geometry, has h fixed in the limit. In terms of the momentum variables P , we take and fix p, δ, h in the b → 0 limit. We can then interpret C 0 as governing the matrix elements BH 2 |O|BH 1 of the probe operator O of dimension h between black hole states of nearby energies.
This limit of the fusion kernel was studied in [20], with the result This is the left-moving half spectral density associated to free matter propagating an a BTZ black hole background 27 [88]. In particular, the poles at imaginary δ are associated with the frequencies of quasinormal modes governing the approach to equilibrium. This result is sufficient to recover the 'heavy-light' limit of conformal blocks [89,90]; see [20] for more details.

Near-extremal BTZ and the Schwarzian theory
Our second example (based on results to appear [91]) is similar to the first, but treats the distinct case where the black hole of interest is very close to extremality.
Rotating BTZ black holes exist for dimensions above the extremality bound h > c−1 24 , and we will tune our operators close to this, with h − c−1 24 of order c −1 . Our third operator will remain a light probe. This means we have where we fix k 1 , k 2 , h and take b → 0.
In this limit, our universal density of states ρ 0 and OPE coefficients C 0 are given by where the ± refers to a product of four terms with all possible sign combinations. These expression may be familiar from the Schwarzian theory, which governs the dynamics of weakly broken conformal symmetry [92][93][94]. This theory arises in near-extremal black holes, which have a nearhorizon AdS 2 region with dynamics governed by Jackiw-Teitelboim gravity [93,95]. Specifically, ρ 0 is proportional to the density of states for the Schwarzian theory, and C 0 to a transition amplitude appearing in calculations of correlation functions [94,96,97].
The appearance of these quantities is a sign that there is a universal sector of large c CFTs which knows about quantum geometry, where the metric fluctuations are not suppressed. The connection between the Schwarzian theory, near-extremal BTZ and universality in CFT will be explored in much greater detail in forthcoming work [91].

Conical defect action
Finally, we consider a regime where all three operators have dimensions scaling with c. If we take 24h c > 1 in this limit, as required for asymptotic formulas, C 0 should be interpreted as giving a threepoint function of black hole microstates. It is unclear whether there is a direct calculation of this quantity, giving the semiclassical limit of C 0 as an on-shell action. However, perhaps surprisingly, if we fix 24h c < 1 and take c → ∞, there is such an interpretation, shown in [4]. Those authors computed the vacuum fusion kernel in a large central charge limit, and equated it to a suitably regularised on-shell action of a geometry corresponding to three heavy particles running between the asymptotic boundary and a trivalent vertex. The action in this case is Einstein-Hilbert, plus an action m i L i for each particle, where L i is a regularised proper length of the particle's worldline and m i ∼ c 3 η i is its mass. Since the particles have masses of order c, they backreact to form three conical defects in the geometry, meeting at the vertex 28 .
In our notation, we can express the result of [4] as a limit of C 0 : (5.8) 28 No particle action was included in [4], but they also included no singular contribution to the Einstein Hilbert action localised on the worldline. These two terms are equal and opposite, so the results are equivalent.
where F (z) = I(z) + I(1 − z) for I(z) = z 1 2 dy log Γ(y). The action b −2 S grav appearing here is precisely the gravitational action for the conical defect network described above. When left-and right-moving sectors are combined, for scalars the phase θ cancels.
When conformal blocks are computed at large c as an on-shell gravitational action, this conical defect action, and hence this limit of C 0 , appear as the natural normalisation of the blocks [29,30]. While the relation with our universal asymptotic formulas is suggestive, it remains rather mysterious from that point of view, and deserves to be better understood.

Torus one-point functions & the Eigenstate Thermalization Hypothesis
Although the primary focus of our paper is on the asymptotic behaviour of the C ijk 2 , similar techniques can be applied to other observables in two-dimensional conformal field theory. For example, by studying the modular covariance of the torus one-point function of an operator O 0 one obtains an asymptotic formula for diagonal heavy-heavy-light structure constants C OHH , where we average over the heavy operator H. This was discussed in [5], who found in the limit that ∆ H → ∞. Here χ is the lightest operator to which O 0 couples (i.e. for which C 0χχ = 0), and is assumed to be sufficiently light, ∆ χ < c−1 12 . The normalization factor N 0 depends only on c, ∆ χ and ∆ 0 . This analysis was performed at the level of the scaling blocks in [5] and was generalized to include the contribution of global blocks in [6]. When regarded as a formula for the average value of the primary operators, however, equation (6.1) is true only at leading order in 1/c; the inclusion of Virasoro blocks provides corrections which are only subleading at large c.
We can now write down the finite c version of this formula using the modular S kernel introduced in section 3.3 for torus one point functions. Following the same logic that led to our other asymptotic formulas, we conclude that provided that χ, the lightest operator that couples to O 0 , is sufficiently light (α χ lies in the discrete range in the sense of [20]) and that there exists a gap above this lightest operator so that corrections due to the inversion of the contributions of other operators in the original channel are indeed suppressed. The large P asymptotics of this formula are straightforward to find by taking the large P H limit of the modular S kernel, namely This reproduces the earlier result (6.1) in the appropriate limit.
We would like to emphasize two important qualitative differences between this formula and our other asymptotic formulas. The first is that it is not universal in the same sense as our other formulas, as it explicitly depends on the lightest operator that couples to O 0 , both through its conformal weights and OPE coefficient (this is because the vacuum Verma module cannot propagate as an intermediate state in either channel of the torus one-point function). Second, its derivation is on even less rigorous footing than our other asymptotic formulas because the structure constants that appear in the conformal block decomposition of the torus one-point function need not be positive, and so the spectral densities ρ[O 0 ],ρ[O 0 ] do not in general have definite sign and may oscillate when integrated. This is unlike the product of structure constants that appear in the necklace channel conformal block decomposition of the torus two-point function of identical operators or the sunset channel of the genus-two partition function, which are positive in a unitary CFT. In fact, if the lightest operator that couples to O is sufficiently heavy (in particular, if it has twist > c−1 12 ), then one cannot even argue that the asymptotics of the structure constants are universal as corrections due to the propagation of other operators in the original channel are not parametrically suppressed.
As discussed in section 1.5, the fact that the averaged diagonal heavy-heavy-light OPE coefficients are exponentially suppressed (via e.g. (6.3)) implies a different hierarchy of suppression between the averaged diagonal and non-diagonal heavy-heavy-light structure constants than would naively have been expected from the usual statement of the Eigenstate Thermalization Hypothesis, where f O is order one and g O ≈ e − 1 2 S(∆) . Indeed, if the lightest operator that couples to O 0 satisfies Re(α χ +ᾱ χ ) ≥ Q 2 (for scalars, this corresponds to dimension ∆ χ ≥ c−1 16 ), then there is no suppression whatsoever of the averaged off-diagonal structure constants compared to the diagonal, and indeed the diagonal terms may be even smaller than the off-diagonal in this regime. This may be seen by comparing equation (6.3) with equation (4.13). This contrast is particularly sharp in holographic theories with a large gap in the spectrum of primary operators, with only Planckian degrees of freedom. Indeed the dual of a theory of "pure" quantum gravity in AdS 3 is in a sense one where the averaged diagonal heavy-heavy-light structure constants are smallest.

A Explicit forms of elementary crossing kernels
In this section we will review the explicit forms of the elementary crossing kernels used in this paper, with a focus on the analytic structure of the kernels as a function of the intermediate weights.

A.1 Sphere four-point
We will start by reviewing the explicit form of the fusion kernel, which implements the fusion transformation relating sphere four-point Virasoro conformal blocks in different OPE channels (see equation (3.7)). The fusion kernel was worked out in explicit detail by Ponsot and Teschner [21,22]. The expression involves the special functions Γ b (x), which is a meromorphic function with no zeros that one may think of as a generalization of the ordinary gamma function, but with simple poles at x = −(mb + nb −1 ) for m, n ∈ Z ≥0 , and .
Many properties of these special functions, including large argument and small b asymptotics, were summarized in [20] (see in particular appendix A of that paper). The explicit expression for the kernel involves a contour integral and is given by where the prefactor P b is given by and the arguments of the special functions in the integrand are The contour C runs from −i∞ to i∞, traversing between the towers of poles running to the left at s = −U i − mb − nb −1 and to the right at s = Q − V j + mb + nb −1 in the complex s plane, for m, n ∈ Z ≥0 .
Viewed as a function of the internal weight P s , the kernel (A.2) has eight semi-infinite lines of poles extending to both the top and bottom of the complex plane F PsPt P 2 P 1 P 3 P 4 : simple poles at P s = ±i where P 0 = P 1 + P 2 , P 3 + P 4 (and six permutations under reflection P i → −P i ).
(A.5) Roughly, half of these poles are explicit singularities of special functions in the prefactor (A.3), while the other half arise from singularities of the contour integral, which occur when poles of the integrand pinch the contour. In the case particularly relevant for this paper of pairwise identical operators P 4 = P 1 , P 3 = P 2 , these singularities are enhanced to double poles, although there is an exception when the T-channel internal weight P t is degenerate (P t = ± i 2 ((m + 1)b + (n + 1)b −1 ), m, n ∈ Z ≥0 ), in which case the poles remain simple when the external operators have weights consistent with the fusion rules.
In most cases, the contour of integration over the internal weight P s in the fusion transformation (3.7) can be taken to run along the real axis. However, as emphasized in [20,61], when the external operators are sufficiently light, in particular when then some poles of the fusion kernel (A.5) cross the real P s axis and the contour must be deformed, leading to a finite number of discrete residue contributions to the S-channel decomposition of the T-channel Virasoro block. These correspond to the Virasoro analog of double-twist operators [20].
In the special case of pairwise identical operators with T-channel exchange of the identity, the contour integral can be computed very explicitly and the fusion kernel takes the following simple form, which makes the analytic structure manifest = ρ 0 (P s )C 0 (P 1 , P 2 , P s ). (A.7)

(A.8)
This integral representation only converges when Outside of this range, the kernel is defined via analytic continuation, using the fact that it satisfies a shift relation that we will make explicit shortly.
The integral contributes the following series of poles in the P plane, one extending to the top and the other extending to the bottom integral: poles at P = ± i 2 Q 2 + iP 0 + mb + nb −1 , m, n ∈ Z ≥0 . (A.10) Together with the prefactor, the full kernel has the following polar structure in the P plane S P P [P 0 ] : poles at P = i 2 Q 2 − iP 0 + mb + nb −1 , m, n ∈ Z ≥0 , and all possible reflections (in P, P 0 ).
(A.11) One can think of these poles as arising in the case that the external operator is a (Virasoro) doubletwist of the internal operator. Unlike the case of the fusion kernel, for unitary values of the weights none of these poles can cross the contour of integration Im(P ) = 0.
Similarly to the case of the fusion kernel, the modular S kernel can be straightforwardly evaluated in the case that the external operator is the identity, P 0 = i Q 2 . In this case, the prefactor vanishes and so we only need to extract the singularities of the contour integral. By carefully studying this limit, one finds S P P [1] = 2 √ 2 cos(4πP P ), (A.12) precisely reproducing the non-degenerate modular S matrix for the Virasoro characters (2.7). To study the limit in which the internal operator in the original channel is also the identity one must be more careful, for the simple reason that the Virasoro vacuum character is not the same as the h → 0 limit of the non-degenerate Virasoro character; in the latter case, there are null states that do not decouple continuously.
To study this limit more carefully, we note that the modular kernel satisfies the following shift relation (see e.g. [75]) 2 cosh(2πbP )S P P [P 0 ] = Γ(b(Q + 2iP ))Γ(2ibP ) Γ(b( Q 2 + i(2P − P 0 )))Γ(b( Q 2 + i(2P + P 0 ))) (A.13) Now consider the limit P → i b −1 2 of this equality. The first term on the right-hand side will be singular unless we take P 0 to i Q 2 at the same time. To facilitate the study of this limit, we write P = i 2 (b −1 − ), P 0 = i Q 2 − , and take → 0. Taking the limit, we find lim →0 S P, i 2 (Q− ) i( (A.14) precisely reproducing the modular S matrix for the inversion of the Virasoro vacuum character (2.11). Note that one cannot recover this by taking the appropriate limit of (A.8), as α 0 = 2α is at the boundary of the regime of convergence of the integral representation.

B Asymptotics of crossing kernels
In this section we will collect results for the asymptotic form of the elementary crossing kernels when some of the weights are taken to be heavy. These results are important for establishing both the form of our asymptotic formulas and their validity, via the suppression of corrections due to the propagation of non-vacuum primaries.

B.1 Sphere four-point
In [20], the asymptotic form of the fusion kernel when the S-channel internal weight P s was taken to be heavy with fixed external weights was extensively studied. The main result of that analysis was the following asymptotic form of the vacuum fusion kernel (A.7) with pairwise identical operators, which follows directly from the asymptotics of the special function Γ b that were established in that paper , P s → ∞ By carefully studying the asymptotics of the contour integral in the definition of the fusion kernel, in [20] it was also established that the fusion kernel with non-zero T-channel weight is exponentially suppressed at large P s compared to the vacuum kernel , P s → ∞.

(B.3)
Thus we learn that corrections to the heavy-light-light asymptotic formula (4.4) due to the exchange of non-vacuum primaries in the T-channel are exponentially suppressed.

B.1.1 With heavy external operators
In order to establish the validity of the off-diagonal HHL and HHH asymptotic formulas, we need to ensure that the propagation of non-vacuum primaries is suppressed compared to that of the vacuum. The only nontrivial step is establishing the suppression of F P 2 P 2 P 1 P 3 P 1 P 3 F P 2 1 P 1 P 3 P 1 P 3

(B.4)
when one or both of the external operators P 1 , P 3 are taken to be heavy along with the S-channel internal weight P 2 .

B.2 Torus one-point
In order to establish the validity of the heavy-heavy-light and heavy-heavy-heavy universal formulas, we also need to study the asymptotics of the torus one-point kernel in the limit that the internal weight in one of the channels becomes heavy, namely the limit P → ∞. In this limit, the prefactor Q b reduces to the following log Q b (P, P , P 0 ) ∼2π(Q − α 0 )P + h 0 log(2P ) To study the asymptotics of the contour integral, we start by considering scaling the integration variable with P , ie. ξ = σP . Then the integrand behaves in the following way at large P depending on the imaginary part of σ (B.14) In this limit, there are poles extending to the left and right at Im(σ) = ±1 pinching the contour.
For α in the discrete range, we cannot evaluate the integral by deforming the contour and summing over residues e.g. in the ξ right half-plane since the integrand does not decay exponentially along the arc at infinity. However, so long as the internal weight α obeys the condition (A.9), the integral along the contour Re(ξ) = 0 converges nicely and the integral in this limit can easily be computed by using the asymptotics (B.14). When α ∈ (0, Q 2 ), we have C dξ i e −4πξP T b (ξ, P, P 0 ) ≈ Q 2 − iP 0 2π(−2iP )( Q 2 + i(2P − P 0 )) e −2πP ( Q 2 +i(2P −P 0 )) .

(B.18)
Notice that the exponential part of the prefactor cancels the different exponential asymptotics of the shifted kernel S P,P −i n 2 b so that the overall asymptotics are preserved.