Typicality and thermality in 2d CFT

We identify typical high energy eigenstates in two-dimensional conformal field theories at finite $c$ and establish that correlation functions of the stress tensor in such states are accurately thermal as defined by the standard canonical ensemble. Typical states of dimension $h$ are shown to be typical level $h/c$ descendants. In the AdS$_3$/CFT$_2$ correspondence, it is such states that should be compared to black holes in the bulk. We also discuss the discrepancy between thermal correlators and those computed in high energy primary states: the latter are reproduced instead by a generalized Gibbs ensemble with extreme values chosen for the chemical potentials conjugate to the KdV charges.


Introduction
In this paper we address the question of whether typical microstates in two-dimensional conformal field theories appear thermal in a suitable sense. For a wide range of physical systems, the usefulness of the basic notion of temperature as applied to an isolated system is predicated on this fact, while the eigenstate thermalization hypothesis (ETH) has sharpened the criteria for the emergence of a thermal description [1][2][3].
2d CFTs provide the most tractable class of interacting quantum field theories, so provide a natural arena to address such questions. On the other hand, this tractability arises due to infinite dimensional Virasoro symmetry, which in turn gives rise to an infinite number of conserved charges that commute with the Hamiltonian (the so-called KdV charges [4]). There is an obvious tension between the existence of this infinite tower of charges and the standard description of an ensemble characterized by a finite number of control parameters, i.e. the temperature and chemical potentials. Hence the notion of a thermal description may need to be refined in this context, for instance by passing to a generalized Gibbs ensemble with an infinite number of chemical potentials.
We focus here on universal aspects of this question. Namely, suppose we are handed a typical energy eigenstate 1 of the CFT: do correlation functions of stress tensors and conserved currents appear thermal in such a state, at least in some regime of parameters? We can answer this question without committing to a specific CFT, and if this fails to hold then there will be no effective thermal description of the CFT microstates.
Previous work with similar aims includes [5,6]. These papers considered specific theories, namely N = 4 super Yang-Mills and the D1-D5 CFT, and considered simplifying features, such as focussing on BPS states or free fields. Typical microstates were shown to behave approximately "thermally", 2 with small deviations encoding the specific state. In bulk language, this provides evidence that a black hole serves as a coarse grained description of collections of microstates. As noted above, we proceed here without assuming supersymmetry or making reference to a specific CFT, although we do restrict to two-dimensions and to specific universal probes. Also relevant is [7], which considers states that are random superpositions of energy eigenstates in a small window, concluding that physically accessible observables have values that are close to thermal, with an error that is exponentially small in the entropy. It was also noted that the nonthermal features can be enhanced to be of order unity by considering imaginary time correlators. Additional work and reviews include [8,9].
One motivation for this work is to resolve an apparent puzzle regarding a mismatch in the expectation values of KdV charges in microstates versus the thermal ensemble. The simplest example of this mismatch will suffice here. We consider the stress tensor T (w) along with the conformal normal ordered product :T T :, obtained by subtracting power law divergences in the OPE and then taking the coincident limit. The zero modes of these two operators mutually commute, and define the lowest two members of the infinite tower of mutually commuting KdV charges. We first consider the CFT on an infinite line at inverse temperature β, and compute :T T : β = π 2 c 6β 2 2 + 11 90 We next consider the CFT on a spatial circle of circumference L. Let |h p denote a Vira-soro primary state of dimension h p , 3 hence obeying L 0 |h p = h p |h p , L n>0 |h p = 0. The expectation values in this state are To compare, we should take L → ∞ with h p /L 2 fixed so as to maintain a finite energy density in the limit. Demanding T β = h p |T |h p fixes the relation between h p /L 2 and β as (1.4) Comparing to (1.1) we note a discrepancy, which is subleading at large c. In this work we consider arbitrary c, not necessarily large, in which case the discrepancy is in no sense small. The same type of discrepancy persists for quasi-primaries and the higher KdV charges [10,11].
One possible response to this discrepancy is that expectation values computed in the primary state should be compared with those in the generalized Gibbs ensemble rather than the usual canonical ensemble, with the infinite number of chemical potentials adjusted to yield equality for the KdV expectation values. This avenue has been explored in [12][13][14][15][16]. Here we take another point of view: we regard the discrepancy as a reflection of the fact that primary states are atypical, and we should not expect the canonical ensemble to accurately reproduce results in such atypical states. Indeed, in any system which has a thermal description there will exist atypical states which appear highly nonthermal.
As we discuss, a typical state of dimension h is not primary but rather a typical level h c descendant of a dimension h p = c−1 c h primary. These states have the form |ψ h ≡ n (L −n ) Nn |h p , n N n n = h c , with the N n being non-negative integers drawn from a Boltzmann distribution, such that N n agree with the Bose-Einstein distribution. We show that if one chooses a typical state of this form, then the above discrepancy is resolved: T β = ψ h |T |ψ h and :T T : β = ψ h | :T T : |ψ h , where β is given by (1.3) but with h p replaced by h.
We will actually establish a much more general result (4.20), namely agreement between stress tensor correlators computed in the typical microstate versus the thermal ensemble. More precisely, consider the case of the two-point function T (w)T (0) . For real w, corresponding to spatial and/or real Lorentzian time separation between the two points, the microstate correlator is accurately thermal provided L β. On the other hand, in Euclidean time, corresponding to imaginary w, agreement is present only inside the strip |Im(w)| < β. For example, the thermal correlator is periodic under w → w + iβ, while this is not even approximately true for the microstate correlator outside the strip.
The underlying mechanism responsible for the thermal behavior of microstate correlators is the following. A stress tensor correlator involves a weighted sum over expectation values of the form ψ|L n 1 . . . . L n k |ψ , where k n k = 0. These expectation values vary considerably depending on which microstate we choose. However, the relevant part of the correlator is an infinite sum of the above expectation values multiplied by factors cos 2πnw L , and what matters is the variance of this object evaluated over the space of dimension h microstates. We compute this variance in the large L limit and show that it is small provided β is held fixed as L → ∞. This follows from the fact that ψ|L n 1 . . . . L n k |ψ for different choices of the n i are statistically independent in this regime. Since the variance is small, the correlator takes approximately the same value in a typical microstate as in the thermal ensemble, with corrections suppressed by 1/L. As mentioned above, typical microstate correlators cannot be approximated by thermal correlators outside the strip |Im(w)| < β; this follows from the analytic continuation needed to define the thermal correlators in such cases.
Once we have established that in a typical microstate stress tensor correlators assume their thermal values, up to small corrections, it immediately follows that all KdV charges will have nearly thermal expectation values. With this in mind we can come back to the relevance of the generalized Gibbs ensemble. We have noted that a typical state of dimension h is based on a primary of dimension h p = c−1 c h, but suppose we instead choose to look at a state based on a different value of h p . In this case we do not expect correlation functions computed in a typical descendant of such a state to agree with those in the canonical ensemble, but one can ask whether they would agree with correlators in the generalized Gibbs ensemble for appropriately chosen potentials. It would be interesting to answer this question following the large c analysis in [12,15], but we do not address it here. The case of a primary state, with h p = h is the extreme version of this; for example, a primary state has the lowest value of the second KdV charge among all states of a given energy. In general, if, for whatever reason, one is interested in the properties of states whose KdV charges differ significantly from their values in the canonical ensemble, then the generalized Gibbs ensemble is appropriate. However, since such states are rare, the canonical ensemble suffices to describe most states.
The rest of this paper is organized as follows. In section 2 we warm up with the technically simple case of spin-1 current correlators. We present the general argument that typical microstates yield thermal correlators, and then verify this numerically. In section 3 we turn to the stress tensor correlators. The statistical independence of the expectation values of a string of Virasoro generators, which is the key result needed for approximate thermality, is established in section 4 for the case of the two-point function. The general case is considered in Appendix B. We close with some comments in section 5. Appendix A derives the equivalence between two different forms of the current two-point functions.

Current correlators
In this section we consider correlation functions of spin-1 currents J(z). This provides a technically simple context to compare and contrast correlation functions computed in microstates versus a thermal ensemble.

Thermal correlator
We normalize J(z) such that its 2-point function on the Euclidean plane is (2.1) Transforming to the infinite line at inverse temperature β via z = e 2π β w gives The current can be realized in terms of a free boson as J(z) = ∂φ(z), where the free boson stress tensor is T (z) = − 1 2 ∂φ∂φ. Higher point correlators are obtained from factorization into 2-point functions, as in Wick's theorem.
We next introduce a Euclidean torus with coordinate w = x + it obeying w ∼ = w + L ∼ = w + iβ, corresponding to a theory on a spatial circle of size L at temperature T = 1/β. We write the corresponding torus 2-point function as J(w 1 )J(w 2 ) L,β . L and β are interchanged by taking w → iw, which is the modular S-transformation in terms of the modular parameter 3) The 2-point function obeys The 2-point function is a meromorphic function on the torus with a single pole −1 (w 1 −w 2 ) 2 . This, along with the modular property, determines the 2-point function up to a position independent constant. The constant is determined in terms of the generalized partition function with a chemical potential, Z(q, y) = Tr[q L 0 − c 24 y Q ], where Q denotes the charge corresponding to the current J. This structure arises from Ward identities, and explicit formulas are provided in [17]. In the case of a free scalar we have is the Weierstrass function and the Eisenstein series is E 2 (τ ) = 1 − 24 ∞ n=1 nq n 1−q n with q = e 2πiτ . We will use this free boson result in the following, keeping in mind that the general correlator just differs from this by a position independent constant.
For what follows, it will be useful to reexpress the correlator as a mode sum in the free boson theory. The mode expansion on the cylinder is with [α m , α n ] = mδ m+n,0 . (2.8) The thermal correlator is with Z(τ ) = Tr q L 0 − 1 24 qL 0 − 1 24 and We work in a basis of eigenstates of α −n α n with eigenvalues N n n, N n being the occupation number. In the canonical ensemble, the probability distribution over occupation numbers is given by the normalized Boltzmann factor, Nn=0 e 2πiτ Nnn = (1 − e 2πiτ n )e 2πiτ Nnn . (2.11) The average occupation number is given by the Bose-Einstein distribution This yields the thermal correlator 4 Here we have used α 2 0 L,β = L 4πβ , as derived in Appendix A. An important point for what follows is that the sum in (2.13) converges in the strip |Im(w)| < β, due to the competition between the cosine in the numerator and the Bose-Einstein exponential in the denominator, but diverges outside the strip. Inside the strip the correlator is periodic under w → w + iβ, and we use this relation to analytically continue the correlator to the full w-plane.
The equivalence of (2.5) and (2.13) is shown in Appendix A.1.

Microstate correlator
In a microstate, |ψ , the current two-point function takes a similar form, We have assumed that |ψ is an eigenstate of the number operator, α −n α n |ψ = N n n|ψ (for n > 0). We now ask to what extent the correlator evaluated in a typical microstate agrees with the thermal correlator at an appropriate temperature. First, we need to establish what we mean by a typical microstate. As above, we restrict to states that are eigenstates of α −n α n . The total energy E = 2π L ∞ n=1 N n n is assumed to be large, EL 1, and we define the Standard statistical reasoning implies that if we choose such a state at random, the occupation number of the nth level will have the probability distribution P (N n ) as in (2.11). Accordingly, our definition of typicality corresponds to randomly choosing occupation numbers according to this probability distribution. 5 We further impose N n = 0 for sufficiently large n, say 2πn L > E; this is convenient for numerics and also ensures that we consider only states of finite energy.
It is easy to see that the microstate correlator will differ completely from the thermal correlator outside the strip |Im(w)| < β, a point that was emphasized in [7]. To see this, we recall that the mode sum in (2.13) diverges outside the strip, and the thermal correlator is defined there by analytic continuation from inside the strip. On the other hand, the microstate correlator is a finite sum since the total energy is assumed to be finite, and so no issue of nontrivial analytic continuation arises. If N n ≈ N n L,β then the sum in the microstate correlator looks approximately like a truncated version of the thermal sum. While the two sums can approximately agree inside the strip they will differ outside it, just as the sum N n=1 x n for large N will approximately agree with 1/(1 − x) for |x| < 1, but looks completely different for |x| > 1.
With this in mind, we now restrict attention to |Im(w)| < β. We now argue that the microstate correlator will look approximately thermal provided L β. To see this, we first note that if we simply insert N n = N n L,β along with ψ|α 2 0 |ψ = L 4πβ in the microstate correlator, then we reproduce the thermal result. Of course, no microstate is precisely compatible with this since N n L,β are not integers in general, but we can consider a microstate for which these relations are approximately true. Such microstates are rare, since N n has large fluctuations over the space of all microstates of a given energy: by differentiating the partition function This is not small, which implies that N n = N n L,β even in typical microstates. 5 The situation here is equivalent to studies of random partitions of large integers and limit shapes of their corresponding Young diagrams [18]. The Bose-Einstein distribution also determines the limiting profile of the appropriate Young diagram.
However, the correlator itself an infinite sum of such terms, the relevant piece of which is We can evaluate the fluctuations in this operator using (2.17) If we take L → ∞ at fixed w and β 6 we can convert sums to integrals and find We first note that by using the first line and taking L → ∞ we find that (2.14) correctly reduces to (2.2). The integral appearing in the expression for K 2 (w) L,β above can be formally evaluated in terms of Hurwitz zeta functions but its explicit form is not illuminating. We then compute the size of the fluctuations as which is just the standard magnitude of finite size corrections to the thermodynamic limit. Since the sum (2.16) is sharply peaked in the ensemble of microstates in this regime, the correlator in a typical microstate approximates the thermal result.
The situation changes slightly if we hold fixed w/L as we take L → ∞. In this case we cannot replace the sums by integrals due to the relatively rapid variation of the cosines, and we have (2.20) For w = 0, or any multiple of L/2, the cosine factor becomes unity, which allows us to 6 We relax the condition on w below.
replace the sum by an integral, yielding, δK(w) = 2π 3 3Lβ 3 . On the other hand, if ∆w/L is kept nonzero and fixed in the limit, where ∆w denotes the distance to the nearest multiple of L/2, then the cosine factor is rapidly varying compared to the rest of the summand, and can be replaced by its average, namely 1/2, yielding δK(w) = π 3 3Lβ 3 . All that really concerns us is that, as above, δK ∼ 1 √ L , and so the fluctuations in the correlator are once again suppressed in the large L limit.
These arguments are readily verified by numerical analysis. To implement this we generate a list of occupation numbers, (N 1 , N 2 , . . .) by drawing numbers according to the probability distribution P (N n ). We then insert these occupation numbers in the microstate correlator (2.14) and plot the result. [Right] Comparison of the term that differs between the thermal and microstate correlators (i.e. the third terms in (2.13) and (2.14) respectively). The microstate on the plot is defined by the partition N n from the left panel.
For |w| L the correlators decay exponentially in |w|. However, they must eventually increase to respect the periodicity w ∼ = w+L. The minimal value is reached for w ≈ L/2, and as shown in Appendix A.2, J( L 2 )J(0) L,β ∼ − π βL , which vanishes as L → ∞ as expected. It is worth commenting on some related plots that appear in [5] (see their fig. 1). That paper considers the free CFT corresponding to the D1-D5 system at the symmetric orbifold point. At large N , this theory has a large degeneracy of Ramond-Ramond ground states, which are chiral primaries. The coarse grained description of these ground states is dual to the M = 0 BTZ black hole, as was verified by comparison of a two-point function computed in the two descriptions. At large N the typical ground state correlator is well approximated by the coarse grained correlator for time separation t < O( √ N ). For larger t the correlator displays an erratic behavior that depends sensitively on the microstate. The common feature in the two examples is the appearance of a coarse grained description, but the details differ.

Stress tensor correlators
We now turn to the case of stress tensor correlators. The general approach follows the previous discussion of current correlators, although the details are a bit more involved. The conclusion is the same: correlators computed in typical microstates look thermal in the appropriate regime of parameters.

Two-point functions
Stress tensor correlators are highly constrained by conformal invariance; in this section we collect a few results. On the plane we have We transform to new coordinates w(z) using (3. 2) The correlator on the line at inverse temperature β is generated by z = e 2π β w , yielding The thermal expectation value of the normal ordered product :T T :, obtained by taking the coincident limit w → w after removing singular terms in the Laurent expansion, is The stress tensor two-point function on the torus is fixed by a combination of conformal invariance and knowledge of the torus partition function, the latter quantity depending on the specific CFT. The two-point function is meromorphic, and so determined up to a constant by its singularities, which are in turn fixed by the OPE, T (w)T (0) ∼ c/2 The coefficient of the double pole is therefore fixed by the one-point function, which is in turn given by differentiating the partition function with respect to the modular parameter. The undetermined constant part of the correlator is fixed by Ward identities. The explicit formula for the correlator may be found in [17]. The same logic applies to higher genus Riemann surfaces as well.
Next, we would like the result for the stress tensor two-point function on a spatial circle evaluated in a primary state. If O hp is a primary operator then on the plane we have This is fixed by the conformal Ward identity for stress tensor insertions (or, equivalently, by the fact that it must be a meromorphic function of z and z , with singularities fixed by the OPE). As usual, O hp (∞) = lim z→∞ z 2hp O hp (z). We now transform to the cylinder with a spatial circle of circumference L via z = e 2πi L w , which gives A naive test of thermality consists of comparing (3.3) to (3.6) in the thermodynamic limit. In particular we take L → ∞ while simultaneously holding h p /L 2 fixed to maintain a finite energy density. For the correlators to match at large separation, which yields T 2 , we should take in the limit. The primary state result becomes Comparing to (3.3) we see that the two results share the same short distance singularities and (by construction) long distance limit, but differ otherwise. For example, the primary state result yields as opposed to (3.4). On general grounds, we expect that in the thermodynamic limit expectation values computed in typical states should agree with those computed in the thermal ensemble, and so the mismatch is an indication that primary states are not typical. On the other hand, the mismatch goes away at large c, indicating that in this regime primary states are typical.

Typical states
The Hilbert space of a two-dimensional CFT can be decomposed into representations of the Virasoro algebra. Each conformal family is labelled by a primary operator of some conformal dimension h p and consists of the primary state |h p and its conformal descendants obtained by acting with strings of L −n operators. We consider unitary representations at c > 1 with no null states. The full CFT has both left and right moving Virasoro algebras, but since we will only be considering correlators of T (z) we can restrict attention to one chiral half.
To characterize the typical state |ψ h at some specific h 1, we note that there are two competing effects. On the one hand, the number of primary states grows exponentially with h p , but on the other hand so too does the number of descendant states at level h − h p . The typical value of h p will be the one that balances these effects.
We write the partition function of the CFT as The anti-holomorphic dependence is not made explicit; in what follows, the correlation functions of the stress tensor and/or its modes will be determined by holomorphic derivatives (∂ τ or ∂ q ) of the partition function. Tr hp above denotes a trace over states in the conformal family labelled by the primary of weight h p , and d(h p ) is the number of primaries at weight h p . The corresponding Virasoro character Z hp is where η(q) = q 1/24 ∞ n=1 (1 − q n ) is the standard Dedekind eta function. Writing q = e − 2πβ L , at high temperature we have as follows from the modular behavior of the eta function. The high temperature behavior of the full partition function is obtained by modular transformation of the vacuum contribution, These imply the asymptotic degeneracy of primaries [19] d(h p ) ≈ e 2π √ c−1 6 hp , (3.14) which takes the same form as the Cardy density of states [20], except with the replacement c → c − 1. Next, for a given primary state of weight h p , we need to count up the number of descendant states at level h − h p . This corresponds to the number of partitions of h − h p , which is given by the Hardy-Ramanujan formula, e 2π h−hp Maximizing with respect to h p gives At large c, the typical state is nearly primary in the sense that h p ≈ h. However, we will not be making any such large c assumption here. At finite c, the typical states with weight h are level h/c descendants of a weight h p primary. 7

Typical state two-point function
On the Euclidean cylinder with a spatial circle, w ∼ = w + L, the mode expansion of the stress tensor is where the generators obey the Virasoro algebra Let |ψ h be an eigenstate of L 0 , L 0 |ψ h = h|ψ h . Using the mode expansion and the commutation relations it is straightforward to derive the following expression for the two-point function in such a state, For example, suppose that |ψ h is primary, so that L n>0 |ψ h = 0 and the second line vanishes. We then recover (3.6). We wish to evaluate this for a typical state. As discussed in the previous section, a typical state with weight h is a level h c Virasoro descendant of a primary state |h p whose dimension h p = c−1 c h. The expectation value of L −n L n depends on which particular descendant state we choose. However, we will show in section 4 that in the thermodynamic limit the variance of (3.19) over the ensemble of such states is small. Therefore, the expectation value in such states can be approximated by an average weighted by a Boltzmann factor, with the temperature chosen so as to yield the desired average weight. Let X hp,β denote the average of X defined in this sense, (3.20) Here q = e − 2πβ L as before. β is fixed by demanding L 0 hp,β = h, which can be written as (3.24) Using (3.23), converting the sum to an integral at large L, and using ∞ 0 .

(3.26)
This reproduces the thermal correlator in (3.3), thus verifying that the stress two-point function in a typical state appears thermal, provided L β. It immediately follows that ψ h | :T T : |ψ h = :T T : β .
The key step in obtaining this result was the replacement (3.24), whose validity depends on the microstate expectation value being sharply peaked over the ensemble of states. Obtaining analogous results for higher point correlators of the stress tensor will similarly depend on establishing that operators built out of sums of more L n are similarly sharply peaked. We turn to these questions in the next section.

Statistics of Virasoro generator expectation values
We shall now consider the following quantity The replacement (3.24) is valid if X(w) is sharply peaked over the thermal ensemble. In order to verify this, we will study its fluctuations and The mode number n is taken to be of order n ∼ L/β. The relevance of this scaling follows from the fact that when we convert sums to integrals we write x = 2πβn L . The Bose-Einstein factor then appears as (e x − 1) −1 , leading to exponential suppression of the x 1 regime. The same was true in the current correlator case.
Let us first consider (4.4). To evaluate such expectation values we will repeatedly make use of a simple trick: using and cyclicity of the trace, one can move the leftmost operator to the right end of the string and then rewrite the resulting expression as a sum of commutators plus the expectation value of the original string. For example, to compute X n we write L −n L n = q n L n L −n = q n ( L −n L n + [L n , L −n ] ) . Then Similarly, to compute X n X m we write We now show that for m = n the first two terms are subleading compared to the third in the thermodynamic limit. After evaluating the commutators, each of the three terms is proportional to an expectation value of the form L m L n L p with m + n + p = 0. In the third term one of (m, n, p) equals 0, unlike for the first two terms. If none of (m, n, p) equals 0 then we compute where we have assumed n = −p < 0, the other case leading to the same conclusion. Using our results above, we see that the middle term dominates and implies L 0 L −p L p ∼ L β 5 . Hence we see that the appearance of an L 0 insertion in the third term of (4.9) leads to an L/β enhancement compared to the first two terms. The same enhancement arises from the [L n , L −n ] ∼ n 3 contribution in the third term. Therefore, for m = n in this regime. Returning to (4.4), using (4.8) and accounting for the two extra powers of L/β that come from replacing the sums by integrals, we have for all w at high temperatures.
To compute δX 2 diag , we need to evaluate (4.9) when m = n. In this case, the first and third terms are equal and so This yields the same scaling as for the off-diagonal piece, Altogether, we find that This implies the fluctuation in the correlator vanishes in the large L thermodynamic limit. It is straightforward to derive explicit expressions for the fluctuations, analogous to the case of the current correlator. In the limit L → ∞ with w and β fixed, substituting X n from (3.23), we have L 2π The fluctuations in the case with w/L fixed can also be treated as for the current correlator. Higher-point correlation functions of the stress-tensor take the form It is implicit in the above expression that the L 0 's are shifted by −c/24. Equality between (4.19) in a typical state and its thermal value will follow if the sum is sharply peaked over the ensemble. We will show in appendix B.1 that the fluctuations in (4.19) are again small as long as the number of stress tensor insertions is small compared to L/β. It then follows that equality of thermal and typical correlation functions extends to n-point functions of the stress tensor (4.20) The above arguments hold when the number of stress tensor insertions n is held fixed as L/β → ∞, but can fail if n is allowed to grow in the limit. This can understood on general grounds as follows. We write the thermal correlation function of n stress tensors as where E|T (w 1 ) . . . T (w n )|E denotes the average over all states of energy E, and ρ(E) is the density of states. At high temperature, we think of evaluating the integral by locating a saddle point. Since E|T (w 1 ) . . . T (w n )|E ∼ E n , if n is held fixed as β → 0 the saddle point location is unaffected by the presence of the stress tensors. The fact that the same saddle point energy arises independent of the length of the string, provided it is held fixed, is what is responsible for the factorization properties that imply small fluctuations. On the other hand, if n ∼ L/β (or any more rapid growth) then the saddle point location does depend on the size of the string and the location of the stress tensors. Such correlators will therefore be sensitive to the particular microstate, which is not surprising given that in this regime we can arrange the stress tensors uniformly across the system with a spacing less than the thermal wavelength λ ∼ β.

Discussion
The main result of this paper confirms a general physical expectation: correlation functions in typical high energy states appear thermal. To reach this conclusion we needed to be sufficiently careful about what constitutes a typical state. A number of past works [10, 12-15, 21, 22] have compared expectation values in primary states to those in the thermal ensemble, and in some cases agreement was found. As we have seen here, the agreement in these cases requires working in the large c limit, since at finite c primary states are highly atypical. This atypicality is responsible for the mismatch between the expectation values of KdV charges computed in primary states versus the canonical ensemble. Typical states are instead descendants at level h/c, and taking this into account restores the agreement. We focussed here on correlation functions of conserved currents and stress tensors, but these remarks apply generally to correlators computed away from the large c limit. Our results have nontrivial implications for the comparison between CFT and black hole physics. Quantities computed in a black hole background are inherently coarse-grained and should therefore be compared with those evaluated in typical states of the CFT, rather than in primary states. For example, we expect disagreement between correlation functions in a heavy primary state (as studied e.g. in [23][24][25][26]) and the corresponding Witten diagrams or HRRT surfaces evaluated in the black hole background beyond leading order in 1/c.
We conclude with a few comments about the connection to the eigenstate thermalization hypothesis (ETH). The usual statement of ETH is that energy eigenstates of chaotic systems obey [1,2] where E denotes the average of the nearby energies E a and E b , O β E is the thermal average of the "few-body" operator O at the temperature β E , f (E a , E b ) is smooth function of the energies, and R ab is a random matrix. Although the full range of validity of this ansatz remains to be understood, it leads to physically reasonable behavior regarding the approach to thermal equilibrium in generic states. Our results are perfectly compatible with ETH, and further imply agreement between the vacuum block contribution to CFT quantities (such as the entanglement entropy) in thermal and typical states. However, since the only operators O that we study are conserved currents and the stress tensor, we are not really testing the core elements of ETH. For example, the second term in (5.1) is not respected by taking O to be the stress tensor, since the stress tensor has a strictly vanishing matrix element between states in different conformal families.
The ETH ansatz ensures that the expectation value of a local operator averaged over a long time will agree with its thermal value. In particular, even if one chooses an initial state for which an expectation value is far from thermal, the expectation value will simply fluctuate around its thermal value for almost all times, provided the matrix elements of the operator satisfy ETH. Such time-dependent behavior of course requires the system to be in a non-energy eigenstate (though with a sharply distributed energy), with the time dependence coming from the off-diagonal terms in (5.1). In this paper we have restricted attention to energy eigenstates, and although we have considered time dependent correlators this time dependence refers to the relative, as opposed to overall, location of the operators. Thus questions regarding thermalization are beyond our present scope, but under investigation.

A Current two-point function A.1 Equivalence of two forms of thermal correlator
Here we establish the equivalence of (2.5) and (2.13). We start working on (2.5) by carrying out the sum over over m using . We have assumed 0 < Im(w) < β, so that the sums converge. Using and performing the sum over n, we find we arrive at (A.3).

A.2 Minimal size of thermal correlator
We are interested in taking L → ∞ with w = L/2 at fixed β. This gives the minimal size of the thermal correlator, since periodicity under w ∼ = w + L implies symmetry around this point.
We proceed by first performing the over n in (2.5), which yields . (A.8) We have the modular transformation From this we deduce The higher-point functions of the stress tensor in the microstate take the form analogous to (3.19) for the 2-point case. At finite h each L 0 should be replaced with L 0 − c/24, but the difference is subleading in the thermodynamic limit. In order to demonstrate approximate equality between the microstate and thermal correlators (4.20) we must show that this quantity is sharply peaked over the ensemble of states at fixed h. Accordingly, we study the fluctuations of Once again the fluctuations can be split into off-diagonal and diagonal pieces: Here off-diagonal refers to a term of the form with all of the i k distinct from all of the j k . As in the main text, the kinematic factors just go along for the ride.
The diagonal terms are subleading in the thermodynamic limit, as in section 4. To see this we make use of a result (proven below) on the expectation values of strings of Virasoro generators. Suppose that X is a string of Virasoro generators of length whose mode numbers sum to zero, with s the largest number of non-overlapping substrings within X whose mode numbers sum to zero. If the levels of the generators in X scale like L/β and L/β, then X ∼ (L/β) +s , (B.5) as (L/β) → ∞. These are the levels that are relevant in the thermodynamic limit, as in the main text. For such X, it also follows that First consider the scaling of the connected part of an off-diagonal term, The expectation value scales as (L/β) 2n+s+s , where s (s ) is the number of zero substrings in {i} ({j}). We get additional factors of (L/β) when we convert the sums to integrals: (L/β) n−s and (L/β) n−s from the sums over {i} and {j} respectively. Accordingly this term scales as L 0 , and one can check that the disconnected piece scales in the same way. These terms will make an O(1) contribution to δY 2 unless they cancel. Now consider a diagonal term, say with i 1 = j 1 . The expectation value still scales as (L/β) 2n+s+s but there is one fewer sum since we have fixed i 1 = j 1 . This term therefore scales as L −1 and vanishes in the thermodynamic limit. Other diagonal terms will similarly make vanishing contributions to δY 2 in the limit.
Returning to the off-diagonal terms, we see that δY 2 will be O(1) unless which we now demonstrate. We start from where we used the manipulations from section 4. Similarly, (B.10) The second term has the same length as the first but one fewer zero substring, so by (B.5) it has one fewer power of (L/β). The first term therefore gives the leading behavior in the thermodynamic limit: This procedure can be iterated on all the L i k until one is left with only terms of the form where we made use of (B.6). Thus the expectation value of the j string factors out: We see that the leading term cancels, and the off-diagonal contribution to δY 2 starts at O(L −1 ). This gives rise to a fluctuation ∼ 1 √ L at finite size, as for the two-point function.
When the number of insertions scales with L/β they can be arrayed across the entire system with separation smaller than the thermal wavelength, so we have a very fine-grained probe. In this limit the argument above breaks down: eq. (B.5) no longer holds and the q∂ q X term in (B.6) cannot be discarded. Thus the expectation value of the j string does not factor out, and Y has O(1) fluctuations across the ensemble: such high-point correlators depend sensitively on the details of the microstate.

Proof of equation (B.5)
Suppose that X is a string of Virasoro generators of length whose mode numbers sum to zero, with s the largest number of non-overlapping substrings within X whose mode numbers sum to zero. We will show that if the levels of the generators in X scale as L/β and L/β, then X ∼ (L/β) +s (B.14) in the thermodynamic limit. We proceed by induction; the base case was shown in section 4. Now, suppose that the expectation value of an ( , s) string scales as (L/β) +s and consider an arbitrary ( + 1, s) string provided m = 0. To obtain the last line we used the inductive assumption: the first term is the sum of ( , s) strings multiplied by one factor of (m − a i ) ∼ (L/β), while the second term is the sum of ( − 1, s − 1) strings multiplied by m 3 ∼ (L/β) 3 . If m = 0 then we have an ( + 1, s + 1) string L 0 L a 1 . . . L a ≈ L a 1 . . . L a · q∂ q ln Z ∼ (L/β) +s+2 (B.16) where we used (B.6) and the inductive hypothesis. This proves the claim. When ∼ L/β these statements no longer hold: there are factors ∼ e L/β relating different orderings of the string, so there will be some strings contributing to δY for which (B.5), (B.6) and (B.13) all break down.

B.2 Ordering independence
In this subsection we argue that the ordering of operators defining the descendant state does not affect the expectation value of a string of operators in the thermodynamic limit. This does not affect our results, but leads to an effectively one-to-one correspondence between integer partitions and descendent states for purposes of computing expectation values.
Consider two descendents that differ only in the ordering of two Virasoro generators: We wish to argue that as β/L → 0. First consider L a = L b = X = 1: which scales as (L/β) 5 . This is the key point: terms that arise in the difference ψ h | X |ψ h − ψ h | X |ψ h have one factor of f h and two of the levels in place of f 2 h in ψ h | X |ψ h , so the difference is subleading to the expectation values themselves in the thermodynamic limit.
If we now let L a , L b and X be arbitrary the above reasoning still applies. The objects to compare are ) which can be computed by commuting L −i and L −j (or [L −i , L −j ]) all the way to the left. In the first a term with two factors of f h is generated, while the second has at most one factor of f h and two factors of the levels. The remainder of the computation is the same in both cases, so the difference is subleading in the limit.