Echoes of chaos from string theory black holes

The strongly coupled D1-D5 conformal field theory is a microscopic model of black holes which is expected to have chaotic dynamics. Here, we study the weak coupling limit of the theory where it is integrable rather than chaotic. In this limit, the operators creating microstates of the lowest mass black hole are known exactly. We consider the time-ordered two-point function of light probes in these microstates, normalized by the same two-point function in vacuum. These correlators display a universal early-time decay followed by late-time sporadic behavior. To find a prescription for temporal coarse-graining of these late fluctuations we appeal to random matrix theory, where we show that a progressive time-average smooths the spectral form factor (a proxy for the 2-point function) in a typical draw of a random matrix. This coarse-grained quantity reproduces the matrix ensemble average to a good approximation. Employing this coarse-graining in the D1-D5 system, we find that the early-time decay is followed by a dip, a ramp and a plateau, in remarkable qualitative agreement with recent studies of the Sachdev-Ye-Kitaev (SYK) model. We study the timescales involved, comment on similarities and differences between our integrable model and the chaotic SYK model, and suggest ways to extend our results away from the integrable limit.


Introduction
Black holes are the most entropic objects in the universe. Their entropy S BH = A BH 4G N is proportional to the horizon area and implies that the energy spectrum of microstates has a miniscule gap (δE ∼ e −A BH /4G N ), which becomes infinitesimal in the classical → 0 limit. Various lines of evidence also suggest that the dynamics of the Hamiltonian acting on these microstates is chaotic [1,2,3], implying that the spectrum of excitations must be irregular [4]. Around a typical state of such a bounded system, general arguments from quantum mechanics suggest that the gapped, irregular spectrum will lead to temporal correlations showing a universal initial decay which gives way at very late times to rapid, small fluctuations whose precise structure is determined by the actual microstate.
Random matrix theory (RMT), where the Hamiltonian is drawn from a fixed ensemble, has been proposed as a universal description of this sort of behavior. In this theory it has been shown that the ensemble average of the spectral form factor, a proxy for the twopoint function related to the 'easy' version of the information paradox [5,6,7], exhibits a characteristic initial decay, followed by an increasing ramp, and then a plateau. Recently, it was shown that the Sachdev-Ye-Kitaev (SYK) model [8,9,10], which is a strongly coupled model of quenched disorder inspired by black hole physics 1 , also displays the decay, ramp and plateau phenomena [16]. 2 Like random matrix theory, the SYK model contains an average over Hamiltonians -the coupling of the theory is drawn from a distribution and the smooth ramp and plateau arise after averaging over this ensemble. We would like to understand whether this behavior occurs generally in black hole physics.
String theory contains many examples of black holes whose microscopic descriptions are well understood. The simplest setting is Type IIB string theory compactified on a torus with five asymptotically flat dimensions. This theory contains charged black holes whose extremal limit still has a large entropy. The low-energy excitations of such a black hole are described by a two-dimensional conformal field theory [18], the "D1-D5 CFT", which is a marginal deformation of a sigma model on the symmetric product target space (T 4 ) N /S N . Here S N is the permutation group acting on N copies of T 4 . The marginal deformation parameter acts as the coupling in the theory. When it is large the theory is expected to be chaotic as it describes a macroscopic black hole. When it goes to zero, the theory approaches the symmetric product limit where it is integrable.
These five dimensional black holes can be reduced through a sequence of near-horizon, low-energy limits to extremal black holes in three and two dimensional Anti-de Sitter space. Indeed, through these limits, the D1-D5 CFT is known to be exactly dual to type IIB string theory on AdS 3 × S 3 × T 4 [19,20]. The SYK model was inspired by AdS 2 black holes, and hence it may be that the D1-D5 CFT at finite coupling has a reduction to an SYK-like model [13,15]. Here, we instead study the weak coupling limit of the theory. In this limit, the theory is integrable rather than chaotic but, remarkably, we show that many of the qualitative features of the chaotic RMT and SYK dynamics are already present.
Specifically, we consider dynamics around Ramond ground states of the D1-D5 theory which are typical microstates of the lightest black hole. These black holes have a large microscopic entropy, although it is not large enough to produce a classical black hole horizon. The temporal correlation function of graviton operators in these states shows an initial universal decay followed by sporadic fluctuations [21]. A similar structure occurs in observables computed with a single draw of a Hamiltonian in a random matrix theory. We argue, and numerically demonstrate, that the ensemble average in RMT can be mimicked by a progressive time-average in a single draw from the theory, over windows that scale proportionally to time. Applying this progressive time average to correlators in a typical ground state of the D1-D5 theory reveals an initial decay, followed by a long ramp and a plateau, qualitatively resembling both the RMT and SYK theories. The initial decay exactly reproduces the expected results in a black hole background [21]. We present analytic calculations of the plateau height and the shape of the ramp, and comment on the reasons for the quantitative differences between our results and those in fully chaotic theories like RMT and SYK. An interesting challenge for the future is to perturbatively turn on the marginal deformation that takes the integrable limit of the D1-D5 theory into a chaotic regime.
The paper is organized as follows. In section 2 we give a short review of the D1-D5 system at the orbifold point and the two point functions in the Ramond ground states based on [21]. In section 3 we introduce progressive time averaging in the context of random matrix theory and present evidence that it is capable of capturing both qualitative and quantitative features of the ensemble average. In section 4 we apply this time average to the D1-D5 graviton two point function and present the main results of the paper, including analytic estimates for the ramp and the plateau. In section 5 we conclude the paper. We include two appendices with some additional details of the discussion in section 4.

Correlators in the D1-D5 CFT at the orbifold point
We will consider Type IIB string theory compactified to five dimension on S 1 × T 4 with N 1 D1-branes wrapped on the S 1 and N 5 D5-branes wrapped on the entire compact space. At low energies, the effective theory describing the dynamics of excitations is a certain marginal deformation of an N = (4, 4) supersymmetric sigma model on the symmetric product target space M 0 = (T 4 ) N /S N , where N = N 1 N 5 and S N is the permutation group acting on the N copies of T 4 [18]. The sigma model on M 0 describes N "strands" of string propagating on T 4 . While this is a free orbifold theory, it has an interesting spectrum and correlation functions, as we will see. The marginal deformation corresponds to turning on an interaction that allows splitting and joining of strings.
Taking an appropriate limit isolates the part of the spacetime that is exactly described by the CFT. In this low-energy limit we say that the D1-D5 CFT is holographically dual to Type IIB string theory on AdS 3 × S 3 × T 4 . To have a large, weakly coupled AdS 3 space, N must be large and, in addition, the CFT must be strongly coupled, i.e. deformed far from the orbifold point. We are going to study the theory in the opposite, weakly coupled limit, but still at large N .
The extremal, supersymmetric black holes in the five dimensional asymptotically flat theory descend in this construction to the BTZ black holes of AdS 3 with periodic boundary conditions for fermions around the asymptotic circle in the AdS 3 geometry, i.e., they are in the Ramond sector of the theory. The lightest black hole, which is massless, has the quantum numbers of a ground state in this sector.
The construction of Ramond ground states of the D1-D5 CFT at the orbifold point is reviewed in detail in Appendix A of [21]; here we provide a brief summary. We think of the CFT as describing N distinct "strands" of string, each of which propagates on T 4 . The ground states of the theory are formed by joining strands into various closed strings, which may be "short" (consisting of one strand) or "long" (consisting of multiple strands). The strands are attached together by applying elementary bosonic (σ) and fermionic (τ ) twist operators which create n-wound string sectors. Each twist operator has 8 polarizations associated with the global symmetries of the theory. A general Ramond sector ground state is created by multiplying together bosonic and fermionic twist operators to achieve a total twist of N = N 1 N 5 : where µ = 1, . . . 8 labels the polarizations, and n = 1, 2 . . . N labels the possible lengths of strings (i.e. the number of strands a string is made of). For our purposes, the integers N nµ and N nµ , which count the various twist operators, uniquely specify a Ramond ground state of the theory. Note that while Ramond ground states all have the same energy, the spectrum of excitations around each of them is different. For example, consider a case with N = 4 strands. Three possible states are four strings of length 1 (N 1 = 4), two long strings of length 2 (N 2 = 2), and one string of length 3 with another of length 1 (N 3 = 1, N 1 = 1). If we take the CFT to be on a circle of circumference L, the momentum spectrum of excitations is very different in these sectors -the first has modes gapped by 1/L with four-fold degeneracy, the second has modes gapped by 1/2L with 2-fold degeneracy and the third spectrum is a union of modes gapped by 1/L and 1/3L each with unit degeneracy. Thus correlation functions computed in each of the microstates will be different and will depend on the twist distributions {N nµ , N nµ }. ) such partitions, leading to an enormous ground state degeneracy in the theory, corresponding to an entropy S = 2 √ 2πN . Nearly all partitions of a large integer lie very close to a certain "typical partition" [21]. This means that most Ramond states will in fact have twist distributions {N nµ , N nµ } that lie close to a certain typical distribution. Thus, although correlation functions measured in individual microstates will depend on the precise form of the state, for almost all microstates the generic correlation functions will take a typical form, which we seek to investigate. Microcanonically, we should study all partitions of integers that lead to a total twist of N . The easiest way, however, to derive the form of the typical state is to use the grand canonical ensemble with a "chemical potential" η to fix the total "charge" N for eight types (µ = 1 · · · 8) of bosons (σ µ n ) and fermions (τ µ n ) with integral charges n. When N is large, the grand canonical average populations for {N nµ , N nµ } will also be typical, in the sense that most configurations will be very close to the average (the standard deviation over the mean will tend to zero). Thus we can derive that most of the Ramond ground states have twist distributions close to the Bose-Einstein and Fermi-Dirac forms: For further reference, we note that the entropy scales as S ∼ 1/η. Now that we know the form of the typical Ramond ground state in the D1-D5 system it remains to calculate the correlation function. Again following [21], we will consider bosonic non-twist operators, which do not cut and join the N strands of the CFT. (An operator describing a fluctuation of the metric in the T 4 directions is an example.) We focus on S N invariant operators obtained as sums of copies acting on each strand, We are interested in two-point functions of the form Since the state as a whole splits into a product of strings of lengths n, the correlator splits into a sum of terms, each of which reduces to a two point function in a CFT on a spatial circle that is n times as long. After some algebra (see [21] for details), the correlation function becomes Here h andh are the left and right conformal dimensions, C is a constant, the sum on n accounts for the contribution from strings of length n, and the sum on k accounts for the placement of operators on different strands of a long string. Also, w = w 1 − w 2 and w =w 1 −w 2 are differences in the lightcone positions of the probe operators. In Lorentzian signature we will set where φ and t are dimensionless angular and time coordinates in the CFT, normalized by setting the length of the spatial circle to be equal to 2π. The correlator G(w,w) exhibits physical lightcone divergences on the cylinder. We can regularize these divergences by dividing by the vacuum correlation function, which is fixed by conformal invariance. Focusing, for definiteness, on operators with conformal dimensions h =h = 1 this results in [21] (2.9) Setting φ = 0, we can evaluate the temporal correlation function numerically -the result is plotted in Fig. 1. We see a smooth initial decay followed by sporadic behavior, which is qualitatively similar to the behavior of observables in a single draw from a random matrix theory or in the SYK model before the average over disorder.
What is the origin of this sporadic behavior? As shown in Appendix A, the two-point function (2.9) for φ = 0 receives contributions from frequencies of the form m/n, with n an integer labeling the length of a component string (so 1 ≤ n ≤ N ). This is a dense spectrum consisting of all rational numbers with denominators smaller than N + 1. The mixing of this large number of incommensurate frequencies produces the rapid late time oscillations. A feature of the theory is that excitations on different long strings do not interact at the orbifold point. This means that the smallest frequencies that occur in the two-point function are much larger than implied by the dense spectrum because no terms depend on the energy differences between excitations of strings of different lengths. (Hence, while the spectrum contains excitations with energies 1/(N − 1) and 1/N the two point function does not contain the difference 1/(N − 1) − 1/N .) If we were dealing with a fully chaotic system, we would expect all the degeneracies in the spectrum to be broken, leading to exponentially small energy spacings. When we later analyze smooth, time-averaged versions of our two-point function, we will see that the relatively large frequency gap will cause the late-time plateau value to be reached earlier than would have been the case for smaller gaps.
One might wonder why the averaging over states we have performed by going to the grand canonical ensemble has not led to a smoothing of the sporadic behavior, in the way that ensemble averaging does for random matrix theory and SYK. The reason is that every state in the ensemble has exactly the same spectrum, albeit with different degeneracies. Our ensemble average therefore does not have the same effect as the averaging over different spectra that produces smoothing in random matrix theory and SYK. Thus, to obtain smooth late-time behavior, we will need another way of coarse graining, to which we turn next.

Random matrices and progressive time-averaging
The sporadic late-time fluctuations of the two-point correlation function (2.9) are reminiscent of similar behavior found in [16] for the SYK model. There, smooth curves were obtained by averaging over random couplings. The main object of study in [16] was the spectral form factor where the sum runs over all the eigenvalues E n of a Hamiltonian drawn from an ensemble. The spectral form factor displays sporadic late-time behavior, which can be smoothed by averaging over the emsemble of Hamiltonians. The main result of [16] is that the result agrees very well with the spectral form factor in random matrix theory, again after averaging over the random Hamiltonians. One motivation for studying the spectral form factor is found in the spectral decomposition of the thermal two point function, It is believed that the late-time behavior is controlled by the phases e i(Em−En)t , so that some features of the two point function in this regime are captured by the spectral form factor (3.1). Another motivation is that the spectral form factor is a more primitive quantity than two-point functions, in that it can be obtained directly from the partition function and does not require the introduction of operators. As a result, it can be straightforwardly studied in random matrix theory. Our D1-D5 CFT has a definite Hamiltonian, so we cannot resort to disorder averaging for smoothing the sporadic late-time behavior of two-point functions. Is there any other meaningful way in which the late-time oscillations can be smoothed? A natural idea is to coarse-grain the correlator over time. One quickly notices, however, that averaging with any fixed time window either fails to remove the late-time oscillations or significantly distorts the early-time decay. This leads to the idea of using a time-window that grows with time, which we refer to as progressive time-averaging. In order to motivate a specific prescription we turn to random matrix theory. We will ask whether the well-known result of ensemble averaging could be alternatively obtained by progressive time-averaging applied to a single Hamiltonian drawn from the ensemble. We will find that this is the case for a time window that grows linearly with time, which will motivate our use of this procedure in the context of the D1-D5 CFT.

Ergodicity in random matrix theory
We consider Hamiltonians that are L × L matrices drawn from a random matrix ensemble. An important phenomenon in random matrix theory is self-averaging of certain quantities, i.e. the agreement of a quantity evaluated on a typical instance of the ensemble with the ensemble average of the same quantity. An interesting generalization of self-averaging quantities are ergodic quantities. For ergodic quantities the result of averaging over random Hamiltonians can be approximately reproduced by using a single Hamiltonian drawn from the ensemble and coarse-graining in time.
It is known in random matrix theory that the spectral form factor is self-averaging for sufficiently short times but not for longer times [22]. On the other hand, we can study the ergodicity of the form factor by considering suitable time averages where g(t, ∆t) is some smearing function with peak at t = 0, width ∆t and dtg(t, ∆t) = 1. We could imagine it to be a Gaussian or a step function, but its details should not matter too much. The spectral form factor for a Gaussian random matrix ensemble, which is related to the late time behavior of the SYK model, is not ergodic for any fixed time window ∆t [22].

Progressive time-averaging
We will now provide evidence suggesting that a progressive time average with ∆t ∼ t gives a good approximation to the ensemble average for Gaussian random matrices. This is equivalent to averaging over fixed windows in log t. We first present a heuristic motivation, followed by numerical evidence. In Gaussian random matrix ensembles, the probability distribution for the difference of two neighboring energy levels s = E n+1 − E n with s > 0 is given by [24] where s 0 is the average value of s, A is the normalization, and the constants α and β depend on the specific ensemble. In the spectral form factor (3.1), the phases exp(±ist) appear with the same weight and so add to give a term proportional to cos(st). Let us ask what happens to this cosine upon ensemble averaging: We see a cancellation of the random phases in the average, resulting in Gaussian decay.
Can we reproduce this with a time average? Consider for example a Gaussian smearing function applied to a typical phase e is 0 t and its conjugate: We see that we need to set σ = t √ 2α in order to reproduce the decay of the ensemble average. At the qualitative level the argument depends relatively little on the smearing function. For instance, for a step function we find which for σ ∼ t is again a decaying function with width of order 1 s 0 . The argument above focuses on time dependences associated to differences of neighboring energy levels rather than generic energy differences, so that we may really trust it only at very late times. While it would be interesting to find a more precise analytic argument, in the present paper we will simply use the heuristic argument as motivation, verify numerically that the resulting prescription produces good results in random matrix theory, and then apply it to our system of interest. Now we present numerical evidence suggesting that the progressive time-average with window ∆t ∼ t of the spectral form factor (3.1) is ergodic. We work with 1024 × 1024 matrices and -for numerical speed up -discretized time averages We draw a single pseudorandom Hermitian matrix H from the Gaussian Unitary Ensemble (GUE), p(H) = 1 2 L/2 π L 2 /2 e − L 2 TrH 2 with L = 2 10 . The spectral form factor for such a single matrix is plotted in Fig. 2. We see self-averaging for early times which is quickly overtaken by noise for late times.
The ensemble average of the spectral form factor for Gaussian ensembles is a well studied quantity. In particular, it behaves universally for large L, exhibiting an early slope from an initial value ∼ L 2 , followed by a dip, a linear rise over time scale ∼ L, and finally a plateau which is the infinite-time average and is of order L [25]. We plot this ensemble average for 500 random matrices in Fig. 3 and Fig. 4, with the results of time-averaging for a single matrix drawn from GUE superposed, with various fixed windows and a progressive window ∆t = 0.8t, respectively. 4 It is clear that the progressive time window provides a much better approximation to the ensemble average than the fixed time windows.

Echoes of chaos in D1-D5 two-point functions
Our examination of the D1-D5 theory at the orbifold point will focus on the regularized Lorentzian two-point function (2.9) evaluated at temporal distance t and equal location in space:Ĝ We start by applying the progressive time average of the previous section to this two point function. For numerical simplicity, we will use the pointwise averaging 5 of (3.9) with ∆t = t. 4 The order one coefficient in front of t should be smaller than 2 because otherwise, in (3.9), we are calculating the total integral of the function. Other than this constraint, the late part of the ramp and the plateau are rather insensitive to this coefficient, as follows from the argument presented around (3.6). On the other hand, the location of the dip depends slightly on the choice of the coefficient. The coefficient could be tuned to minimize deviations from the early self-averaging part of the curve in order extract the dip time. 5 One might expect a Gaussian kernel to produce smoother curves but it is numerically more challenging.
The results are presented in Fig. 5. The smoothed curve has a dip that is lower than the late time average (plateau), which it reaches after climbing a ramp whose length increases with N . This is in qualitative agreement with random matrix theory and the SYK model. The remainder of this section is devoted to an analytic study of the late time ramp and plateau, highlighting the quantitative differences from random matrix theory.
In the following, we will find two ways of rewriting C n (t) useful. The first was obtained in [21] by explicitly evaluating the sum over k: (4. 2) The second rewriting (worked out in Appendix A) will be handy for deriving analytically the late-time behavior of the correlator (4.1). In addition, it emphasizes the similarity between the D1-D5 two-point function and the spectral form factor: Here the 'spectral weights' follow a 'triangle law' (see Fig. 6) and G n (x) = n n divides x 0 otherwise. (4.5) Below, we analyze the behavior of (4.1) after progressive time-averaging.

Plateau
At very late times, the progressive time averages of the quantities C n (t) tend to limiting values:C The function G n (2m + 2) vanishes unless 2m + 2 is a multiple of n. This requires that for even n and m = n − 1 for odd n. This leads tō C n(even) = 3 2 andC n(odd) = 1, (4.8) from which we obtain: (4.9) We may approximateḠ for the typical state using the grand canonical ensemble in which the total twist N is fixed with a chemical potential η according to equations (2.3) and (2.4).
In this approximation, we may let the sums in (4.9) run to infinity, , (4.10) and then, when N is large, approximate them with integrals: Here δ is an O(1) number that parameterizes the discretization error at the lower limit. Evaluating the integrals gives:

Ramp
To exhibit the ramp behavior of the correlator (4.1), we again employ progressive time averaging. For numerical convenience, we use a simple step-function averaging reminiscent of eq. (3. dt Ĝ (t ). (4.14) The progressively time-averagedG(t) decomposes into progressively time-averaged contributions from individual modes,G (t) = 1 N n N nCn (t), (4.15) each of which takes the form: sin t /2 sin t /n 2 1 + sin t n tan t /n dt . (4.16) This rewriting follows from eq. (4.2). An intuitive approach to estimate the ramp is to recognize that for each individual n, the contribution ofC n (t) to the correlator jumps fairly quickly from the low point in the dip to the plateau. This is because there is only one timescale, as the gap (which sets the plateau time as γ × 1/gap for some constant γ) and the level spacing are the same. 6 We also know from the above that for odd n the contribution to the plateau is 1 and for even n it is 3/2. This suggests that we can write the ramp part of the correlator as: In this equation we have taken into account that by a time t the contribution of any n with n < t/γ will have hit its plateau, and we approximate the other modes as being 0.
Here γ is some O(1) number that relates the scale of the gap to the precise timescale of the plateau. This may depend on the spectrum and on the operator. The sum includes a unit contribution from the plateau for any n and the second sum includes the additional 1/2 that is present for even n. Putting in the occupation numbers, we get . For the first sum τ = t/γ and for the second sum τ = t/(2γ). Also, note that η in second sum is multipled by a factor of 2. Putting this all together and doing some elementary algebra givesG (t) = 5η π 2 log 1 ηδ tanh tη 2γ + 8η 2π 2 log 2, (4.20) where we can ignore the last term for our purposes. The characteristic time scale is the plateau time t p ∼ 1/η ∼ √ N . Let us understand the time dependence. First, note that t cannot be taken to zero since the integral approximation is invalid in that case. Thus, t is at least O(1) andG(t) ≥ 0. We can consider two useful limits. First so the ramp rises logarithmically, in contrast to the linear rise for random matrices. In this range,

Dip
In this subsection, we consider the temporal coarse graining of (4.1) with generic progressive time window of width ∆t = at (generalizing (4.14)), which we will denote bỹ Such a generalization does not modify the conclusions about the late part of the ramp and the plateau time. However, we do expect the precise location of the dip to be sensitive to the parameter a and therefore the specific coarse graining that we pick. As we will see, the scaling with the entropy is independent of a.
The strategy we use is to approximate (4.24) as the sum of the contribution coming from the regularized M = 0 BTZ two point function [21]    In order to extract the dip time, we want to find the minimum of this curve. Let us assume that the dip happens at 1 t d √ N . In this case, we can approximate the ramp part with the logarithmic rise of (4.21). On the other hand, the BTZ part asymptotes to 9 1 at so that for times 1 see the right panel of Fig. 8 for an illustration. The minimum of this curve is at which scales as t d ∼ √ S with the entropy. Note that we indeed have 1 t d √ N . This establishes a parametrically long ramp. Notice that we have t d ∼ √ t p which is also valid for random matrices.

Variances
Eqs. (4.13) and (4.18) apply to the typical Ramond ground state of the D1-D5 system. We may ask by how much other Ramond ground states differ from these typical values.
We start by considering eigenstates of the occupation numbers N n . The middle expression in eq. (4.17) castsG(t) as a linear combination of distinct occupation numbers N n , which in the grand canonical ensemble are independent random variables. Thus, the variance in the ramp part of progressively time-averaged correlator,G(t), can be approximated in terms of variances in N n : The variance in the plateau height is obtained by taking the late time limit of the above expression: where we have expanded in small η (large N ) in the last expression. So the standard deviation in the plateau height divided by the mean (4.13) scales as 1/ log(1/η) ∼ 1/ log N . Note that the primary sources of the deviation are the 'relatively short long strings' with n 1. We can also estimate the variance of the slope of the ramp. From (4.17), we recognize that after sufficient coarse-graining, dG(t)/dt ∝ N t/γ /N . Therefore, At central times in the ramp t ∼ √ N . Since η ∼ 1/ √ N this means that the hyperbolic functions on the right hand size are O(1). So the standard deviation in the slope is O(1/N ). At central times in the ramp, we showed above that the slope is also O(1/N ). Thus the slope varies significantly between occupation number eigenstates.
However, a typical Ramond ground state is a superposition of these eigenstates. Sincẽ G(t) is not an eigenvalue, but an expectation value of a quantum mechanical operator, it makes sense to discuss the variances among all ensemble members, including superpositions. Such a variance, with superposition states weighted with a uniform measure over CP exp S , was computed in [26]. This paper showed that the variance in the expectation value of a quantum mechanical operator is suppressed relative to the variance among its eigenstates by an extra factor of the dimension of the Hilbert space. Thus, all the variances computed in the preceding paragraphs receive an additional factor: var superpositions = e −S var Nneigenstates = e −2π √ 2N var Nneigenstates . (4.36) Thus we can conclude that over the entire Hilbert space, almost all states will show a coarse-grained two point function that lies very close to the results that we have computed in the typical state.

Discussion
We have studied the time-ordered two-point correlation function of certain operators in typical states of the Ramond sector of the D1-D5 CFT. At strong coupling these are black hole microstates and the theory is expected to be chaotic. Here, we studied the weak coupling limit of this theory, where it is integrable. After temporal coarse-graining, the late time two-point function displays a characteristic dip, ramp and plateau. These features are remarkably similar to those seen in random matrix theory (RMT) and the SYK model, showing that the qualitative form does not specifically arise from the chaos present in those models.
A key quantitative difference is that the slopes of the ramps in RMT and SYK are constant, while in our model the slope decreases with time. Also, while the RMT and SYK plateaus are exponentially suppressed in the entropy S, our plateau scales as log S/S. Finally, the plateau in RMT and SYK is reached at a time that is exponential in the entropy, while in our case it is reached at times proportional to the entropy.
These quantitative differences arise from the different structures of the excitation spectra. In a chaotic theory the energy eigenvalues are typically non-degenerate and have spacings that are exponentially small in the entropy. Random matrix theories also demonstrate a phenomenon of spectral rigidity, in which repulsion between eigenvalues of the Hamiltonian produces long-range correlations in the spectrum. The exponentially small gap leads to the exponentially large plateau time, and the linear ramp is partially a consequence of the spectral rigidity [16]. By contrast, although the D1-D5 theory at the orbifold point has a dense spectrum, there is a very large degeneracy of each energy level and the gaps are not exponentially small. This leads to a much shorter timescale for the plateau. The plateau is also much higher because the theory explores its phase space less completely than a chaotic model.
The authors of [27] argue that in general 2d CFTs, the dip occurs at times proportional to the entropy. Likewise the authors of [28] predict a breakdown of the semiclassical description of the two point function at entropy times. (See also [29] for related work in the context of the D1-D5 system.) In contrast, the location of our dip scales with the entropy as √ S. The reason for the difference is that all of these works address finite temperature states, and require generalization to apply to the zero temperature, large entropy system that we examine. 10 It would also be useful to see what the results in [27,28] imply for the late time, finite temperature two-point function in the orbifold D1-D5 theory.
It would be very interesting to see how these phenomena change as the D1-D5 theory is deformed from the integrable point that we studied to the strongly coupled region where it is expected to be chaotic and dual to weakly coupled AdS 3 gravity. One strategy for making progress is to turn on this marginal deformation perturbatively [30], although it may be challenging to sum the perturbation series with sufficient accuracy to capture the late time physics. Another interesting avenue is to consider correlation functions of twist operators that induce interaction between the long string components of the state. The resulting mixing should break degeneracies between energy levels and lead to much smaller gaps. This will in turn lead to much longer timescales for the ramp and the plateau in the two point function. 10 In SYK, the range of parameters where there is both chaotic behavior and IR conformal symmetry is 1 βJ N , where β is the inverse temperature and J is the coupling [11]. We see that in the zero temperature limit we need to switch off the coupling to stay in this regime. Our situation is somewhat similar to this.  We plot this function in Fig. 6.

B More detailed ramp estimate
For all except very small values of n, the quantitiesC n (t) defined in (4.16) show a universal behavior. 11 Following a rapid decay from their initial values, theC n (t) hover near zero for a time ∼ 0.67nπ. At that time, the even n quantities undergo one sharp jump to near their asymptotic value of 3/2 and, thereafter, many smaller jumps and gentle decays that keep theC n (t) near 3/2. For odd n, theC n (t) rise to near their asymptotic value of 1 in two distinct sharp jumps that happen at approximately 0.67nπ and 1.34nπ, also followed by many smaller jumps and gentle decays which keep theC n (t) near 1. We have not derived these statements analytically, but they are manifest from the plots in Fig. 9. As a coarse approximation to the time dependence ofG(t), we may model theC n (t) as simple step functions: As a next step, we substitute the occupation numbers for the typical state and replace the sums with integrals. After these approximations, it will not be meaningful to keep track of .67 n π .67 n π 1.34 n π 1.00 0.54 Figure 9: The progressively time-averaged correlator contributions of individual modes C n (t) for even n (left; n = 100, 500, 1500) and odd n (right; n = 101, 501, 1501). We have also marked the Θ-functions from eq. (B.1).
the various O(1) coefficients appearing in (B.2). Thus, we introduce a single O(1) coefficient γ that parameterizes the average rate at which the successive modes join the ramp: The factor of 5/4 is the average height of the jumps undergone by the even (3/2) and odd (1 = 0.54 + 0.46) modes. Since N = 2π 2 /η 2 1, this reduces to: