Ergodic Equilibration of R\'enyi Entropies and Replica Wormholes

We study the behavior of R\'enyi entropies for pure states from standard assumptions about chaos in the high-energy spectrum of the Hamiltonian of a many-body quantum system. We compute the exact long-time averages of R\'enyi entropies and show that the quantum noise around these values is exponentially suppressed in the microcanonical entropy. For delocalized states over the microcanonical band, the long-time average approximately reproduces the equilibration proposal of H. Liu and S. Vardhan, with extra structure arising at the order of non-planar permutations. We analyze the equilibrium approximation for AdS/CFT systems describing black holes in equilibrium in a box. We extend our analysis to the situation of an evaporating black hole, and comment on the possible gravitational description of the new terms in our approximation.


Introduction
Recent attempts to derive the Page curve from semiclassical Euclidean gravity in low-dimensional models have proven remarkably successful [1][2][3]. The lesson is that the sharp change in tendency of the Page curve for Rényi entropies can be reproduced á la Hawking-Page from an exchange in dominance at the Page time between the disconnected Euclidean black hole saddle and the so-called 'replica wormhole' saddle. For the purity of the state of the radiation ρ R , the replica calculation in gravity outputs a formula consisting of these two leading contributions where S R β is the thermal second Rényi entropy of the radiation and S R β is the thermodynamic 'Bekenstein-Hawking' second Rényi entropy of the black hole at inverse temperature β.
Even if replica wormholes 'unitarize' the Rényi entropy of the radiation, their inclusion seems to lead to a fundamental incompatibility with a conventional quantum mechanical description. The semiclassical computation of the state of the radiation (ρ R ) ij produces the well-known thermal result, up to perturbative corrections. The most natural way to reconcile a thermal density matrix (ρ R ) β with (1.1) is to declare that Euclidean gravity is effectively reproducing an averaged description of the 'pseudo-random' properties of the discrete UV spectrum of the theory [4][5][6][7][8][9][10][11][12][13].
In this note, we reexamine the possibility introduced in [5] that replica wormholes give an approximation to the equilibrated physics of a single unitary theory. We appeal to quantum ergodicity in a high-energy microcanonical band of the Hamiltonian and derive a microscopic version of the 'equilibrium approximation' of [5] for Rényi entropies. Our results have some additional structure that enters at the level of non-planar diagrams of the previous approximation. We then show that for a certain class of states our results can be further approximated by the microcanonical and the canonical ensembles. In the case of the canonical ensemble, each term of our approximation can be computed from an Euclidean path integral over replicas of the system with different patterns of connectivity between them.
We then consider initial pure states in AdS/CFT systems that lead to a large black hole in equilibrium with its radiation. For the purity of the radiation subystem, the equilibrium approximation in this case consists of three terms (1. 2) The first two terms agree with (1.1) and, in this case, the equilibrium approximation itself prescribes a boundary path integral to compute each of these terms. The pattern of connectivity between the replicas in each of these path integrals is ultimately related to the topology of the corresponding saddle of the semiclassical gravitational path integral. Similar considerations hold for higher Rényi entropies.
We finally engineer a different AdS/CFT setup that contains an evaporating black hole in the quasi-equilibrium approximation. We get a similar qualitative result to (1.2) for the purity of the radiation at each epoch of evaporation. The last term in (1.2) is responsible for recovering the exact pure state for the radiation ρ R at the endpoint of evaporation. The contribution of these terms is reminiscent of higher genus saddles to the gravitational path integral in models of JT gravity. The minus sign in (1.2), however, seems to be prescribed from the exact unitary description and it might be ad hoc from semiclassical gravity.
The paper is organized as follows: In section 2 we derive the microscopic equilibrium approximation for Rényi entropies from standard properties of chaotic many-body quantum systems. In section 3 we give an estimation of the average quantum noise around the microscopic equilibrium value. In section 4 we consider a class of delocalized states over a microcanonical band and we obtain the microcanonical and canonical equilibrium approximations for the Rényi entropies. In section 5 we analyze the equilibration of Rényi entropies for a black hole inside a finite box, and obtain (1.2) and higher Rényi analogs. We also consider the case of a slowly evaporating black hole. We end with some conclusions and appendix A containing some technical details.

Ergodicity and Long-Time Averaging
In this section, we will study the behavior of Rényi entropies for pure states under mild assumptions about chaos in a high-energy microcanonical band of the Hamiltonian of a many-body quantum system. We will start by considering a microcanonical band H E, of the Hamiltonian H consisting of states with energies in the interval [E − , E + ]. The energy window will be narrow E, but still spacious enough to accommodate a large microcanonical entropy S = log N , where N is the number of energy eigenstates {|E i } on the band. We will assume that the spectrum of the Hamiltonian {E i } is non-degenerate in this band 1 , and that there are no rational relations between different energy eigenvalues 2 . In particular, the energy differences E i − E j will be rationally independent, which translates into the lack of resonances in the system.
Given some initial state localized on this band |Ψ = i c i |E i , its time evolution will involve an effective number of N eff energy eigenstates, where For any t > 0, the position of the state vector |Ψ(t) lies on a torus determined by these N eff real phases, T N eff , and in fact the lack of resonances will make it an ergodic cover of this torus.
In what follows, we will assume a bipartition of the full system, H = H R ⊗ H R , and study the time-evolution of the entanglement spectrum of |Ψ(t) with respect to this bipartition. More precisely, we will study the set of Rényi entropies where ρ R = Tr R |Ψ(t) Ψ(t)| is the reduced density matrix of subsystem R. We will exploit the spectral properties of H on the band to compute the long-time averages of Rényi entropies which, as we shall see, will tell us an idea about 'equilibrium' values of Rényi entropies.
For later convenience, and in order to make contact with [5], we will introduce some notation. Density matrices like |Ψ Ψ| can be viewed as states |Ψ ⊗ |Ψ * ∈ H ⊗ H, where the star denotes some antiunitary operation like CPT. Similarly, Z (R) n can be regarded as an amplitude on (H ⊗ H) n , namely as Z All the information about the partial tracing is kept in the bra R, R | that lives in the dual space to this replicated Hilbert space (see Fig. 1).
We will also define a set of states on the replicated Hilbert space that will turn out to be particularly useful for notational purposes. Given a density matrix ρ = ρ ij |E i E j |, and some permutation σ ∈ S n , we define a state |ρ, σ ∈ (H ⊗ H) n as First, we will compute the long-time average of the purity, Z 3) in this case involves four replicas and is given by The long-time integral of the phase vanishes whenever the total frequency is non-zero, which in this case requires From the assumption that there are no rational relations between energy eigenvalues, this condition can only hold when the i and j eigenvalues are identified, which leads to three different long-time saddles and we are not adopting the convention of summing over repeated indices for the last term. This last term is essential in order to avoid over-counting for the case of the configuration Following these considerations, we derive the average value of the purity for the microscopic equilibration density matrix ρ = i |c i | 2 |E i E i |. The unnormalized state |φ is a mutipartite entangled state in the replicated Hilbert space The first two terms in (2.7) match the equilibrium proposal of [5] but in this case ρ possesses microscopic information about the initial pure state |Ψ . From the symmetries of |φ under permutations of the replicas, it is straightforward to see that the whole expression is invariant under R ↔ R, which is expected from an average over the long-time ensemble of pure states.
The new term that we obtain is essential to preserve exact purity in the limit in which R becomes the whole system, H R = H. In this limit, Tr R ρ = 1 and Tr R ρ 2 = N −1 eff , so the long-time average of the purity becomes which is consistent with exact unitarity, from Tr (|Ψ(t) Ψ(t)|) 2 = 1.
The equilibration value of Z (R) n = Tr R ρ n R will similarly be given by a long-time integral over the 2n replicas The long-time integral of the phase again imposes the constraint that the total frequency is zero, The lack of rational relations in the spectrum of the Hamiltonian allows to perform this integral without the need to know the particular spectrum, mainly reducing the integral to a simple combinatorial problem for the long-time saddles, which is explained in detail in Appendix A. We import the result here B∈π a,b∈B where the sum is over partitions Π n of the set N n = {1, 2, ..., n}. A given partition has the form π = {B 1 , ..., B r } and the B k are 'boxes' containing n k elements of N n . The number of terms in the sum is the number of partitions of N n , which is known as the n-th Bell number B n , and grows super-exponentially for large n. The coefficients α π can be recursively found from the relation where π < π represents the sum over finer partitions π . The coefficient for the finest partition π e = {1, 2, ..., n} is set to α πe = 1 in this normalization. The values of α π for some of the finest partitions are explicitly computed in Appendix A.
In this way, we arrive to the long-time average of the n-th Rényi entropy The first sum exactly reproduces the terms in the equilibrium ansatz of [5], but again one has to consider the microscopic equilibration density matrix ρ = |c i | 2 |E i E i | which has information about the initial state |Ψ of the system. The second term is a sum over non-trivial partitions Π * n = Π n \{π e } and over the permutation orbit S n /S π of each partition π. Here S π < S n the stabilizer subgroup of a given partition π, which can be intuitively understood as any permutation that preserves the content of the 'boxes' B l ∈ π. The new states correspond to different multipartite entangled states with a new 'source' of entanglement coming from the projections δ ia i b associated to each partition π. We can also write |φ π , σ = P π |ρ, σ , for P π the projection operator acting as in (2.14) on half of the H-factors of the replica Hilbert space. The number of new states |φ π , σ also scales super-exponentially with n. These states are essential to restore the exact purity in (2.13) when the subsystem R is allowed to gradually approach the size of the full system. The long-time average of the Rényi entropy (2.13) is invariant under R ↔ R. This property follows from R, R| ρ, σ = R, R| ρ, στ , with the cyclic permutation τ = (1, 2, ..., n − 1, n) ∈ S n . The right-multiplication is an isomorphism in S n and therefore the first sum is trivially invariant. All possible partitions are present for each permutation in the second sum, making the whole sum also invariant.
We can estimate the magnitude of each term in (2.13) if we introduce the effective rank n R and n R of the density matrix ρ on each of the subsystems The matrix elements of ρ in an orthonormal basis {|r, r } of H R ⊗H R will have magnitude (ρ) rrr r ∼ (n R n R ) −1 . In this basis, each amplitude is where η = (n, n − 1, ..., 2, 1) ∈ S n . Following the argument in [5], the diagrammatic representation of each amplitude shows the number of R and R loops present, and each loop corresponds to a factor of n R and n R respectively (see Fig 2). The number of R loops in R, R | ρ, σ can be shown to be the number of cycles of the permutation σ, denoted k(σ). The number of R-loops, on the other hand, can be shown to be k(η −1 σ). Altogether, the magnitude of the amplitude is The dominant permutations corresponding to planar diagrams are constructed from the 'noncrossing partitions' of N n and saturate the inequality For n R < n R the leading permutation in (2.18) is σ = e, while for n R > n R it is η.
Let us define the 'order' of a given partition π = {B 1 , ..., B r } of boxes of size |B k | = n k as the product |π| ≡ n 1 n 2 ... n k . Then, we can see that the effect of the projector P π in R, R | φ π , σ is nothing but to reduce the previous result by a factor of N 1−|π| eff . The contribution of the new states is then and in particular we can see that |π| > 1 implies that the new terms enter the long-time value at least at the order of permutations with non-planar diagrams. Note, however, that this hierarchy is less pronounced when |Ψ involves a small number of energy eigenstates, and in this case the new terms can become comparable to certain 'planar' permutations. In fact, for a single eigenstate, all the terms in (2.13) are of the same order of magnitude.
where F is the number of white faces (with the exterior included). In general, where E is the number of edges and V is the number of vertices of the blue 'polygon', while g is the genus of the surface in which the polygon is embedded. The inequality (2.19) follows from E = 3n and V = 2n.

Quantum Noise
In the previous section, we have shown that the long-time average of the Rényi entropies in a microcanonical band of a many-body chaotic system produces a microscopic version of the equilibrium ansatz proposed in [5]. In this section, we will show that when n R and n R are large, quantum fluctuations are suppressed with respect to the average value in the long run. In this sense, it is reasonable to expect that the long-time average is a measure of the 'equilibrated' value of the Rényi entropy, at least for timescales t t P ∼ E −1 exp(N eff ) with no Poincaré recurrences on the system.
In order to simplify the discussion, we will neglect the terms coming from the projectors since they will be always subdominant with respect to leading permutations. The long-time variance ∆ (R) n of the Rényi entropy is defined as the square root of which can be written in the compact notation where A 2n = S 2n \ S n × S n is the set of 'connected' permutations between the two Z (R) n factors.
The magnitude of each term in (3.2) can also be estimated from a double-line diagrammatic counting à la 't Hooft (see Fig. 3). In the product basis, {|r, r }, each amplitude is where η 2 = (2n, ..., n + 1)(n, ..., 1) ∈ S 2n . The number of R-loops of the diagram does not change with respect to the previous estimation, since R, R| n ⊗ R, R| n has the same R-tracing pattern as the bra R, R| 2n , where the subindex represents the number of replicas. Therefore, we will have k(σ) of such loops, each of them yielding a factor of n R . The number of R-loops, however, notices the new 'factorized' tracing pattern and the total number will be given in this case by k(τ 2 σ), where The total contribution is then Note that for any 'disconnected' σ ∈ S n × S n this expression recovers the square of (2.18). However, disconnected permutations are not allowed because we need to restrict to A 2n . In particular, we have that for σ ∈ A 2n the following inequality holds (3.5) Figure 3: Double-line diagram corresponding to R, R| ⊗ R, R | | ρ, σ for n = 2 and the connected permutation σ = (13)(24) ∈ A 4 . For connected diagrams, the total number of loops is . Therefore, only planar permutations σ ∈ A n saturate (3.5).
Assuming that n R < n R , (3.4) and (3.5) show that the quantum noise for the Rényi entropy is suppressed by ∆ We emphasize that, for finite dimensional systems, n R ∼ d R is expected to generally be the dimensionality of the subsystem, which scales with the total entropy as e f R S , given a fraction 0 < f R < 1/2 (here we are assuming that S = log N accounts for almost the total dimensionality of the system). Therefore, the quantum noise is expected to be suppressed by exp(−f R S).
In a similar way, we can generalize our results to the m-th long-time moment where A mn ⊂ S mn consists of totally connected permutations between all of the Z (R) n factors. Again, the R-loops do not notice the different tracing pattern, while the R-loops do. The estimation is then − 1), ..., nm) ∈ S nm . In this case, any totally connected permutation σ ∈ A mn will satisfy k(τ m σ) These considerations lead to the conclusion that the long-time averaging induces an effective probability distribution P(Z (R) n ) for the value of the Rényi entropy which is extremely peaked at the average value (2.13), with a variance (3.2) which is exponentially suppressed in the effective number of degrees of freedom log n R 1. The typical timescale for quantum fluctuations is the Heisenberg time t H ∼ (∆E) −1 , where ∆E is the average energy difference between the eigenstates participating in |Ψ . We expect that in general these fluctuations are effectively 'frozen' for timescales t t H and that the Rényi entropy is 'equilibrated' with a value given by (2.13).

Equilibrium Approximation for Rényi Entropies
As we have seen, Rényi entropies are very fine-grained measures of the system that retain the information about the initial state |Ψ for arbitrarily long times. In this section, we will make further assumptions about the Hamiltonian H to approximate the long-time values of Rényi entropies for a general class of initial states by the corresponding 'equilibrium values' arising from different thermodynamic ensembles.
To start, we can consider the reduced set of initial states for which the energy wavefunction c i is delocalized enough such that it excites a large fraction of the energy eigenstates of the band. More precisely, let N eff /N = 1 − x 2 for some x 1. For such states, the microscopic equilibration density matrix ρ will be very close to the microcanonical density matrix on the band with respect to the trace distance, ||ρ − ρ mc || 1 x. In this case, it is obvious that the long-time purity (2.7) will be given at leading order in x by where ρ mc is the microcanonical density matrix on the band, and is the average purity of the energy eigenstates of the band. The first two terms in the expression (4.1) exactly match the microcanonical equilibrium approximation of [5], while the extra term in our approximation comes from the exact unitary description and it is subleading by a factor of N −1 .
The requirement x 1 is too restrictive. In order to make more general statements corresponding to a much larger set of initial states, we will need to make an extra assumption about the structure of the Hamiltonian. A particularly convenient guiding principle to study quantum chaos is to look at properties of eigenstates of typical Hamiltonians, and to see which of these properties could be approximate features of the chaotic Hamiltonian H. One such property is the typicality of the chaotic eigenstates with respect to 'small' observables, which is the essence of the eigenstate thermalization hypothesis (ETH) [14][15][16].
Our assumption about H will be a lot milder, since we are not going to consider the properties of single eigenstates 3 , but rather averaged properties over a large number of them. Given a state |Ψ involving a large number of eigenstates N eff , consider the density matrix ρ 0 = Π |Ψ ρ mc Π |Ψ that is constructed by projecting the microcanonical density matrix ρ mc into the N eff -dimensional subspace generated by the eigenstates in which |Ψ has larger support. Of course, ρ 0 is a really good approximation of the microscopic density matrix Note the extra factor of N −1 eff coming from the fact that ρ 0 is a much better approximation to the state ρ than ρ mc .
Let S eff = log N eff be the 'microcanonical entropy' of ρ 0 . We expect that, whenever this entropy is comparable to the entropy of the band, (S − S eff )/S 1, then generally the subset of eigenstates taking part in ρ 0 will be a good representative of the full microcanonical ensemble for any quantity which has desirable 'convergence' properties, and in particular for the entanglement spectrum of R and R. Under this assumption, initial states involving a large fraction of the entropy of the band will reproduce the microcanonical value for the long-time average of the Rényi entropy A similar approximation can be done in terms of the canonical ensemble. The thermodynamic entropy can be smoothly defined by e S(E) ≡ E i δ (E − E i ), where δ is the 'regularized Dirac delta' of width that accounts for the discreteness of the spectrum. For the inverse temperature β = ∂S/∂E evaluated at the energy of the band, the ensemble trace distance can be evaluated from a saddle point approximation at large S and yields the well-known result where ρ β = e −βH /Z β is the canonical density matrix and Z β = Tr e −βH is the canonical partition function.
For delocalized initial states |Ψ that excite a large number of eigenstates of the band, we can also approximate their long-time average purity (2.7) by the canonical equilibration value The sums are now implicitly ranging over all of the eigenstates of the Hamiltonian. In a completely analogous way, from (2.13) we can obtain the canonical approximation for higher Rényi entropies To further elucidate the structure of the equilibrium approximation for canonical equilibration, let us reintroduce the basis {|r, r } of H R ⊗ H R . We can write down each term in (4.8) in this basis in the compact notation For local theories, each of the terms in terms in (4.9) and (4.10) can be written as an Euclidean path integral over n replicas of the system. The amplitudes correspond to the Euclidean timeevolution by an amount β, with two different states inserted at the boundaries of the strip, τ = 0 and τ = β. The delta functions determine the gluing pattern of these n path integrals. To be more specific, let Φ i (τ ) ≡ {Φ µ i (τ, x)} denote the collective set of fields on the i-th replica, and let S E [Φ(τ )] be the Euclidean action of the theory. In this notation we have and We have shown that for a general class of initial states, the long-time averaged values of Rényi entropies will yield a version of the equilibrium ansatze of [5] for the microcanonical and canonical ensembles, with extra terms that contribute at the level of the non-planar permutations. In our derivation, we mainly used quantum ergodicity of the Hamiltonian H, and a restriction to initial states involving a large number of eigenstates of the microcanonical band. Similar results can also be obtained by Haar averaging Z (R) n either over initial states |Ψ in the microcanonical band or over time-evolution operators. Our derivation, on the other hand, directly applies to atypical initial states that remain atypical for t < t H .

Black Hole in a Box and Replica Wormholes
So far, we have been quite general in our discussion about the nature of the chaotic many-body system under consideration. In this section, we will describe the relevance of the 'equilibrium approximation' for Rényi entropies in the context of AdS/CFT systems describing black holes in equilibrium.
We consider a system consisting of a holographic CFT on a spatial sphere S d−1 of radius AdS , which we denote H R . The CFT sphere is contained in an external 'radiation box' H R with no dynamical gravity, and of finite volume L d , with L > AdS . The full Hamiltonian of the system is where H R is the weakly coupled Hamiltonian of the box, H R is the CFT Hamiltonian, and H int is a small interaction that allows for transparent boundary conditions in the gravitational description of the system. The Hamiltonian (5.1) will satisfy the spectral requirements introduced in section 2, which are mainly inherited from the properties of the black hole band of the CFT Hamiltonian.
The system is initialized at a state |Ψ = i c i |E i that belongs to a high-energy microcanonical band of total energy E L −1 . For our purposes we take |Ψ to be a semiclassical state of this band such as, for instance, some configuration of matter in AdS. We will assume that the state |Ψ involves a large number N eff of eigenstates of the band. Under time-evolution, the matter will eventually collapse and form a large black hole in AdS. The size of this black hole will depend on the size of the radiation box L, where we are assuming that the energy is sufficiently large compared to L −1 , i.e. EL (L/l P ) α for some α > 1 that depends on the dimension.
Strictly speaking, this black hole is not an equilibrium state, since the state vector |Ψ(t) will indefinitely explore the N eff -dimensional ergodic torus, T N eff . In particular, there will be quasiperiodic quantum fluctuations entering at Heisenberg timescales t H ∼ (∆E) −1 , where ∆E is the average energy difference between the energy eigenstates participating in |Ψ . In the very long run, at timescales t P ∼ E −1 exp N eff , these fluctuations will coherently lead to Poincaré recurrences. For certain decaying correlation functions, the quantum noise becomes the leading contribution at late times, and thus the Lorentzian semiclassical description of the state is not good enough to reproduce this non-perturbative quantum gravitational effect [21][22][23][24][25][26][27][28].
The required time t eq for the effective equilibration of the entanglement spectrum of ρ R is expected to be really small compared to t H . For the local Hamiltonian H R , entanglement is expected to propagate in the form of a wavefront at an effective lightcone velocity v eff [29][30][31][32][33][34][35][36] and thus timescale for equilibration for the degrees of freedom of the box is of the order t eq ∼ L/v eff . For the black hole degrees of freedom, information spreading occurs much faster, so we expect that the entanglement of these degrees of freedom equilibrates after a few scrambling times t eq ∼ t s .
For t t eq we have provided general arguments in the previous sections to declare that the Rényi entropies of the radiation will equilibrate to values which are approximately given by (4.8), that is where β = ∂S/∂E is the inverse temperature associated to the microcanonical band. Assuming that the interaction H int is small, we can neglect its contribution to the canonical ensemble and perform the factorization For notational purposes, we will defining the following quantities which are in fact related to the thermal Rényi entropies on each of the subsystems.
For the factorized equilibration density matrix (5.3) the canonical equilibration value of the purity (4.6) simplifies and yields Note that the first two terms correspond to e −S R β and e −S R β from the definition of the thermal second Rényi entropy S R β and S R β of each of the subsystems. The equilibrated value of the purity in terms of the second Rényi entropies is then We will now analyze the origin of each term in (5.6) from the point of view of the CFT (R system). First of all, there is an overall normalization of this expression which is given by (Z β ) 2 ≈ (Z R β ) 2 (Z R β ) 2 due to the form of the canonical density matrix (5.3). The numerator of the first term involves a CFT path integral (Z R β ) 2 which precisely cancels the R−part of this normalization. Therefore, this term corresponds to two disconnected CFT path integrals which in bulk variables will be dominated by a disconnected saddle consisting of two copies of the Euclidean black hole at inverse temperature β. This disconnected term matches the contribution of the 'disconnected saddle' in previous replica calculations in gravity 4 .
On the other hand, the second term corresponds to the CFT path integral Z R 2β and corresponds to two Euclidean strips of length β, interpreted in this context as two copies of the CFT system, which are glued together in such a way that they form a single thermal circle of length 2β. Since the replicas are already connected through the boundary conditions, in bulk terms this path integral will be dominated by an Euclidean black hole at inverse temperature 2β which will connect the two replicas 5 . This connected term matches the contribution of the 'replica wormhole' in previous replica calculations in gravity.
These observations lead to the hypothesis introduced in [5] that replica computations of the purity of the radiation using the gravitational path integral (Tr R ρ 2 R ) grav are effectively reproducing each term of the equilibrium approximation (5.7).
However, the interpretation of the third term in (5.6) and (5.7) as arising from a subleading gravitational saddle is less clear. In fact, we can rewrite (5.7) as In JT gravity models [1][2][3] the suppression in powers of S R β agrees heuristically with the genus expansion in powers of the extremal entropy S 0 , so these terms might appear at the level of higher genus saddles of the gravitational path integral. However, there is a somewhat obscure minus sign for each handle arising from the global minus sign of the last term in (5.7). A possibility is that this term is prescribed from the long-time average in the exact unitary description, and that it goes beyond the semiclassical gravitational path integral. In this sense, it can be viewed as a 'counterterm' to restore exact unitarity at the level of subleading saddles of the previous two quantities, Z R 2β and (Z R β ) 2 . Generalizing these results to higher Rényi entropies is again a matter of combinatorics. Let |σ| ≡ k(σ) denote the number of cycles of σ ∈ S n , each of length {s 1 , ..., s |σ| }, and similarly for σ = τ σ for the cycle lengths {s 1 , ..., s |σ | }, where τ = (1, ..., n). Given a non-trivial partition π ∈ Π n \{π e } and a permutation σ ∈ S n , let σ π ∈ S n be the 'coarse-grained' permutation that is constructed from σ by the rule of merging two of its cycles (a 1 , ..., a Ls ) and (b 1 , ..., b Lm ) whenever some a and some b belong to the same 'box' B l ∈ π. Let |σ π | < |σ| denote the number of cycles of this permutation, of length {q 1 , ..., q |σπ| }, and similarly for σ π , for the lengths {q 1 , ..., q |σ π | }. In this notation, we can compute (4.8) for the product density matrix (5.3), which gives In the first sum, only the σ = e contribution is related to a totally disconnected path integral (Z R β ) n , with the dominant bulk saddle being n copies of the Euclidean black hole at inverse temperature β. The rest of the terms involve a leading contribution of at least a connected geometry coming from (Z sβ k ) for s > 1. The connectivity pattern of these leading geometries is ultimately related to the topology of the replica wormhole that reproduces each term, which also follows the hierarchy between 'planar' and 'non-planar' contributions [1]. The gravitational replica calculation of the Rényi entropy of the radiation, (Tr R ρ n R ) grav , seems to be effectively capturing each term of the equilibrium approximation (5.2).
For the second sum, the terms are again reminiscent of further suppressed contributions to the gravitational path integral, and they are responsible for restoring the exact unitary description of the equilibrated value. In this case, the extra terms involve the coefficients α π which do not seem to emerge from a boundary path integral like (4.12), but rather seem to be combinatorial coefficients prescribed by the long-time average.
We will now consider the situation of an evaporating black hole in this setup. We will first introduce the previous system inside a larger box R of length L L. Let us couple the small box to the large one by a Hamiltonian H ev responsible for evaporation, in such a way that radiation escapes slowly compared to the equilibration time t eq of the small box (see Fig. 4). This 'adiabatic approximation' is a natural assumption for standard black hole evaporation. For example, for a Schwarzschild black hole the evaporation time is of the order of t ev ∼ β S BH , while the equilibration timescale for the black hole degrees of freedom is of the order of the scrambling time t eq ∼ β log S BH . Under this approximation, we can divide the full evaporation process in epochs of time ∆t, with t eq ∆t t ev . For example, we can consider ∆t as the time that it takes for the black hole to loose 1% of its initial entropy. At epoch k, the system inside the small box will have energy E k , which is a monotonically decreasing function of k. We are assuming that the initial energy E is so large that even after many emissions, say when it has lost 99% of its initial entropy, E k is still large enough to correspond to a stable microcanonical black hole in AdS.
Consider the radiation subsystem rad = R ∪ R . The Rényi entropies and many observables of the small box rapidly equilibrate, and therefore we can perform a slightly stronger equilibrium approximation Tr rad ρ n rad,k (t) ≈ Tr rad ρ n rad,k eq ≈ Tr rad ρ n rad,k no ev . (5.10) The long-time average in the last estimation is taken in a different system where no evaporation is allowed. This 'eternal' system consists on a black hole inside the box of length L, both at inverse temperature β k associated to the energy of the epoch, E k , and no interaction with the larger box L . In this way, we can perform the long-time integration with no risk of loosing track of the black hole subsystem.
For simplicity, we consider the case in which H R = ω a † ω a ω is a free Hamiltonian, with the corresponding one-particle Hamiltonian h = ω |ω ω|. The one-particle canonical partition functions are defined as z n (β) = ω e −nβω , and s n (β) the corresponding Rényi entropy. Let δs be the total number of degrees of freedom emitted in between epochs. Once each emission happens, we can take L to be arbitrarily large and assume from locality that the emitted quanta do not interact with previously emitted radiation. Under this assumptions, the state ρ k that approximates the equilibrium properties of |Ψ(t) at epoch k is the product state The equilibrium approximation (5.10) for this density matrix leads to the purity of the radiation where S R k = δs k−1 j=1 s 2 (β j ) is the total second Rényi entropy emitted before the epoch. Similar considerations hold for higher Rényi entropies.
We have shown that the equilibrium approximation for the purity of the radiation (5.12) reproduces an exact Page curve. At each epoch, all the terms in (5.12) can be formulated as Euclidean path integrals over the corresponding subsystems. In particular, the second term corresponds at leading order to a connected saddle of the gravitational path integral. Remarkably, (5.12) yields an exact pure state for the radiation ρ R at the end of evaporation, even though the adiabatic approximation becomes completely unjustified at the last stages, where the emission timescale is comparable to the scrambling time of the black hole. The last term of (5.12) with the corresponding minus sign is responsible for this effect. As we have argued, this minus sign seems to be prescribed from the exact unitary description of the equilibrated purity. Similar considerations hold for higher Rényi entropies.

Conclusions and Outlook
In this note we have analyzed the equilibration of Rényi entropies under mild assumptions about the chaotic spectral properties of the Hamiltonian of a many-body quantum system. The ergodic longtime average provides a microscopic equilibrium value for the Rényi entropy (2.13) which retains information about the initial state of the system. The averaged quantum noise relative to the long-time value is exponentially suppressed in the microcanonical entropy whenever the subsystem comprises a non-negligible fraction of the full system.
For initial states that excite a large number of energy eigenstates of the microcanonical band, the long-time average of the Rényi entropy is approximated by the microcanonical (4.4) or the canonical (4.8) equilibrium ansatze of [5], with some extra structure that contributes at the level of non-planar permutations. For local systems, each of the terms in the canonical equilibrium approximation for the n-th Rényi entropy can be formulated as an Euclidean path integral over n replicas of the system. The extra structure corresponds to path integrals (4.12) with a more complicated pattern of connectivity between the replicas, which is imposed by the extra projections.
Our results have certain similarities with Rényi entropies for Haar-typical states in the microcanonical band [37][38][39][40][41][42], although we did not assume randomness of the initial state nor of the Hamiltonian. Our considerations apply to atypical initial states which remain atypical for Heisenberg timescales under generic k-local time evolution [43]. It would be interesting to further investigate the scope of our results for states containing a few eigenstates of the Hamiltonian, or even at the level of single chaotic eigenstates, where all the terms in (2.13) become comparable [17][18][19][20].
In the context of AdS/CFT systems describing the semiclassical formation of a black hole that reaches equilibrium with its radiation, the equilibrium approximation yields (5.7) for the purity of the radiation, and (5.9) for the higher Rényi entropies. These expressions have the same form as the replica calculations in semiclassical gravity, and the connection is strengthen from the point of view of the the CFT path integral that reproduces each term of the equilibrium approximation. We leave the potential identification of the subleading terms of the equilibrium approximation with subleading effects of the semiclassical path integral for future investigation.
The case of the evaporating black hole can also be treated in this formalism under the adiabatic approximation that allows to divide the evaporation process in quasi-equilibrium epochs. A stronger version of the equilibrium approximation yields the purity at each epoch (5.12), where β k = β(t k ) is the inverse temperature of the black hole at time t k . Each of the terms in (5.12) has also a formulation in terms of an Euclidean path integral over replicas of the whole system, which is given at leading order by a gravitational saddle. The last term in (5.12) is responsible for recovering an exact pure state of the radiation at the end of evaporation. It would be interesting to understand whether the semiclassical path integral is able to capture this term with the minus sign in front, and in general the α π coefficients of the corresponding terms for higher Rényi entropies.

A Moments of Ergodic Long-Time Averaging
In this appendix, we compute the value of several long-time integrals for a Hamiltonian with no rational spectral order. The integrals correspond to certain complex moments of the homogeneous probability distribution on the ergodic torus T N . We will define the n-th moment as the following integral: where the value of the i and j indices is unspecified in the range from 1 to N .
Generally, the integral as a principal value will only be non-zero when the overall phase vanishes. For instance, the first moment yields a delta function For the second moment, the final expression will consists of more terms than the naive permutation of the indices. The reason is that we need to be careful with the over-counting of configuration in which i 1 = i 2 , since this particular case must yield The explicit general formula that captures the above case is where we are not using the convention of summing repeated indices for the last term.
Let us now proceed to the case of the third moment. There can be multiple possibilities for coincident i-indices. The first one is that only two of them coincide, like for instance i 1 = i 2 = i 3 . This case will be given by where the prefactor arises from the double-counting of the permutations that swap the equal Iindices. It can also be the case that all three i-indices coincide. In that case, the prefactor must be different One can easily check that the fomula that encapsulates all the above cases is given by The generalization for general n is straightforward once we do a little bit of combinatorics. The integral will have the form B∈π a,b∈B Let us explain this formula in more detail. The prefactor just represents the naive ways of assigning the i indices to the j indices. The sum is over the set of partitions of the set N n = {1, 2, ..., n}, denoted by Π n . A given partition π = {B 1 , ..., B k } just represents that indices in each of the blocks B i are equal. For example, for n = 7, the partition π = {{3, 4}, {2, 5}, {1, 6, 7}} represents that i 3 = i 4 , i 2 = i 5 and i 1 = i 6 = i 7 . This is the reason for the two products, the first one being over the blocks of the partition, and the second one over indices in each of the blocks, which must be set equal with the delta function. The number of terms in the sum is therefore the number of partitions of a set of n elements, which is called the Bell number B n . This number grows super-exponentially for large n. Now, we need to fix the α π coefficients in front of each of the terms in the sum. We will do this recursively. First of all, the trivial partition π e = {{1}, {2}, ..., {n}} that represents the counting when all indices are different will have α πe = 1. For a generic partition π = {B 1 , ..., B k }, the total number of configurations that should be counted is just given by the ways to arrange n elements on different boxes of sizes |B i | = n i , that is, n!/ (n 1 ! ...n k !). The reason is that all permutations that can be related by a permutation of the identical indices leave the partition invariant.
The coefficient α π will only depend on the size of the boxes n 1 , ..., n k . That is, it depends on the equivalence class of the partition, where two partitions are equivalent if their B's have the same number of elements. These equivalence classes are just given by the partitions of the natural number n, which are all the ways to decompose n as a sum of positive integers. For instance, the trivial partition π e is the only element of the class n = 1 + 1 + ... . Partitions that consist of a triplet will belong to the class n = 3 + 1 + ... + 1 ≡ [3, 1 n−3 ]. Partitions with two distinct pairs will be of the class n = 2 + 2 + 1 + ... + 1 ≡ [2 2 , 1 n−4 ], and so on.
Let us start from all partitions of the class [2, 1 n−2 ], that is, π consists of just one pair. Already the trivial partition π e accounts for n! cases of this kind. All other terms in (A.8) do not contribute since we are assuming that only two indices are equal, and therefore only the actual partition and possibly finer partitions can only contribute. We therefore have that n! 2! = n! 1 + α [ For a general partition π = {B 1 , ..., B k }, only when we know all the coefficients for finer partitions π < π will we be able to solve for the coefficient through the following condition n! n 1 ! ... n k ! = n! π ≤π α π ⇒ α π = 1 n 1 ! ... n k ! −   The coefficient for all these partitions can be read from Table 1. The formula (A.8) then particularizes exactly to (A.7).
The case for n = 4 will consist of B 4 = 15 terms, which can be read from the table The case n = 5 already has B 5 = 52 terms, so writing (A.8) in an explicit form becomes really tedious.