Replica wormholes and the black hole interior

Recent work has shown how to obtain the Page curve of an evaporating black hole from holographic computations of entanglement entropy. We show how these computations can be justified using the replica trick, from geometries with a spacetime wormhole connecting the different replicas. In a simple model, we study the Page transition in detail by summing replica geometries with different topologies. We compute related quantities in less detail in more complicated models, including JT gravity coupled to conformal matter and the SYK model. Separately, we give a direct gravitational argument for entanglement wedge reconstruction using an explicit formula known as the Petz map; again, a spacetime wormhole plays an important role. We discuss an interpretation of the wormhole geometries as part of some ensemble average implicit in the gravity description.


Introduction
Hawking's calculation [1] of black hole evaporation by thermal radiation raises deep puzzles about the consistency of black holes and quantum mechanics. In particular, if the black hole starts in a pure quantum state, how can the apparently thermal radiation at the end of evaporation actually be in a pure state, as quantum mechanics requires? A quantitative form of this puzzle is encapsulated in the Page curve [2], a plot of the entanglement entropy of the radiation as a function of time. It increases at short times because of the thermal character of the radiation, but then must decrease to zero at the end of evaporation to be consistent with unitary quantum mechanical evolution.
Holographic duality has supplied a powerful tool for computing entanglement entropy, the Ryu-Takayanagi formula [3]. This formula has been generalized and refined in numerous ways since its original formulation. 1 In its most general formulation, the Engelhardt-Wall (EW) prescription [11], it says that the entropy of a holographic boundary region B is given by the generalized entropy of the minimal quantum extremal surface (QES).
Building on earlier ideas in [12,13], the authors of [14,15] considered a black hole in anti-de Sitter space that was allowed to evaporate into an auxiliary system R using absorbing boundary conditions. They showed that, at exactly the Page time, there was a phase transition in the minimal QES. This caused the boundary entropy to begin decreasing, in accordance with the Page curve. However, as emphasized in [15], a naive calculation of the entropy of the Hawking radiation showed that it continued to increase, since the semiclassical bulk physics had not changed.
As advocated in [14], and commented on in [15], the Page curve for the radiation would result from a variant of the usual rules for computing entropy holographically. 2 This idea was given an elegant "doubly holographic" realization in [16], where it was called the "island conjecture." Explicitly, the prescription states that the actual entropy S(R) of the Hawking radiation is given by where I is some "island" region of the bulk. 3 Here S bulk (I ∪ R) is the entropy of the island plus the Hawking radiation, computed semiclassically in the original fixed geometry. 4 Since the Ryu-Takayanagi formula can be derived from gravitational path integrals using replicas [7,8,9,23], the prescription (1.1) should also be derivable directly from the gravitational path integral. Deriving it is one of the main goals of this paper. We will show explicitly how this prescription arises from a replica computation with Euclidean wormholes connecting the different replicas. While we present our arguments in the context of simple models, their structure is not model dependent and should apply in a general context.
Wormholes of a similar type have recently been used to analyze the late-time behavior of the spectral form factor [24,25] and correlation functions [26], inspired by the puzzle posed in [27]. In both that setting and here, the basic point of the wormholes is to give small but nonzero overlaps between naively orthogonal bulk states. This resonates with the longstanding suspicion that some small corrections to the Hawking calculation (of order e −S BH ) could be responsible for making black hole evolution unitary. 5 Understanding the Page curve is only part of the black hole information problem. One also wants to understand how information that was thrown into the black hole ends up escaping in the Hawking radiation. As argued in [14,15], this can be addressed using entanglement wedge reconstruction. 6 In the case of the island conjecture, this idea implies that R contains all of the information in the island I, which itself contains much of the black hole interior. Existing derivations of entanglement wedge reconstruction have been indirect, 7 relying on throwing entropy calculations into the meat grinder of modern quantum information theory. So, the second main goal of this paper is to show using a bulk argument that operators in the radiation can manipulate the interior of the black hole. We show this directly by using gravitational path integrals to evaluate matrix elements of an explicit reconstruction operator defined using the Petz map [41,42] (see [39,43] for a description and recent discussion of the Petz map). Again, Euclidean wormholes play a crucial role. Here they are in some sense connecting the interior of the black hole to the quantum computer acting on the Hawking radiation.
We now give a brief summary of this paper.
In section two we introduce a simple toy model of an evaporating black hole in Jackiw-Teitelboim (JT) gravity [44,45,46], a two dimensional truncation of near extremal black hole dynamics. We represent the Hawking radiation by an auxiliary reference system whose states are entangled with interior partner modes represented by "end of the world branes" (EOW branes) in the black hole interior. This model is simple enough that we can exactly evaluate the full gravitational path integral for the Rényi entropies by summing over all planar topologies with the correct boundary conditions. We find that, before the analog of the Page time, the dominant topology consists of n disconnected copies, one for each replica, of the single replica geometry. In contrast, after the Page time, the dominant topology is connected, with an n-boundary Euclidean wormhole connecting all the different replicas. Our calculation uses a type of Feynman-diagram resummation method applied to spacetime geometry. This is inspired by techniques from the theory of free probability, and it makes it possible to continue the sum over topologies in n.
We also compute the overlap between the different black hole microstates |ψ i . We find that ψ i |ψ j = 0 for i = j, but that | ψ i |ψ j | 2 ∼ e −S due to a wormhole contribution. This can only be consistent if the gravitational path integral represents an ensemble of quantum theories. 8 The Rényi entropies are sums of large numbers of these overlaps. The Page transition comes from the buildup of these small errors in orthogonality.
In section three we explicitly reconstruct the black hole interior from the radiation in this simple model. Our tool is the Petz map which we introduce and explain. We then evaluate matrix elements of Petz map reconstructions using gravitational path integrals. We employ an analytic continuation similar to the replica trick and so consider geometries of different topologies. After analytic calculation, the problem reduces to a bulk field theory calculation in the original spacetime geometry.
After the Page time, when the connected topology dominates, the matrix elements of the Petz map reconstruction and original interior operator agree. In contrast, before the Page time, the disconnected topology dominates and the Petz map reconstruction fails to learn anything about the interior.
Again we are able to sum these topologies and thereby follow this transition in detail. The error in the reconstruction increases linearly with the dimension of the code subspace. As a result, even long after the Page time, reconstruction is only possible for sufficiently small code subspaces, an example of state dependence. This explains the state dependence required in the Hayden-Preskill decoding process [47] and gives a direct derivation of the results about state dependence in entanglement wedge reconstruction from [13].
The toy model calculations have some simplifying features that actual calculations for evaporating black holes do not. We therefore extend our calculations to two less idealized models.
In section four we calculate an analog of the Page curve using the replica trick in JT gravity coupled to a conformal matter, where the matter sector is allowed to flow freely across the boundaries between two thermofield double black holes. Perturbative semiclassical gravity would suggest that the entropy of each thermofield double black hole would grow forever. In contrast, unitarity and more explicitly the island formula (1.1) implies that it must saturate at twice the Bekenstein-Hawking entropy. We analyze this situation using replicas as before. In this case, we cannot explicitly find the saddle point for integer n Rényi entropies. But we are able to calculate the von Neumann entropy by showing that, as n → 1, the dominant saddle point after the Page time replicates about the minimal QES. This reproduces the answer from (1.1). This argument essentially reproduces the general arguments for the EW prescription from [23] in this specific context.
In section five, we do essentially the same calculations in a UV-complete theory, the SYK model. In this case, we can numerically find a replica-nondiagonal saddle point in the G, Σ action that describes the saturation of the n = 2 Rényi entropy. According to the usual intuition in this context for associating topologies to saddles for the G, Σ action [24], this saddle point corresponds to a Euclidean wormhole that connects the two replicas, as in our gravity calculations.
In section six we turn briefly to de Sitter space. We discuss a generalization of the no-boundary proposal for replica computations, and use it to find a Page-like transition in the entropy in a toy model using EOW branes in the JT gravity realization of de Sitter space [48]. A notable feature is that the density matrix itself changes character at the analog of the Page transition.
In section seven we offer some preliminary remarks about the role of wormholes in theories without ensemble averaging.
We end with several appendices: In appendix A we discuss details of the n-replica gravitational partition function. In appendix B we present analogous calculations in more general theories of two dimensional dilaton gravity, including flat space. In appendix C we give a direct gravitational path integral derivation of general entanglement wedge reconstruction using the Petz map. In appendix D we present an explicit Hilbert space ensemble precisely dual to the sum over all topologies of the simple model. In appendix E we discuss in some detail the connection between random tensor networks and fixed area states. In appendix F we present a more refined analysis of the Page transition in the simple model.
Closely related work has been done independently by Ahmed Almheiri, Tom Hartman, Juan Maldacena, Edgar Shaghoulian and Amirhossein Tajdini [49]. We have arranged with these authors to coordinate submission of our papers. In this section, we will introduce a simple 2d gravity model in which one can derive the "island" prescription (and corrections to it) in a very explicit way using the replica trick. 9 The starting point is to consider a black hole in JT gravity, with an "end of the world brane" (EOW brane) behind the horizon, sketched in figure 1. This can be understood as a Z 2 quotient of an ordinary two-sided black hole, with a particle of mass µ behind the horizon [50]. The Euclidean action for the system is where the integral is along the worldine of the EOW brane, and the pure JT action is In our discussion of the model, two different types of boundary conditions will be relevant. At the standard asymptotic boundary, we impose Here τ can be interpreted as the imaginary time coordinate of the holographic boundary dual to the JT gravity system. At the EOW brane, we impose the "dual" boundary conditions ∂ n φ = µ, K = 0, (2.4) where ∂ n means the derivative normal to the EOW brane boundary, and µ ≥ 0. We will be interested in the case where the EOW brane has a very large number k of possible internal states, each orthogonal to the others. To model an evaporating black hole, we will think of these states as describing the interior partners of the early Hawking radiation. We will entangle them with an auxiliary system R, which will model the early radiation of an evaporating black hole. So, all together, the state of the whole system is

5)
Here |ψ i B is the state of the black hole B with the brane in state i, and |i R is a state of the auxiliary "radiation" system R. 9 A similar model has recently been discussed independently in [19].
island k ≪ e S k ≫ e S BH BH Figure 2: We show the expected entanglement wedges for our simple model based on the "island" conjecture.
In the k e S BH phase, the entanglement wedge of the B system (boundary dual of the gravity theory) is the whole spacetime, shown hatched. In the k e S BH phase, an island develops. The entanglement wedge of the B system retreats to the exterior of the horizon, and the entanglement wedge of the auxiliary R system is the shaded blue island behind the horizon.
Let's compute the entropy of the R system in the state (2.5), using the island conjecture (1.1) of [14,15,16]. If we take the island to be the empty set, then the answer for the entropy is just log(k), representing the entanglement between the R system and the bulk state of the brane. However, we can also consider the case with a nontrivial island, as shown in figure 2. This island contains the EOW brane, so the bulk entropy term S naive (R ∪ I) in the island formula (1.1) will give zero. The remaining area term is interpreted in JT gravity as Area (∂I) 4G N → S 0 + 2π φ(∂I). The transition between these two answers, as a function of k, is analogous to the Page curve. For the simple model described above, we will derive this answer using the replica trick, along with corrections. We will start by deriving a qualitative first approximation to the full answer, and then gradually progress to calculating the full non-perturbative answer in Section 2.5.

Computation of the purity
We would like to use the 2d gravity path integral to compute the entropy of system R, in the state |Ψ given in (2.5). We can start by discussing the density matrix ρ R : The matrix elements of ρ R are gravity amplitudes ψ i |ψ j . These are determined (up to normalization of the states |ψ i ) by a gravity calculation with the following boundary conditions ψ i |ψ j = (2.9) or Figure 3: The boundary conditions for | ψ i |ψ j | 2 are shown at left, and two ways of filling in the geometry are shown at right. To compute the purity, we want to sum over i, j by connecting the dashed lines together. For the disconnected geometry, this will lead to a single k index loop, and for the connected geometry it will lead to two loops.
Here, the black line is the asymptotic boundary of type (2.3), with renormalized length β. We drew the line with a wiggle to emphasize that the boundary conditions do not constraint its shape. The arrow represents the direction of time evolution, from the ket to the bra. 10 At the locations where the dashed blue lines intersect the solid black line, we require that an EOW brane of type i or j should intersect the asymptotic boundary. Here and below, dashed lines carry the index of the EOW brane state. The leading gravity configuration that satisfies these boundary conditions is the following classical solution ψ i |ψ j ≈ (2.10) The black asymptotic boundary (with arrow) borders a portion of the hyperbolic disk, and a solid blue EOW brane follows a geodesic between the i and j endpoints. An important feature is that the same EOW brane connects to both the i index and the j index. Because we assume that these correspond to orthogonal internal states of the EOW brane, the result for ψ i |ψ j will be proportional to δ ij . So, based on this computation, it looks like ρ R will be maximally mixed, with entropy log(k): However, we can also try to compute the entropy directly, using the replica trick. Let's start by considering the so-called "purity" Tr(ρ 2 R ), which is closely related to the Renyi 2-entropy. From (2.8), one can easily work out that | ψ i |ψ j | 2 . (2.12) The boundary conditions to compute | ψ i |ψ j | 2 are shown in the left panel of figure 3. To compute the purity, we will sum over i, j by connecting the dashed lines in the obvious way. The crucial point is that there are two different ways of filling these boundary conditions in with 2d geometry, as shown in the right panel of figure 3. We can either have a disconnected geometry with the topology of two disks, or a connected "Euclidean wormhole" geometry with the topology of a single disk. 11 To study a two-sided version of this model where we don't take the Z 2 quotient, we would glue two copies of the EOW brane geometries together along the EOW branes. Then the topology would be two disks in the disconnected case, and one cylinder in the connected case.
In both topological classes, there is a classical solution in JT gravity. This is somewhat nontrivial for the connected (wormhole) case, for the following reason. There is a parameter that characterizes the wormhole geometry, which we can take to be the regularized distance from one of the asymptotic boundaries to the other, through the wormhole. By itself, the action of JT gravity provides a pressure that would prefer this distance to be large. But the tension provided by the mass µ of the particle or EOW brane prefers the length to be short, and provides a counterbalancing force that leads to a stable minimum for the action. 12 To describe the contributions of these geometries to Tr(ρ 2 R ), we will use the notation Z n = Z n (β) to represent the gravity path integral on a disk topology with a boundary that consists of alternating segments of n physical boundaries of renormalized lengths β, and n EOW branes. Using this notation, we can evaluate the sum of the two contributions in figure 3 as In the numerator, we have the contributions of the two geometries in figure 3: the disconnected geometry at left has one k-index loop, and two copies of the geometry that defines Z 1 . The connected geometry at right has two k-index loops, and a single copy of the geometry that defines Z 2 . In the denominator, we have divided by the gravity computation that normalizes the density matrix. 13 We will work out exact formulas for Z n in JT gravity below, but the basic point can be seen already in a very crude approximation, where we retain only the dependence on the topological S 0 term in the JT gravity action (2.2). This term weights the contribution of a given topology by e S 0 χ , where χ is the Euler characteristic. Since the topology relevant for Z n is disk-like for any value of n, and since χ = 1 for the disk, we will have Z n ∝ e S 0 . (2.14) Using this formula, we see that the second term on the RHS of (2.13) is proportional to e −S 0 . So If k is reasonably small the disconnected geometry dominates, and we will find that the purity is 1/k, which is consistent with (2.11). However, if k gets very big, then the connected geometry dominates and we will find the purity is e −S 0 , independent of k. This interchange of dominance of the two saddlepoints is the basic mechanism that prevents the entropy (here the Renyi entropy) of the radiation from growing indefinitely. Note that because the state of the entire system is pure, this implies that the entropy of the excitations in the black hole (EOW brane states) remains finite even when k is very large. The basic mechanism for this is the small nonorthogonality of these states, as we discuss in the next section. When k becomes of order e S BH , this nonorthogonality adds up to a large effect. 11 Euclidean wormholes are sometimes also referred to as spacetime wormholes, to differentiate them from the more commonly studied spatial wormholes such as Einstein-Rosen bridges. 12 Note, however, that having a classical solution isn't necessary for the contribution to be meaningful. 13 There is a subtlety here. If we normalize the state so that Tr(ρ R ) = 1, we will find that Tr(ρ R ) 2 is not exactly one. This can be interpreted as due to the small fluctuation in the normalization of the states. This effect can be taken into account, but in the planar approximation that we will describe below, it is consistent to ignore it.

Factorization and averaging
The calculation in the previous section raises some important subtleties. For example, how can this result for Tr(ρ 2 R ) be reconciled with the formula for ρ R in (2.11)? To address this, let's first consider the amplitude ψ i |ψ j . For this quantity, the gravity path integral gives two seemingly-contradictory answers (2.16) The first of these equations follows from the gravitational calculation in (2.10): the answer is proportional to δ ij because the same EOW brane connects to both the i index and the j index. The two terms in the second equation correspond to the two terms in the RHS of figure 3. Note that in the second term i does not have to be equal to j. Of course, the two equations in (2.16) are incompatible in a strict sense. However, suppose that we imagine that the true quantum amplitude is where R ij is a random variable with mean zero. Then if we interpret the gravity path integral as computing some type of average over the microscopic R ij quantities, then we would interpret the gravity answers as telling us that Here, the bar indicates an average over R ij . Now the equations are compatible, and the extra term in the second equation tells us the variance of R ij . Note that the correction to orthogonality in (2.17) is very small. But the large number of terms in (2.12) enables these small corrections to dominate the final answer for k e S 0 , giving a qualitatively different result than (2.11). The conclusion seems to be that the gravity path integral is not literally computing quantum amplitudes, but is instead computing some coarse-grained version, where we average over some microscopic information R ij . The lack of factorization of the resulting quantities is familiar from the connection between Euclidean wormholes and disorder. 14 This connection goes back to the work of Coleman [54] and of Giddings and Strominger [55], and has been recently discussed in JT gravity in [25]. It raises important questions in the present context, which we will return to in section 7.
In appendix D we give a precise description of the random ensemble that is dual to the JT gravity plus EOW branes model.

First look at the von Neumann entropy
The standard way to compute the von Neumann entropy using replicas is to use For this to be useful, it is important to be able to compute Renyi entropies in a way that has a good continuation in n. This is rather nontrivial, since as n grows large, there are many different geometries that can fill in the boundary conditions for computing Tr(ρ n ) (shown in the left panel of figure 4). We will show how to handle this problem in the next section. First, though, we can do a simplified analysis that is appropriate in two extreme limits. In the or . . . or Figure 4: The boundary conditions for Tr(ρ n R ) with n = 6 are shown at left, along with two extreme ways of filling in the geometry, corresponding to the completely connected and completely disconnected options. Note that the geometry at right contains a fixed point of the Z n symmetry that rotates the replicas.
case where k e S BH , completely disconnected geometries with the topology of n disks dominate. They have a single k-index loop, and give the contribution (2.20) Using (2.20), we find S R = log(k), consistent with (2.7).
In the opposite limit where k e S BH , the completely connected geometry with the topology of a single disk dominates. There are n index loops (see figure 4), and the contribution is The k factors cancel out, so this answer is purely gravitational. In order to compute the von Neumann entropy, we need to continue Z n to near n = 1. A trick for doing this (see section 3.2 of [7]) is to notice that the geometry associated to Z n has a Z n replica symmetry, and that the Z n quotient of the geometry can be continued in n. In the limit n → 1, this becomes the original unreplicated geometry, and the computation of the von Neumann entropy reduces to S 0 + 2πφ h , where φ h is the value of the dilaton at the horizon, which is the fixed point of the Z n symmetry in the n → 1 limit. So we conclude that in this phase, the answer is just the thermodynamic entropy of the black hole. Again, this is consistent with (2.7). The main conceptual point that we want to emphasize is that the island extremal surface descends from a Renyi entropy computation that involves replica wormholes.

Summation of planar geometries
So far, we have discussed the completely disconnected and completely connected replica geometries. In the regimes where these dominate, we get the two different answers for the von Neumann entropy, predicted by the island conjecture. In order to make this more precise, and to understand the transition between the two regimes, we need to sum over replica geometries that are intermediate between the completely disconnected and complete connected cases shown in figure 4.
The starting point is the boundary condition for Tr(ρ n R ), shown in the left panel of figure 4. In principle, we need to sum over all ways of filling this boundary condition in. To simplify, we will assume that e S 0 and k are both large. In this approximation, we only need to sum over geometries that are planar, in the sense explained in figure 5. Non-planar geometries are suppressed either by powers of e −2S 0 (for adding handles), or by powers of 1/k 2 (for introducing crossings).
Depending on the relative size of k or e S 0 , either highly disconnected or highly connected geometries will tend to dominate. We do not assume any particular relationship between k and e S 0 , so it The middle figure has an extra handle and is down by e −2S 0 . The right figure involves a crossing, and is down by k −2 (it has two dashed index loops instead of four).
will be necessary to sum over all planar geometries. In order to do this, it is convenient to define the resolvent matrix R ij (λ) of ρ R : We will write down a Schwinger-Dyson equation for R ij using the planarity property and then use its solution to find the entanglement spectrum of ρ R . 15 The boundary conditions for R ij correspond to an infinite sum over different numbers of asymptotic AdS 2 boundaries, connected by the k index lines associated to the R system: The dashed index lines come with factors of 1/λ, and the solid lines with arrows come with factors 1/(kZ 1 ) that normalize the gravitational path integral. We fill these in with all possible planar geometries: For the one-boundary terms in (2.23), there is only a single bulk geometry that can fill it in. For the two-boundary term, there are two possible geometries, and we sum over them in (2.24). For the three-boundary term (not shown), there would be five possible geometries. We can write a Schwinger-Dyson equation that sums these geometries:

(2.25)
15 This analysis was inspired by the "free probability" results discussed in [56] and e.g. figure 1 of the earlier [57].
On the RHS, the second term sums all planar geometries in which the first boundary is disconnected from all other boundaries. The third term sums all planar geometries in which the first boundary is connected to one other boundary, and so on. To write this identity as an equation, it is convenient to define R as the trace of the resolvent matrix, (2.26) We can think of the δ ij /λ as a "bare propagator." It is the first term in (2.24). In the remaining terms, n labels the number of boundaries of the geometry that contains the leftmost boundary. The Z n factor is the gravitational path integral of an n boundary geometry, which we will analyze below. For each of the n boundaries, we divided by kZ 1 in order to normalize the density matrix ρ R . Taking the trace of (2.27), we find (2.28) In JT gravity, it is possible to make this equation very explicit. 16 The exact formula for Z n can be worked out using the boundary particle formalism [58,59]. We decompose the path integral for Z n in the following way Here, the object I 2n ( 1 , . . . , 2n ) is the JT gravity path integral with 2n geodesic boundaries of fixed regularized lengths . It has the following expression [58] where K is the modified Bessel function. The object ϕ in (2.29) is the Hartle-Hawking state in the geodesic basis [51]. It computes the path integral from the asymptotic boundary (characterized by renormalized length β) to the geodesic of regularized length . It has the expression [59,58] The boundary of I 2n consists of n geodesics that correspond to EOW branes, and n geodesics that need to be glued to Hartle-Hawking states. In both cases, the procedure is similar: we integrate over the length of the geodesic, with a "wave function" that is either the e −µ weighting for the EOW brane, or the ϕ β ( ) weighting for the Hartle-Hawking states. Including the correct measure factor for the integral over geodesic lengths (see appendix A), the formula for Z n is (2.32) In going to the second line from the first one, we inserted the integral representations (2.30) and (2.31), and then did the integrals using formulas discussed in appendix A. In y(s), the Boltzmann factor exp(− βs 2 2 ) and the Gamma function factor 2 1−2µ |Γ(µ − 1 2 + is)| 2 come from the integral with the Hartle-Hawking state and with the brane state, respectively.
Physically, the s parameter can be viewed as giving the energy s 2 /2 of a particular asymptotic region. The main simplification in the above calculation is the fact that the energy must be the same for all asymptotic boundaries in a single connected geometry.
Mathematically, (2.32) gives an integral representation for Z n , with the property that the n dependence is very simple inside the integral. The sum needed for the Schwinger-Dyson equation (2.25) becomes a geometric series. In order to unclutter the equations that follow, we will define rescaled variables Then, after summing the geometric series, one finds that the equation for the resolvent is This equation contains all of the information about the entanglement sepctrum of ρ R , and it is exact in the planar approximation. 17 By solving the equation for R(λ), and then taking the discontinuity across the real axis, one can find the density of eigenvalues of the matrix ρ R : The von Neumann entropy can then be computed by evaluating

Microcanonical ensemble: Page's result
Let's first use this planar resummation to compute the entanglement spectrum for the case where the black hole is in a microcanonical ensemble. This means that we fix the energy in each asymptotic region, rather than fixing the renormalized length β. Let s be the chosen value of s = √ 2E, and let ∆s be the width of a small interval that defines the microcanonical ensemble. It is convenient to define the boldface quantities S, Z 1 and w: e S = ρ(s)∆s, Z n = ρ(s)y(s) n ∆s, w(s) = y(s) (2.37) 17 A similar equation applies to arbitrary dilaton gravity theories, at least in the classical limit and with large µ, see equation B.5.
Here S is the entropy of our microcanonical ensemble, and Z n is the microcanonical version of Z n .
The advantage of this microcanonical ensemble is that the resolvent equation (2.28) simplifies to a quadratic equation for R. This equation, and the corresponding solution for the density of eigenvalues (2.35), are given by (2.38) These equations are precisely the same ones found by Page for the entanglement spectrum of a subsystem of dimension k in a random state of total dimension ke S , in the planar approximation [2]. The fact that we get the random state answer can be understood from the fact that for fixed energy, the random ensemble dual to our JT gravity + EOW branes model (see appendix D) is the same random state ensemble discussed by Page. 1. When k e S , the range of the spectrum is very narrow and the first term in D(λ) is a very narrow semicircle, roughly a delta function. So the spectrum consists of k eigenvalues of size 1/k. This describes a maximally mixed density matrix and the von Neumann entropy is log k.
2. As k approaches e S (the Page time), the distribution develops a 1/ √ λ singularity at the origin. After the Page time, the smooth part of the distribution contains only e S eigenvalues. The remaining k − e S are exactly zero. The von Neumann entropy has a rather sharp transition from log k to S during the Page transition: S = log m − m 2n , where m = min(k, e S ), and n = max(k, e S ) [2]. 3. When k e S , the smooth term can again be approximated as a delta function. The distribution describes a density matrix with e S states that are maximally mixed, and the rest unentangled. The von Neumann entropy stays equal to S.

Canonical ensemble: smoothing out the transition
In the canonical ensemble, we did not find a way to solve the resolvent equation (2.34) exactly. However, we can get some intuition by solving the equation numerically. To do this, one can first evaluate λ as a function of R, find the locus where λ is real, and then compute the inverse. The results of this procedure are sketched in figure 6. Long before the Page transition, the distribution of eigenvalues D(λ) is very narrow, localized near λ = 1/k as in the microcanonical case. Long after the Page transition, the distribution resembles the thermal spectrum of the black hole, with a rather sharp cutoff at s = s k chosen so that the total number of eigenvalues is k: (definition of s k ). (2.39) In between the two regimes, the curve smoothly interpolates, with no singular feature during the transition.
We will now try to analyze the resolvent equation approximately, with the goal of computing the von Neumann entropy in the semiclassical, small G N regime. In our formulas for JT gravity, we did not include a G N parameter explicitly, but it can be restored by taking so the semiclassical small G N limit corresponds to small β. We give a much more detailed analysis of the same problem in Appendix F.  Figure 6: Sketch (not to scale) of the entanglement spectrum in the canonical ensemble. Before the Page transition, the density of states has a very narrow distribution. Near the Page transition, it becomes a shifted thermal spectrum, with a cutoff at some energy. After the Page time, it becomes an ordinary thermal spectrum, again with a cutoff at some energy.
As a first step, we can determine the location of the bottom edge in the distribution D(λ), i.e. the smallest eigenvalue λ 0 . This corresponds to a branch point in R(λ), which means a location where dλ/dR = 0. To determine this point, we first write the resolvent equation in the form In the region near the lower endpoint of the spectrum, R is real and negative, and the above integral can be approximated by dividing the s integral into two intervals where the two different terms in the denominator dominate: Here s R is defined by w(s R )R = −k. This approximation is justified in more detail in Appendix F. Setting to zero the derivative of (2.42) with respect to R, we find that the first two terms on the RHS should be equal. Comparison with (2.39) then implies that s R ≈ s k . Plugging this back into (2.42), we find that the smallest eigenvalue of ρ R , and the corresponding value of R are . (2.43) The next step is to write the equation for the resolvent (2.34) and break up the s integral into two parts: In the region from s k to infinity, we replaced the factor of k − w(s)R in the denominator by k. This is justified on the following grounds. First, we are going to study the equation in the region λ > λ 0 , and in this region |R| < |R(λ 0 )|. Second, w(s) is a decreasing function of s, so for s > s k , we have |w(s)R| ≤ |w(s k )R(λ 0 )| = k. Intuitively, we have separated the s integral into two terms in (2.44) corresponding to the "pre-Page" and "post-Page" parts of the thermal ensemble. The high energy states with s > s k are effectively before the Page transition, and the planar resummation (represented by the nontrivial denominator) is not necessary for these states. A nice feature of (2.44) is that the final term can be recognized as λ 0 R, so (2.44) can be rewritten We can solve this equation in an approximation where the the second term is a small perturbation, and the zero-th order solution is just k/(λ − λ 0 ). Iterating the equation once, we find the first-order solution . (2.46) This approximation is good as long as k, which will be true as long as with δ a control parameter. So we conclude that for eigenvalues λ satisfying (2.47), we can compute the resolvent to good accuracy. Unfortunately, there is a gap between this value and the actual bottom of the spectrum λ 0 , and we do not have control over the distribution of eigenvalues in this region. Before and during the Page transition, we are saved by the fact that λ 0 w(s k ), so for the purposes of computing the von Neumann entropy, this is a narrow region where the density of eigenvalues can be approximated by a delta function with unknown weight. After the Page transition, this is no longer true, but the eigenvalues in the unknown region contribute a small amount to the entropy.
So, using (2.44) in the region (2.47) and parametrizing our ignorance in the remaining eigenvalues with a delta function, we have (2.48) In the region between s k − δ and s k , λ 0 w(s), so we can approximate λ 0 + w(s) ≈ λ 0 . This means that we can let the integral run all the way to s k , at the cost of changing the coefficient of the delta function piece. The resulting combined coefficient can be shown to vanish, by the following argument. The distribution of eigenvalues has to satisfy two normalization conditions, dλ D(λ) = k, dλ D(λ)λ = 1. (2.49) The first condition says that ρ R has k total eigenvalues, and the second condition says that ρ R is normalized. The first of these equations implies that is correctly normalized without any further delta function piece. One can check that the second condition follows from our result for λ 0 in (2.43). This is our final expression for the entanglement spectrum. It can be given a very simple interpretation: we have the spectrum of the first k states of the thermal spectrum of the black hole, shifted so that the total normalization is one. This shift can be understood as the contribution of all of the remaining states in the spectrum of the black hole: Here P is the projector into the post-Page subspace with s < s k , andP is the projector into the pre-Page subspace with s > s k . The operator ρ BH is the density matrix of the thermal ensemble (including the brane) for the black hole. Using this result, one can evaluate the von Neumann entropy. Inserting (2.50) into (2.36) and doing the λ integral using the delta function, we find (2.52) A careful analysis shows that the error in this approximation peaks around the Page transition, when it is of order G N (see appendix F for details). As a function of k, this expression exhibits a smoothedout version of the Page transition. It differs from the naive answer min(k, S BH ) by an amount of order G −1/2 N near the Page transition, but one can show that it agrees to within O(1) precision with the average of the micro-canonical answer over the full thermal ensemble. In figure 7 we plot the exact formula for the entropy for β = 3, obtained by solving the resolvent equation numerically. We also plot the approximation (2.52) and the exact microcanonical answer for comparison.
We would like to emphasize one point about the continuation in n. Because they are sensitive to different parts of the thermal spectrum, the different Renyi entropies Tr(ρ n ) experience Page transitions at different values of k. In fact, there are values of k for which the Page transition for all of the integer Renyi entropies has already taken place, but the von Neumann entropy is still very close to log(k). In this regime, a fully connected geometry dominates the computation of the Renyi entropy for every integer n, but its continuation to n = 1 would give the wrong answer in the von Neumann limit. Nevertheless, the full resummation gives the right answer.

Reconstruction behind the horizon via the Petz map
In the previous section, we showed (in the simple model) that for large k, the entanglement entropy saturates at the thermodynamic entropy of the black hole, and the RT surface is at the horizon. This means that the entanglement wedge of the radiation R contains an island behind the horizon.
The location of the entanglement wedge is significant because of the notion of "entanglement wedge reconstruction," 18 which implies that in the large k phase, the radiation system R should describe the island. The arguments in favor of entanglement-wedge reconstruction [8,37,38,39] are strong but non-constructive. In the next section we will show directly from the gravitational path integral that operators acting in the R system can control the region behind the horizon. We will see that the Euclidean wormholes described above play an essential role, connecting the region behind the horizon to the R system.
Although we will focus on the simple model in our discussion below, a similar argument applies to the general case of operator reconstruction within the entanglement wedge, e.g. the famous case of two intervals in the AdS 3 vacuum. The argument is essentially the same and we describe this generalization in appendix C.

Setup
We would like to slightly enrich the simple model discussed above, so that there is something nontrivial to reconstruct. To do this, we will imagine coupling our JT gravity system to a bulk field theory, containing some propagating degrees of freedom. For simplicity, we will assume that these propagating degrees of freedom cannot feel the difference between different states of the EOW brane.
We will consider a "code subspace" of states corresponding to small excitations of the bulk fields propagating on the background described so far (i.e. by the background corresponding to the state (2.5), and to the geometry in figure 1). A basis for the code subspace is provided by the states Here, |ψ aj is a state in which the EOW brane is in state j, and we also have a small excitation of the propagating bulk fields, labeled by a.
Because of the variety of geometries that will be important below, we actually have to be somewhat more precise at this point, and define |ψ aj by a boundary condition rather than a bulk statement. What we really mean is that the a index corresponds to some insertion in the boundary conditions for the Euclidean path integral that defines the bulk state. So, for example, the boundary conditions for an inner product of two such states would be where the labeled "x" marks represent the insertions associated to a and b. We assume that e.g. the a insertion is arranged so that on the leading disk topology, it produces a particular state of the bulk field theory |a FT ,  Relative to the state with no insertions, |a FT could contain excitations behind the horizon, outside the horizon, or both. To simplify the formulas below, we will assume that the bulk states are indexed and normalized so that to a good approximation a|b FT = δ ab . We will specialize to a more concrete model in Section 3.7. We let O FT be some operator that acts within this subspace of bulk states. We will denote its matrix elements in the bulk field theory states as O ab : We define O to be the representation of the bulk operator O FT in the full boundary system Note that in general, this "global" representation of the operator acts nontrivially on both R and B. Now we can get to the point. Suppose that we are in the phase with k e S BH , so that the entanglement wedge of R includes the island behind the horizon. Then if O FT acts within the "island" behind the horizon, the general arguments for entanglement-wedge reconstruction suggest that we should be able to find an operator O R , acting only on R, such that What we would like to do is show by a direct bulk calculation that this is possible. The tool we will use is the so-called "Petz map," which essentially gives a guess for what operator O R to choose. We will do a bulk calculation to show that it actually works, demonstrating entanglement-wedge reconstruction explicitly.

Petz Lite
Before we discuss the Petz map itself, we will go through a simpler ("Petz Lite") version, which will be good enough for a certain limited class of states. The Petz Lite map works as follows. Given an operator O on the combined system BR, we can define a operator on the R system using the partial trace Here, the constant c 0 should be chosen so that the identity operator maps to the identity operator. This seems like a very naive guess for an operator mapping, and in fact this Petz Lite version will not work for general states. But it does work in a special class of states, analogous to the "fixed area" states of Akers, Rath, Dong, Harlow and Marolf [60,61]. In JT gravity, fixed area is replaced by fixed dilaton φ, and we choose to fix the value of φ at the horizon to be the value that it has in the classical solution. The advantage of doing this is that when φ is fixed, we don't impose the equation R + 2 = 0 at that point, and instead we allow any value of R, including a delta function singularity that corresponds to a conical excess. This freedom makes replica geometries very easy to construct [60,61]: we just take n copies of the n = 1 geometry, and glue them together in such a way that the horizon is a conical singularity with total angle 2πn.
To proceed, let's define boldface states |ψ ai to be fixed-dilaton versions of the states |ψ ai defined above. Making this modification in (3.5) and tracing over the B system as in (3.7), we find the candidate operator How does this operator manage to affect the region behind the horizon? Note that although O R acts only on R, its explicit form depends in detail on amplitudes ψ ai |ψ bj in the black hole theory B. Let's imagine that some quantum computer will apply O R to the R system. Then in order to compute these amplitudes, the quantum computer will do computations that are equivalent to simulating system B. This introduces a second copy of B into the game. The basic point seems to be that the physical copy of the black hole B can connect via a Euclidean wormhole to this second copy, providing a geometrical connection between the black hole and R. This connection is equivalent to inserting an operator behind the horion of the physical black hole.

= physical BH
copy in quantum computer acting on R cuts glued Figure 8: The two-replica geometry that contributes to the expectation value (3.11) of the Petz Lite operator.
The wavy line represents the bulk particle. Note that gluing the right geometry to the left geometry has the same effect (from the perspective of the physical left system) as inserting a creation operator behind the horizon. Another sketch of the glued geometry is shown at right. The marked point is the point of fixed dilaton, and it has conical angle 4π. The worldline of the particle created in the quantum computer (wiggly orange line) travels through the wormhole and into the physical black hole. The brane on the top half of each of the figures is a brane of type j and the brane on the top half is of type i.
Let's see how this works in a very simple case, where the code subspace consists of two states: the state |0 FT with no bulk particles excited, and |1 FT with a single particle behind the horizon. We consider the operator which creates a particle. For this case, Petz Lite gives Here |ψ 0j is a state with no particle, and |ψ 1i is a state with the particle present. If we were to evaluate the amplitudes ψ 0j |ψ 1i using a single replica, we would find zero because the bulk field theory states with and without a particle are orthogonal. So the amplitude would vanish, and we would conclude that O R = 0. However, as discussed in section 2.3, the gravity path integral seems to require an interpretation as a disorder average, where the amplitude ψ 0j |ψ 1i is only zero "on average" in some ensemble implicit in the gravitational description. So we should postpone our concern that the operator vanishes, and see what it does when we compute a matrix element: On the RHS we have a product of two gravitational amplitudes. The first factor corresponds to the physical black hole, and the second factor corresponds to the simulated one in the quantum computer acting on R. In the large k phase, the largest contribution to the expression on the RHS comes from a connected geometry, where the physical black hole is connected to the simulated one by a wormhole. This wormhole is particularly easy to describe for the fixed-dilaton states. We simply glue two copies of the one-replica geometry together along a cut behind the horizon. This corresponds to a geometry with a 4π angle at the horizon, as sketched in figure 8. From the perspective of the physical black hole, this wormhole corresponds to an insertion of an operator behind the horizon. It is an operator with a particle present in the ket vector (top half of the cut) and no particle present in the bra vector (bottom half of the cut). So the operator acts as a creation operator behind the horizon. This is the basic mechanism that underlies entanglement-wedge reconstruction. However, in order to go beyond fixed-dilaton states, we will have to use the full Petz map, rather than the Petz Lite version discussed above.

Quantum information discussion of the Petz map
We now introduce the full Petz map, in its natural environment of quantum channels. The remarks below will not be necessary for our gravity discussion, but we include them for background.
A reversible quantum channel N can be defined by the property that it has a type of inverse R, such that (R • N )(ρ) = ρ for all inputs to the channel N . The Petz map [41,42,62] gives a universal formula for R in terms of N , in cases where exact recovery is possible. We will not write the full formula, since we are only interested in a special case of it, but we refer the reader to [63,43] for details and previous discussions in the context of entanglement wedge reconstruction.
As a representative example, let's consider a quantum channel that embeds H code ⊆ H A ⊗ HĀ, and then traces over HĀ. The channel is reversible if it is possible to recover H code from H A . In this situation, for any operator O code acting on H code , the Petz map gives an equivalent operator O A acting on A: Here V is the encoding map from H code into H A ⊗ HĀ, and Π code is the projector onto the image of H code in H A ⊗ HĀ. By saying that O A is equivalent to O code , we mean that for any state in H code , the following two operations produce the same state: (a) acting with O code and then encoding with V and (b) encoding with V first and then acting with O A . To see that (3.12) works, we note that reversiblity of the channel implies that there is an isomorphism H A ∼ = H 1 ⊗ H 2 ⊕ H 3 where H 1 ∼ = H code such that any state |ψ ∈ H code is mapped by V to the state |ψ |χ ∈ H 1 ⊗ H 2 ⊗ HĀ, and where |χ ∈ H 2 ⊗ HĀ is some fixed state. 19 We can now evaluate (3.12). The encoding map acts as where χ 2 = TrĀ|χ χ|. Finally For channels that are not perfectly reversible, the Petz map will not work perfectly. However, it still achieves very good performance, with an "average error" at most twice the optimal average error (using any recovery channel) [64]. For large code spaces, the performance may be suboptimal for particular"worst-case" input states, whereas recent work has shown that a slightly more complicated "twirled Petz map" has good provable worst-case performance [65]. However, for our application, the ordinary Petz map will work well enough. 20 For reversible channels, the Petz map reconstruction of a unitary code space operator is itself always a unitary operator (at least when restricted to the image of the code space). 21 However, the individual elements that make up the Petz map reconstruction, in particular the factors of σ −1/2 A , are highly nonunitary. 22 The Petz map reconstruction may therefore have much higher complexity, as a unitary operator, than the simple description (3.12) would suggest. In fact, for evaporating black holes, there is good reason to think that it is exponentially complicated [66,67].

Gravitational replica trick for the Petz map
The above discussion of the Petz map motivates us to amend the naive "Petz Lite" formula to (3.17) Here σ R is the trace over B of the projector onto the code subspace, To check how well (3.17) works, we will compute the LHS of (3.6) in gravity and compare it with the RHS. The main complication in this calculation is the fact that it involves the −1/2 power of an operator. To deal with this, we will use a version of the replica trick, defining For positive integer n, matrix elements of this operator can be evaluated using replicas. In a first pass, we will take a naive approach and analytically continue the dominant term in the bulk calculation from integer n to n = −1/2. Later, we will show how to do this analytic continuation more precisely, using a summation of planar geometries.
We will now begin the calculation. After writing out the integer n expression and evaluating the inner products in the R system explicitly, one gets an expression that depends only on inner products of the |ψ ai states in the B system: In this expression, all indices except for a and b are summed over. The pattern of index contractions is best understood by writing a diagram for the boundary conditions for the corresponding bulk path integral. Using the translation of inner products to boundary conditions shown in (3.2), and dropping the "x" symbols for clarity, we find 21 For approximately reversible channels, it will be approximately unitary. 22 For the Petz Lite reconstruction, the problematic nonunitary part is the large constant factor c 0 .
In this diagram, and in similar ones below, we draw the case n = 2. In general there will be 2n + 2 black segments corresponding to asymptotic boundaries. In the phase k e S BH where the connected geometries dominate, the boundary conditions will be filled in by a completely connected geometry We now need to evaluate this bulk computation. There are 2n + 2 index loops for the k states of the EOW brane, which gives a factor of k 2n+2 that cancels against the k 2n+2 in the denominator of (3.21). What remains is a purely gravitational calculation, involving a path integral over gravity and over the propagating bulk fields. As a first approximation (order G −1 N ), we can ignore the matter fields altogether and just work out the answer from the gravity computation. In this approximation, the geometry has a Z 2n+2 symmetry and we can analytically continue the Z 2n+2 quotient of the geometry in n. In the limit n → −1/2, it becomes a disk with a single boundary, for which the gravitational path integral gives Z 1 . This cancels against the Z 2n+2 1 in the denominator, so the gravitational answer is simply one.

Reducing to a bulk field theory calculation
Now that we have evaluated the leading-order gravitational contribution, we need to compute the contribution from the propagating bulk fields. In the leading approximation, this means a field theory calculation on the fixed background, with the Z 2n+2 symmetry. We can view the geometry as built out of 2n + 2 pieces, which we separate with dotted lines: The field theory path integral on each of these pieces gives an operator with bra and ket corresponding to the dotted lines that border each piece. The particular operator we get in each case depends on the angle θ between the dotted lines, and it also depends on the field theory boundary conditions at the asymptotic boundaries, in particular, it depends on the a and b indices in (3.23). We will refer to this operator as M , and a certain sum over such operators as M : M (a, a; θ). (3.24) In this notation, the field theory computation we are interested in is The trace in this expression is over the Hilbert space of the bulk fields on one of the dotted lines.
Since O (n) R involves 2n + 2 replicas, we should set θ = 2π/(2n + 2). There are two sources of n-dependence in (3.25): the number of powers of M , and the value of θ. We would like to continue to n = −1/2, and it is convenient to do this in stages. In the first stage, we set θ = 2π, leaving the number of powers of M general. When θ = 2π, the M operator becomes a path integral on the standard unbackreacted geometry, In deriving the second equality here, we used (3.3). The notation Tr outside means a trace over the region outside the horizon, in the Hilbert space of the bulk field theory. Substituting this in, we find that with θ = 2π, the RHS of (3.25) can be written as a|O As a second stage, we now take the remaining dependence on n to −1/2, and we note that the answer takes the form of a Petz map, but now for a simpler problem defined entirely within the Hilbert space of the bulk field theory on the fixed background. Specifically, it is the Petz map O FT → O inside for reconstructing the bulk field theory operator O FT using only the region inside the horizon. If O FT acts behind the horizon, then this reconstruction is possible, so this auxiliary Petz map will succeed, meaning that This follows from the properties of the Petz map discussed in section 3.3, but it can also be shown directly as follows.
Then one can check that O inside = A inside , and the reconstruction works perfectly. Let's now summarize. We were originally interested in a reconstruction problem where the goal is to reconstruct a bulk field theory operator O FT using an operator O R acting only on the "radiation" system R. We started with a candidate for O R suggested by the Petz map. Using a bulk argument, we showed that where the matrix element in the RHS is defined purely in the bulk field theory on a fixed background. The operator O inside can be interpreted as the result of an auxiliary Petz map problem defined entirely in the field theory on the fixed background. This auxiliary Petz map attempts to reconstruct O FT using only the region behind the horizon. Such reconstruction will obviously be possible if O FT acts only behind the horizon, so (3.29) will hold and therefore (3.6) will hold also. So we conclude that in the island phase, reconstruction of operators behind the horizon is possible using only R.

The disconnected phase
To understand this result a little better, we can ask why reconstruction fails if we are in the disconnected phase with k e S BH . In this phase, the geometry that dominates (3.20) is the completely disconnected one, similar to the geometry shown in the center of figure 4. At leading order, the gravitational part of this computation reduces to Z 2n+2 1 , with an additional factor of k for the single k-index loop. These factors cancel the prefactors in (3.21) in the limit n → −1/2. So the gravitational and k-index part of the computation gives one, just as in the connected phase. 23 The computation therefore reduces again to a purely field-theory calculation on a fixed background.
Formally, this field theory calculation can be interpreted as an auxiliary Petz-map calculation, but of a rather trivial kind. The analog of the geometries in (3.26) is the one shown on the RHS of (3.26), but with no cut at all. As a field theory operator, this corresponds to Tr entire bulk |b a| FT . (3.31) So we replace Tr outside by a trace over the entire bulk Hilbert space. The associated Petz map is one that attempts to reconstruct the operator O FT from the trivial subsystem of the bulk theory, after tracing out everything. This is obviously not possible, so reconstruction fails.

Planar resummation for the Petz map
In the above analysis of the Petz map, for the most part we assumed that the connected geometry dominates. In this section, we will do the full sum over planar geometries, using a method similar to the one we used to compute the entanglement spectrum of ρ R . In this section, we will remove the bulk fields from the theory, and go back to considering the theory with only the EOW brane. In order to have a nontrivial code subspace of states we will interpret both the a and j indices as describing the state of the EOW brane. The j index is entangled with the radiation, and the a index labels the state within the code subspace. To reiterate: in our previous discussion, this a index labeled some state of the bulk fields, but to make the following analysis simple we will now take a to describe another aspect of the EOW brane (e.g. the state of some qubit that propagates right alongside the EOW brane). The main challenge in the computation is the existence of the operators σ −1/2 in the formula for the Petz map. It will be convenient to write a version of this formula for integer powers of σ as In the rest of this section, we will use the rescaled σ defined here, which is normalized as a density matrix. What we would like to do is make sense of the "analytic continuation to n 1 = n 2 = −1/2" of this expression. To make this precise, it will be helpful to introduce a generating function of these operators for all values of n 1 , n 2 as O R (λ 1 , λ 2 ): Figure 9: Boundary conditions (left) and a planar contribution to Ψ a |O R (λ 1 , λ 2 )|Ψ b . This contribution is an element of the first class, since the leftmost boundary is in the same connected geometry as the rightmost one.
Later we will see how to deform the integration contour to get an answer that can be continued to n 1 = n 2 = − 1 2 , which gives the actual Petz map. We would like to compute the sum over planar bulk geometries of the matrix elements of O R (λ 1 , λ 2 ). The geometries can be divided into two classes. In the first class, the asymptotic boundary corresponding to the physical black hole (with a, b indices in (3.21)) is connected to the asymptotic boundary on the other side of the circle, where the operator O a b acts. In the second class, these two asymptotic boundaries are not in the same connected component of the geometry. The contribution of the first class of geometries to the matrix element is proportional to O ab , since the index lines associated to the EOW branes require a = a and b = b . In the second class, the contribution is proportional to δ ab Tr O. In other words, we have the matrix elements: Ideal reconstruction would mean that c 1 = 1 and c 2 = 0. Since the Petz map is normalized so that the identity operator always reconstructs perfectly, we must have c 1 + c 2 d code = 1. We can compute c 1 by summing all geometries from the first class. Geometries belonging to the first class can be further grouped as follows. We are guaranteed that there is a geometry that connects the boundary of the physical black hole to the "opposite side" boundary. In addition, this geometry is connected to some number m 1 of boundaries in the "top half" of the circle, and m 2 boundaries in the "bottom half." This defines one connected component with 2 + m 1 + m 2 boundaries, but in between each of these boundaries we can have a full summation of planar geometries. These can be summed over using the resolvent for σ R : See figure 9 for a graphical description of this. The resulting expression for c 1 (λ 1 , λ 2 ) in the planar approximation is (3.37) In this formula, d code and e S 0 only appear in the combination d code e S 0 . This is clear for the explicit factors of d code , since they multiply either Z 1 or Z m 1 +m 2 +2 , both of which are proportional to e S 0 . But it is also true for the implicit factors in R, see (3.36). This will have an important implication: in order for the connected geometries to dominate the Petz map computation, we will need to have k d code e S BH , rather than the Page time condition k e S BH . This is in keeping with expectations based on [47,13,14]. In analyzing the equations below, we will simplify by setting d code = 1, and remembering that it can be restored by rescaling e S 0 → d code e S 0 .
To compute the coefficient c n 1 ,n 2 1 , we can use the contour integral in (3.34) and interchange the s and λ integrals to write .
For non-integer n, the function λ n has a branchpoint at infinity, so we can't continue the integral as written. However, we can first deform the contour for the λ 1 , λ 2 integrals to a new region where continuation will be possible. In particular, for each of the λ 1 , λ 2 integrals, we separately deform the integration contour to surround the cut in the resolvent along the positive real axis. 24 After doing this there is no problem in continuing in n, and for n 1 = n 2 = −1/2, we find (3.40)

Microcanonical ensemble
Let's first analyze this equation in the microcanonical ensemble with fixed s. Then the Schwinger-Dyson equation for the resolvent (2.34) implies Substituting this into the formula for c 1 (3.40), one finds We can do the integral using the result for the resolvent or the density of states in (2.38), and one finds (restoring d code by taking e S → e S d code ) So the reconstruction works well in the region k d code e S where connected geometries dominate. In this region, (3.43) gives the leading small correction due to disconnected geometries.

Canonical ensemble
To analyze c 1 in the canonical ensemble, we can go back to (3.40) and insert the leading expression so we find the expression Near the Page transition, the integral is dominated by the region s < s k , where the w(s) term in the denominator is larger than λ 0 . So we can approximate the integral as This has a simple interpretation: part of the thermal ensemble is "pre-Page," and part of the thermal ensemble is "post-Page." The above integral gives the fraction of the ensemble that is Post-page. Note that this notion of post-and pre-Page refers to the Page time with e S 0 replaced by d code e S 0 . After the Page transition, the correction to one in (3.46) becomes very small, and the leading correction comes from a new place: the correction to the assumed form k/(λ − λ 0 ) of the resolvent. This correction is computed in the final term in (2.46), and working out the first-order change in the LHS of (3.44), one finds In order to simplify this answer, we set λ 0 = 0, which is a good approximation after the Page transition. This integral is dominated by the region where s 1 ≈ s 2 . Making this approximation, it then evaluates to the weighted average of the error in the microcanonical ensemble (3.43), with the weighted average taken over the Post-page portion of the thermal ensemble. This is consistent with the lower bounds on the reconstruction error derived in [13,14].
In figure 10, we give a plot of the exact answer for c 1 in the microcanonical ensemble, and in the canonical ensemble with β = 3.

Radiating black holes in equilibrium
In the simple model discussed above, the Page transition results from a competition between the entropy of the black hole and the number of states k of an EOW brane. The interesting regime is a rather artificial one, where k is extremely large, of order e S BH . However, for a physical radiating black hole, we expect that a similar effect is accomplished by going to late time, where the large number of states is provided automatically by the large amount of Hawking radiation.
The analysis for a radiating black hole (even in a simple theory like JT gravity) is more complicated than for the simple model: it involves intrinsically Lorentzian physics, and the backreaction of the Hawking radiation is essential for finding the wormhole solutions. So we will not be able to go as far as for the simple model. However, in this section we will make some preliminary comments. First, we will explain the relationship between the wormhole topologies and the island extremal surface. Second, we will analyze the continuation near n = 1 of the replica-symmetric wormhole explicitly, and make contact with the quantum extremal surface. Finally, we will explain qualitatively why we expect solutions to exist for integer values of n > 1.
The setup we will discuss is the following. We consider two thermodynamically stable black holes (e.g. large black holes in AdS) that are weakly coupled together in such a way that they can exchange Hawking radiation. We refer to the two black holes as 1 and 2. As the Hawking radiation of 1 falls into 2, and vice versa, the entropy of each black hole grows. In thermal equilibrium, the two black holes will exchange Hawking radiation at a constant rate. A naive semiclassical computation will show that the entanglement between the black holes grows linearly with time forever. This eventually exceeds the coarse-grained entropy by an arbitrarily large amount. The Page curve in this context would be an entanglement entropy that follows the linear growth for a while, before saturating at a value that corresponds to the coarse-grained entropy of one of the two black holes.

JT gravity setup
We now analyze this situation in JT gravity. (2d gravity is convenient for drawing pictures, but the topological argument relating replica wormholes to the island extremal surface is similar in any spacetime dimension: one just replaces each point in the discussion below by a sphere.) We will consider two black holes in JT gravity, coupled together by a bulk matter CFT in a way that will be described below. Before getting into replicas and wormholes, let's begin by discussing the computation that gives the ordinary thermal partition function Z(β) for this combined system.
For a single black hole, the boundary conditions (2.3) for computing Z(β) would be to specify that the boundary is a loop of length β/ and to set the dilaton to 1/ at this boundary. Here, → 0 is a holographic renormalization parameter. We would then look for gravity configurations that can "fill in" these boundary conditions. Up to symmetries, the unique classical solution is the hyperbolic disk where we use the portion of the geometry inside the radius This cutoff radius has been chosen so the length of the boundary is β/ . The dilaton profile is The value of φ h was chosen so that φ = 1/ at the boundary. Let's now discuss the computation of Z(β) with two coupled black holes. Neglecting the matter fields for the moment, the boundary conditions will be the same in the two copies. We can fill these in with the same solution (4.1), but where we now take the two boundaries to be at radii The region r < r − is the first black hole, and the region r > r + is the second black hole. (Note that this is another copy of the hyperbolic disk, as one can check by taking r → 1/r.) A sketch of the full configuration is here: Next we should discuss the bulk matter fields and their coupling between the two black holes. A convenient choice is to include a matter CFT propagating in the bulk theory, and then allow these matter fields to pass freely from one system to the other, with "transparent" boundary conditions. So, for example, let's consider the geometry described above. For small , the geometry of the two black holes is conformal to the entire plane, and (up to a Weyl transformation that affects some observables), the bulk matter fields are simply propagating on this plane without seeing any unusual feature at r = 1 where we transition from one black hole to the other.
So far, we have described a situation in Euclidean signature, appropriate for computing Z(β). But we can also continue it into Lorentzian signature. The setup corresponds to two copies of the thermofield double black hole. In other words, each of the two black holes 1 and 2 have both a left asymptotic boundary L and a right asymptotic boundary R. The boundary conditions for the matter fields allow particles to pass between the two L asymptotic boundaries, and similarly for the two R boundaries. Up to a Weyl rescaling, the bulk matter fields feel like they are propagating on a Lorentzian cylinder, where half of the spatial S 1 is in black hole 1, and the other half of the spatial S 1 is in black hole 2.
In this situation, we would like to compute the entropy of the two-sided black hole 1, namely the entropy of the subsystems 1L ∪ 1R. Let's start by considering the boundary conditions to compute the Renyi n-entropy in the t = 0 state prepared by cutting the Euclidean path integral in half. These boundary conditions correspond to n replicas of (4.5), with boundary twist operators inserted, shown as zigzag cuts here:  As we pass through these twist operators from below to above, we transition from replica α of the inner circle to replica α + 1. If the systems were not interacting with each other, this would be a trivial relabeling. But with interactions it becomes nontrivial: below the twist operators replica α of the system 2 (outer circle) interacts with replica α of system 1 (inner circle), and above them replica α of system 2 interacts with replica α + 1 of system 1.
The boundary conditions described above will compute the Renyi n-entropy at time zero of the two-sided BH 1. By moving the location of the twist operators, we can compute the entropy at other times. In principle, we can move the two twist operators independently. From the rotational symmetry of (4.6), it is clear that if we rotate the twist operators around the circle by the same angle, we will not change the answer. In Lorentzian signature, this corresponds to invariance of the answer under forwards evolution on the R system, and backwards evolution on the L system. However, if we move both forwards in Lorentzian time, there is nontrivial time-dependence.

Renyi topologies for the empty set and for the island
Having specified the boundary conditions, we can now ask what bulk geometries can fill them in. Our goal is to understand what topologies for the Renyi entropy computation lead to the "island" and "empty set" extremal surfaces in the von Neumann limit.
Let's first discuss the empty set. This arises from a Renyi entropy computation with the topology of 2n disk geometries: n for system 1, and n for system 2. Each replica of the boundary conditions is associated to a distinct replica in the bulk, and the different geometries are only connected by their boundary interactions. The geometry can be represented in a convenient way by filling in the interior and exterior of (4.6), and extending the cuts corresponding to the twist operators into the bulk, from one side of system 1 to the other: The path integral of the bulk matter fields on this geometry computes the Renyi n-entropy of the bulk matter fields on a particular bulk slice. This slice corresponds to the location of the cut in (4.7), and it can be understood as passing across the full Einstein-Rosen bridge of system one. 25 In other words, we are computing the total entropy of the matter fields in system one. As we move the twist operators forwards in time, this entropy will grow, due to the fact that the Hawking radiation of one black hole falls into the other and vice versa. In general, the Renyi entropy computation will backreact on the gravity computation, so the geometry of the disks will be modified. But in the von Neumann limit n → 1 where the entropy computation does not backreact on the geometry, we find simply S vN = (entropy on full slice across ERB of black hole 1). (4.8) In writing this expression, we used the fact that the gravitational contribution to the path integral gives one in non-backreacting limit n → 1. As we move the twist operators forwards in time, the RHS of (4.8) will grow linearly in time forever. Let's now discuss the island. For the Renyi n-entropy, we leave the n disks of system 2 in place, but we replace the n disks of system 1 by a completely connected Z n -symmetric geometry that we will refer to as a "pinwheel." A sketch of the pinwheel for n = 6 (with its Z n quotient shaded) looks like this: +1 (4.9) For the moment, let's not worry about whether the geometry in (4.9) is a solution or not (we will come back to this). What we would like to do is understand what it looks like in the limit n → 1. Strictly speaking, this geometry does not have a continuation in n, but its Z n quotient does. The quotient (shaded orange in the above sketch) is characterized by the requirement that the conical angle around the two points marked by black dots should be 2π/n.
The shaded region intersects the asymptotic boundaries in two semicircles, shown with heavy black curves. These two semicircles are the two inner semicircles of (4.6), so in particular, they are joined by bulk field theory interactions to the same replica of system 2, which will be filled in by a disk geometry. For n ≈ 1, the quotient of the full geometry including system 2 therefore looks like the following: +1 (4.10) The inner region is the continuation to n near one of the shaded portion of (4.9). Note that the conical angle around the black dots is close to 2π/n, which is close to 2π.
One can evaluate the contribution of this topology to the gravity path integral, in a very crude approximation where we keep only the topological S 0 term in the JT action. In order to compute the von Neumann entropy, we need to consider Here Z n is the bulk computation with the boundary conditions for the Renyi n entropy, and Z 1 is the bulk computation of the partition function of the combined system. At the level of the topological action, these partition functions are simply e χS 0 , where χ is the total Euler characteristic: Here, the ∼ symbol means equivalence at the level of the topological part of the action. Substituting these expressions in (4.11), we find that the topological part of the gravitational path integral contributes S vN ∼ 2S 0 . Finally, let's discuss the contribution from the bulk field theory. It is helpful to compare the island geometry in (4.10) to the empty-set one in (4.7). In the empty-set geometry (4.7), the cut associated to the twist operators on the boundary extended all the way across the bulk. In the limit n → 1, the corresponding field theory computation was the entropy of the bulk fields on that slice. In the island geometry (4.10), the cut ends on the two black dots, which are the fixed points of the Z n symmetry. In the limit n → 1, the corresponding field theory computation is the entropy of bulk fields in the portion of system 1 outside the black dots. Adding this together with the gravity contribution described above, we have very roughly S vN = 2S 0 + (bulk entropy in BH 1 outside black dots). (4.14) At the level of our topological analysis, this is consistent with the island conjecture (2.7). In order to go beyond this crude analysis, we would like to find actual on-shell solutions in JT gravity coupled to a bulk CFT, which will require going to late times. As in many discussions of JT gravity, it is convenient to think about doing the path integral in two stages. First, we integrate over the dilaton. This imposes a delta function constraint that the metric should be hyperbolic, with R = −2. Next, we integrate over the moduli space of hyperbolic metrics, and over the space of "boundary wiggles," which represent the shape of the cutoff within the parent hyperbolic manifold [48]. Finding true solutions appears to involve two complications: first, in order to find on-shell solutions, we need to continue the twist operators forwards in Lorentzian time a certain amount. Second, for integer Renyi index n, the solution seems to involve an interesting but somewhat nontrivial configuration of these boundary wiggles. This problem can be avoided by continuing the off-shell action near n = 1, and observing that in this limit, only a finite dimensional space of wiggles can get excited. We study this problem next.

The pinwheel geometry for n ≈ 1
Let's begin by analyzing the pinwheel geometry in more detail, and working out its off-shell action near n = 1. The simple final answer could be derived quickly from the methods in [7,23], but we believe the following derivation may generalize better to integer n > 1.
We are interested in the special case of a pinwheel geometry with Z n symmetry, and with a hyperbolic metric R = −2. Such geometries can be parametrized by the length b of the geodesic that separates off any one of the asymptotic regions. Other geometrical quantities can be computed in terms of this b parameter. For example, consider the two fixed points of Z n symmetry, shown with black dots in (4.9). If we set 2ρ to be the geodesic distance between the two points, then one can show using hyperbolic geometry that An important special case of this formula is the limit n → 1, where The pinwheel has n of these geodesics of length b, separating off the n asymptotic regions from the rest of the geometry. The asymptotic region outside each of these geodesics is a "trumpet" geometry with σ ≥ 0. The locus σ = 0 is the geodesic of length b, and the asymptotic boundary is at σ → ∞.
In addition to the parameter b, the pinwheel geometry is characterized by more subtle "boundary wiggles," which represent the shape of the cutoff surface at large σ. These can be parametrized by giving the angular coordinate θ in (4.17) as a function of the proper length along the cutoff surface. It is convenient to use a rescaled proper-length coordinate τ , which runs from zero to β. The full off-shell action for the Z n -symmetric pinwheel geometry reduces [48] to the Schwarzian action for the trumpet geometry [25] nI (4.18) Putting the boundary wiggles on shell means no wiggles at all, θ(τ ) = 2π β τ . However, the result is still not on shell with respect to b, because one finds I ∝ b 2 .
So far, we have taken n to be general. But for n very close to one, the Z n quotient of the pinwheel geometry becomes the hyperbolic disk, with two marked points corresponding to the fixed points of the Z n symmetry, see (4.10). The hyperbolic disk has exact SL(2,R) symmetry, and for n close to one, we expect an approximate SL(2,R) symmetry. This leads to a separation of scales in the action: four special modes become parametrically soft. Three of these are parametrized by an SL(2,R) transformation, and one is parametrized by the distance 2ρ between the Z n fixed points. We would like to work out a "slightly off-shell" action with arbitrary values of these modes, but with everything else on-shell.
A finite SL(2,R) transformation that preserves the unit circle is The corresponding configuration of the Schwarzian variable θ(τ ) is determined by The Schwarzian derivative Sch(e − θb 2π , τ ) for this function θ(τ ) can be computed using the composition rule Sch(f • g, τ ) = Sch(f, g)g 2 + Sch(g, τ ), where f (x) = x ib 2π and where g(τ ) = e iθ given in (4.20). The integral over τ can be converted to an integral over e iθ , and can then be done by contour integration. The result for the action is (4.21) Now taking the limit n → 1, we can use (4.16) to expand this off-shell action near n = 1 as It will be convenient to think about this action in a slightly different way. We can act with an SL(2,R) transformation on the entire geometry (4.10) to "straighten out" the boundary mode (4.20). This transformation will leave the metric invariant, since the hyperbolic disk metric (4.1) is invariant under (4.19), but it will move the two marked points. A straightforward calculation shows that where ρ 1 , ρ 2 are the ρ coordinates of the new marked points, and is the dilaton profile in the n = 1 geometry. This simple form of the final answer would follow more directly from the general results in [7,23]. Let's briefly summarize. For n near one, the pinwheel geometry has four special modes with action of order (n − 1). These can be parametrized by the locations of the two Z n fixed points, considered as marked points in the hyperbolic disk. The off-shell action, as a function of the locations of these points, is related to the value of the dilaton at these points, (4.23).

The bulk field theory calculation
So far we have just studied the gravitational action of the pinwheel geometry. We also need to analyze the contribution from the bulk matter CFT. As explained in [8], the matter path integral can be understood as computing Tr( ρ n n ) (4.25) where ρ n is the un-normalized density matrix prepared by the path integral on the Z n quotient geometry. For n ≈ 1, this approaches (4.10). In principle, we need to consider two types of corrections to n = 1, The first O(n − 1) term can be shown to correspond to a small shift in the background value of the dilaton at the fixed points, due to quantum fluctuations of the propagating matter fields [8,23]. This term will be subleading in the discussion below, and we will neglect it. However, the second O(n − 1) term leads to an important contribution for n ≈ 1: log Z n ⊃ (n − 1) × (bulk entropy in cut region of BH 1). (4.27) We need to evaluate this entropy. This sounds potentially difficult, because the cut region in (4.10) consists of two intervals. But in the late-time limit of interest, it can be well-approximated [18] using the formula for the entropy of a single interval, which is simple in 2d CFT. Let's review the CFT formula for the entropy of an interval. Suppose we are interested in a metric of the form ds 2 = Ω(x) 2 dx i dx i , i = 1, 2 (4.28) We consider the state to be the vacuum in the x coordinates. For a conformal field theory, one would naively expect the entropy to be Weyl-invariant. But this is not quite true because of the need for a cutoff. If we assume that the appropriate cutoff is one in the physical distance, then the regularized entropy of an interval is (see e.g. [68]) In the twist-operator formalism for the entropy, this Ω dependence comes from the Weyl-transformation properties of the twist operators. As a warm-up, we would like to apply this to work out the regularized entropy of the interval between r = r 1 and r = 1 (at fixed θ) in the metric This metric is conformal to the plane, and we consider the state that is the ordinary vacuum in the plane. From (4.29), the entropy of an interval [r 1 , r 2 ] at fixed θ would be + (UV divergence ind. of r 1 , r 2 ). (4.31) 1 2 Figure 11: The black dots on the boundaries correspond to the boundary twist operators, and the black dots in the center are the marked points corresponding to the Z n fixed points. For large t, the entropy on the two intervals marked with wavy lines is approximately the same as the sum of the entropies on the two intervals taken separately.
In the present case we are interested in taking r 2 = 1. As r 2 approaches 1, the expression diverges. This reflects the divergent entanglement between the two black holes due to the coupling at the boundary. This is an IR divergence in the bulk and a UV divergence in the boundary. In principle, this should be cut off by smearing out the coupling between the two boundaries so that very highenergy modes are not coupled. Although we will not discuss the details of this regularization, it will lead to an expression of the form where the constant depends on the regularization. Now, the entropy that we actually need to compute is the one for two intervals, where each interval stretches between one of the Z n fixed points in (4.10) and one of the boundary twist operators. This problem simplifies in a late time Lorentzian configuration like the one shown in figure 11. As we increase t, the two intervals become far apart and uncorrelated with each other, and the entropy approaches the sum of two separate single-interval computations, In this formula, we assumed that the bulk and boundary twist operators were at the same Killing times. We won't work out the formula for different times, but we note that the action is stationary under first-order changes of these Killing times.

Saddle point
We can now put the gravity computation together with the bulk CFT computation and find a saddle point. For n near one, we have the off-shell contribution where the integral is over the locations of the two marked points in the disk geometry. In the case where the boundary twist operators have been continued to late time, this integral has a saddle point for a Lorentzian configuration of the two marked points, corresponding to the one shown in figure 11.
In principle, we should extremize the action over both coordinates. But for late time, the extremum will be such that the points x 1 and x 2 are at the same Killing time as the corresponding boundary twist operators. 26 We then have to extremize over the radii. The action is This action has an extremum at sinh(ρ 1 ) = sinh(ρ 2 ) = βc/24π 2 . This is analogous to the quantum extremal surface of [14,15,16,18]. Here we have shown how it arises from a n → 1 limit of a Euclidean wormhole calculation. 27

Comments on integer n
Finding saddle points explicitly for integer n > 1 seems to be be significantly more difficult than in the n ≈ 1 limit. One reason is that it is no longer possible to restrict to the SL(2,R) modes of the Schwarzian theory: a more general θ(τ ) is necessary. Gluing the two black holes together then involves a "conformal welding" problem. Without getting into the details of this, we would like to explain qualitatively why the Renyi entropy should be finite in the limit t → ∞, using an argument similar to one in [24]. We caution the reader that the discussion in this section and the next one 4.7 is preliminary. As a first step, consider the auxiliary computation (4.36) Here H 1+2 represents the Hamiltonian of the combined boundary dual of BH 1 and BH 2, including the interaction. In the first equality, we inserted some cancelling factors of e ±iHt , and in the figure at right we sketched a path integral contour that would compute this quantity. The solid and dashed lines represent the boundary duals of the two BH systems. The "caps" at the left and right end represent the Euclidean evolution by β and β , and the long horizontal portions represent the forwards and backwards Lorentzian evolution. This quantity is exactly independent of time, and it is helpful to imagine taking t to be very large. Then the contour has a time-translation invariance that is broken only near the endpoints. The gravity solution that fills this in will also have this property, and in order for the answer to be independent of t, this time-translation-invariant gravity geometry must have zero action per unit time. In fact, the relevant geometry is easy to identify; it is a piece of a thermofield-double-like geometry, which is invariant under forwards evolution on one side and backwards evolution on the other. The action is zero per unit time because of a cancellation between the two sides. 28 Now let's use this auxiliary computation. The contour for the Renyi 2-entropy is shown in figure  12(b) and 12(c). The long Lorentzian part of this contour is exactly the same as for two copies 26 This is because the calculation is invariant under time-reversal symmetry, once we ignore correlations between the two intervals. 27 In the setup described here, both entangled systems are gravitational, and we could consider another saddle point where BH 1 remains disk-like and BH 2 forms a pinwheel. By making the S 0 parameters of the two theories different, we can make one of these configurations dominate over the other. 28 We say TFD-like, because if β = β , the two boundaries will not be on opposite sides, but at some general angle. Also, this TFD-like geometry actually consists of two TFD-like geometries, one for each of BH 1 and BH 2. However, because of the interactions between the quantum fields, it is not exactly the same as two separate TFDs; in particular, the quantum state of the matter fields will be somewhat entangled.  Figure 12: In (a) we sketch the boundary time contours for two copies of (4.36), where the long horizontal part is the Lorentzian portion and the "caps" at the ends are Euclidean. In (b) we insert boundary twist operators (wiggly lines) in order to compute the purity Tr(ρ 2 1 ). In (c) we represent these twist operators explicitly by changing the pattern of connection of the contours. We can make a configuration with bounded action as t → ∞ by pasting the solution from the corresponding region of (a) into the shaded regions of (c). The configuration is thermofield-double-like in these two regions.
of (4.36), shown in figure 12(a). The difference has to do with the way the contours are connected together at the ends (and also the amount of Euclidean evolution at the ends). To find a configuration whose action does not grow with time, we can paste the solution from part (a) into the shaded region of (c), and then fill in the remainder in some way that is consistent with the new boundary conditions. Since this modification happens near the ends of the contours, it will cost an amount of action that does not depend on the length of the time interval, at least in the limit of large t. In two dimensions, the topology of the resulting gravity configuration will consist of a cylinder for BH 1 and two separate disks for BH 2.
This explains why there are gravity configurations for the Renyi entropy that have bounded action as t → ∞, but it doesn't show that there are classical solutions. A potential subtlety is the following. In (4.36) there are two parameters β, β . The action along the Lorentzian part of the contours will be zero for any values of these parameters. We expect that these are stabilized to saddle point values when we take into account the action penalty from gluing in the caps at the ends. We have not shown this reliably in gravity, but we will verify it in a related context by finding numerical solutions in the SYK model in the next section.

A factorization problem
In the context of the simple model, we argued in section 2.3 that the answers from the gravity path integral must be interpreted in terms of some implicit ensemble average. In order to make this argument, we showed that the gravity path integral predicted ψ i |ψ j = 0 but | ψ i |ψ j | 2 = 0. Can one say something similar for a more physical black hole?
In the simple model, |ψ i was the state of the black hole after projecting the radiation R onto a definite state |i . For a radiating black hole, an analog is as follows. We consider a case with only one black hole system, which starts out in a microcanonical version of the TFD state, called |E 29 (4.37) To project onto a state in which the black hole radiates a particular sequence of Hawking quanta, 29 For concreteness, we assume a width ∆ of order the thermal scale associated to energy E. Figure 13: In (a) we sketch the boundary time contours for computing ψ A |ψ A ψ B |ψ B . In (b) we sketch the contours for computing | ψ A |ψ B | 2 , and in (c) we rearrange them. In the shaded regions, we glue the relevant part of the solution from (a).
we could act on this initial state with a sequence of annihilation operators where i j refer to a particular sequence of Hawking radiation modes, and t j are the times at which these modes are extracted. Acting with a sequence of this type will lower the energy of the black hole, and eventually lead to evaporation. However, in order to make our arguments as sharp as possible, we would like the "evaporation" process to continue forever, so we will intersperse creation operators between the annihilation operators in such a way that the energy of the black hole remains constant. Concretely, we will evolve forwards in time on the R boundary of the two-sided state |E , applying a sequence of m creation and annihilation operators to various bulk field theory modes, roughly once per unit time, Here, for simplicity, we are taking the bulk matter theory to be a product of free field theories. It will be convenient to constrain O so that it has approximately zero conserved charges in the boundary theory. So, in particular, we balance the number of creation and annihilation operators so that when we act with O on a state, we do not change the energy significantly. Let's choose two different operators of the type (4.39), and call them A and B. Then we can consider states obtained by acting with these operators on the R system of the two-sided state |E : We expect that in JT gravity coupled to a bulk free theory, the inner product between these states will either be zero or will approach zero as we increase the number of operator insertions m, We can think of this inner product as the one-point function of A † B in the state |E . As described in section 6.2 of [26], we can get a nonzero answer for the square of this one point function, by considering a Euclidean wormhole with two boundaries. What we would like to do now is argue that the gravity answer for approaches a constant as the number of insertions m goes to infinity. The argument is very similar to the one we used in the last section to argue that Renyi entropies approach a finite answer as t → ∞. The basic point is that the long Lorentzian portions of the contour (where the operators act) is the same for the numerator and denominator of (4.42), see figure 13. By gluing the gravity saddle point for the computation of the denominator into the long Lorentzian parts of the computation of the numerator, we will find a finite answer, which is suppressed only by the action involved in gluing on the different set of "caps" at the ends. The resulting topology is that of a cylinder connecting the two asymptotic boundary circles, as described in [26]. 30

SYK computations with two replicas
JT gravity coupled to matter fields is not a UV complete theory, due to a divergence in the moduli space integral for long thin tubes. There is no reason to suspect that this is a problem for the current calculations, but to build confidence we undertake a related calculation in the UV complete SYK model [69,70,71]. We will numerically find a solution that gives a non-decaying contribution to the Renyi 2-entropy, which provides a check of the argument given in section 4.6.
We study the same physical arrangement as described in section 4, but with the black hole systems 1 and 2 replaced by SYK systems 1 and 2, interacting with each other in a way that will be described below. The goal is to compute the purity Tr(ρ 2 ), for system 1, as a function of time.
Using the replica-diagonal SYK saddle point, this purity will decrease exponentially in time forever (in a microcanonical ensemble). But in the exact theory, it must saturate at an exponentially small floor value. This is a good target for a nontrivial "wormhole" saddle point, and the interchange of dominance is the Renyi 2-entropy version of the Page transition.
This SYK setup, and the replica-diagonal saddle point, was considered previously by Gu, Lucas, and Qi in [72]. In order to find the "wormhole" saddle point, we will essentially import an appropriate modification of the "double cone" solution found in [24].
To start, define two separate SYK models, called 1 and 2, with N 1 and N 2 fermions, respectively. In order to compute the Renyi 2-entropy, we will need to consider two replicas of this system. So we will write a general fermion with three different indices, ψ α i,a Note that the number of fermions in the two physical systems a = 1, 2 are in general different, and the couplings J i 1 ...iq;1 and J i 1 ...iq;2 are independently drawn. However, the couplings do not depend on the replica index α. So far, the two physical systems are independent. We would like to introduce a coupling between them. In the computation of the Renyi entropy, different replicas will need to interact with each other, so we define a general operator with two replica indices α and α : Regardless of the replica indices, this operator always couples the two physical systems 1 and 2 to each other, and never to themselves. 32 30 In the microcanonical setting described here, we expect to find a stabilized solution based on this approach. In the canonical ensemble we do not expect one, based on arguments from [24,26]. 31 We take q to be a multiple of four to avoid some factors below. 32 We will takeq to be even, so that V preserves separate fermion parity symmetries on the two subsystems.  Figure 14: The time contour C, divided into C 1 and C 2 . Note that τ = β is identified with τ = 0, so C is closed. One swap operator is inserted at τ = 0 and another is inserted at τ = β/2 + it.
Let's now specify the physical setup more precisely. The system starts out in a thermofield double state for the combined interacting 1 + 2 system, with four subsystems 1L, 1R, 2L, 2R. For weak interactions, this thermofield double is a state with relatively little entanglement between the 1 system and the 2 system. (Most of the entanglement is between e.g. 1L and 1R, the two sides of the TFD.) We then evolve the systems forward in time, computing the entropy of system 1L ∪ 1R. The state of the combined system is invariant under forwards evolution on the R systems and backwards evolution on the L systems. We will evolve forwards on R only, leaving L alone.
After evolving for time t, we would like to compute the entropy of the 1L ∪ 1R system. This measures the entanglement between the total 1 system and the total 2 system. We expect this entropy to grow linearly in time, before saturating at a late time value determined by a nontrivial saddle point. More precisely, at time t, we will study the purity Tr (ρ 1 (t) 2 ) of the combined 1L ∪ 1R subsystem. This quantity can be computed by a path integral with two replicas, and with insertions of a swap operator that swaps the replica index of the 1L and 1R subsystems. We expect it to decay exponentially until saturating at a floor value.
Including the effect of the swap operators, the path integral for the purity can be written as where the action is [72] 33 Each system has two replicas, and we can either couple a particular replica of system 1 to the same replica on system 2, or to the opposite one. These two possibilities are realized on the two components of the contour C = C 1 ∪ C 2 . Switching this coupling at the transition between the two contours is equivalent to inserting swap operators at those transition points. It will be helpful below to write the full action with a condensed notation where g(τ ) is the identity matrix for τ in C 1 , and g(τ ) is the swap operator for τ in C 2 .
We can now compute the disorder average (over couplings J) of (5.4), by taking the couplings to be Gaussian random variables with mean zero and with As usual in SYK calculations, the result for the disorder average can be written in terms of G, Σ collective fields 34 Tr Note that one only needs collective fields that are diagonal in the physical system index a, although in general we need off-diagonal fields in the replica indices α, α . The action is explicitly The saddle point equations that stationarize this action are In the second line, we are using the notationâ to mean the opposite physical system, so if a = 1 then a = 2. As usual in SYK equations, the first equation is more complicated than it seems: for fixed a, the quantities G a and Σ a are viewed as matrices, acting on the vector space parametrized by τ, α.
The ∂ τ operator is also viewed in these terms, but is taken to be diagonal. These equations can be discretized on the contour (14), and solved numerically, using the standard iterative approach that is common in the SYK literature. There are two different types of solutions that will be important for us, and we will discuss them in turn.

The replica-diagonal solution
We start with the replica-diagonal solution shown in the top row of figure 15. This is the solution that was considered in [72]. As shown there, it has an action that initially grows linearly with time, implying an exponential decrease of the purity.
The growth in the action can be understood as follows. First consider the case with no interaction between system 1 and system 2, so J = 0. In this case, the twist operators have no effect, and action evaluated on the saddle point G, Σ configuration is exactly independent of time. This saddle point is simply the standard thermal solution G(τ ) of the SYK model for each of the four copies (two noninteracting physical systems, with two replicas each), analytically continued along the contour C:   On this solution, if we evaluate the action with J = 0, the answer must be exactly independent of t. Now, let's start with this solution and treat theĴ 2 term on the second line of (5.10) as a perturbation. Evaluating the action by plugging in the unperturbed solution (5.12), the second line of (5.10) works out to Here, the O(1) term represents a non-growing contribution as t becomes large. So the action will grow linearly in time, at least for while. This corresponds to an exponential decrease in the purity. This derivation is only valid for early times J 2 t/J 1, so that the interaction can be treated perturbatively. A subtle detail [72] is that in the canonical ensemble, this linear growth does not continue forever; instead, the action saturates at some finite value (see the left panel of figure 16). One might be tempted to interpret this saturation as the Page transition. However, what this saturation actually represents is the contribution of very low energies in the tail of the canonical ensemble, where the dynamics are slow enough that the two physical systems essentially do not evolve and entangle. 35 35 At the quantum mechanical level, the contribution of such states will still decrease with time (a power-law decrease This is a real effect in the Renyi 2-entropy, but such "tail" effects cannot significantly effect the von Neumann entropy (which isn't affected by tail effects the way Renyi entropies are), so we view it as a distraction. In order to avoid this subtlety altogether, we can work in the microcanonical ensemble. Then one finds that the action continues to increase linearly, as shown in the right panel of 16. So we conclude that in the microcanonical ensemble, the replica-diagonal saddle point gives an answer for the purity that exponentially decays forever.

The wormhole solution
The other class of solutions can be motivated by the discussion in section 4.6, which suggests that there should be a solution with an approximately time-translation invariant portion corresponding to a TFD-like correlation between different replicas, glued in some way to a configuration with the correct boundary conditions at the ends of the contour. We refer to this as a wormhole solution because its pattern of correlation is the same as that of the wormhole geometry from section 4.6.
In order to find such a solution numerically, the procedure we followed was as follows. For the first few iterations of the Schwinger-Dyson equations, we included an explicit source in the SYK equations that encourages replica-off-diagonal correlations like the ones expected based on section 4.6. After a few iterations, we then set the source to zero and continued iterating until convergence. If the time t is larger than a critical value (discussed below), we found that the iterations converged to a nontrivial solution like the one shown in figure 15. The pattern of correlations in this solution is precisely the one expected based on the argument in section 4.6.
The numerical value of the action is independent of time to a good approximation, see figure 16. 36 To reiterate the discussion from section 4.6, this can be understood as follows: as we make the time larger, the only aspect of the solution that changes is the TFD-like portion in the middle of the time contours gets extended. Since this TFD-like configuration has exactly zero action, the action does not change. 37 in the purity), but we do not see this at the level of the classical action. 36 To compute the action accurately, we used extrapolation in the size of the matrices that represent the discretized G, Σ variables. The maximum size we used was 1600 × 1600 each for system 1 and system 2. To speed up convergence, one can start with a refinement of the converged solution for smaller grid size. 37 Note that the SYK action is not local in time, but for solutions that decay exponentially in time like these, it is Since the action of the replica-diagonal solution increases linearly (in the microcanonical ensemble), there will eventually be a transition between the two solutions. For the setup that we have described here, the transition is at a time that is independent of N , but proportional to 1/Ĵ 2 .
It would be desirable to understand this solution better, since it seems to involve several interesting aspects of thermalization and chaos. One example of this has to do with the critical time at which the solution first starts to exist. Empirically, this is based on the following: near the endpoints of the contour, the solution needs to have a "normal" pattern of correlation, in which contours C 1 and C 2 of each replica are highly correlated with each other. However, as we move away from the endpoints, this pattern of correlation is replaced by correlation between C 1 and C 2 of opposite replicas. The decay of the first type of correlation appears to be due to a scrambling effect, sourced by the perturbation due to the coupling between the systems,Ĵ. Based on this, one would predict that the critical time at which the "wormhole" solutions start to exist is logarithmic in the couplingĴ t first exist ∝ log(J/ J). (5.15) This appears to be consistent with numerics (not shown).

Entropy and replica wormholes in de Sitter
In this section, we will tentatively discuss some entropy computations using replica wormholes in de Sitter space. 38 Our starting point is the no-boundary proposal [73] for the wave function of de Sitter space, or more precisely its generalization in [74,75,48,76] to a no-boundary proposal for the density matrix. In this version, we compute a density matrix for the universe by summing over all geometries that end on the boundary conditions for the bra and ket vectors of the density matrix.
In this sum, one can have separate disconnected geometries attached to the bra and ket (these terms would also be included in the original no-boundary proposal) but also connected geometries in which the bra and ket are distinct boundaries of a single connected spacetime. One can also generalize this further to a no-boundary proposal for the tensor product of n copies of the density matrix ρ. In this case, we have n bra-type boundaries and n ket-type boundaries, and we sum over all spacetimes (connected or otherwise) that end on these 2n boundary conditions. This immediately leads to a somewhat surprising conclusion. Naively, it would appear that connected geometries will lead to a mixed density matrix. However, to check this, let's compare Tr(ρ n ) and Tr(ρ) n . In both cases, the boundary conditions consist of 2n boundary components: n ket-type boundaries and n bra-type boundaries. To compute either quantity, we identify these boundaries in bra-ket pairs, and integrate over the mutual boundary conditions for each pair. Tr(ρ n ) and Tr(ρ) n correspond to two different ways of pairing up the 2n boundaries. However, the noboundary rules described above are invariant under arbitrary permutations of the ket-type boundaries and the bra-type boundaries. So Tr(ρ n ) = Tr(ρ) n and the state is pure.
This seems at odds with the fact that the no-boundary answer for a single copy of the density matrix will be mixed. As in the discussion of black holes in section 2.3, a possible interpretation is that the gravity path integral is describing an ensemble of theories in which the state of the universe is pure. The average of the density matrix is the mixed no-boundary density matrix. But the average of the entropy is zero. (Another possible interpretation is that our no-boundary prescription for ρ ⊗n is too aggressive. But we will keep it for the moment and see where else it leads.) local enough for this argument to apply. 38 We thank Jorrit Kruthoff, Juan Maldacena, Mehrdad Mirbabayi and Eva Silverstein for discussions about de Sitter space.
So far, we have only discussed the entropy of the whole universe. The entropy of subsystems is more interesting, but more difficult to compute. So we will study a simple model inspired by the end-of-the-world brane model for near-extremal black holes in section 2. This model is a two dimensional nearly de Sitter space described by Jackiw-Teitelboim gravity with positive cosmological constant [48,77]. We take the asymptotic time slice to be a segment that starts and ends on two EOW branes with a large number of orthogonal internal states. The boundary conditions for an unnormalized ket vector look like the following ij|Φ = (6.1) Here, we are imagining that we measure the length of the spatial interval to be some renormalized value , which is held fixed in the rest of the discussion. The density matrix ρ consists of two copies of this boundary condition, which can be filled in as follows: The fact that there are multiple topologies already for the density matrix makes the story in de Sitter space richer than for the black hole. Note that due to the second term, this is a mixed state. Here, in nearly de-Sitter gravity, the path integrals are given by analytic continuation of JT gravity path integrals, see [48,77] (or more generally [78,79]) for details of this analytic continuation. For the simple case where the brane is massless, we have In order to compute e.g. Tr(ρ), the primed and unprimed indices will be contracted. The first diagram will contribute e 2S 0 k, and the second will contribute e S 0 k 2 . So if we take a very large value of k, there can be an interchange of dominance between the two. We can now compute the entropy of the left brane degree of freedom, corresponding to the i-type index. We will refer to its density matrix as σ. For small k, a completely disconnected topology dominates, and one finds Tr(σ n ) Tr(σ) n = k| Z 1 | 2n The von Neumann entropy will be log(k). In the opposite limit k e S 0 , the fully connected geometry dominates, both for the computation of Tr(σ n ) and for Tr(σ) n . We find The s integral just gives an order one coefficient, so the answer is roughly e (1−n)S 0 and the von Neumann entropy is approximately S 0 . So there is a "Page-like" transition in our simple de Sitter model. As we increase the number of EOW brane states, the von Neumann entropy of one of the branes has a transition from the naive result log k to the de Sitter entropy S 0 . Although the EOW brane setup is rather artificial, this does give a hint at a microscopic role for the de Sitter entropy [80] (see also [81,82,83,84]).
It would be interesting if a similar effect could be seen in more physical setup in which the large number of states k emerges naturally from some bulk computation, rather than being put in by hand as we did here. A starting point could be the "centaur" geometry [85] or possibly the large number of states produced by quantum field theory evolution over many e-folds, which was previously considered in the context of the de Sitter entropy in [86,87]. 39

Wormholes in non-averaged systems
The arguments in this paper use spacetime wormhole geometries in an essential way. But the results in section 2.3 for the overlap of individual black hole microstates |ψ i computed using such wormholes seem only to be consistent if they are interpreted as ensemble averages. In this case it seems natural to interpret the wormholes as part of an effective description, not a microscopic one. They do not know about the exponentially large amount of microscopic data contained in the fluctuating phases R ij in the non-averaged matrix elements ψ i |ψ j = δ ij + e −S 0 /2 R ij . We should stress that the inconsistency in overlaps that is resolved by an averaged description is not limited to the simple model discussed in section 2. As discussed in section 4.7, we expect a similar situation for the radiating black hole.
The question we want to address here is what role wormholes play in systems without averaging. We would like to describe a somewhat analogous situation in semiclassical quantum chaos, which may provide some guidance [25]. Consider a few body quantum chaotic system, like a quantum billiard. Semiclassically, matrix elements in the position basis can be written as sums over classical trajectories connecting the bra point to the ket point. This should allow an analysis of the overlap puzzle. A simpler situation that illustrates many of the same ideas is to consider the Minkowski signature partition function Tre −iHt which can be written semiclassically as a sum over periodic orbits a. The product of two such partition functions, the spectral form factor, can be rewritten as a double sum over periodic orbits, which we can express schematically as follows: Here S a is the classical action of orbit a. The spectral form factor obviously factorizes into the product of partition functions -this requires a double sum over orbits. On averaging, 40 only the "diagonal" terms corresponding to the same orbit in each sum (up to a time shift) survive. This gives the ramp. 41 After the pairing connection is made factorization is lost, as expected for an averaged system. In this picture the diagonal pairing pattern is an effective, coarse-grained description of the exponentially large number of long, diagonally paired orbits, multiplied by their exponentially small one-loop determinant.
Let's now try to make an analogy to quantum gravity. We view the quantum Hamiltonian of the billiard, H as the "boundary" description. The sum over orbits would be the microscopic "dual bulk" description. The diagonal pairing pattern in the coarse-grained sum over long orbits we take to be the analog of the wormhole geometry in the gravitational context. 42 It is well-known that wormholes conflict with the factorization of e.g. partition functions of decoupled systems [90]. In a non-averaged situation where factorization must hold, what is one supposed to do with the wormholes? The solution the periodic orbit analogy suggests is related to one already offered in [90]. To restore factorization in the non-averaged theory, one doesn't eliminate the paired diagonal terms corresponding to the wormhole. Instead one adds back in all the other unpaired off-diagonal terms. So to have a gravitational bulk understanding of non-averaged theories we need a gravitational bulk understanding of these off-diagonal terms. These might well not have a simple geometrical description, even an effective one.
The periodic orbits are defined in the microscopic phase space that semiclassically determines all the microstates of the quantum system. So a variant of the issue at hand is to have a gravitational bulk understanding, geometrical or not, of all the microstates of the system. This is related to the fuzzball program. 43 On occasion another idea has been suggested: wormholes in non-averaged decoupled systems could be ruled out because they are not actual saddle points. However, in JT gravity, the wormhole describing the ramp is a saddle point in the microcanonical ensemble [24,26]. We see no obstruction to the existence of such microcanonically stable wormholes in more complicated higher dimensional gravitational theories dual to non-averaged systems [24]. But they would give the wrong answer in a non-averaged theory. 44 They do not describe the erratic fluctuations due to the fine-grained structure of energy levels that we expect, as illustrated for example in figure 10 of [94]. These erratic fluctuations are not a small effect -their magnitude is of the order of the signal itself. So the existence of a saddle point is not a sharp criterion for including wormholes. 45 Again, we stress that the periodic orbit analogy tells us that the off-diagonal contributions are responsible for the erratic behavior. Any complete gravitational bulk description must contain a description of the analog of these off-diagonal terms.
In the periodic orbit example we proposed that the wormhole geometry is analogous to the diagonal pairing pattern, an effective, coarse-grained description of which orbits contribute. This raises the question of whether bulk geometry in general is only an effective description of some different, more fundamental degrees of freedom -the analog of the periodic orbits. Another possibility is that the fundamental bulk description contains "geometric" degrees of freedom, like perturbative strings, in addition to other non-geometric ones -complicated configurations of large numbers of branes, for example. It is even conceivable that geometry could actually describe everything, in some subtle way as yet not understood. 46 The current situation, though, is that there is no known bulk description of a gravitational theory that is rich enough to include the microscopic information necessary to explain, for example, the erratic behavior in the spectral form factor. The nature, or perhaps even the existence, of such a bulk description remains one of the deepest mysteries in quantum gravity.
We now turn to a more pragmatic question: if spacetime wormholes are only an effective descripanalogy to the microstate overlaps discussed in section 2.3 would be to compute the averages of individual density matrix elements, not purities. 43 For reviews see [91,92]. For a critique of this program see [93]. 44 We thank Phil Saad for discussions on this point. 45 It would be interesting to find some internal signature of this failure within the geometric bulk theory. In the presence of bulk matter there is a UV "Hagedorn" type divergence at small wormhole diameter [25] that may have some bearing on this issue. 46 An impressive example of microstate information being completely described by geometry is the elegant calculation of [95] of the exact entropy of certain supersymmetric black holes. The exact integer degeneracies are recovered from a sum over an infinite number of instanton corrections computed using supersymmetric localization. What would be required in the present context is an analogous result that would explain the much more complicated pattern of energy levels present in these non BPS chaotic systems as a sum over gravitational configurations.
tion, are such configurations useful in non-averaged bulk theories? We believe the answer is clearly "yes." For example, the entropies computed in section 2 are "self-averaging." This means that they have small variance in an averaged theory, basically because they are sums of large numbers of fluctuating terms. This variance can be computed from the bulk by considering additional wormholes linking the two copies used to compute the variance. Roughly speaking, self-averaging quantities are those where the "off-diagonal" terms make a small contribution compared to the diagonal ones. 47 We expect that a wormhole calculation of these self-averaging quantities will be quite accurate even in a single realization taken from the ensemble of theories, and hence in a non-averaged system. But we emphasize that wormholes will not give the exact answer. Worse, without understanding the bulk origin of the off-diagonal microscopic effects, there is no clear procedure to systematically improve the calculation into an arbitrarily accurate one.
Systems with a direct coupling between subsystems like those discussed in [96] are another example where self-averaging behavior occurs. Here again a wormhole calculation will be useful, even in a non-averaged theory. But again, a precise calculation would require the microscopic information.
In the absence of a complete microscopic description how is one to decide whether to trust a wormhole calculation in a non-averaged theory? An empirical test might be the following. Pretend the theory is part of an ensemble and compute the variance of the quantity of interest, again using wormholes. If it is small, the wormhole calculation should be trustworthy. If it is of order of the signal, beware! z 1 = z 2 = 1, x 1 = 0, we get (x 2 = e 2 ): For computation of Z n>1 , we can first gauge fix x 1 = 0, x 2 = 1, x 3 = 2, the Faddeev-Popov determinant gives: By defining the regularized length ij = 2 log(|x i − x j |) − log z i − log z j , we can introduce a new basis { 12 , 13 , 23 ... 1n , n−1,n }, which forms a triangulation of a hyperbolic polygon: Its Jacobian matrix can be shown to be a lower triangular matrix: whose determinant is: This gives the measure:

B General two dimensional dilaton gravity
Near-extremal black holes are universally described by Jackiw-Teitelboim gravity with negative cosmological constant, and therefore they also have replica wormhole configurations where the area of the transverse direction is the dilaton field (together with S 0 ). It is interesting to ask whether replica wormholes exist for more general classes of black holes, especially those far from extremality. Assuming spherical symmetry, general gravitational systems can be reduced to two dimensional dilaton gravity [97], so we are led to consider JT gravity with a more general dilaton potential in our simple model: Again we assume the brane follows a geodesic. And again we entangle the brane states with the radiation, and consider the behavior of the entropy of the radiation when we vary k. Note that flat space black holes (Rindler space) can be described in this framework using the special case of a constant dilaton potential. While in general understanding euclidean wormholes requires a good understanding of off-shell geometries, in the planar limit only knowledge of the path integral on the disk topology Z n is required. The derivation of the Schwinger-Dyson equation still applies since it only depends on the topological action and k. Moreover in the heavy brane case, the branes become local near the boundary and Z n becomes proportional to the disk partition function of boundary length nβ. 48 Since the effects of the heavy branes are local, they cancel out in the ratio of Zn In general the exact form of ρ(E) is not known and one can only solve the equation in the classical limit. Using the classical thermodynamic relation for general dilaton gravity (Appendix E.3 in [48]): is called the prepotential, we can write down the general semiclassical resolvent equation: In the microcanonical case, we will still get Page's result. In the canonical ensemble case, one needs to analyze the equation based on W (φ h ) and we expect that most of our analysis will still apply.

C General entanglement-wedge reconstruction using Petz
In this appendix, we will indicate how the argument in sections 3.4 and 3.5 extends to the case of general entanglement-wedge reconstruction. We start with a code space of bulk field theory excitations |a FT around some particular background, and an operator O FT acting in this space. |Ψ a is the boundary CFT state corresponding to |a , and O is the CFT operator corresponding to O FT . If A is a subregion of the boundary theory, then the Petz map gives a guess for the reconstruction of O on region A: Defining a replica version as in (3.19), we have In the bulk dual, the RHS of (C.2) is a gravitational path integral with operator insertions to prepare the different state |Ψ a in the code subspace. At order G −1 N , the answer doesn't depend on these operator insertions, and it reduces to the gravitational path integral for the Renyi (2n + 2)-entropy, which we will refer to as Z grav 2n+2 . At order G 0 N , we have a field theory computation on this fixed background, which we can write Ψ a |O (n) This formula requires some explanation. The replica-symmetric geometry M 2n+2 that dominates Z grav 2n+2 has a codimension-two surface Σ that is fixed by the cyclic replica symmetry. We divide M 2n+2 into 2n + 2 equal pieces by cutting along codimension-one surfaces A n that connect the 2n + 2 copies of region A on the boundary to Σ. Each surface A n can be understood as a backreacted Renyi version of the Cauchy slice of the entanglement wedge of A. They intersect Σ with equally spaced angles θ = 2π/(2n + 2). The operator M (a, b; θ) is defined as the path integral on the geometry between two of these cuts, with boundary conditions that include the operator insertions for the states |Ψ a and |Ψ b . In the limit where we take the total number of replicas to one, we find the simple formula M (a, b; 2π) = Tr A |a b| FT . (C.5) As in section 3.5, one can now take the n → −1/2 limit, and find where the RHS is an auxiliary Petz map computation, defined purely in the bulk field theory. A is the Cauchy slice of the entanglement wedge of A, and is the Petz map for the channel corresponding to erasure of the complement of the entanglement wedge. In particular, for the case where O FT acts within the entanglement wedge A to a good approximation, then O A = O FT , and reconstruction succeeds. We also note that there is a more general version of the Petz reconstruction map, defined using a fixed, but arbitrary, state ρ that is not necessarily maximally mixed. In this case, we define O (ρ) It is easy to see, by analogous arguments to those above, that this reconstruction reduces in the bulk to the field theory Petz reconstruction O (ρ) An advantage of this more general construction is that it can be made well-defined even for infinitedimensional code spaces, where the maximally mixed state does not exist. However, one has to be somewhat careful here: if the code space includes degrees of freedom outside the entanglement wedge (i.e. we have a subsystem rather than a subspace code), the field theory Petz map, constructed using an arbitrary, non-maximally mixed state ρ, will not necessarily recover the original operator. Even if we use the twirled Petz map, the reconstruction is only guaranteed to work if the code space state ρ has no entanglement between the inside and outside of the entanglement wedge [39].
In an infinite-dimensional code space (such as the entire field theory Hilbert space), such product states do not necessarily exist. However, by using a thermal state at very high temperature, we can make the entanglement arbitrarily short range. We should then expect that the field theory Petz map will recover the original operator with high accuracy so long as the inverse temperature β is much smaller than the distance from the original operator to the edge of the entanglement wedge. This can be verified in simple cases where the action of the field theory modular flow (and hence the field theory Petz map) is known explicitly.
Finally, we emphasize that the definition of the Petz map reconstructions relies on being able to create arbitrary states in the code space using gravitational path integrals. For most situations of interest, this is not a problem. For example, if our code space is the state of a diary that was thrown into a black hole, we can easily construct arbitrary states in the code space simply by changing the state of the diary, before it was thrown into the black hole.
If we want to reconstruct the interior partners of Hawking quanta -to understand, for example, whether there is a firewall [29] -the situation is somewhat more complicated. We cannot directly manipulate the interior modes because they become trans-Planckian when evolved back in time. Instead, they are always produced, together with the Hawking quanta themselves, in a fixed, entangled state |ψ 0 . However, we can use the gravitational path integral to manipulate the state of the Hawking quanta. Because the state |ψ 0 has maximal rank, using these manipulations, we can produce an overcomplete basis of states for the Hawking quanta and interior partners.
By taking linear superpositions of such path integrals, we can therefore construct arbitrary states in code spaces that include interior partner modes. We can use these path integrals to construct Petz map reconstructions of interior modes, after the Page time, that act only on the Hawking radiation.
This construction assumes that the interior modes were initially in the state |ψ 0 , i.e. that there wasn't a firewall. Indeed, it has always been the case that gravitational calculations implied the absense of a firewall. The new result that one can see using the Petz map is that gravitational calculations also imply that the interior modes can be reconstructed on the Hawking radiation, i.e. ER=EPR [30,31,32,33,34].

D A ensemble boundary dual of the simple model
In this appendix, we show that the simple model, with pure JT gravity plus EOW branes, is dual to a boundary ensemble of theories, just like pure JT gravity [25]. This ensemble of theories is defined by a randomly chosen Hamiltonian H, and a set of randomly chosen special states |i (the brane states). The dual bulk theory is valid to all order in e −S 0 . It includes nontrivial bulk topologies, including topologies with handles (suppressed by powers of e −S 0 ). However it doesn't include any contributions of EOW brane loops. All brane world lines have to begin and end on the boundary. This provides some justification for the fact that we ignored the possibility of end-of-the-world brane loops in all our bulk gravity calculations in this paper.
We now give a precise definition of the ensemble of boundary theories. First, the Hamiltonian H is chosen from the usual JT gravity ensemble of Hamiltonians, as in [25]. Let the eigenstates of this Hamiltonian be labelled |E a . Then the brane states are chosen to be where the coefficients C i,a are i.i.d. complex Gaussian random variables.
Let us see why this works. Our aim is to show that expressions of the form Tr(e − βnH ) , where the expectation is over the ensemble of Hamiltonians and states (D.1), matches a bulk computation in JT gravity with the following boundary conditions. We have p asymptotic boundaries that are intervals of renormalized lengths {β m } bounded by EOW branes, and we have q standard S 1 boundaries with renormalized lengths { β n }.
We now simply integrate out the Gaussian random variables C i,a . We find that (D.2) equals Tr(e − βnH ) Here, we are summing over permutations π on p elements that take m to π(m) and the subsets γ ⊆ {1, 2, . . . , p} are elements of the set c(π) of cycles of the permutation π. This formula no longer involves the brane states |i . We can therefore hope to make contact with the arguments from [25] for evaluating products of partition functions, in this ensemble of Hamiltonians, using bulk JT gravity path integrals. Of course, (D.3) isn't quite in the right form that would allow us to do this, since it includes insertions of |Γ(µ − 1 The expectation values in this formula can be evaluated using bulk JT gravity by considering all topologies (including topologies with handles) of JT gravity with asymptotic boundaries of renormalised length m∈γ (β m +β m ) for each cycle γ in the permutation π, as well as additional asymptotic boundaries of renormalised length β n for each n.
Once we include the sum over permutations π, this is exactly the set of topologies that appear in the simple model. The different permutations correspond to the different ways of pairing up brane start and end points. The factors of δ i π(m) ,jm ensure that we only get nonzero contributions when the state of each brane is the same at both ends; it tells us that the branes have no dynamics, as desired.
We are still not quite done however. In this formula, we consider geometry of fixed asymptotic length, and integrate over different lengths. In contrast, in JT gravity with branes, the topologies are partially bounded by geodesics, which contribute the EOW brane action µl. Fortunately, as argued in [58,26], any boundary segment of length β m can be replaced in the JT gravity calculations by the insertion of the Wheeler-de Witt wavefunction at the homologous, non-selfintersecting bulk geodesic χ. In the l basis, this Wheeler-de Witt wavefunction is given by as in (2.31). When we rewrite the JT gravity calculation in this way, the only dependence of the expectation value in (D.5) on the auxiliary variables β m comes in the choice of wavefunction that we insert at the geodesic.
We can now do the integrals over β m explicitly by simply taking a superposition over these Wheeler-de Witt wavefunctions. After doing this superposition, we find that the full wavefunction on each geodesic is given by Using (A.1) and Eqn. 25 of [98], which is the orthogonality condition for the l-basis of wavefunctions, we find This is just the insertion of a brane wavefunction on the geodesic χ. We have therefore recovered the bulk theory of JT gravity with EOW branes. We can now use the random state formula (D.1) to write a random matrix formula for the density matrix ρ R in the state (2.5): The matrix H is drawn from the double-scaled matrix integral dual to ordinary JT gravity. The matrix C is a rectangular k × ∞ complex matrix with Gaussian random entries, rescaled so that the density matrix is correctly normalized.

E Fixed Area States and Random Tensor Networks
In this appendix, we show that the non-perturbative corrections to the Ryu-Takayanagi formula from additional extremal surfaces are identical in a) "fixed-area" states in AdS/CFT [60,61] and b) random tensor networks [89]. This is consistent with the intuition from [60,61] that fixed-area states are analogous to tensor network states, since both have flat entanglement spectra at leading order. However, our calculations suggest that the connection to random tensor networks specifically (as opposed, for example, to perfect tensor networks [99]) goes much deeper than might previously suspected, with non-perturbative instanton corrections that are the same in both models. Note: this appendix was added in v2; similar results were derived independently in upcoming work by Akers, Faulkner, Lin and Rath [100]. We first calculate the von Neumann entropy, including non-perturbative corrections, for the simplest possible random tensor network: a bipartite pure state with subsystem Hilbert space dimensions d B and dB. (We assume for notational convenience that d B < dB.) This is, of course, just the famous Page calculation [2]. Let us review how the answer can be found using the replica trick.
We first evaluate Tr(ρ n B ) , where the expectation is over the Haar measure for the random state. Using the standard result where the sum is over permutations π on the n copies of the system and π B and πB are operators that permute the n copies of their respective subsystem, we find Tr(ρ n B ) = Tr(ρ n B ) ψ|ψ n = π Tr(τ B π B )Tr(πB) π Tr(π B )Tr(πB) where τ is any fixed cyclic permutation and C(Π) is the number of cycles in the permutation Π. Here we focus on the limit where d B and dB are both large but their ratio is arbitrary. In this limit E.2 simplifies to where the sum is now over permutations in the set G of permutations such that C(π) + C(τ • π) takes its maximal value, which is n + 1. These are the permutations that lie on a geodesic (shortest paths in permutation space, where each step is a transposition) between the identity permutation and the cyclic permutation τ . Such permutations are in one-to-one correspondence with the set of non-crossing partitions. We note that, in the limit of large d B , dB, fluctuations are suppressed and so the expectation over states is unnecessary. The number of non-crossing permutations such that C(τ •π) = k is equal to the Narayana number, N (n, k). on n. However, we can use the fact that the binomial coefficients vanish for all k > n to extend the sum from k = 1 to infinity. The expression is then easy to continue in n, and we find The first term comes from the identity permutation (k = 1) and the second comes from the permutations consisting of a single transposition (k = 2), which give the leading correction in the limit dB d B . We therefore find Naively, one would expect additional subleading corrections from permutations that are suppressed by higher powers of d B /dB, as there were for Tr(ρ n B ). However, one can check that for k > 2, the continuation of the Narayana number N (n, k) near n = 1 is proportional (n − 1) 2 , which does not contribute to the von Neumann entropy. Morally speaking, the von Neumann entropy is "oneinstanton exact".
We note that much of the above discussion extends to more complicated random tensor networks. For example, when evaluating Tr(ρ 2 B ), one ends up with the partition function of a classical Ising model. However, this will not be necessary for our present purposes. See [89] for more details.
Instead, we shall move on to discussing fixed-area states in quantum gravity. For concreteness, we shall mostly focus on the example of two disjoint boundary regions (which we collectively denote by B) in AdS 3 , although the story is completely general. In this case, there are two extremal surfaces, one homotopic to the region B (with area A 1 ) and the other homotopic to the complementary region B (with area A 2 ).
We consider states where the area of both surfaces has been measured, and so saddle points of the gravitational path integral can have conical singularities at the extremal surfaces, so long as the deficit angle is constant everywhere on the surface. As discussed in [60,61] and section 3.2, this means that the replicated geometries used to calculate Tr(ρ n A ) are very simple: they are just n copies of the original unbackreacted geometry, with different sheets glued together around the extremal surfaces.
More specifically, as shown in figure 17, the boundary conditions used to compute Tr(ρ n B ) mean that the bottom half (the 'ket') of the bulk geometry near region B needs to be glued to the top half of the bulk geometry on the neighbouring sheet. Passing upwards through this region applies a T U U Figure 18: A simple random tensor network, with a single random tensor |T and three bulk legs. One bulk leg flows into |T , while the other two are mapped directly into the boundary Hilbert spaces B andB.
cyclic permutation τ to the different sheets. In contrast, when upwards through the bulk geometry near regionB, one must remain on the same sheet. What about the central region, between the two extremal surfaces? This is not fixed by the boundary conditions, and so there should be saddles associated to each possible permutation π of the sheets.
What is the action of each of these saddle points? We first note that, when the state is correctly normalised (dividing through by Tr(ρ B ) n ) the contribution to the gravitational action from everything except the conical singularities cancels between numerator and dominator because the geometry is unbackreacted. The conical singularites themselves give a contribution to the action of (φ − 2π)A/8πG N , whereφ is the angle around the conical singularity. If the extremal surfaces had conical singularities with angles φ 1 and φ 2 in the unreplicated geometry, the contribution to the action in the replicated geometry is After normalisation, the dependence on φ 1 and φ 2 vanishes and we are left with the final result Note that, since A 1 and A 2 are individually divergent, all permutations π that do not minimise (C(τ • π) + C(π) are infinitely suppressed, even at finite G N . This is exactly the result that we found for the bipartite random pure state, except with d B and dB replaced by exp(A 1 /4G N ) and exp(A 2 /4G N ) respectively. It therefore follows that the nonperturbative corrections to the von Neumann entropy for the fixed-area state are identical to the corrections that we found above for the bipartite random pure state.
Finally, we comment briefly on what happens when bulk degrees of freedom are added in each case. For a much more careful analysis of perturbative correction in fixed-area states see [101]. If we treat the gravitational path integral semiclassically, the Euclidean field theory path integral in the bottom-half of the geometry prepares a particular state |ψ bb b of the bulk quantum fields. Here b is the bulk region near B,b is the bulk region nearB and b is the central region between the extremal surfaces. The field theory partition function on the full replicated geometry then evaluates to where the operators τ b and π b permute the subsystems in regions b and b respectively.
What about if we add bulk legs to the random tensor network, as shown in figure 18? The answer is exactly the same thing. If we input some bulk state |ψ bb b and then evaluate Tr(ρ n B ) by replacing the random tripartite states |T by a sum over permutations π, as in (E.3), we find that the term associated to each permutation has an additional factor that is exactly (E.8). The analogy between fixed-area states and random tensor networks continues to hold.

F The Page transition in the simple model
The aim of this appendix is to comprehensively analyse the full Page transition in the simple model. Despite the simplicity of this model, the structure of the transition turns out to be somewhat complicated. Different features have transitions at different times, creating a number of distinct "subphases".
As a simple example, the transition happens earlier for larger n Rényi entropies, creating an infinite number of different Page transitions. We will ignore the Rényi entropies, and instead focus only on changes in either (a) qualitative features of the shape of the entanglement spectrum, or (b) in the von Neumann entropy.
Nevertheless, we still find seven distinct phases within the transition, as the entanglement spectrum slowly switches from a flat spectrum, with all eigenvalues equal to 1/k, to a thermal spectrum, with entropy given by the Bekenstein-Hawking entropy of the black hole. Because the details of the calculations in this appendix are fairly technical, we first summarise the main conclusions about each phase, and then proceed to analysing each phase in detail.

Assumptions and Notation
In this appendix, we shall always assume that we are in the semiclassical limit β 1. 49 At certain points, we shall also assume, for convenience, that the brane mass µ 1/β. We first formalise some notation. Recall that Z n = e S 0 dsρ(s)y(s) n . (F.1) We use notation where the saddle point value of s for Z n is denoted by s (n) . In the limit µ 1/β, we have ρ(s) y(s) n ∼ s 2π 2 y(0) n e 2πs−nβs 2 /2 .
There is one more saddle point that will be relevant for our calculations. This is the saddle point for e S 0 dsρ(s) 2 y(s), (F.4) 49 Recall that, in our units, this corresponds to the limit G N → 0.
which we denote by s . For µ 1/β, s = 4π/β. Note that we have s > s (1) and s (n) > s (m) for n < m.
For a particular choice k, there are two other values of s which will be important. The first, denoted by s k , is the value of s corresponding to the kth largest thermal eigenstate, as defined in (2.39). In the semiclassical limit, this is defined by The second, which we denote by s k labels the thermal eigenstates with eigenvalue 1/k. This is defined by Note that the normalisation of the thermal state ensures that we always have s k < s k . Finally, an important value of k, which we shall denote by k 3→4 , is defined to be the smallest value of k for which For µ 1/β, we have s k = 2π(2s k /β − 2π/β 2 ) + o(1/β) and hence s k 3→4 = (4 − 2 √ 2)π/β + o(1/β). More generally, we expect that s (2) < s k 3→4 < s (1) . Note that for k k 3→4 , the left hand side of (F.7) is much smaller than the right hand side.

Summary of the phase structure
At small k, specifically k Z 2 /y(0) 2 (phase I), the shape of the entanglement spectrum is a narrow semicircle distribution, of width 4 Z 2 /kZ 2 1 , centred on 1/k. The von Neumann entropy is approximately equal to log k, with a small correction that comes from the finite width of the semicircle. This is exponentially small with respect to (S BH − log k).
For Z 2 /y 2 (0) k e S 0 ρ(s (2) ) (phase II), the entanglement spectrum is dominated by the same narrow semicircle distribution. The width of this semicircle continues to give the leading correction to the von Neumann entropy. However, there is now a small tail of much larger eigenvalues, that emerge out of the semicircle as k increases. This tail looks like the thermal spectrum, except all the eigenvalues are increase by a constant shift of δλ = 1/k. When e S 0 ρ(s (2) ) k k 3→4 (phase III), the original semicircle disappears. The spectrum looks like a thermal spectrum, shifted by δλ = 1/k, which is then cutoff after O(k) eigenvalues. We do not have good analytic control over the cutoff region itself. The von Neumann entropy is dominated by eigenvalues near the cutoff, and is still approximately equal to log k. The largest correction to the entropy comes from the width of the cutoff region and has magnitude O(k 2 y(s k ) 2 /Z 2 1 ). When k 3→4 k e S 0 ρ(s (1) )e −O(1/β) (phase IV), the leading correction instead comes from the eigenvalues in the shifted thermal spectrum, and has size O(e S 0 ρ( s k )/k). Unlike in phase III, this correction is well controlled analytic; its exact size is given by a simple integral.
The actual Page transition (phase V) for the von Neumann entropy happens at k ∼ e S 0 ρ(s (1) ). At this point, the shift in the thermal tail of eigenvalues starts decreasing in order to preserve the normalisation of the state. The shift is now (1 − p k )/k, where p k is the probability of the thermal ensemble being in one of its k largest eigenvalues. The von Neumann entropy is well approximated, up to a correction that peaks at O(β) size near the transition, by the entropy of a shifted thermal spectrum, with a hard cutoff after exactly k eigenvalues. This gives a large O(1/ √ β) correction compared to the naive Page curve where S BH is the black hole entropy in the canonical ensemble. This is parametrically larger than the correction for the microcanonical ensemble, because of the fluctuations in the energy.
As k continues to grow, the shift in the thermal part of the spectrum decays, until it becomes neglible for all eigenvalues where the spectrum actually looks thermal. The entanglement spectrum is now the ordinary unshifted thermal spectrum, except that it is smoothly cutoff after k eigenvalues. The von Neumann entropy is approximately equal to the black hole entropy S BH .
Initially, the largest correction to that entropy still comes from the existence of the cutoff (phase VI) and has size However, when k e S 0 ρ(s ) (phase VII), this correction is smaller than the effect of small corrections to the thermal spectrum at s ∼ s . These corrections give a correction to the entropy of size In the limit k → ∞, we also regain some analytic control over the shape of the spectral cutoff. Throughout the transition, the von Neumann entropy is well approximated by assuming that the density of states is a thermal spectrum, with a hard cutoff after k eigenvalues, and with the eigenvalues increased by a uniform shift to ensure the correct normalisation. This approximation is worst at the Page transition, when it gives an O(β) error.
It is important to note, however, that this shifted thermal spectrum only accurately computes the leading correction to the naive Page curve in phases IV and V. Outside this range, the shifted thermal spectrum entropy is no more accurate than the naive Page curve. Nonetheless, the shifted thermal spectrum is still very accurate everywhere else, since the naive Page curve is itself a very good approximation away from the transition.
In Figure 19, we plot the correction δS to the naive Page curve entropy over the course of the Page transition.
When k is sufficiently small, we can solve the resolvent equation by assuming that the resolvent R satsifies |y(s)R(λ)| kZ 1 for all λ and s. We justify this assumption by checking that the resulting solution for R is self-consistent. We then have (F.14) where we chose the correct solution by demanding that R → 0 as λ → ∞. The density of states D(λ) has support for 1/k − 4Z 2 /kZ 2 1 ≤ λ ≤ 1/k + 4Z 2 /kZ 2 1 , where it has the form This is a semicircle distribution that is sharply peaked around 1/k. We now need to check that our assumptions were self-consistent. We have |R(λ)| k 3/2 for all s and λ. Our assumption that |R(λ)y(s)| kZ 1 is therefore valid so long as k Z 2 /y(0) 2 , as claimed.
What about the von Neumann entropy in this phase? At leading order, it is clearly given by log k. To calculate the leading correction to this, we expand λ about 1/k to second order. We find and we therefore obtain This is very similar to the Page result, except that the correction is enhanced by a factor of Z 2 /(Z 1 y(s (1) )), because the largest correction comes from values of s near s (2) rather than s (1) . It is also worth noting that (F. 19) and (F.20) must also be true in the exact solution for D(λ), since we know Tr(ρ 0 ) = k and Tr(ρ) = 1. This ensures that higher-order perturbative corrections to D(λ) give corrections to the von Neumann entropy S that are suppressed compared to (F.22).
A simpler way to reach the same result, without going through the full density of states calculation, is to look at the leading correction to the disconnected geometry in the calculation of the purities Tr(ρ n ) when k is small. This leading correction comes from geometries where two replicas are connected, with the rest disconnected. There are n(n − 1)/2 such geometries (from all the ways of pairing replicas), and each geometry is suppressed compared to the disconnected geometry by a factor of Z 2 k/Z 2 1 . This gives a correction to the Rényi entropy of size δS n = −nZ 2 k/2Z 2 1 . Taking the limit n → 1, reproduces (F.22).
Phase II: Z 2 /y(0) 2 k e S 0 ρ(s (2) ) In this regime, the approximation from the previous subsection breaks down at sufficiently small values of s. We need a new approach. Our strategy is to split the integral over s into two pieces: s < s tran and s > s tran as We treat the integral for s > s tran in the same way as before. This assumes that |y(s tran )R(λ)| kZ 1 (F. 25) for all λ. We then treat the integral for s < s tran as a small perturbation that can be ignored to leading order. For this second approximation to be justified, we require In the first approximate inequality we used the fact that |y(s)R| |kZ 1 − y(s)R|, except for a small neighbourhood near the pole at kZ 1 = y(s)R. This neighbourhood only gives a small contribution to the integral. Can we choose s tran to simultaneously satisfy the required conditions on both parts of the integral? The answer is yes. To satisfy our assumption (F.26), we require s k − s tran 1. In this phase, we have s k s (2) s (1) and so Treating the second term on the RHS of (F.24) as a small perturbation, we find that the unperturbed solution for the resolvent R is again given by (F.15), as in phase I. Is this result consistent with our assumption (F.25)? We have since ρ(s k )y(s k ) 2 ρ(s (2) )y(s (2) ) 2 . We can therefore consistently choose s tran = s k −κ for some large, but O(1), constant κ, and simultaneously satisfy (F.25) and (F.26).
Having justified our assumptions, we now treat the integral as a small perturbation. In the limit λ − 1/k Z 2 /kZ 2 1 , the unperturbed solution (F.15) reduces to Including the perturbation (F.29), we find the first order correction Since we now have y(0)/Z 1 Z 2 /kZ 2 1 , the pole at λ − 1/k = y(s)/Z 1 , for sufficiently small s, is in the regime where our approximations are valid. This gives a nonzero density of states, at far larger eigenvalues than anywhere in the semicircle.
Specifically, for λ − 1/k Z 2 /kZ 2 1 , we have This looks like a thermal spectrum, with all the eigenvalues shifted upwards by 1/k. At λ − 1/k ≈ 2 Z 2 /kZ 2 1 , this shifted thermal spectrum merges into the main semicircle. The full spectrum is therefore a semicircle, centred on 1/k and with width 4 Z 2 /kZ 2 1 , plus a very small tail of larger eigenvalues that look like the largest eigenvalues of the thermal spectrum, shifted upwards by δλ = 1/k. Note that there is no tail of small eigenvalues, outside of the semicircle, because in this regime the unperturbed solution is negative, and the perturbed solution does not include any poles.
To understand the exact transition between the shifted thermal tail and the semicircle, we would need to consider the first order perturbation of the full unperturbed solution (F.15). However, we will not do so here, since it is unimportant for our purposes.
We note that the exact solution must satisfy because of the constraints Tr(ρ 0 ) = k and Tr(ρ) = 1. These results will be crucially to controlling the size of the correction to the von Neumann entropy. What is the von Neumann entropy? Again, to leading order, it is given by log k. There are two possible sources for the dominant correction. The first is the nonzero width of the semicircle, which gives the same correction that was found in (F.22). The second comes from the existence of the small tail of large eigenvalues. The existence of this tail gives a correction to the entropy of size The first term here is the direct contribution to the entropy from eigenvalues in the small tail, while the second and third terms come from corrections to the contribution from eigenvalues in the semicircle (calculated by the formula in (F.18)) given (F.33) and (F.34) respectively. We therefore find where y(s * ) = 2 Z 2 /k (and hence s * s tran ). Since 0 ≥ We conclude that (F.22) is still the leading correction to the von Neumann entropy. We note that the largest contribution to (F.36) comes from values of λ ∼ 1/k + O( Z 2 /kZ 2 1 ). In this regime, the approximation R 0 (λ) = k/(λ − 1/k) is not very accurate. Hence the precise size of the correction (F.36) should not be trusted. This is fine: since (F. 19) and (F.20) are fixed by the constraints, the leading effect on the von Neumann entropy of any small correction to D(λ), with λ − 1/k = O( Z 2 /kZ 2 1 ), will be subleading compared to (F.21). It will therefore be small compared to the correction in (F.22). The point of the calculation in (F.36) is to bound the correction from the small density of states with λ − 1/k Z 2 /kZ 2 1 , where a small correction to D(λ) could, in principle, have given a larger correction to the von Neumann entropy than (F.22).
Phase III: e S 0 ρ(s (2) ) k k 3→4 In this phase, we will have somewhat less control over the shape of the peak of the density of states D(λ). We will focus on understanding the density of states away from this peak. Fortunately this approach will still give us good control over the von Neumann entropy. Our strategy will be similar to our strategy for phase II. However we will find that our approximation for the resolvent R(λ) will now only be valid outside of a small range of λ (which includes the peak of the spectrum). Explicitly, we choose some s tran as for Z 2 /y(0) 2 k e S 0 ρ(s (2) ), and approximate the resolvent equation by where the second term on the right hand side is treated as a small perturbation. As before, for this approximation to be valid, we require s k − s tran 1. as well as |y(s tran )R(λ)| kZ 1 , for the values of λ where we want to be able to trust our approximation.
Unlike for phase II, we do not include a term in our approximation for the integral over s > s tran . In this phase we will have s tran s (2) , causing (F.41) to be dominated by s ∼ s tran . It will therefore be much smaller than k, and can be safely ignored, whenever our approximations are themselves valid.
Ignoring the small perturbation, we obtain the initial unperturbed solution Here, we have used the fact that s tran < s k s (1) , to see that Our unperturbed solution R 0 (λ) has a pole at λ = 1/k, so it is clearly not consistent with our assumption that |y(s tran )R(λ)| kZ 1 for all values of λ. What values of λ allow us to choose s tran so that our solution is self-consistent? We have |y(s k )R(λ)| kZ 1 so long as Since s k = O(1/β) in the semiclassical limit, we have y(s) ∝ e O(s) for s ≈ s k . Both assumptions are therefore valid so long as (F.44) holds and we choose s tran = s k − κ, for some large, but O(1), constant κ. As for phase II, the first order correction to (F.42) is given by (F.31). We find that the density of states D(λ) = 0 for sufficiently small λ, satisfying 1/k −λ y(s k )/Z 1 . For large values of λ, with λ − 1/k y(s k )/Z 1 , we have D(λ) = e S 0 dsρ(s)δ(λ − 1/k − y(s)/Z 1 ), (F. 45) as in (F.32). This region only contains a small fraction of the k total eigenvalues in the entanglement spectrum -by definition, the kth largest eigenvalue in the shifted thermal spectrum is given by λ = 1/k + y(s k )/Z 1 . The remaining eigenvalues must lie in the range 1/k − O(y(s k )/Z 1 ) λ 1/k + O(y(s k )/Z 1 ). Since we know that R 0 (λ) is not a good approximate solution for any values of λ within this range, we conclude that the density of states D(λ) is large throughout this region, rather than being concentrated parametrically closer to 1/k.
In summary, we conclude the following. The density of states looks like a thermal spectrum, shifted by δλ = 1/k, cutoff by some unknown function of width O(y(s k )/Z 1 ) at λ = 1/k. We do not know any compelling way to get analytic traction on this cutoff function, although it can of course be calculated numerically, by solving either the entire resolvent equation numerically or an appropriate approximation to it.
We note that when k ∼ e S 0 ρ(s (2) ), the width of the semicircle that we found in phase II is equal to the width of this new uncontrolled cutoff function. There is a smooth transition between the two.
We now move on to calculating the von Neumann entropy. To leading order we still have S = log k. As for phase II, there are two potential sources for the leading correction. The first is a correction from the width of the cutoff, as in (F.18). This is equal to To calculate the exact size of this correction, we would need to use a numerical approximation for the density of states in the cutoff region. The second possible correction comes from the existence of the thermal tail of large eigenvalues. As in (F.36), this is given by Since we have s (2) s k s (1) , this integral is dominated by values of s with ky(s) ∼ Z 1 . Unlike for k e S 0 ρ(s (2) ), this means that the dominant contribution comes from eigenvalues where the approximation can be trusted. Its size is therefore O(e S 0 ρ( s k )/k), where s k is defined in (F.6).
Comparing the sizes of the two corrections, we have since we are assuming that k k 3→4 , for k 3→4 defined in (F.7). We therefore conclude that the dominant correction in this phase comes from the width of the cutoff rather than the thermal tail.
This phase can be dealt with using exactly the same strategy as phase III. The only change is that we now find that the source of the leading correction to the von Neumann entropy (which is still given by log k at leading order) has changed. Specifically, the leading correction δS 2 = O(e S 0 ρ( s k )/k) now comes from the existence of the thermal tail, rather than the width of the cutoff region. Its exact size is given (F.47), with high accuracy in the semiclassical limit.
One might worry about whether we can still assume that e S 0 ∞ stran dsρ(s)y(s) ≈ Z 1 . (F.49) As we shall see in the next section, a more accurate treatment of this integral would decrease the shift in the thermal tail by p k /k, where p k is the probability that the thermal ensemble is in one of its k largest eigenvalues. When k/e S 0 ρ(s (1) ) = e −O(1/β) , we have p k /k = O(y(s k )/Z 1 ), so this shift can be ignored everywhere that the shifted thermal spectrum can be trusted. The leading uncontrolled correction to the von Neumann entropy still comes from the width of the cutoff region, and has size δS 1 = O(k 2 y(s k ) 2 /Z 2 1 ), as in (F.46).
Phase V: e S 0 ρ(s (1) )e −O(1/β)) k e S 0 ρ(s (1) )e O(1/β) We have now finally reached the actual Page transition for the von Neumann entropy! In this phase, we can still use the same approximations that we used in phases III and IV to find the shape of the density of states away from its peak. Again, we want to choose s tran = s k − κ, for a large, but O(1), constant κ. However, we can no longer assume that where p k is the probability of the thermal density matrix being in one of its k largest eigenvalues. We therefore find that the density of states D(λ) is given by We see that the shift in the thermal part of the spectrum decreases as we move through the Page transition. This can be easily understood as the eigenvalues adjusting to match the twin constraints that Tr(ρ) = 1 and Tr(ρ 0 ) = k; if we had O(k) thermal eigenvalues, all shifted by δλ = 1/k, we would have Tr(ρ) > 1.
(F. 54) It follows that y(s k )/Z 1 (1 − p k )/k = O √ βρ(s k )y(s k ) (1 − p k )ρ(s (1) )y(s (1) ) 1, (F. 55) since in this phase we have s k − s (1) 1/β. The width of the uncontrolled region is therefore always small compared to the size of a typical eigenvalue in the uncontrolled region.
Let us try to calculate the von Neumann entropy. As a first approximation, we can consider the von Neumann entropy of a thermal spectrum, with a hard cutoff after k eigenvalues, and with each eigenvalue increased by (1 − p k )/k. This agrees with the actual spectrum everywhere that we have control, and obeys the constraints Trρ 0 = k and Trρ = 1. The von Neumann entropy of such a state would be given by What is the error from this approximation? We have significantly changed the spectrum in the small region where λ − 1−p k k = O( y(s k ) Z 1 ). To leading order, this part of the spectrum gives a contribution to the von Neumann entropy of (1 − p k ) log((1 − p k )/k). Since the zeroth and first moments are fixed by the rest of the spectrum and the constraints Tr(ρ) = 1 and Tr(ρ 0 ) = k, the leading difference between the shifted thermal spectrum and the actual spectrum will be controlled by where ∆D(λ) is the diference between the two spectral densities. To find this formula, we expanded λ to second order about (1 − p k )/k. In estimating its size, we used the fact that most of the k eigenvalues are in this region of the spectrum. How large can this error get? It is largest when s k ∼ s (1) + O( 1/β). In this case, we have (1 − p k ) = O(1), so the size of the correction is O(β).
A cruder approximation is to replace log (y(s)/Z 1 + (1 − p k )/k) in (F.56) by log max(y(s)/Z 1 , (1− p k )/k). The largest error from this simplification comes from values of s where y(s)/Z 1 ∼ (1 − p k )/k. From such eigenvalues, we have an O(1) error in our estimate of the logarithm. When s k ∼ s (1) , this gives a total error of O(ρ(s)y(s)/Z 1 ) = O( √ β), which is parametrically worse than the shifted thermal spectrum approximation.
How large is the difference between the shifted thermal entropy (F.56) and a naive Page curve given by S = min(log k, S BH )? Assuming µ 1/β, S BH = e S 0 Z 1 dsρ(s)y(s) log(Z 1 /y(s)) = S 0 + 4π 2 β + 1 2 log 2π β + O(1). (F.58) Let k = Z 1 /y(s (1) ), i.e s k = s (1) . The naive Page curve entropy would be log k. Instead, using the crude approximation discussed above, we find (F. 59) In the first line, we used the fact that p k = 1/2 + o(1). In the second line, we used the fact that log(Z 1 /y(s (1) ) = log k and assumed µ 1/β. The leading correction is therefore O(1/ √ β); it becomes very large in the semiclassical limit. This is in sharp contrast to the Page curve for Haar random states, or the microcanonical ensemble, where the correction is never larger than O(1). The energy fluctuations in a thermal state parametrically increase the size of the corrections to the naive Page curve.
Phase VI: e S 0 ρ(s (1) )e O(1/β) k e S 0 ρ(s ) In this phase, we have (1 − p k )/k = O(y(s k )/Z 1 ). Hence the approximations made in phase V no longer give us good control over the bottom of the spectrum. Instead, we treat the regimes λ y(s k )/Z 1 and λ y(s k )/Z 1 separately. For the former case, we can treat the entire second term in as a small perturbation. Our initial solution is R 0 (λ) = k/λ. Hence we have R 0 y(s k ) kZ 1 and our treatment of the second term (F.60) as a small perturbation is self-consistent. We therefore have the first order perturbative correction The thermal spectrum is essentially unshifted, everywhere that it is under good analytic control. For λ y(s k )/Z 1 , it is easiest to consider λ as a function of negative, real R. Rewriting (F.60), we find For small negative R, the first term dominates, and so λ is negative. For very large negative R, the second term dominates, and so λ is positive, although it approaches zero as R → −∞. Note that for negative R, the contribution from the integral is positive for all values of s. At some intermediate value of R, λ should have a maximum. This value will be the bottom of the eigenvalue spectrum. Suppose we have R = −κkZ 1 /y(s k ) for some large, but O(1), constant κ as usual. In this case, the second term dominates and we have We can therefore be confident that D(λ) = 0 for λ y(s k )/Z 1 . As for phases III-V, we do not have good control over the shape of the spectrum in the cutoff region, when λ = O(y(s k )/Z 1 ).
The largest correction to the von Neumann entropy comes from the cutoff region. For eigenvalues in this region, there is an O(1) multiplicative uncertainty in the eigenvalue, which corresponds to an O(1) uncertainty in log λ. The total probability of an eigenvalue being in this region is O(1 − p k ). Hence we have where S BH is the entropy of the canonical ensemble. Note that this correction is the same order of magnitude as the difference between S BH and the entropy of the shifted thermal spectrum (F.56). However, we have no good reason to think that the two corrections are the same. The size will be affected by the details of the cutoff, over which we have very little control.
Phase VII: k e S 0 ρ(s ) In the limit k → ∞, the correction to the thermal entropy from the cutoff region, discussed above, decays as O( ρ(s k )y(s k ) Z 1 ) = O(e − β 2 (s k −s (1) ) 2 ). (F. 66) This decays much faster than the largest corrections to the Renyi entropies, which come from planar diagrams that consist of two discs and are suppressed by a factor of O(1/k) compared to the leading connected topology. One can also make general arguments that it is inconsistent with entanglement wedge reconstruction for the correction to decay faster than O(1/k) in this limit [13]. It follows that there should exist some other correction, which cannot come from the cutoff region, that becomes dominant at sufficiently large k. The source of that correction is the secondorder perturbative correction to the resolven. As for phase VI, the unperturbed solution is given by R 0 (λ) = k/λ and the first order perturbative correction R 1 (λ) is given by (F.61). The second order correction is R 2 (λ) = e 2S 0 Z 1 k ds 1 ds 2 ρ(s 1 )ρ(s 2 )y(s 1 )y(s 2 ) (Z 1 λ − y(s 1 )) 2 (Z 1 λ − y(s 2 )) . (F.67) The second order perturbative correction to the density of states is therefore D 2 (λ) = e 2S 0 kZ 1 ds 1 ds 2 ρ(s 1 )ρ(s 2 )y(s 1 )y(s 2 ) y(s 2 ) − y(s 1 ) δ (λ − y(s 1 )/Z 1 ). (F.68) The pole at s 1 = s 2 is dealt with by taking the principal value. The contributions proportional to δ(λ − y(s 1 )/Z 1 ) and δ(λ − y(s 2 )/Z 1 ), which would otherwise exist, cancel under the relabelling s 1 ↔ s 2 . We therefore get a correction to the von Neumann entropy of δS = − where s is the saddle point for ρ(s) 2 y(s). This is proportional to 1/k, as expected. We emphasize that this final result depends crucially on the fact that k e S 0 ρ(s ). If we still had k e S 0 ρ(s ), the integral would be dominated by values of s close to s k and we would have This is the same order of magnitude as the correction we found in phase VI. We emphasize that the exact size of the correction still cannot be computed, since the perturbative approximation is poor for s = s k − O(1).
As with the correction to the entropy in phases I and II, there is a much simpler way to find this entropy correction, by analytically continuing the leading correction to the Renyi entropies. The leading correction to the purities Tr(ρ n ) in the limit k → ∞ comes from planar diagrams consisting of two connected components. These give a contribution to the purity of δ Tr(ρ n ) = n 2kZ n 1 n−1 p=1 Z p Z n−p = ne 2S 0 2kZ n 1 ds 1 ds 2 ρ(s 1 )ρ(s 2 ) y(s 1 ) n y(s 2 ) − y(s 1 )y(s 2 ) n y(s 1 ) − y(s 2 ) . (F.74) They therefore give a correction to the Rényi entropies S n = 1/(1 − n) log(T r(ρ n )) of δS n = − ne 2S 0 (n − 1)2kZ n ds 1 ds 2 ρ(s 1 )ρ(s 2 ) y(s 1 ) n y(s 2 ) − y(s 1 )y(s 2 ) n y(s 1 ) − y(s 2 ) , (F. 75) In the limit n → 1, this gives the correction to the von Neumann entropy δS = − e 2S 0 2kZ 1 ds 1 ds 2 ρ(s 1 )ρ(s 2 ) y(s 1 )y(s 2 ) y(s 1 ) − y(s 2 ) log y(s 1 ) y(s 2 ) , (F. 76) which is exactly what we found before. The disadvantage of this approach is that it does not know when this calculation breaks down. If we only looked at the two connected component topologies, we would expect that the corrections to the Page curve would become large when k = O(e 2S 0 ρ(s ) 2 y(s )/Z 1 ), which is much larger than the value of k at the Page transition. In contrast, our more careful approach using the resolvent can see that this large correction doesn't exist when k e S 0 ρ(s ), because the integral in (F.71) is cutoff at s k .
Finally, in the limit of very large k, we actually regain some increased analytic control over the cutoff region. Specifically, we need s k 1/β. We assume for this section that the brane mass µ 1/β and so we have y(s) = y(0)e −βs 2 /2 . Suppose we rewrite (F.60) as λR = k − e S 0 ds ρ(s) For self-consistency of our assumptions, we will need log(−R y(0)/kZ 1 ) 1. We know that in the cutoff region we have |R| = O(kZ 1 /y(s k )), which implies s R ≈ s k . Our approximation is therefore well controlled so long as s k O(1/β), as claimed. So far, we have only considered negative, real R. For imaginary or positive R, the integral is seemingly not so simple. However, we can simply deform the integration contour in the complex plane, without passing through any poles, to absorb the phase of R into a phase of y(s R ). Hence (F.79) is valid when R has arbitrary phase so long as we use complex logarithms. Writing R = −kZ 1 e r+iθ /y(0), we obtain s R = 2r/β + iθ √ 2βr , (F. 80) where we used the fact that r 1/β. We can also do the integral in (F.77) explicitly to get −Z 1 λe r+iθ = 1 − e S 0 k s R 4π 3 e 2πs R . (F.81) Comparing the imaginary parts of the left and right hand sides, we find Solutions for this exist with θ = 0 for λ < 2π y(s k )/(eZ 1 βs k ) = λ min , so there are no eigenvalues below this cutoff. At λ = λ min , we have ∆r = 1. For λ > λ min , we have λ e ∆r−1 λ min cos sinc −1 λ min λe ∆r−1 = ∆r.
(F. 89) In principle, this equation can be solved to find the fixed function ∆r(λ/λ min ), which does not depend on any parameters. For λ λ min , we have cos(sinc −1 λ min e −∆r+1 /λ ≈ 1 and so ∆r = −W 0 (λ/λ min ) where W 0 is the Lambert W function. The resolvent R decays at large λ, as expected, and we reenter the regime where our perturbative approximations are well controlled.
Finally, we obtain the density of states D(λ) = Z 1 ke r sin θ π = kZ 1 e ∆r(λ/λ min ) πy(s k ) sin sinc −1 2π y(s k ) e ∆r(λ/λ min ) Z 1 λβs k . in the first step, we used ρ(s) ≈ ρ(s k ), while the second step used y (s) = β s y(s). The cutoff spectrum connects smoothly to the thermal spectrum. This completes our analysis of the entanglement spectrum of the simple model.