Phases of information release during black hole evaporation

In a recent article, we have shown how quantum fluctuations of the background geometry modify Hawking's density matrix for black hole (BH) radiation. Hawking's diagonal matrix picks up small off-diagonal elements whose influence becomes larger with the number of emitted particles. We have calculated the"time-of-first-bit", when the first bit of information comes out of the BH, and the"transparency time", when the rate of information release becomes order unity. We have found that the transparency time is equal to the"Page time", when the BH has lost half of its initial entropy to the radiation, in agreement with Page's results. Here, we improve our previous calculation by keeping track of the time of emission of the Hawking particles and their back-reaction on the BH. Our analysis reveals a new time scale, the radiation"coherence time", which is equal to the geometric mean of the evaporation time and the light crossing time. We find, as for our previous treatment, that the time-of-first-bit is equal to the coherence time, which is much shorter than the Page time. But the transparency time is now much later than the Page time, just one coherence time before the end of evaporation. Close to the end, when the BH is parametrically of Planckian dimensions but still large, the coherence time becomes parametrically equal to the evaporation time, thus allowing the radiation to purify. We also determine the time dependence of the entanglement entropy of the early and late-emitted radiation. This entropy is small during most of the lifetime of the BH, but our qualitative analysis suggests that it becomes parametrically maximal near the end of evaporation.


Introduction
That black holes (BHs) radiate thermally was a remarkable finding [1,2] but has also lead to some infamous puzzles. For instance, an initially pure state of matter can collapse to form a BH and eventually evaporate into a mixed state of thermal radiation. This is in direct conflict with quantum mechanics, which postulates a unitary time evolution and so forbids a pure state from evolving into a mixed one. This is, in essence, the BH informationloss paradox [3]. (For reviews, see [4,5,6].) Over the years, a myriad of explanations has been suggested on how this tenuous situation gets resolved. Initially, Hawking thought that the laws of quantum mechanics have to be changed. Others have sometimes claimed that a theory more fundamental than general relativity, such as string theory, or some exotic new physics, such as highly entropic remnants, is needed to resolve the matter. However, strong circumstantial evidence has been gathered, indicating that general relativity and ordinary quantum mechanics are sufficient for consistently describing the process of BH evaporation. In particular, the quantum-information treatments of Page [7] and then of Hayden and Preskill [8] demonstrate that a slowly burning matter system -be it the complete works of Shakespeare or a Schwarzschild BH -must emit all of its information by the end of evaporation. Consequently, a thermal mixed state cannot be the final product. Once this logic is accepted, the challenge then becomes to identify what is still missing from the standard treatments; namely, the information-release mechanism that is responsible for restoring unitarity by the end of the BH evaporation. The review articles [4,5,6] contain further discussion of the issues concerning the BH information paradox.
In [9], it was proposed that the origin of the BH information paradox is the use of a strictly classical geometry for the BH. (See [10,11,12] for overlapping ideas.) It was also argued that the leading semiclassical corrections that account for the quantum fluctuations of the background geometry should be taken into account by assigning a wavefunction to the BH. The contention was that the parameter which controls the strength of the semiclassical corrections is the ratio of the Compton wavelength of the BH λ BH = /M BH to its radial size R S . In [13], we have proposed a concrete scheme for evaluating the semiclassical corrections using the wavefunction of [14,9]. The parameter that controls the strength of the semiclassical corrections was denoted by C BH and determined more precisely, C BH = 1/S BH = λ BH 2π /R S . We have, in a recent article [15], gone on to apply this idea to the calculation of the Hawking radiation. There, Hawking's calculation was repeated but with one additional input: The assignment of a Gaussian wavefunction to the collapsing shell of matter. The main distinction between our treatment and Hawking's is the introduction of a new scale, the width of the wavefunction. On the basis of the Bohr correspondence principle [14,9], this width should be Planckian.
After computing the appropriate expectation values, we obtained a picture that is different than that found by Hawking and consistent with Page's.
Most pertinently, Hawking's density matrix for the BH radiation is strictly diagonal whereas our matrix contains small off-diagonal elements of order √ in the same basis. The effect of these elements on the eigenvalues of the matrix is initially small but, as the number N of emitted particles grows, so does the changes to the eigenvalues. The parameter that controls these changes to the matrix was found to be NC BH .
We have calculated the time when the rate of information release becomes of order unity. This "transparency time" t trans was found to coincide with the time when NC BH = 1 , which is, in turn, the same as the "Page time" when the BH has lost half of its initial entropy to the radiation. Hence, this result is in agreement with the analysis of Page [7].
We have also calculated the "time-of-first-bit" t 1bit , when the first bit of information comes out of the BH. This occurs when N 2 C BH = 1 , which is much earlier than the Page time and apparently in disagreement with Page's calculation. However, we have tracked the information in the correlations between the shell and radiation as well as in the radiation subsystem. Page, on the other hand, tracked only the latter, which is an exponentially suppressed quantity until after the transparency time.
To keep the calculations in [15] as simple as possible, we have ignored the fact that the Hawking particles are emitted over a time scale spanning the lifetime of the BH. In effect, we were assuming that all the Hawking particles are being emitted coherently. Here, we will improve upon our previous calculation by keeping track of the time of emission of the Hawking particles and their back-reaction on the BH. Our analysis reveals a new time scalethe radiation coherence time t coh = R 2 S /l p -which is the geometric mean of the evaporation time R 3 S /l 2 p and the light crossing time R S . The number of coherent Hawking particles N coh that is emitted during this time is typically given by N coh = 1/ √ C BH = √ S BH . This estimate for N coh is valid during most of the lifetime of the BH but gets modified at the last stages of evaporation (see below).
What we find is that the Page time splits into two: The time-of-first-bit is the same as found before: much earlier than the Page time. It can now be identified with the coherence time, t 1bit = t coh . On the other hand, the transparency time turns out to be much later than the Page time. This phase now transpires at one coherence time before the time of final evaporation,  (10)). For ∆N, the width is determined by the inverse of the width in ω, and so it is proportional to C 1/2 The situation changes at the late stages of evaporation, although this era can only be discussed in a qualitative way because our methods become less accurate for this region of parameters. Here, the BH is parametrically nearing Planckian dimensions, but still large and semiclassical, so that C BH is becoming larger and approaching order unity. It is clear that the width of ω is decreasing and approaching unity but, somewhat surprisingly, the width of ∆N is growing. To understand this unexpected result, note that the factor C BH in the product C BH ∆N is determined by the time of emission of the Hawking particle and is small for most emitted particles. Therefore, N coh goes at the end as 1/C BH ∼ S BH ; where the S BH means the BH entropy at an earlier epoch, so that N coh ≫ 1 . Based on this qualitative analysis, it will be argued that, by this point in the evaporation, the entirety of the emitted Hawking particles become coherent, N coh ∼ total number of particles. Consequently, the radiation purifies at a high rate.
The sequence by which the correlations between the emitted particles evolves now becomes clear: Always a delta function for Hawking since N coh = 1. While in our case, first, a smoothed delta function when N coh ∼ 1/ √ C BH , followed by a theta function when N coh ∼ 1/C BH .
Taking into account the time-dependent emissions, we are able to determine the evolution of the entanglement between the early and late radiation.
We find that this entanglement is initially very small but becomes significant at t trans and then grows quickly to be (parametrically) maximal when the radiation purifies.
The rest of the paper proceeds as follows: First, in Section 2, we summarize our preceding work [15]. This is essential for understanding the remainder, as we adopt this initial framework and build up the analysis from there.
Next, in Section 3, we determine what is the effect of time-dependent emissions and use this to better understand how the evaporation process evolves.
In Section 4, a calculation of the trace of the square of the density matrix enables us to analyze the rate of information transfer, to quantify the various time scales and to qualitatively demonstrate that the radiation does indeed become purified by the end. Then, in Section 5, we calculate the entanglement entropy for the early-and late-emitted radiation and qualitatively show that it becomes parametrically maximal at late times. The paper concludes with a summary (Section 6) and an appendix.
Recently, a modern interpretation of the information-loss paradox, known as the "firewall" problem [16] (also see [17,6,18] for earlier versions and [19,20,21,22,23,24,25,26,27,28,29,30,31,32] for a sample of the related literature). We expect that the current analysis will be an essential step toward a resolution of this puzzle, but defer a specific discussion to a future publication [33].
2 Review of previous results on semiclassical corrections to Hawking radiation

Conventions
We will now review our preceding paper [15], which the reader can refer to for an in-depth discussion. This review will also serve to introduce notations and conventions that we will use in the following sections.
We choose units such that Planck's constant and Newton's constant G are explicit, and all other fundamental constants are set to unity. In some instances, the Planck length l p = √ G is used instead.
We assume a four-dimensional Schwarzschild BH (generalizations to higher dimensions are straightforward) of large but finite mass M BH ≫ /G , with the metric Here, R S = 2M BH G denotes the horizon radius.
We use a dimensionless advanced-

Semiclassical density matrix
The objective of [15] was to calculate the modifications due to a fluctuating geometry to Hawking's thermal density matrix for the radiation emitted by a collapsing shell of matter. As the geometry is sourced by the collapsing shell, we have assigned it a wavefunction, where R S is the Schwarzschild radius of the incipient BH, R shell is the radius of the shell, N ≃ 4πR 2 S √ π G is a normalization constant and C BH = S −1 BH is the aforementioned "classicality" parameter. This form of BH wavefunction was first justified in [14] and then further motivated in [9,13,15].
The classicality parameter C BH can be viewed as a dimensionless (scaled) that evolves in time, C BH = (t) . It is initially very small for a large BH but steadily grows as the BH evaporates. In this sense, semiclassical corrections to observables can be expressed as powers of this dimensionless .
For a discussion of BH radiation, it is more convenient to use the advanced time of the shell v shell . The conversion to Ψ shell (v shell ) is made by observing that, in the near-horizon limit, where v 0 is the advanced time at which the shell crosses its horizon. One can then compute the expectation value of a generic operator O as follows: where σ 2 = R 2 S C BH /2 = l 2 p /2π . Equation (2) is a particular case of our more general prescription [9,13], which amounts to applying the standard rules of quantum mechanics.
Hawking's calculation [2] relates the in-going modes with the out-going modes (the Hawking particles) by a Bogolubov transformation, Here, F ω is an out-mode that has been traced back to past null infinity, √ 2π e iω ′ v is a basis function for an in-mode and the α's (β's) are the positive-energy (negative-energy) Bogolubov coefficients. Recall that, unlike Hawking (and unlike in [15]), we are using dimensionless frequencies.
The Hawking single-particle density matrix for the out-modes can then be expressed as with |0 in denoting the vacuum annihilated by positive-frequency in-modes.
Hawking used a procedure of ray tracing that exploited the geometric optics of the modes to determine that The logarithm in the top line takes into account the discontinuity in the phase of the modes as they pass across the shell at an advanced time v close to v 0 . This phase discontinuity turns out to be central to our findings.
As discussed in [15], only the Bogolubov coefficients are sensitive to the effects of the fluctuating background geometry. Hence, applying our prescription (2), we obtain the "semiclassical" density matrix, The "semiclassical" coefficients β ω ′′ ω, SC are derived in the same way as Hawking does but now take into account that the discontinuity in the phase depends on v shell rather than on v 0 , The v integral in Eq. (6) can be expressed as a sum of a classical term and the leading semiclassical correction. Denoting this integral as The integral on the left I C is a delta function δ(ω − ω ′ ) and yields Hawking's classical result of a diagonal density matrix. The expectation value of the integral on the right ∆I SC (C BH ) leads to the off-diagonal elements.
The expectation value of interest then goes as which was evaluated in [15] to leading order in C BH , Substituting the full expressions for the Bogolubov coefficients [2] into Eq. (4), we can write the semiclassical correction to Hawking's matrix as where t ω is the transmission coefficient through the gravitational barrier.
These remaining integrals can be computed analytically with some amount of effort. One caveat is a logarithmic divergence on the line ω = ω ′ . We have handled this by isolating the divergent piece and then recognizing that this is just a small (order C which we will subsequently denote as C 1/2 BH ∆ρ OD (for off-diagonal). Recall that Hawking's classical matrix with dimensionless frequencies is given by We will assume that the semiclassical matrix has been renormalized to give The next step in [15] was constructing a multi-particle density matrix for N identical, independent particles. This, in effect, amounts to the assumption that all the particles are coherent, so that the timing of their emissions does not affect the correlations among them. This will be corrected later.
This multi-particle matrix consists of N × N blocks: ρ IJ (ω, ω) with I, J = 1, . . . , N and any of the N 2 blocks is a matrix of the same dimensionality as the single-particle density matrix. Each diagonal entry is the single-particle Hawking matrix ρ (N ) II = ρ H (ω, ω) (plus subdominant semiclassical corrections) and each off-diagonal element contains the semiclassical part ρ (N ) . Each block is multiplied by a phase e Θ IJ ( Θ IJ = −Θ JI ), but these are of no consequence to our discussion. The normalized N-particle density matrix can then be expressed as where the symbol / I N ×N denotes an N × N matrix of ones off the diagonal (up to the implied phases) and zeros on it. This matrix can be used to track the information flowing out of the BH.

Entropy and information
The von Neumann entropy per particle S N = −Tr ρ (N ) ln ρ (N ) of the radiation 1 can be calculated perturbatively in the small parameter C BH . This calculation yields, to leading order, Here S H is the thermal entropy or, equivalently, the von Neumann entropy for the Hawking diagonal matrix.
is a positive numerical factor of order one.
From Eq. (15), it is possible to deduce that the parameter controlling the semiclassical corrections is NC BH rather than C BH . This outcome is a consequence of having roughly N times more off-diagonal elements than 1 Alternatively, one can symmetrize the particles and use the normalization 1/N ! . In diagonal ones. So that, when NC BH = 1 , the semiclassical corrections becomes large and one expects a significant change.
In [15], the back-reaction of the Hawking particles on the geometry was incorporated in the following (incomplete) way: It was assumed that the BH radiates thermally as a black body, which is clearly a good approximation during most of the lifetime of the BH. We further assumed that the radiated particles carry an energy equal to T H , with the Hawking temperature T H also taken to be time dependent. Then dN = dM dN dM = − dM T H (t) , which can be integrated to give and therefore, because Also, since As N(t) and C BH (t) are both monotonically increasing functions of time, their product is growing and will eventually reach and then surpass unity.
Indeed, the transition out of the perturbative regime takes place at the Page time [7], when the BH has lost half of its initial entropy to radiation. This finding appears to substantiate Page's claim that this moment represents a tipping point in the evaporation process.
We have also looked in [15] at the rate of information flow. The information contained in the radiation is defined in the standard way, where Eq. (18) has been applied (both here and below).
It follows that and so dI dS H is small (order C BH ) before the Page time and of order unity at it. But, at later times, the previous calculation formally breaks down.
We can use Eqs. (19) and (20) to calculate t 1bit and t trans . Recall that t 1bit is defined to be the time when the first bit of information comes out of the BH. And so, using Eq. (19), we see that this happens when N ≃ S BH (0) , which is the same as the coherence time. On the other hand, the transparency time t trans occurs when dI/dS H = 1 . From Eq. (20), this transpires when NC BH ≃ 1 , which is indeed the Page time.
Another clue is found by looking at the purity of the density matrix, The smallness of this ratio implies that the density matrix is still close to thermal, even at the Page time. However, the Page time appears to be the moment when deviations from thermality are starting to become significant, just as Page had asserted.
Inspecting the purity, one can see that the radiation is already close to pure when C BH 1 . Unlike the previous calculation of the information rate, which entailed expanding out a logarithm, Eq. (21) is reliable also for values of NC BH ≫ 1 provided that C BH < 1 . Hence, we can conclude that the radiation does parametrically purify.
3 Time-dependent radiation density matrix

A model of time-dependent emission
We will now provide a more accurate account of the back-reaction of the emitted particles. Let us begin by assigning a time-dependent wavefunction to the shell. Then both the mean position of the shell and its width could, in principle, become time dependent. Specifically, R S and C BH are now both functions of time, However, in this particular case, the width What is required is the wavefunction in terms of v. For this, we use Here, the time dependence of v shell (t) is classical and due only to the classical The resulting wavefunction is For this parametrization, the width is time dependent.
It is more convenient to use the number of emitted particles N as our time coordinate rather than t or v. The Stefan-Boltzmann law for blackbody emission leads to is the BH lifetime. We will use N T to denote the total number of particles emitted by a certain time, so that the multi-particle matrix is now an N T ×N T block matrix. The time of emission of specific particles will be denoted by

The time-dependent density matrix
We can further improve on the previous results by taking into consideration that the time of emission differs for the different Hawking particles. In particular, the phase discontinuities associated with the logarithm in Eq. (5) depend on these emission times. This effect is not relevant to the classical Hawking calculation but could be relevant to the phases of the semiclassical β coefficients; cf, Eq. (7). This is because the shell-crossing time v shell (t) is different for different modes due to the shell continually depleting its mass; cf, Eq. (23). Now suppose that a given particle is emitted at "time" N ′ and another at N ′′ . Then the density matrix of Eq. (6) should be replaced with The density matrix depends on the three times N ′ , N ′′ and N T . The width of the wavefunction at N T controls the fluctuations in v shell and is a property of the BH, while N ′ and N ′′ are the emission times of the specific particles and are intrinsic to the quantum matter fields.
The N ′ , N ′′ dependence enters only through the β's, The wavefunction, on the other hand, depends on N T , and so the density matrix depends on additional phases that are missed when it is evaluated at a common time as in [15]. In particular, The expectation value in the last line of Eq. (27) is the same as that of time-independent situation, so that the difference between the treatments is in the additional phase factors in the second line. These phases can be re-expressed as The details of the calculation leading to the phase factor (28) and the rest of the evaluation of ρ SC are relegated to the Appendix. The final timedependent result is rather simple: An additional "suppression" factor multiplying the time-independent matrix of Eq. (12), The suppression factor is given by The expression in Eq. (29) for the semiclassical correction to the Hawking density matrix is limited in its validity to the region of parameter space when

The coherence time
The semiclassical correction to the density matrix in Eq. (29) now contains an extra suppression factor. The contribution from a particle emitted at time and so a new time scale appears, We have identified N coh (N T ; N) as the coherence scale for particle emis- In Schwarzschild time, this new scale is parametrically equal to the coherence time For instance, at the Page time, t coh (t P age ; t P age ) = 640 π 2 R 2 S (0) lp . The coherence scale is also the time that it takes the BH to emit order of √ S BH particles. Consider that where the dots stand for terms that are higher order in N T S BH (0) . The point being that, as long as N is close to N T , the corrections are subleading and This new timescale N coh (or t coh ) is the central result of our paper. We use it in an extensive way in the following analysis and the rest of our results depend crucially on its existence. The appearance of t coh in our formalism is quite natural for the following reason: Let us consider the time over which the wavefunction Ψ shell changes sig- nificantly. An inspection of Eq. (1) indicates that this happens when the Schwarzschild radius shrinks by an amount Hence, the coherence time means the interval over which the overlap of the wavefunction at different times becomes small. The fact that N coh ≪ S BH (t coh ≪ τ BH ) can be attributed to the width of the wavefunction being much smaller than the Schwarzschild radius or, equivalently, to the BH being semiclassical, C BH ≪ 1 .

A simplified qualitative description of BH evaporation
Let us start at time N T = 0 and follow the evaporation for one interval of the coherence time, N (0) ≡ N coh (0; 0) = S BH (0) . This will define a block of size S BH (0) × S BH (0) in the multi-particle matrix. We then start the clock over at time N T = N (0) and pretend that this is a newly born BH of smaller size (the original BH minus the first block). We again follow the evaporation for a time set by the coherence scale, but with the scale now determined by this smaller-sized BH. That is, (N (0) ) . Then, by continually repeating this process, we can parse the matrix into about N T / S BH (0) square blocks that are roughly of size S BH (0)× S BH (0) (although each additional block is slightly smaller than the previous one).
The difference between Hawking's original description of BH evaporation and ours is that, for Hawking's picture, there are N T blocks of size 1, as each emitted particle is independent of all the other particles. Conversely, for our previous time-independent treatment, there is a single block of size N T × N T .
To a good approximation, each block can be viewed as the evolution of a newly born BH for its coherent time scale, as the regions of the density matrix external to any block are highly suppressed. Then, as long as we proceed one block at a time, the suppression factor can be ignored. All particles in the same block are approximately coherent and indistinguishable. Hence, the results of our time-independent treatment can be applied.
Let us now consider the effective expansion parameter for a block that is "born" at a time N T which is not too late in the evaporation process. By construction, the total particle number of the block is about the same as the number of correlated particles at this time, N means a time close to N T ). The classicality parameter is approximately BH (N T ) -except near the end of the evaporation. The effective expansion parameter is then the product N coh (N T ; N)C BH (N T ) ≃ C BH (N T ) , which is obviously less than one until the BH reaches the Planck scale. To compare, the effective expansion parameter for the time-independent treatment is N T C BH (N T ), which is much larger than that of the block picture.
Let us next determine the "time of last block emission" N * . This is the time when the number of particles remaining to be emitted S BH (0) − N * is equal to the coherence time, where N is again a time close to N * . For future reference, since C BH (N * ) = To see this, let us use Eq. (32) to rewrite Eq. (35) as from which it follows that We now use another approximation which will turn out to be the correct way to estimate the emission time of the last block. Expanding the right-hand side of Eq. (38), we have Then, since which is satisfied at the transparency t trans (as will be made explicit in Sub- and semiclassical, although its Schwarzschild radius is parametrically smaller than the initial radius R S (0) and the size of the remaining block is parametrically larger than S BH (0).
Since the formalism of Section 2 can be applied to the block picture before it breaks down, we can estimate some associated quantities. For instance, the von Neumann entropy of a block that is born at N T is (cf, Eq. (15)) where we have also used that the thermal entropy of a block is approximately the same as its particle number.
More interesting is the information output per block. According to Eq. (20) and the above observations, the rate is meaning that, over the "lifetime" of the block, That is, each block emits about one bit of information. This can also be seen directly from Eq. (41).
As there are roughly S BH (0) blocks in total, the implication of the above simplified picture is that only ∆I BH ≃ S BH (0) ever gets released.
However, this is incorrect because, even besides the break down at t trans , the independent block picture is not perfectly accurate. The blocks overlap, correlations get built and accumulate. To pick up the information that comes out, one has to monitor the BH continuously, otherwise the information gets lost after each coherence time.

Time dependence of information release
This section will focus on how the suppression factor and coherence scale impact upon the purification of the density matrix and the transfer of information.

Time-dependence of the purity
The purity of the density matrix , which will be calculated next. This result will be the initial step towards distinguishing the different phases of information release.
We first re-express the multi-particle density matrix of Eq. (14) but with the suppression factor now included. In terms of the variables N ′ , N ′′ , each ranging from 0 to N T and with frequency labels suppressed, this is As the matrix has already been normalized to yield unit trace, we need only calculate Tr (ρ (N T ) ) 2 . Moreover, we do not have to consider the diagonal contributions because these will contribute at order N −1 T and do not "mix" with off-diagonal terms as far as this trace is concerned (see [15]).
Hence, for current purposes, we can consider a simplified matrix for the offdiagonal correction, where the exponents in Eq. (30) for D are now expressed in terms of N coh .
Since N T is large, we can treat the discrete arguments of the density matrix as continuous. Now consider that where I is given by Recalling that the suppression factors restrict N ′ , N ′′ to take on values close to N T and that C BH (N) is a slowly varying function except at late times, we can make the approximation C BH (N ′ ), C BH (N ′′ ) = C BH (N T ) and then evaluate the Gaussian integrals. For instance, where N T ≫ 1 has also been used to treat the upper boundary of the x integral as infinite.
In this way, one ends up with Then, reinserting the other factors from (ρ (N T ) OD ) 2 and dropping the subleading term, we arrive at As the purity of the Hawking matrix is 1/N T , the purity of ρ OD is smaller by a factor of N coh C BH ≃ C 1/2 BH ≪ 1 . It appears that the purity of the offdiagonal correction only catches up to the small purity of the Hawking matrix when C BH (N T ) ≃ 1 ; implying that there is no chance for purification. However, it will be shown later on that such a conclusion is premature.

The rate of information transfer
It is interesting to compare the preceding calculation for Tr (ρ (N T ) OD ) 2 with that of our earlier study [15]. We previously obtained Tr (ρ (N T ) OD ) 2 ∼ C BH (N T ) , so that the modified result in Eq. (50) is smaller by a factor BH . This estimate can be substantiated as follows: N T and S BH = C −1 BH are parametrically equal for a "typical BH", meaning for times t 1bit < t < t trans . In which case, the time-dependent model effec- Much in the same way, our previous time-independent estimates for the rate of information transfer can be corrected for time dependency by replacing where appropriate. Here and below, N means a time close enough to N T for insignificant suppression.
For instance, consider the estimate for the information I in Eq. (19). It should now be modified, with a numerical factor modifying K to K.
We can use Eq. (51) to determine when the first bit of information comes out of the BH. For such early times, so the first bit of information comes out when . This happens at the coherence time. Of course, we already knew this, since each coherence-sized block contains one bit of information; cf, Eq. (43). Hence, t 1bit = t coh , the same as for the previous time-independent treatment.
It will be shown below (also see Eq. (36)) that the transparanecy time occurs when N coh (N T ; N) C BH (N T ) ≃ 1 . This and Eq. (51) tells us that the amount of information released by this time is The value of S H by that time is parametrically equal to the total entropy of the BH, S H (t trans ) ≃ S BH (0). So, parametrically, all the BH information is released by t trans .
Another useful approximation for the information I that is valid up to the transparency time is the following: BH . The latter can be ignored to leading order, and so from which it is evident that the information transfer rate is initially small but becomes order unity at the late stages of evaporation.
We have defined the transparency time as the moment at which the rate of information transfer is unity And so t trans is the time at which N coh (N T ; N) C BH (N T ) ≃ 1 as already stated. Let us recall that t trans has replaced the Page time in this respect.
We again see that there is nothing particularly special about the original Page time in our updated framework.
We now want to verify that the transparency time occurs at one coherence time before the end of evaporation. By this time, the BH still has an entropy of S BH (t trans ) = C −1 BH (t trans ) ≃ S with the first equality following from ∂N ∂R S = − 2πR S G . Integrating, we then have [∆R S ] trans ≃ l p S 1/3 The results of this section are summarized in Figure 1, showing the rate of information release. Let us substantiate our arguments by looking at the relevant integral, which is that of Eq. (47) with C BH (N T ) 1 meant as a number of order unity but still smaller than 1,

Qualitative discussion of the final purification
The magnitude of any of the exponents is at most of order unity. For instance, setting N ′ = 1 (i.e., the initial emission of radiation), one finds that the first exponent goes as 1 Clearly, for N ′ , N ′′ > 1, similar estimates are also valid. Hence, the Gaussians are turning into Heaviside functions and, parametri- Then, using the estimate I ∼ 4(N T ) 2 in Eq. (46), we have from which P (ρ (N T ) ) ∼ 1 follows. The interpretation is that the density matrix does appear to have purified towards the end of the evaporation. We expect to provide a more rigorous analysis of the late-time purification at a later time [34]. Another surprise is the apparent suddenness of the purification. After all, the "action" only seems to begin at t trans , which is but one coherence time before the end. This is, to some extent, an artifact of the choice of time parameter; the evolution of the BH is more gradual when described in terms of the monotonically increasing classicality parameter C BH (N T ).
As this parameter measures the degree of classicality of the geometry, one could argue that it is also the most natural choice of clock for the current framework.

Early-Late Entanglement
Let us now address the entanglement between early and late-time radiation, both for a "typical" BH and for an "old" one. The results should be relevant for an eventual resolution of the firewall paradox [16], as this puzzle is often posed as a conflict as to which subsystem the late radiation is entangled with and how strongly. Here, we will calculate the time dependence of the early-late entanglement but defer addressing the implications to the firewall paradox until a later article [33].
We will now use the multi-particle density matrix in explicit Dirac notation, where the suppression factor D(N T ; N ′ , N ′′ ) is defined in Eq. (30).

Entanglement for t < t trans
Let us first discuss the case that the BH is typical, namely, for times t 1bit < t < t trans . First, we have to choose a reference time N cut and factor the Hilbert-space |N into the states of "early emissions" |N E , for which N E ≤ N cut , and "late emissions" |N L , for which N L ≥ N cut . In our framework, the natural choice of "cutoff" is one coherence time prior to N T , At the end of evaporation, N cut is the transparency time.
The density matrix is then expressed on the product space |N E ⊗ |N L , . It is given by where N prod = N cut (N T − N cut ) . The products ρ H ⊗ ρ H , ∆ρ OD ⊗ ∆ρ OD should be regarded as shorthand notation for for the first term inside the curly brackets It is the second term within the curly brackets that stores the entanglement between early and late radiation, One can already see the source of entanglement; the density matrix does not factor into ρ E ⊗ ρ L , rather there are correlations.
Let us now trace over the late radiation to obtain the reduced density matrix for the early radiation ρ E . The trace over the late radiation of the Hawking term is calculated in a straightforward way and that of the first term within the curly brackets of Eq. (62) trivially vanishes. The only relevant term in Eq. (62) is therefore the second term in the curly brackets. To evaluate it, we need to perform the following integral: where b is either 0, 1 or 2 depending on which of the four different products of exponents in expression (63) is being considered.
It can be seen that, for a BH of typical age, the width of the Gaussian Applying this estimate of J b , we then obtain a reduced density matrix of the form Here, ∆ρ 2 . Unlike ∆ρ OD , which is purely off-diagonal, ∆ρ 2 OD does have a diagonal component. Let us now consider times t 1bit < t < t trans . In this case, the Gaussiansuppressed terms are subdominant, leaving The Gaussian suppression has disappeared and has been replaced by a factor of C BH on the correction term. The von Neumann entropy per particle is given by (see Footnote 1), In fact, to leading order, we need only consider contributions from the diagonal elements of ρ E (N T ; N ′ E , N ′′ E ). This is because the large number of off-diagonal elements, a factor of ∼ N cut more of these than diagonal ones, enters only at quadratic order, leading to the additional suppression N cut C 2 BH ≪ C BH . Now, if one uses the standard definition of entanglement for pure states and applies it to the Hawking density matrix, it comes out as entangled.
We know that the Hawking part of the matrix is thermal because of the tracing over the negative energy in-modes, and so it does not represent any entanglement between late and early radiation. Formally, one has to use an appropriate definition for mixed-state entanglement such as the "positive partial transpose" criterion [35,36]. Rather than use this sophisticated criteria, we will calculate the entanglement entropy and subtract from it the contribution from the Hawking density matrix.
We proceed by expanding the logarithm of the density matrix of Eq. (66) in the von Neumann formula in Eq. (67) to linear order in C BH (N T ). Only the diagonal elements of ρ contribute to this order, as just explained. We then subtract from the answer the zeroth order result coming from the Hawking matrix. We also drop a factor of ln N T /2 that is due to the resolution of Gibbs' paradox for indistinguishable particles. The final result of this procedure is then For a typical BH, N cut ≃ N T ≃ C −1 BH (N T ) . We can conclude that the entanglement entropy is of order unity, S ent ∼ 1 .
Let us next consider the entanglement entropy at the transparency time, T . This is significant compared to earlier times but well short of that expected at the purification scale, S ent ≃ N T . Hence, the time scale for maximal entanglement must still be later than the transparency time.

Qualitative discussion of the entanglement entropy for t > t trans
Our expectation is that purification is indeed attained at the last phase of evaporation t > t trans . Previously, we presented a qualitative argument based on the purity of the radiation density matrix. Here, we will discuss in a qualitative way the entanglement entropy at late times when C BH (N T ) approaches unity. We hope to be able to present a more precise analysis in the future [34].
Let us begin by revising the form of the reduced density matrix. We In this case, the "correction" term in ρ E is the dominant one. The Hawking contribution is "only" diagonal whereas the correction uniformly fills up the entire matrix.
Let us now recall that a uniform M × M matrix filled with (say) c's can be diagonalized to yield a single non-vanishing eigenvalue, λ = cM . In this way, the correction part of the matrix can be reduced to a diagonal matrix with a single non-zero entry, λ = C BH (N T )Tr [∆ρ 2 OD ] . Once the Hawking contribution is discarded, the entanglement entropy can be calculated in terms of the eigenvalue λ given above, That is, a late-time entanglement of order N cut ≃ N T ≃ S BH (0) as expected.
An order N T entanglement indicates a pure state while a small entanglement is an indication of a product state. Hence, the radiation does (parametrically) purify, but only in the final stages of the BH evaporation.
It is worth re-emphasizing that this conclusion should only be viewed as a qualitative one. We have greatly simplified matters by treating the Gaussian suppression factors as theta functions in the late-time limit. In reality, these late-time Gaussian factors are not exactly uniform. Nevertheless, the matrix is close enough to uniform to suggest that our qualitative results will survive a more accurate treatment.
Let us further emphasize that the changing coherence scale is the physical mechanism which enables the entanglement between late and early modes to We have also identified a clear physical reason for the appearance of the coherence time: The wavefunction of the BH at one time is nearly orthogonal to the wavefunction at another when the time separation is t coh , causing the emissions of particles that are separated by more than t coh to become incoherent.
That some of the particle emissions are coherent is what allows for a unitary process of evaporation. In this way, the wavefunction is serving as the conduit for total information flow from the burning matter system to the final state of external radiation. This conclusion was substantiated by three calculations: First, the trace of the square of the radiation density matrix becomes larger at late times and parametrically approaches unity, Tr(ρ 2 ) ∼ 1 . Second, the total information released by the BH is of the same order as its total information content, I ∼ S BH (0) . Third, the late-time entanglement entropy between the early and late-emitted radiation is also of this order, S ent ∼ S BH (0) .
Qualitative arguments show that the number of coherent particles begins to grow rapidly one coherence time before the end of evaporation and spans the entirety of the emitted particles by the very end. This growth is surprising and deserves a more precise treatment. Evidently, the key to this mechanism is the wavefunction of the collapsing shell and the existence of several different time scales. This wavefunction provides a Gaussian width for the time lapse between particle emissions that depends on these particular emission times as well as the time scale in the evolution of the BH. The former scales are fixed by the geometry at the times of emission, whereas the latter is changing as the back-reaction from the particles effectively shrinks the shell.
We have identified the time-of-first-bit t 1bit as occurring at a time t coh after the emission of the first Hawking particle. On the other hand, the transparency time t trans , which is the moment when the rate of information flow reaches order unity, occurs at a time t coh before the end of evaporation.
Finally, the purification time only occurs after t trans when the BH is still large but parametrically approaching Planckian dimensions. The Page time [7], which is attributed to the time of transparency in the Page model, no longer has any specific meaning in our framework. It has been split into two different time scales t 1bit and t trans . We expect that this distinction could be essential to resolving the recently posed firewall paradox [16]. For instance, let us suppose that a firewall is symptomatic of a transparent BH, as is often implied to be the case. Then our revised picture would delay the need for a firewall from the Page time to a parametrically smaller interval before the end of evaporation. This is a matter that we hope to report on in the near future [33].
The emerging picture of the phases of information release during BH evaporation is then the following: The first bit of information comes out from the BH after one coherence time. Then the information continues to come out of the BH at a nearly constant rate of 1 bit per coherence time until the transparency time is approached. By this time, the rate of information release becomes unity. The amount of information released by the transparency time is already of the order of the total entropy of the BH. After t trans , our description of the BH radiation is only qualitative. But, based on the scaling of the quantities that could be calculated, we have argued that the radiation purifies quickly when the BH evaporation nears its final stages.

Acknowledgments
We thank Sunny Itzhaki for discussions.
We will sometimes use a different choice of variables and change from N ′ , , so that We wish to express the phase factors in Eq. (71) in terms of the particle number. For this, we define We will assume that the differences ∆N ′ and ∆N ′′ are small (in a sense made explicit below) and expand the phase factors accordingly. Our premise being that the expectation value on the left-hand side of Eq. (27) rapidly approaches zero for large enough ∆N ′ and ∆N ′′ . This assumption will be justified by its self-consistency.
We will expand the phases by using a suitably modified version of Eq. (23), . The point here is that the product of the dimensionless frequency and dimensionless advanced time does not depend on the time-dependent scale R S (t) that is used to make both dimensionless. Then, as our purpose is to expand out the entire phases and not just the v's, the canceled-out factors of R S (t) should not be included in the expansions.
The partial derivative ∂R S ∂N is also required and can be evaluated using the fact that N(t) = S BH (0) − S BH (t) = const. − π(R S (t)) 2 G , which gives us where the · · · denote higher orders in C BH . The second expansion is well defined provided that C BH (N ′ )∆N ′ 1 (and similarly for the other one), which is equivalent to This is on the order of the total number of Hawking particles that will be emitted during the whole period from N ′ to the end of the lifetime of the BH, and so this restriction is a weak one. We can conclude that the first-order term in the expansions is a good approximation to the exact value until N T becomes of order of S BH (0) and, then, is valid if the differences ∆N ′ , ∆N ′′ are smaller than S BH (0).
Evaluating the expectation value of Eq. (27) in this way, we obtain a modified form for the quantity ∆I SC (C BH (N T )) that appears in Eq.
We then need to evaluate the integral, Following [15], we change variables to Y = ω ′ −ω ′′ and Z = (ω ′ +ω ′′ )/Y Let us first consider one of the Z integrals (the top one), So that anything besides the leading-order result is suppressed by powers of δN, as well as by powers of Z −1 which make the integral less singular.
And so the conclusion is that, to leading order in small parameters, the Z integral picks up a Y -dependent phase factor relative to the time-independent calculation, . However, its leading-order behavior can be determined by the following argument: By shifting the integration variables to account for the phase factor . (87)