Emergent geometry from stochastic dynamics, or Hawking evaporation in M(atrix) theory

We develop an microscopic model of the M-theory Schwarzschild black hole using the Banks-Fischler-Shenker-Susskind Matrix formulation of quantum gravity. The underlying dynamics is known to be chaotic, which allows us to use methods from Random Matrix Theory and non-equilibrium statistical mechanics to propose a coarse-grained bottom-up picture of the event horizon -- and the associated Hawking evaporation phenomenon. The analysis is possible due to a hierarchy between the various timescales at work. Event horizon physics is found to be non-local at the Planck scale, and we demonstrate how non-unitary physics and information loss arise from the process of averaging over the chaotic unitary dynamics. Most interestingly, we correlate the onset of non-unitarity with the emergence of spacetime geometry outside the horizon. We also write a mean field action for the evolution of qubits -- represented by polarization states of supergravity modes. This evolution is shown to have similarities to a recent toy model of black hole evaporation proposed by Osuga and Page -- a model aimed at developing a plausible no-firewall scenario.


Introduction and highlights
The study of black holes in M(atrix) theory holds a treasure trove of insight into quantum gravity and the nature of spacetime. As a non-perturbative formulation of M-theory, Matrix theory [1,2] can in principle access and potentially resolve many of the puzzles we associate with black holes. Early attempts at staging Matrix black holes have consisted of promising sketches [3]- [6] and numerical simulations [7]- [10]. We have learned that understanding black holes is related to studying strongly coupled Yang-Mills at finite temperature [11]- [13], and that there might be intricate non-local dynamics near the event horizon [14,15]. More recently, we have learned that Matrix theory is characteristically chaotic [6,16,17], and interactions can scramble initial value data at the fastest possible rate that is allowed by the postulates of quantum mechanics [18]- [25] -as also expected from black hole physics.
In this work we ask if one can write a mean field coarse-grained description of the strongly coupled microscopic dynamics of Matrix theory in a manner that captures the essential features of black holes and informs us about the geometry near the event horizon. To illustrate through an analogy, if M(atrix) theory is to black hole quantum mechanics as BCS theory is to superconductivity, we are looking for the analogue of a Landau-Ginzburg description of the quantum physics of black holes -with the underpinning element of stochastic chaotic evolution.
We know that Matrix theory is chaotic, and we know that one can often use the language of random variables, or in this case Random Matrix theory (RMT) [25]- [33] [6], to capture chaotic dynamics. We also know that RMT is closely related to the strong damping regime of Fokker-Planck stochastic evolution [34,35,36,26] whereby a statistical description of ergodic motion is effectively described with macroscopic variables. The suggestion is then to formulate a description of Matrix black holes where the entries of the Matrices are described through particles moving in a mean field potential -one that is obtained by coarse-graining over microscopic degrees of freedom that are engaged in ergodic motion.
In this work, we show that such an effective description of black holes is indeed possible using Matrix theory. In the process of developing this effective model, we settle on a microscopic picture of Matrix black holes that is both intuitive and complex. Entries on the diagonal of the matrices incorporate the thermodynamics and encode information. These can be thought of as particles that mostly hang around near the surface of the would-be horizon. They are subject to a mean field potential whose shape we determine. An additional 'goo' of off-diagonal matrix entries glue these particles into clusters, effectively acting like bound states. These clusters contain around d particles each, for a black hole in d space dimensions. Figure 1 depicts a cartoon of the model. In the figure, the clusters are depicted as cells. The configuration is far from static, and in fact we expect that the cells continuously exchange particles and rearrange themselves. The rest of the matrix degrees of freedom, which constitute the overwhelming majority of the total, condense in a quantum ground state. It is possible that they should be thought of as a membrane stretched at the horizon, without any associated thermodynamics or entropy. Thermal energy is distributed in the dynamics of the cells as they slide near the horizon and interact with each other. We develop this model in detail, matching with all expectations from the dual M-theory supergravity description of a Schwarzschild black hole in the light-cone frame. In particular, Hawking evaporation [37]- [40] is reproduced and information loss is demonstrated to arise from the process of coarse-graining over otherwise unitary dynamics. It becomes clear that dynamics near the horizon has a non-local component when explored at short enough timescales, while being local at the longer timescales associated with Hawking radiation 3 . Most interestingly, we demonstrate that non-unitary evolution and information loss arise at the timescales for which the Matrix dynamics is strongly coupled and spacetime geometry is expected to be emergent in the dual supergravity language. This suggests that Hawking information loss is inherently tied to the premise that geometry near the horizon of a large black hole is smooth and well-defined. The microscopic degrees of freedom underlying black hole dynamics are Planck sized bits that are interacting chaotically over Planckian timescales. Any description of the physics over timescales larger than the Planck time involves coarse graining over stochastic dynamics in a manner that leads to an effective quantum picture that is non-unitary. The notion of spacetime geometry arises at around those Planckian timescales, implying the breakdown of the geometrical picture of black hole evaporation as we approach the horizon. Put differently, the Hawking computation is robust when applied in smooth spacetime backgrounds over large enough timescales, yet the evaporation should still be regarded as unitary because the notion of geometry and spacetime is lost at the event horizon at short timescales.
The outline of the text is as follows. In the first section, we present a brief overview of Matrix theory, followed by a review of Fokker-Planck dynamics and the light-cone Schwarzschild black hole in supergravity. We then systematically develop the effective model for the Matrix black hole, matching and checking against expectations on the dual low energy M-theory side. In the second section, we focus on the time evolution of information within the Matrix black hole. We track information encoded in the polarization states of the low energy Mtheory supergravity multiplet, and we write an effective qubit time evolution operator that is based on the stochastic model developed earlier. We show how the evolution becomes non-unitary at longer timescales because of the coarse-graining over chaotic dynamics, and correlate this with the emergence of spacetime geometry in the dual M-theory language. For short timescales, we write a unitary time evolution operator that describe the weakly coupled qubit dynamics near the event horizon. Finally, in the discussion section, we reflect on the implications and future directions.

The effective model 2.1 M(atrix) theory overview
The M(atrix) theory action is the dimensional reduction of 10 dimensional Super Yang-Mills (SYM) to 0 + 1 dimensions and is given by The gauge group is U (N ), with the X i s (i = 1, . . . , 9) and the Ψ in the adjoint representation of the group. In our conventions, we have where g s is the string coupling and s is the string length 4 . The Yang-Mills coupling is The length dimensions of the various quantities are: X ∼ 1 , t ∼ 1 , and ψ ∼ 0 . The theory is purported to be a non-perturbative formulation of M-theory in the lightcone frame in the following scaling limit 5 This corresponds to focusing on energies that scale as E ∼ g s / s . It is sometimes convenient to introduce alternate M-theory variables , τ , and ξ that remain fixed in the scaling regime of interest For example, the corresponding light-cone M-theory energy scale is ∼ R/ 2 P = fixed. In the map onto light-cone M-theory, N/R is interpreted as total light-cone momentum. Light-cone energy scales inversely with light-cone momentum, hence as (R/N ) × mass 2 . Depending on the coupling regime, the number of active degrees of freedom of a configuration scales as N k , where k = 2 in the weakly coupled regime, and k = 1 at strong coupling.
Compactifying light-cone M-theory to d space dimensions, we can describe it through Matrix theory with d of the 9 X i matrices removed from the dynamics, assuming that the compact directions are small enough that associated modes are too heavy to excite. Alternatively, one can use d + 1 dimensional SYM for a full description of the compactified theory, obtained from the current setup via a T-duality map.
The relation between light-cone M-theory and Matrix theory is known to hold for N → ∞, but the correspondence is valid for finite N as well -between Discrete Light-Cone Quantized (DLCQ) M-theory and finite N matrix theory, where N is mapped onto units of M-theory 4 Matrix theory is sometimes written in Planck scale conventions, related to the one we use by X → Y / √ R and t → τ /R. Using units such that 2π 3 P = 1 where P is the eleven dimensional Planck length, the action takes the form whereẎ = dY /dτ . In this alternate convention, the length dimensions of the various quantities become P , then X = s , given that P = g 1/3 s s . 5 This scaling limit corresponds to the decoupling regime for holographic duality [41,11,42,43] -as applied to D0 branes. The Matrix theory conjecture is thus in the same class of gravity-SYM correspondences that give rise to the AdS/CFT map. discrete light-cone momentum [44]. In this work, we will work at finite but large N in trying to describe an M-theory black hole that is large enough to have small curvature scales at its horizon.

From chaos to a stochastic evolution
Recently, Matrix theory has been demonstrated to be highly chaotic [16,17,6], with dynamics that can scramble initial value data in a time that scales logarithmically with the entropy [20,21,22,23,25] -as opposed to the more common power law behavior. This allows one to capture Matrix theory physics, in the appropriate setting, by treating the matrix entries as random variables. Describing a non-extremal black hole is certainly a good candidate setup for exploring chaos in Matrix theory [33,45,25]. And techniques from the well-established field of Random Matrix Theory (RMT) [26,27,28,29,30] can then be used to tackle the problem. RMT is most powerful when one is dealing with a theory with a single matrix; it then allows a robust statistical treatment of the eigenvalues of this matrix.
In our setup, we will be interested in studying a configuration of matrices in Matrix theory that represents a d dimensional Schwarzschild black hole in the dual light-cone M-theory. We will assume from the outset that we work with spherically symmetric configurations, where the different X i matrices are chaotic and uncorrelated in different space directions. Hence, each matrix entry in the d matrices X i , with i = 1, . . . , d, is random and not correlated with any other matrix entry. This configuration is to be mapped onto a black hole in the dual M-theory -with a fixed temperature and associated Hawking evaporation phenomenon. The fermionic matrix entries of Ψ in (1) will be treated as a component of the thermal soup -in equilibrium with the bosonic matrix entries. At finite temperature, we will hence mostly focus on the bosonic sector with a mirror image at play in the fermionic sector being implied. However, we do need to incorporate the one-loop quantum contribution of the fermionic degrees of freedom to the mean field potential for the bosonic stochastic variables. Furthermore, later on, we will use the fermionic variables as probes to track information evolution in this thermal soup.
We start by noting that RMT is closely related to stochastic physics. In particular, since the work by Dyson [26], it has been demonstrated that RMT dynamics can be properly captured by the strong damping regime of Fokker-Planck evolution. We present here a quick overview of the subject.
In RMT, each matrix entry can be thought of as a stochastic particle evolving in an mean field potential. For a particle with position r and velocity v in d space dimensions, we can study it through the probability function which represents the probability of finding the particle at time t within r and r + dr and v and v + dv. In our setup, we will consider matrix configurations that are spherically symmetric in d dimensions. We will then focus on probability profiles where Here, the v θ i are d−1 components of v in the angular directions, and v = v r . Correspondingly, the mean field potential is spherically symmetric 6 V (r) → V (r) (9) and the Fokker-Planck equation takes the form where T is the temperature of the environment, γ is a damping parameter, and m is the mass of the particle. This then allows us to study the evolution of the matrix entry in a statistical framework. The spherically symmetric Fokker-Planck equation is solved by the equilibrium time-independent profile C here is a normalization constant. Note that this non-relativistic treatment is consistent with Matrix theory since light-cone M-theory has Galilean symmetry with dispersion relation E LC = p 2 /2p LC , where the light-cone momentum p LC ∼ 1/R plays the role of Galilean mass. As mentioned above, the relation between RMT and stochastic physics arises in the regime of strong damping Focusing on this regime, we also write the probability profile as 6 The model we develop involves time averaging over stochastic, chaotic dynamics. The cluster tiling of Figure 1 is not rigid and very dynamical over timescales shorter than the Hawking timescale. It is then reasonable to expect that, at timescales larger than the characteristic timescale associated with cluster dynamics, an approximate spherical symmetry sets in. Of course, going beyond this coarse model one needs to consider the possible breaking of the spherical symmetry [31,32].
integrating over all velocities. The resulting evolution equation is known as the Smoluchowski equation The radial probability current that follows from (14) takes the form which we will use later in understanding evaporation through stochastic diffusion. Our goal is to develop an effective model for strongly coupled chaotic Matrix theory, using the Smoluchowski equation with r representing matrix entries in the bosonic matrix i X 2 i ∼ X i of (1) -since different directions in space are statistically uncorrelated. We then need to identify the relevant mean field potential V (r), mass m, temperature T , and damping parameter γ.
It is worthwhile noting that an alternate and equivalent approach is to track the evolution of moments of random matrix entries. If χ represent any matrix entry, then the Smoluchowski equation with a quadratic potential is equivalent to stochastic fluctuations given by which then imply the differential equations for the moments The timescale of stochastic evolution can then be easily read off as It is important to note that this is not the timescale over which one coarse-grains the random motion to arrive at a mean field potential for stochastic variables. This other timescale, which we call the stochastic timescale t stoch , must be shorter than the thermal timescale, t stoch < t T , and is determined from the process of averaging over microscopic dynamics. We next need to determine the parameters of the model. We will build this effective description of strongly coupled chaotic Matrix theory by using knowledge of the gravity dual, and of the microscopic string theory dynamics that underlies Matrix theory.

The light-cone Schwarzschild black hole
We start by reviewing the dual gravity picture of the Matrix theory setup of interest -a light-cone M-theory Schwarzschild black hole [46]. The corresponding geometry is obtained by Lorentz boosting a d dimensional Schwarzschild black hole in the light-cone direction with a boost factor given by r h /R, where r h is the radius of the black hole horizon. While the horizon geometry is unchanged and the entropy or area in Planck units remains the same, the Hawking temperature is red-shifted The Hawking radiation flux from evaporation takes the form in general d dimensions. The thermal timescale associated with the Hawking temperature is then The entropy is related to the black hole mass M bh as usual S ∼ M bh r h , and the evaporation process can be described by [47,48] Hence, the black hole lifetime is given by Beside the timescale t h and t life , the shorter scrambling timescale determines the timescale over which the black hole scrambles information. We have written all these relations in forms that can be compared to the Matrix theory stochastic model in the choice of units presented earlier. In our SYM choice of units, the entropy of the black hole is written as For a large black hole, we see that we must require r h s (27) leading to small curvature scales at the black hole horizon. The task next is to model an effective Matrix theory stochastic system that reproduces these properties of a light-cone Schwarzschild black hole.

A conjecture for an effective model
In a perturbative regime, Matrix theory consists of ∼ N 2 degrees of freedom as all matrix entries participate in the dynamics. In early models of a Schwarzschild black hole in Matrix theory, the authors of [3,4,5] noted however that, to reproduce the correct equation of state of a light-cone black hole, one must have the entropy proportional to N at strong coupling, This implies that only N of the entries in each matrix X i are to participate in the thermodynamics of the Matrix black hole; that is, most degrees of freedom must be 'frozen', given that N 1 follows from (26) and (27). Inspired from the works of [3,4,5], we then propose that the thermodynamics of the Matrix black hole is carried by the N diagonal entries of the X i matrices. Information in the black hole would also be carried by diagonal degrees of freedom only. These entries can be sometimes interpreted as coordinates of the corresponding D0 branes underlying Matrix theory. Entropically, these order ∼ N degrees of freedom would like to spread to infinity -the theory even admits flat directions for this purpose. However, perturbatively there can be an initial cost in energy in doing so from strings stretching between the D0 branes -i.e. off-diagonal modes of the matrices. Presumably, taking strong coupling effects into account, the configuration forms a metastable ball of size r h , the black hole radius, along with decay channels that implement the process of Hawking evaporation. As a diagonal matrix entry random walks its way out, a bit of the black hole evaporates away [10]. If N diagonal degrees of freedom are to spread in a volume r d h , average inter-brane spacing is generically parametrically much larger with N than if they are spread over an area . And since inter-brane spacing is costly in energy, we can start seeing that the proper model of a Matrix black hole would involve the diagonal entries of the matrices spread on the surface of a would-be black hole horizon. Figure 1 shows a cartoon of the setup. Figure 2(a) shows a cartoon of a matrix X i , focusing on a sub-block associated with a group of 'nearest-neighbor' branes 7 . Using the permutation subgroup of U (N ), we can The δXs refer to the off-diagonal entries spanning clusters; the off-diagonal entries within a cluster are in the shaded block, denoted by δx. (b) General structure of non-zero entries in the matrices for different space dimensions d. The d − 1 labels refer to the number of active columns or rows in the first row or column, respectively. The shaded diagonals start within the shaded square in (a). always arrange to sort the matrix entries as depicted. We expect that a certain number of branes, of order d−1, whose coordinates appear as x in the figure, would be close enough that corresponding matrix off-diagonal modes, labeled δx in the figure, can be light. This still would not affect the S ∼ N requirement as the number of such modes would be independent of N . Branes much farther away, over a distance scale r h , would be much heavier. We propose that beyond the d × d sub-block, all other off-diagonal modes would be too heavy to excite and would freeze or condense in a Bose-Einstein (BE) condensate. Indeed, if we look at the critical condensate temperature T c , we would expect 8 which we can quickly see to be much larger than the Hawking temperature for d ≥ 2. It is possible that this BE condensate describes a membrane-like configuration stretching at the black hole horizon [3,4,5,49,50]. In a coarse-grained effective language, we would set these heavy off-diagonal modes, the δXs in the figure, to zero. Interestingly, fuzzy spheres of various dimensions in Matrix theory have been shown to necessitate the activation of more off-diagonal modes that spread away from the diagonal [51,52]. For example, a 2-sphere (d = 3) is realized through SU (2) representations, which activate 3 diagonal lines along the matrix diagonals; and a 4-sphere (d = 5) activates 5 diagonal lines. Our model then fits well with this pattern. Figure 2(b) shows the general scheme. The diagonal entries within the d×d sub-block of matrices would be spread out from each other at a distance that is around the Planck scale and might naturally involve marginal bound state physics. In M-theory language, this would correspond to supergravity excitations carrying ∼ d units of light-cone momentum. These marginal bound states are conjectured to exist in Matrix theory and are a necessary ingredient for the dictionary between Matrix theory and M-theory [1]. The off-diagonal modes δx in these sub-blocks would remain relatively light and participate in making the physics of these clusters non-local, at around the Planck scale. They would correspond to strings joining nearest neighbor branes, and henceforth we refer to the δxs as 'off-diagonal nearest neighbor modes' 9 .
and bottom left of each matrix are active as well. This is a detail in the description, in the large N d limit, we assume has subleading effect on the larger picture. 8 The right hand side is the expression for the number of degrees of freedom in a Bose condensate in d dimensions. 9 Our treatment explicitly picks out a 'frame' or gauge where the diagonal and off-diagonal matrix entries Our stochastic model would then involve writing an effective theory of all the modes that remain active -diagonals x and nearest neighbor modes δx -while integrating out all other δX modes. We need to provide two separate stochastic treatments, one for the x modes on the diagonal, and another for the off-diagonal nearest neighbor modes δx. The first would describe the coarse-grained thermal state of the black hole; the second would describe finer cluster physics within each matrix sub-block. We will next demonstrate how these two sectors effectively decouple and can reliably be treated through stochastic methods due to a hierarchy in the relevant timescales.
In the Matrix theory scaling regime time scales as g s / s ; this allows us to measure timescale through the effective Yang-Mills coupling g eff (τ ) 2 defined as which remains finite in the scaling regime. Hence, larger effective coupling corresponds to longer times since 0 + 1 SYM is super-renormalizable. In this language, the first timescale t h from (22) arises from the thermodynamics of the diagonal modes, of order N in number; this gives g s The scrambling timescale t scr of (25) is then given by The lifetime of the configuration t life from (24) should correspond to These statements follow from the expected black hole physics on the dual side of the correspondence. Note that all three timescales correspond to regimes where the Matrix theory SYM is strongly coupled.
have very different physical roles. We expect that this setup corresponds to a description of the Matrix black hole from the perspective of the outside observer. U (N ) gauge transformations would naturally change the perspective, while mixing the roles of diagonal and off-diagonal entries. More on this in the Discussion section.
On the SYM side, perturbatively, we know that off-diagonal modes have dynamics given by 10 where ∆r is the distance between the corresponding diagonal entries; this gives a frequency of We can then easily see that if ∆r ∼ s for nearest neighbor off-diagonal modes, δx modes can be treated as heavy and can hence be integrated out over time scales This is the strong coupling transition point for the SYM, a regime that we typically associate with emergence of geometry on the dual M-theory side. The relevant strong coupling benchmark is given by g eff (τ ) 2 ∼ 1, instead of the one using the 't Hooft effective coupling g eff (τ ) 2 N ∼ 1, because the dynamics in question is that of individual partons in the black hole soup, as opposed to the interaction of the black hole as a whole. More on the interplay between these two couplings and the emergence of a valid geometrical description can be found in the Discussion section.
Next, looking at off-diagonal modes δX that straddle diagonal modes separated by a large distance of order ∆r ∼ r h , we see from (36) that these can be integrated out for timescales This is the shortest of the timescales and determines the regime where a stochastic treatment is valid: it corresponds to timescales where integrating out the δX's leads to a stochastic mean field potential for the diagonal modes. Note also that, for r h s , part of this regime overlaps with weak coupling in the Matrix SYM. Figure 3 summarizes the various timescales and clarifies the range of validity for the effective model that we propose. The stochastic formalism with a mean field potential for the diagonal modes requires coarse graining over time scales longer than t stoch . For t > t stoch , δX's are frozen in a BE condensate. We can then incorporate the effect of the δX's into a mean field potential for the modes on the diagonal. The nearest neighbor off-diagonal modes,  Figure 3: The hierarchy of timescales for event horizon dynamics. Timescales t < t o are associated with non-local physics within D0 brane clusters, but timescales t > t stoch allow a local description for coarser inter-cluster dynamics.
the δx's, cannot be integrated out at these timescales. We leave them part of the degrees of freedom participating in the physics of cluster formation. For timescales t > t o , the nearest neighbor modes are heavy as well and are associated with high frequency dynamics that can be coarse grained and described through a stochastic treatment. However, the δX modes will always have a much higher frequency (for r h s ) and hence will still determine the mean field potential for the diagonal modes. Finally, thermal timescales, t h , t scr , and t life are all much longer and live well within the regime of a stochastic treatment that coarse grains physics faster than t stoch .
We then list in one place the set of observations underlying our model: • We have a stochastic effective description for diagonal modes for t > t stoch , or (g eff (τ ) 2 ) 1/3 s r h . We integrate out the off-diagonal modes that straddle widely separated modes on the diagonal.
• Strong coupling corresponds to timescales t > t o , or (g eff (τ ) 2 ) 1/3 1. In this regime, all off-diagonal modes are heavy, but the effect of nearest neighbor off-diagonal modes on diagonal modes is sub-leading. We associate emergence of geometry on the dual M-theory side with the onset of strong coupling in Matrix theory [2,53]. At timescales t stoch < t t o , we might be able to write a stochastic effective description of D0 brane cluster dynamics. We expect that at around t ∼ t o , the degrees of freedom of Matrix theory organize in clusters of about d nearest neighbor branes moving in the larger thermal soup.
• Hawking evaporation physics sets in at t t h , or (g eff (τ ) 2 ) 1/3 ∼ r h s 2 1, well within the regime of validity of the stochastic treatment.
It is useful to write some of these timescales in M-theory Planck units. Using (6), and the fact that light-cone time is boosted by a factor of P /R, we find and Hence we see that τ o correspond to Planck scale time in M-theory language. As we shall see, all this means that the chaotic microscopic dynamics that underlies black hole horizon physics is associated with a characteristic timescale that is given by the Planck scale. A well-defined notion of spacetime geometry necessitates coarse graining over longer timescales. Our next task is to develop the stochastic effective descriptions of diagonal and nearest neighbor off-diagonal modes -the first describing black hole thermodynamics and evaporation, the second giving us a crude peak into brane cluster/bound state dynamics.

Modes on the diagonal
In this section, we propose a mean field stochastic potential for diagonal modes, valid over timescales t > t stoch . Using spherical coordinates, we posit writing r 2 = x 2 i , where x i is any diagonal mode of X i . The potential is parametrized by two scales, r 0 and V 0 , and we need to determine these two parameters by comparing the resulting dynamics to that of a light-cone black hole. Note also that we have incorporated quantum effects that we know would arise from the fermionic sector of Matrix theory: the θ(r 0 − r) flattens the potential so as to model the expected flattening of the potenial from supersymmetry-based cancellations of zero mode energies 11 .
We start by noting that the only scale near the horizon of the Schwarzschild black hole is given by r h 12 . We then start by setting fixing the size of the stochastic diagonal fluctuations to within the would-be horizon size.
The temperature of the soup should naturally be the Hawking temperature in the light-cone frame The mass of a stochastic particle should be set to the mass of a D0 brane This leaves us with determining the damping parameter γ and the potential scale V 0 . We start by looking at evaporation flux from the thermal soup. Following [62], we arrange for a steady state scenario for the probability distribution given by where u = r − r 0 and C is a normalization constant to be determined. We need to find f (u) given the boundary conditions where the first one follows from matching with the equilibrium configuration at r = 0, while the second one amounts to absorbing the evaporation flux at r = r 0 , corresponding to evaporation to infinity. The Fokker-Planck equation at strong damping then leads to where for the mean field potential at hand. The solution is given by the error function Integrating over the velocities we have which then leads to the current We will see below that when we find that V 0 ∼ T h . We then note that erf(−r 0 κ/2) −1 .
For erf(−x), the function near x 1 is very well approximated by −1 with corrections suppressed exponentially as e −x 2 /x. We determine the normalization factor C using For this, we write near r 0, and near r r 0 . We then get up to a numerical factor. The probability current near r 0 takes the form which then leads to the evaporation flux which we can then match with Hawking evaporation at temperature T h This gives one of the two conditions we need to determine γ and V 0 . The other condition comes from the well-known one-loop effective potential of a probe D0 brane in the background of N D0 branes. Using M-theory Planck units, we have [49] where v is the relative velocity of two partons at a separation r ∼ r h . While this is a perturbative result in the Matrix SYM, it is know to lead to an exact match with the dual M-theory scenario [49] implying that it is valid at strong coupling as well 14 . Remembering that the black hole entropy is given by in Planck units, and saturating the Heisenberg uncertainty bound for each parton [3,4,5] v 13 If we want to include the kinetic energy of the evaporated bit, we would get with ω being the kinetic energy, giving the standard black body spectrum. 14 There have been suggestions that a non-remormalization theorem perhaps underlies this finding [2].
we get the scale of the potential energy at the size of the horizon Rescaling to SYM units using (6) gives the same relation (r h → r h √ R, E → E/R). We then naturally identify this energy scale with the depth of the mean field potential Finally, from F = F h , we then get The latter relation implies that which corresponds to a borderline strong damping regime (12) -needed for consistency with RMT.
We can now look at the quantum and thermal vacuum expectation values of a mode x on the diagonal, given by For the given potential and parameters, we have leading to borderline thermal regime, which implies that the diagonal modes are barely excited above the ground state. We also note that odd moments vanish at equilibrium, so that We then have succeeded in developing a stochastic model for diagonal mode dynamics that matches with Hawking evaporation. As a result, a consistency check shows that this stochastic evolution has characteristic timescale given by (19) as required.

Off-diagonal nearest neighbor modes
At timescales t ∼ t o , where Matrix theory enters the strongly coupled realm, we have the possibility to describe clusters of d nearest neighbor branes through stochastic means. The clusters are marginally held together and we expect this dynamics to be a delicate one, given their natural overlap with the physics D0 brane marginal bound state formation. Nevertheless, we will use the methods of stochastic dynamics to try to describe the problem, bearing in mind that we aim only to identify scaling relations of what is most likely a very subtle cluster formation process. We model the potential for the nearest neighbor offdiagonal modes V δx with a simple quadratic confining form, and the only relevant scale is the curvature V (0). For nearest neighbor diagonals, we expect an inter-brane separation of ∆r ∼ s , leading to a perturbative potential for the corresponding off-diagonal modes given by This is a perturbative result but we extend it to t t o as a scaling relation. The thermal and quantum vacuum expectation values are where in the thermal expression, we want to think of T as a scale for kinetic energy within the bound system. We would expect ground state physics, implying as the expected scale for kinetic energy in the cluster. The mass parameter would still be given by Finally, we propose that the strong damping bound needed by RMT should be valid, and at worst saturated identifying the damping parameter γ for cluster dynamics. As a sanity check, we can verify that the associated characteristic timescale for the stochastic dynamics is which again matches well with our expectations that the relevant dynamics is at the onset of strong coupling in the SYM theory. Finally, the expected size of the cluster becomes which also syncs well with our expectation that one thermal parton is to occupy one Planck area at the black hole horizon 15 .

Quantum information
In this section, we want to describe how information evolves in the stochastic model we developed above. For this purpose, we need to look more closely at the fermionic degrees of freedom of the Ψ matrix in (1). It is known that these correspond to the polarizations of the light-cone M-theory supergravity multiplet -the graviton, the gravitino, and the 3-form gauge field [1]. That is, in the low energy regime, we can think of an entry on the diagonal in the X i 's as the coordinate of a supergravity particle whose flavor and polarization state is determined by the corresponding entry in the Ψ matrix. We can expect that information in an M-theory black hole can be encoded in the polarization states of a thermal soup of supergravity excitations. We would then want to study the time evolution of the Ψ matrix within the effective model we have developed. Note that the quantum contribution from the fermionic modes in their ground state has already been taken into account in the shape of the mean field potential for the diagonal bosonic modes.
In the spirit of RMT, the equilibrium dynamics of the fermionic and bosonic matrix entries are treated as statistically uncorrelated. This justifies working with the bosonic sector by itself as we have done so: it is assumed that a corresponding thermal state is also set up in the fermionic sector as the two sectors are in thermal equilibrium. Our goal now is to track how information encoded in the polarization states evolves when this equilibrium configuration is slightly perturbed. We could for example consider one particularly interesting scenario, the emission of a supergravity particle from the stochastic soup, as a matrix entry of X i ventures off to large distances. We would choose a particular matrix configuration that can describe this situation, and analyze the evolution of the corresponding bit of quantum information in Ψ.

Qubit dynamics and M-theory polarizations
We start by considering a d = 3 matrix configuration that looks like 16 where X bh and Ψ bh are a (N − 2) × (N − 2) sub-blocks representing part of the black hole, and the remaining x bh /ψ bh and x/ψ represent 1 × 1 entries that are bits of the black hole that will participate in an emission process. The particle with coordinate x and polarization state ψ has perhaps ventured outside the black hole via ergodic motion. The δx mode is a nearest neighbor off-diagonal, implying that x bh and x are part of a cluster. The rest of the matrix entries start off in an equilibrium state at temperature T h . Note that δx bh and δψ bh are N − 2 component vectors. The fermionic part of the Matrix theory action is given by (1) Quantizing the fermionic matrix entries, we have where α and β are 10 dimensional spinor indices, α, β = 1, . . . , 16, remembering that the matrix entries Ψ ab are Majorana-Weyl in 10 spacetime dimensions. Applying this quantization to the matrix configuration (82), we get for the off-diagonal modes while the diagonal entries lead to a Clifford algebra The latter means that we can introduce new raising/lowering spinors on the diagonal by where we now restrict α = 1, . . . , 8. We then have as needed. In general, the fermionic sector then consists of 8 N (N − 1) qubits from offdiagonal modes and 8 N qubits from the diagonal modes for a total of 8 N 2 qubits corresponding to 2 8 = 256 polarization states of the M-theory supergravity multiplet -one for each of the N 2 matrix degrees of freedom. Using (82), we can then expand the action (83) treating all matrix entries as stochastic variables. Furthermore, given spherical symmetry, we expect all spatial directions to be statistically equivalent so that we can write x i → x for all i. We get the action Throughout, we use a symmetric representation for the Γ i s. Note that Γ 2 = 1 and Tr Γ = 0 so that the eigenvalues of Γ are ±1. We will then choose the convenient representation where Taking the thermal vacuum expectation value of (89), we see that the thermal average of the action S ferm vanishes at equilibrium given that we know This is simply the statement that, once equilibrium is achieved, we have two separate systems -a bosonic and a fermionic one -that can be treated as two thermal components in equilibrium at the same temperature. The interesting physics arises when we consider a perturbed configuration, for example one corresponding to x − x bh being momentarily large -describing the process of evaporation of a bit of the Matrix black hole. The subsequent relaxation process would be driven by the couplings in (89) between bosonic modes and qubits. We can analyze this physical setup by looking at the stochastic effective action of the qubits provided we arrange proper boundary conditions where x and δx are initially perturbed away from equilibrium. In the next section, we develop this method of tracking qubit information evolution.

Qubit action
We expect that a small perturbation should not affect the whole system appreciably on short enough timescales. This means that if we were to perturb x and δx in (82) off-equilibrium, X bh and δX bh (as well as Ψ bh and δΨ bh ) would remain in equilibrium as long as N 1. Using techniques from [63], given a stochastic variable χ coupling to other degrees of freedom F (t) via S = dt χ F , we can write an action where χ would be x or δx from earlier, and where the Stochastic action is T is the temperature to which the perturbed χ relaxes to, and the path integration involves boundary conditions corresponding to the quenching process of interest. The potential V , the damping parameter γ, and the mass m are all determined from our previous discussion in Section 2. F (t) can be obtained from (89) and is bilinear in the qubit variables. It can easily be shown that the Smoluchowski equation for χ given by (14) follows from S stoch [63].
To evaluate the path integral, we start with the classical equations of motion where If χ represents a radial coordinate x 2 i in a spherically symmetric setup as given by (42) we get instead Since V 0 ∼ T for any of the bosonic perturbations of interest, Ω has then the same scale irrespective of symmetry. We solve the sourceless classical equation and we easily find with F = 0, where χ i is an initial off-equilibrium configuration, and χ f is an equilibrium configuration χ relaxes towards. The classical contribution to the action is then where we take the initial time t i = 0. The quantum contribution is given by with the associated Green's function In summary, we arrive at an action for the qubit variables -hidden in the F (t) -of the form describing the evolution of the relevant qubits as the bosonic stochastic variable χ relaxes -after a quench described by the boundary conditions χ i and χ f . Note that the second part of (103) is imaginary and this implies that the qubit evolution would be in general non-unitary. This piece involves quartic qubit interactions and would be responsible for scrambling information away as the background evolves stochastically. This is not surprising yet an important observation: we are then able to associate information loss in Hawking radiation to the scheme of coarse-graining over short timescales that results in an effective model of what otherwise is microscopic unitary evolution of information. That is, we see how averaging over chaotic dynamics in Matrix theory is responsible for information loss in the dual low energy M-theory or supergravity. Below, we will see that when this non-unitary piece of the effective dynamics becomes important, we expect the emergence of geometry on the dual M-theory side. Our goal next is to consider scenarios where χ, or x and δx, are perturbed away from equilibrium, and then we want to track the evolution of the qubits described by ψ and δψ.

Long timescales
Consider the qubit couplings given by (89) where x and δx are arranged to start off in an off-equilibrium configuration. Neglecting the back reaction of this perturbation onto the black hole, we can take X bh = x bh = δX bh = 0 (104) so that we have We want to develop the action of the qubits using (93), which then gives (103) where χ represents x or δx, and F (t) can be read off from (105). Before looking at the details, notice that the second term in (103), which is quartic in the qubits, is imaginary and renders the evolution non-unitary. The term is the result of coarse-graining over the stochastic variables x and δx and naturally leads to information loss. The scale for this non-unitary piece is given that the propagator G(t, t ) scales as δ(t − t )/(∂ 2 t − Ω 2 ) and the fermions are dimensionless. Irrespective of whether χ represents x or δx, we have From (67) and (96), we have when χ is identified with x; while from (74) and (96) we instead have when χ is identified with δx. Hence, for t t o , the non-unitary coupling scales as whether χ represents x or δx. We then see that this coupling, and hence information loss, sets in for timescales of order t ∼ t o , where the effective dimensionless Yang-Mills coupling becomes order unity and Matrix theory starts describing emergent spacetime geometry in the dual formulation. For shorter timescales, t t o , the evolution is effectively unitary, given by the first semi-classical term in (103). Note however, that for t stoch < t t o , the dynamics is non-local, given by the Planck scale cluster physics and the light nearest neighbor off-diagonal modes δx of the matrices.

Short timescales
Let us first start by writing the full qubit action (103) that follows from using (105). When χ is identified with the diagonal coordinate x of (82), we have and obtained from (91) and (105), and where δψ α ≡ δψ α+8 with α = 1, . . . , 8. The 'dot' represents a sum over 8 qubits, i.e. δψ · δψ ≡ α δψ α δψ α . As mentioned above, the second non-unitary piece is negligible for t t o . Looking at the first term of (111), we can see that it provides mass to the δψ and δψ qubits, and it scales as For early times where t < t o , this term is important only if x cl is large. This, for example, would be the case if the matrix entry labeled by x would evaporate away, x cl r h . If the initial perturbation for the stochastic variable x is such that x i ∼ r h , the subsequent stochastic evolution is in a flat potential given the form of (42). This evolution, described by (17) and (18) -or equivalently (99), results in x cl (t) growing to infinity 17 . We then conclude that the effective qubit dynamics that arises from a perturbation on the diagonal -that corresponds to x evaporating away -is described by with x cl ∼ r h initially and growing larger thereafter. This is the statement that the offdiagonal qubits δψ and δψ become heavier and heavier and condense as the bit evaporates away.
For the off-diagonal coordinate δx in (105), the resulting action takes the form 17 To account for the flat direction in (42), we can for example take Ω x → 0, which gives from (99) In arriving at this expression, we have used a complexified version of the action (93) where χ is complex as is δx -since the integrated modes are most naturally represented by complex variables. We also have used the diagonal qubit operators ψ ± and ψ ± bh defined in (87). Once again, as described above, the second non-unitary piece is negligible for t t o . The first term in (115) term provides a coupling between qubits ψ, ψ bh , and δψ and it scales as As the bit x evaporates away, equations (17) and (18) -or equivalently (99) -tell us that the initial value of δx cl decays exponentially to zero on timescale given by t o , as the mode becomes heavy 18 . At short times t t o , we write In summary, the qubit action is given by S where we have switch from time t, x, and δx to scaled variables τ , ξ, and δξ (see equation (6)), and the effective coupling g is defined as which has units of length such that g τ is the effective dimensionless coupling. In total, the system describes 8 × 4 qubits: 8 × 2 off-diagonals ones denoted by δψ α and δψ α , and 8 × 2 on the diagonal denoted by ψ α and ψ bh α . The stochastic relaxation from a quench is given by the classical profiles ξ cl (τ ) = x cl (τ )/ s and δξ cl (τ ) = δx cl (τ )/ s that follow from (99).
We now elaborate on the implications of the qubit evolution action (119), restricting our attention to early times t stoch < t t o -before the onset of dissipation and emergence of geometry. For the remaining discussion, we will use the coherent state representation of the qubits, which we first briefly review. For a qubit with states |0 and |1 , a representation over a coherent state |η looks like [64] where η is a Grassmanian. A general state |Φ is then a function over the Grassmanians η|Φ ≡ Φ(η). A Bell state is then represented as The expectation value of an operator gets a form of a function over Grassmanians The path integral measure is such that For a Hamiltonian of qubits referenced by the operators ψ ± , we would write ψ + → η and ψ − → η. For a simple bilinear and time-dependent structure with sources, we have The unitary evolution operator as a function over Grassmanians takes the form where the propagator is given by We can then use this approach to write the unitary evolution operator for the qubits given by (119). The Grassmanian variables will be labeled as δψ, δψ , ψ − , ψ − bh , and their complex conjugates -in correspondence with the respective operators. We then seek the evolution operator written as that acts on the qubit wavefunction Φ(ψ + (0), ψ + bh (0), δψ(0), δψ (0)). We have the evolution of a 8 × 4 qubit system, half on the matrix diagonal and the other half off-diagonal; all 32 qubit are part of a cluster. The time evolution is obviously sensitive to the details of the quench, given by x cl (t) and δx cl (t). The initial wavefunction Φ is another input to the problem. Cluster formation dynamics might naturally involve the delicate physics of D0 bound state formation -akin to Cooper pair formation in superconductivity. The dynamics of the marginal bound states in Matrix theory is a complicated strong coupling problem that remains an open issue, and we will not be able to tackle the full problem here. Instead, given the spirit of an effective approximate scaling analysis, we will next engage in a speculative analysis that is inspired by a recent toy model of black hole qubit evaporation due to Osuga and Page [65]. We will argue that the Matrix theory qubit evolution operator has the hallmarks of the toy model presented in [65], under a series of assumptions.
In [65], a toy model was proposed whereby the black hole Hilbert space is augmented to a tensor product that involves the black hole qubit sector and two other sectors, one for in-falling and another for outgoing radiation modes just inside and just outside the event horizon. Each black hole qubit is paired with two qubits that are in the singlet Bell state. The latter is proposed to represent the vacuum for the radiation pair of modes that assures smooth spacetime near the horizon. As a black hole qubit evaporates away, [65] proposes a unitary evolution operator that essentially exchanges the black hole qubit with a qubit of outgoing radiation, leaving the black hole sector qubit entangled in a Bell state with the qubit of incoming radiation. The result of this is that one qubit of information leaves the black hole (into the outgoing radiation sector), and a vacuum Bell state of two qubits (black hole and incoming radiation sectors) is left behind that is now to be interpreted as part of a bit of new empty spacetime created just outside a black hole as the latter shrinks in size. The key assumptions in this model are: interactions in the black hole qubit sector are non-local at the Planck scale, and a Bell vacuum state for black hole and incoming radiation qubits is tantamount to shrinking the black hole or equivalently expanding the vacuum space outside of it. The motivation for this toy example is to present a proof of concept model of black hole evaporation consistent with black hole complementarity.
In our setup, we have an explicit quantum theory of gravity that dictates the qubit evolution operator. The partons of the matrix black holes are clusters of diagonal and offdiagonal matrix qubits, about 8 × (d − 1) 2 qubits in d space dimensions. For d = 3, that's 32 qubits. We propose that each cluster of qubits, a 32-qubit system, carries 8 qubits worth of information only -corresponding to the 256 supergravity states that can encode information; the remaining 24 qubits are scaffolding that are in a highly entangled Bell-like vacuum state that is the result of cluster dynamics. These represent the halo at around the event horizon. Naturally, the information is on the diagonal qubits, say in ψ bh in the specific setup we have been considering. That means that δψ, δψ , and ψ start off in a maximally entangled vacuum Bell state of 24 qubits representing radiation or 'membrane goo' near the horizon. We then propose that the unitary evolution operator from (130) and (119) -given a perturbation of the stochastic variables x cl (t) and δx cl (t) that describes the evaporation of the x matrix entry -results in having the qubit of information ψ bh transfered to ψ which exits the Matrix black hole. The end result leaves behind a vacuum Bell state of qubits for δψ, δψ , and ψ bh that is to be interpreted as the production of a bit of new spacetime outside the black hole. As a result, the matrix black hole shrinks in size from N to N − 2. Looking at the form of (119), we see a structure that has the right general form to potentially generate such an evolution of qubits. The analogue of the exchange operator from [65] in our language takes the form exp [i α t (ψ + − ψ + bh )(ψ − − ψ − bh )]. Our effective Hamiltonian involves in addition the mediation of the light δψ modes in combinations of the form ∼ (ψ + −ψ + bh )δψ − and its complex conjugate.
Bell states with 24 qubits are very difficult to study and even determine in their own right. Added to this complication is the fact that (119) is in general non-local due to the light off-diagonal modes. As a result, it is a very challenging task to determine the evolution of the qubits using the action (119). To see this, note that the non-local couplings in (128) have scale given by where we used (99). For χ → ξ cl 1, given that r bh s . For χ → δξ cl ∼ 1, given that the cluster length scale is s . In any scenario, the relevant dynamics is highly non-local. Noting some of the general similarities between the model of [65] and ours, we leave the analysis of the significantly more complex dynamics of our system for future work.

Discussion and Outlook
The analysis in this work is a first attempt to develop a quantum gravity-centric, bottom up picture of black hole event horizon physics. The results can be summarized through two main conclusions: 1. We have determined that near horizon dynamics is non-local in space and time at the Planck scale. The thermal degrees of freedom of the black hole are 'cells' of around d particles, for a black hole in d space dimensions; each cell spans a size of order the Planck scale. One can think of each cell carrying bits of information, encoded in the polarization states of the fermionic variables of Matrix theory -or equivalently the polarization states of the supergravity multiplet on the dual side. The dynamics of black hole degrees of freedom is non-local and chaotic for short Planckian timescales, in a regime where the Yang-Mills theory is hovering just below strong coupling. At longer timescales and larger distances, the dynamics is effectively local both in time and space, while being strongly coupled. This is when and where an effective geometrical picture is possible.
2. When describing evaporation, one is dealing with a chaotic system near the would-be event horizon with a characteristic timescale given by the Planck scale. To describe the evaporation via a top down approach, i.e. via Hawking's approach, one needs to average chaotic dynamics over super-Planckian timescales. Where a spacetime description is valid, one is necessarily left with a non-unitary effective picture for the evaporation arising from coarse graining over Planckian chaotic motion. The suggestion is that the resolution of the black hole information loss paradox cannot lie in any framework that relies on a well-defined smooth spacetime geometry at the event horizon. This is a plausibility argument: We demonstrated that, through a rather simple stochastic model with a single input scale, one can understand how Hawking evoporation is inherently non-unitary -naturally due to stochastic, chaotic UV physics. This simplest of settings necessitates however the breakdown of smooth geometry at the horizon. This obervation, together with other independent evidence towards a breakdown of geometry at the horizon, constitute strong evidence that one most likely needs to look for resolutions of the information paradox in models involving a new perspective on near horizon geometry. The geometrical description of black hole evaporation is inherently non-unitary as it arises from averaging over Planckian timescales that characterize the chaotic physics of the underlying degrees of freedom.
A couple of footnotes are in order. First, we identify emergent geometry at the benchmark of strong effective Yang-Mills coupling g eff (τ ) 2 , as opposed to strong effective 't Hooft coupling g eff (τ ) 2 N , which is the natural coupling for large N . The subtlety here is that the coupling that governs the microscopic event horizon dynamics is one that arises from the interaction of 'order one' matrix entries on the diagonal. At most groups of order d 2 particles participate in the dynamics, hence the relevant effective coupling is not the N dependent 't Hooft coupling. In describing the gravitational interaction of the whole black hole with entropy S ∼ N , the relevant effective coupling is indeed the 't Hooft coupling; but microscopic event horizon dynamics does not involve the participation of all N degrees of freedom.
The second footnote has to do with implicit connections to the issue of black hole complementarity [58]- [61] . In modeling the mean field potential for the degrees of freedom of the Matrix black hole, we note that there was no need to introduce a separate Planck scale near the horizon: the entire potential can be modeled using a single scale, the radius of the event horizon 19 . This is not surprising since we were modeling the physics in a manner to match against expectations on the dual supergravity side. We also noted that the qubit action we arrived at has some of the features of the qubit evolution toy model proposed in the work of Osuga and Page [65]. The latter consisted of a proof-of-concept system that circumvents the need of a firewall by positing non-local interactions at the horizon and an exchange mechanism of qubits within a direct product of three Hilbert spaces. All these ingredients of this toy model emerge naturally from our Matrix theory discussion. However, our action is more complicated than the one in [65], and we leave a detailed analysis of the dynamics for future work. Nevertheless, these similarities between the two systems, ours and that of [65], might be hints that a firewall is not needed at the event horizon after all, and black hole complementarity prevails. This is consistent with [14,15,54] given the non-local nature of the interactions near the event horizon in Matrix theory -at the level of D0 brane clusters. There is however a significant conceptual challenge to this argument. Black hole complementarity is a statement about the perspective of an in-falling observer. This means that one needs to understand how a change of perspective between the observer at infinity and the one in-falling past the horizon is realized in the language of Matrix theory. Presumably, this involves a Matrix transformation in U (N ) since one expects that local spacetime coordinate invariance is embedded in the gauge group of the theory. This in turn requires a more precise map between emergent geometry and metric, and matrix degrees of freedom. Without this critical missing ingredient, we cannot conclusively understand how the firewall paradox is addressed by our effective model. Related to this last point, we also note that our treatment explicitly chooses a frame for describing the black hole, presumably corresponding to the perspective of an outside observer.
This creates a clear separation between the roles of diagonal and off-diagonal matrix entries. The residual gauge freedom is the group of permuting diagonal entries, a subgroup of U (N ). The more interesting transformations would mix diagonal and off-diagonal entries, and we believe these correspond in part to switching the perspective of the observer. Very little is known or understood about this part of the Matrix-supergravity duality, and it seems a full treatment of the quantum black hole would necessitate progress in this direction.
This work is a step towards unravelling the microscopic details of black hole horizon physics within a theory of quantum gravity that is fully embedded in string/M-theory. The effective model approach opens up new directions for a range of possible investigations and extensions that can only add to our understanding of black holes and quantum gravity. We hope to report on some of these in future works.

Acknowledgments
This work was supported by NSF grant number PHY-0968726.