Entanglement in Fock space of random QFT states

Entanglement in random states has turned into a useful approach to quantum thermalization and black hole physics. In this article, we refine and extend the `random unitaries framework' to quantum field theories (QFT), and to include conserved charges. We show that in QFT, the connection between typical states, reduced subsystems and thermal dynamics is more transparent within the Fock basis. We provide generic formulae for the typical reduced density matrices and entanglement entropies of any given subset of particles. To illustrate our methods, we apply the generic framework to the simplest but non trivial cases, a massless scalar field in two dimensions and its generalization to the case of N scalar fields, including the large N limit. We find the effective temperature, by matching the reduced dynamics to a Gibbs ensemble, and derive the equation of state of the QFT. The deviations from perfect thermality are shown to be of order 1/S instead of exp(-S), a result which might be relevant for black hole physics. Finally we describe the analogue of the so-called `Page curve' in the QFT scenario as a function of the energy scale which divides high from low energy degrees of freedom.

1 Introduction and summary of results 1

.1 Quantum thermalization and random matrices
In its simplest formulation, the black hole information paradox [1] follows from the statement that black hole radiation [2] is precisely thermal, with the temperature depending just on the conserved charges defining the black hole. Given that there is an exponentially big number Ω = e S BH of initial states that could collapse to the given black hole, where S BH is the Bekenstein-Hawking entropy [3,2], it is concluded that after the black hole has evaporated the information about the initial state is lost. Mathematically, the paradox arises since under unitary evolution of the initial quantum state, we can never produce a thermal ensemble, where ρ Gibbs is an appropriate Gibbs distribution for the problem at hand. Interestingly, this problem is strikingly similar to one old problem in quantum mechanics, the problem of 'quantum thermalization', see [4,5,6] and references therein. The problem of quantum thermalization can be easily stated as: • In the context of microscopic unitarity, and given (1.1), how can Gibbs ensembles emerge?
In what follows we will assume the Hilbert space to be factorizable 1 : (1. 2) The number of factors in (1.2) is unconstrained, it can be finite or infinite. Besides, the Hilbert space factors do not need to be equal to each other. If the algebra of operators acting on H i is given by O a i i , with a i a discrete parameter, the algebra of operators acting on H is Reduced subalgebras/subsystems A are defined by specifying a set of Hilbert factors, and considering non-unity operators only in those subfactors: As is well known, given a unitarily evolving pure state ρ(t) = |ψ(t) ψ(t)|, the expectation value of any operator belonging to such subalgebra A is completely characterized by its 1 The ideas and results concerning the problem quantum thermalzation are most probably not sensitive to this condition. But in this framework the concepts can be explained more transparently, and a plethora of results can be derived. Besides, the models we will explicitly consider in this article fall in this factorizable class. associated reduced density matrix: (1.5) where ρ A (t) = TrĀ(ρ(t)) , (1.6) andĀ is the complement subalgebra/subsystem of A.
Although perfect thermality cannot be attained within unitary evolution, we might still expect certain approximate thermality for the correlation functions of the theory. The condition stating that all measurements (1.5) are approximately thermal can be more rigorously stated as with small errors, and where ρ Gibbs is the reduced thermal density matrix associated to A. Given these preliminaries, some immediate questions concerning quantum thermalization are: • What is the corresponding ρ Gibbs for the specific theory and state ρ(t) considered?
Or equivalently, how the parameters defining ρ Gibbs are related to the parameters defining ρ(t)?
• How does the error scale in the thermodynamic limit?
Considering (1.8) and the previous questions, one modern approach to tackle this problem is to compute the entanglement entropy of the reduced subsystem A, which is defined by If our state is such that relation (1.8) holds, the next insightful relation follows as well: S A E (t) = S A Gibbs ± error , (1.11) which states that the origin of thermal entropy in quantum mechanics lies in quantum entanglement.
The discussion has been until now entirely generic. To proceed we should take a specific Hamiltonian H and find the exact evolved quantum state: ρ(t) = |ψ(t) ψ(t)| = U |ψ ψ|U † = e −iHt |ψ ψ|e iHt . (1.12) This is certainly a difficult task, since we are usually interested in high energy states, with many particles excited 2 . But due to these complicated dynamics, we expect that some approximations can be done. In particular, one common procedure is to assume that after a certain time we can approximate the evolved state as a random vector 3 : which can be equivalently thought as being created by the application of a random unitary matrix to a given initial state: |ψ random = U random |ψ . (1.14) The question now is obvious: • How do we define |ψ random or U random ?
If we could answer this question generically for any theory and initial state, it would allows us to compute typical values of physical quantities in a unitary framework. The physical quantities in which we are interested in this article are the typical 4 reduced density matrix of a given subsystem A, given by 15) and the typical entanglement entropies (1. 16) This approach was pioneered in [13,5]. In those articles a bipartite Hilbert space was considered, with dimension AB, where A and B are the dimensions of the corresponding factors of the total Hilbert space. For this Hilbert space, |ψ random was defined by averaging over the continuum manifold of quantum states [13], with a constant measure, i.e the Haar ensemble of quantum states in the full Hilbert space. Again, this is equivalent to applying a random unitary U random picked from the Haar ensemble of random unitaries. With this specific definition of randomness in the given Hilbert space, the typical reduced density matrix (1.15) was found in [13] and, using those results, the typical entanglement entropy (1.16) was found in [5] 5 . In the light of the previous questions about quantum thermalization, their results for this specific case are the following: • The typical reduced density matrix of a subsystem A is proportional to the identity matrix. Therefore, this reduced density matrix might be associated to the thermal density matrix of any Hamiltonian (defined for the finite dimensional Hilbert space) in the infinite temperature limit. The first bullet point expresses the fact that unitary evolution typically leads to thermal behavior, with an infinite temperature density matrix for this specific case. The second point expresses the deviations from thermality, showing that the error size when A O(1) is exponentially suppressed. This result has been used repeatedly in the past to argue against the Hawking information paradox [2], since the available computations of correlation functions during black hole evaporation are insensitive to the expected corrections, of O(e −S BH ) if we trust the previous random approximations.
We remark here, see [11,12] for recent reviews and references therein, that the intuition coming from the random unitaries/states framework has had a strong impact in the context of black physics and quantum thermalization, given a plethora of new concepts, problems and solutions. But we believe that the framework deserves further development, since its present status has also deficiencies. Firstly, the random unitary framework has not been formulated generically in the context of QFT. Secondly, the random unitary evolution defined in [13] does not conserve any charge, such as energy, momentum, electric charge, etc. Besides, the framework just provides a Gibbs distribution with infinite temperature. It is then interesting to ask the following questions: • Can the framework be generalized systematically so as to include conservation of any given charge?
• Can it be formalized in a QFT context? 5 See [12], and references therein, for an analogous but somewhat different approach coming from quantum information theory.
• Can non-infinite effective temperatures emerge in the framework?
• Do the deviations from thermality differ from the previous case?
In this article we develop a framework to solve these questions, and which has interesting different potential applications. We summarize the results here 6 .

Summary of results
The natural answer for the first question is not to choose the Haar ensemble of random unitaries/states, but to choose the appropriate ensemble of unitary matrices conserving any given set of charges. The problem with this approach is to construct such an ensemble, and we do not see a systematic way of doing so. In this article we will use the Haar ensemble in an appropriate subspace of the Hilbert space, defined by the given initial charges. This procedure can also be seen as implictly defining an esemble of random unitaries different from the Haar ensemble in the full Hilbert space. Within this framework we will provide generic formulae for the typical reduced density matrix and entanglement entropy of any given subsystem and for any given set of conserved charges. These objects turn out to be written generically in terms of 'constrained multiplicities', number of different configurations with a given set of constraints. The results connect in a direct and transparent way modern approaches of quantum information theory, concerning for example the computation of entanglement entropies, with more traditional frameworks of microstate counting.
For the second question we argue that the natural way to generalize the framework to a QFT context is to use the Fock space formalism. Fock space provides an ideal setup to define the appropriate subspaces corresponding to the given set of conserved charges. Besides, it is also an ideal setup to define partitions/subsystems by tracing out different set of particles. One possibility is to integrate out all momentum modes but a given one, as done in [18] for the case of vacuum states. We can also integrate out positively charged particles, left movers of a CFT, etc. Interestingly, the formulas we find in the case of random states are generic for all types of subsystems.
For the third and fourth questions we need to sacrifice generality of the analysis and choose concrete models in which to apply the developed formulas. We choose the simplest but non-trivial cases, in which many things can be computed analytically: a massless scalar field theory in 1+1 dimensions and its generalization to the case of N scalar fields. For a given initial total energy, we derive the typical density matrix and prove that in the thermodynamic limit it is equal to the thermal density matrix. It turns out that the effective temperature is finite, and depends on the initial energy in the way expected from usual thermal considerations. In this way we are able to derive the equation of state of the QFT directly from random dynamics. The equation of state is seen as the typical macroscopic configuration of an underlying unitary quantum model.
We also compute the deviations from thermality, obtaining an important insight. As commented before, for the so-called Page case, developed in [13,5], the typical density matrix is exactly the thermal distribution withouth the need of taking the thermodynamic limit, and the deviations of the typical entropy from the entropy of the typical reduced density matrix are exponentially suppressed. This means that deviations from thermality are exponentially supressed, see Appendix A for a complete discussion of all cases. On the other hand, as we will show, in the QFT context the typical density matrix and thererefore the entanglement entropy of the typical density matrix are not directly equal to their thermal counterparts, the difference being of O(1/S). Therefore, to compute the leading deviations from thermality we just need to compute the entanglement entropy of the typical density matrix, a feature which will simplify matters in future developments. From other perspective, our results pinpoint the microcanonical ensemble as the right ensemble to use when considering pure state unitary evolution.
At any rate, the important upshot is that the errors expected in a scenario in which some charges are conserved are much bigger than the ones commented before, being of O(1/S) instead of O(e −S ). The applications of these results to conceptual problems involving black holes is still unclear and deserves further development. At any rate it is interesting that: • The corrections affect directly the diagonal entries of the density matrix, and therefore should be seen in observables such as mode occupation numbers. They are not nonperturbative, and should be visible in the first O(1/S) corrections.
• For any QFT, deviations from thermality can be computed exactly within our framework, and it would be interesting if this can guide the computations of these corrections directly from black hole geometric considerations.
At the end of the article we speculate on potential applications of this framework to the relation between geometry and entanglement [19,20]. In particular, our framework can compute entanglement entropies for any desired subsystem, and it is imaginable that we could get important insights into the emergence of near horizon regions through entanglement in the field theory. In previous literature there has been problems with the 'integration out' procedure of high momentum modes, or the so-called 'holographic renormalization', when trying to get to the near horizon region of black hole backgrounds, and therefore only the asymptotic region is well understood. It seems that our framework may overcome such difficulties in the specific context of random evolution, and might teach us important lessons about event horizons, stretched horizons and near horizon regions of black holes.

Reduced density matrices in Fock space and multiplicities counting
Our initial quantum state will be a pure quantum state within a Fock space, which is the natural basis of QFT and certain spin systems: (2.1) The notation works as follows. Labels i a , i b , · · · denote the number of particles in a given representation. Labels a, b, · · · specify the representation. For example, in a certain theory we could have a = p, s, q r , where p is the momentum, s the spin and q r some charges labelled by r. As usual we assume that the Fock states in (2.1) are attached with a total conserved energy E T and total conserved charges Q r T of the free theory, where r label different possible charges. Finally, when there is no need to write all the previous labels we will use greek letters α, β to simplify notation. They are just natural numbers running over the states 7 . For example, normalization of the basis vector is expressed as α|β = δ α,β .
We now follow the logic described in the introduction. Consider that the full interacting Hamiltonian of the system H commutes with the total charges, so that E T and Q r T are conserved. If we begin with an initial Fock space state |Ψ in = |β , with definite values of E T and Q r T , the various charge conservation laws force the unitary evolution of the initial state to be of the following form: where the sum does not run over all Fock states (2.1), but only for those |α with definite values of E T and Q r T . By definition there are Ω(E T , Q r T ) = e S(E T ,Q r T ) of those, i.e the microcanonical number of states for a given total energy and total charges.
As explained in the previous section, a useful approximation when considering chaotic and highly entropic sectors of the theory is to consider U (t t relax ) U random . In this article, with the objective of extending the random unitary framework to systems with conserved charges, we will use Haar randomness in the previous subspace, which is indeed defined by the given set of charges. Up to second order corrections (doubly exponentially 7 We suppose we have a discrete theory, probably with an infinite, but countable number of states such as in QFT. The examples we work out in the article fall in this class, since they are QFT defined on bounded domains. suppressed in the entropy of the given sector) this approximation can be defined by the following averaging procedure 8 Imposing average normalization of the state we obtain We want to remark that the ensemble of random states will always be the same, the previous gaussian ensemble of random states, which can be obtained by an analogue ensemble of random unitaries with random gaussian matrix entries. What will change is the sector of states in which gaussian randomness is considered. The sector of states will be the mathematical object which needs to be analyzed in several ways. From the perspective of the full Fock space, the relations (2.3) implicitly define the action of an ensemble of random unitaries, preserving any given number of conserved charges, on some state of the Hilbert space. With (2.3) we can now ask questions about typical properties of the field theoretic state |Ψ out . For example, the average of the global density matrix is given by |α α| , (2.5) which is just the microcanonical density matrix for a theory with a generic number of conserved charges, obtained here in a straightforward manner by applying random dynamics in the appropriate subspace. Random dynamics seem to pinpoint the maximally mixed microcanonical state, instead of the canonical ensemble. The next natural step, given the structure provided by Fock space, is to choose one particle type, say a = p, s, q r , and integrate out all the others. Notice that the case of momentum space entanglement, analyzed in [18] for vacuum states, is going to be seen here as a special case, arising when we consider all types of particles within a specified momentum shell. But otherwise the Fock space formalism provides naturally more possibilities, such as the entanglement between positively charged particles and the rest, particles with a given spin and the rest, left and right movers of a CFT, etc. All possible types of sets and reduced dynamics within the Fock basis of the Hilbert space are constructed by joining the appropriate set of a, b, · · · conforming the type of subsystem we want to study. We thus begin with the simplest one, given by one particle type a. We first write the state in the following form so that now α represents all other particle types different from a, but the sum still runs over the previously specified subspace. The reduced density matrix is This is directly a diagonal density matrix, without the need of averaging. Indeed, for a subsystem with one particle type a, it just happens that for two states |i a , α and |j a , α with the same energy and charge, they must obey i a = j a . It will no longer be true for subsystems with more than one particle type. Using (2.3) we can obtain the typical reduced density matrix where Ω(E T , Q r T , i a ) is the number of states with energies E T , charges Q r T , and a fixed number i a of particles of type a. The maximum value of the number of particles of type a is denoted by i max a and needs to be determined case by case. We discuss this in the example presented in the next section.
With the previous density matrix, the average entanglement entropy of a particle of a given type a, in the fixed E T and Q r T sector is given by: Being rigorous this is the entanglement entropy of the average density matrix, usually denoted by S E ([ρ a ]). The full average entropy is computed in Appendix A, where we show that the difference between the two is exponentially suppressed in the microcanonical entropy. The entropy of the typical density matrix will be enough for us, since we are not interested in the deviations between the typical entropy and the entropy of the typical density matrix, but instead, on the deviations from thermality. In the usual case of [5], typicality and thermality turn out to be exactly the same, and therefore deviations between thermal entropy and average entanglement entropy are equal to deviations between average entanglement entropy and entanglement entropy of the average (see Appendix A for the complete discussion). We will show that when conserved charges are taken into account this is not longer true. The leading deviations from thermal entropy are those already accounted for by the entropy of the average density matrix, a feature which simplifies present and further developments. At any rate, if needed, the full computations are described in Appendix A. Relations (2.8) and (2.9) are generic formulas which turn out to be written just in terms of 'constrained multiplicities', the number of different states with a given set of constraints. Notice that the result for the entanglement entropy is finite, and no divergences occur, even taking into account that we are dealing with quantum field theories. This is obviously because we are carefully applying conservation of charges. Although the multiplicities might be difficult to compute in general, there might be a class of theories in which they can be computed, and explicit connections with thermal density matrices (in particular generalized Gibbs ensembles) might be established. We do this in the following section for the case of certain scalar field theories. At the same time the generality of (2.8) and (2.9) suggests there might be a generic way to prove that the probabilities P (E T , Q r T , i a ) are well approximated by the Generalized Gibbs Ensemble. Also it would be interesting to compute this multiplicities for integrable theories. In those theories Generalized Gibbs Ensembles are expected to govern the dynamics but deviations from them might be bigger. This might aliviate several problems encountered in previous literature, see [7]. We leave these interesting paths for future work.
Generalizing the procedure to include any desired subset of particles is straightforward. We first form the set of a, b, · · · , c in which we are interested, and write |Ψ out as: where α now labels all particles types different from a, b, · · · , c. The reduced density matrix then reads: where α labels the set of particles which are traced out. As opposed to the previous oneparticle case, this is not a diagonal density matrix 9 . However, taking the average we find the following generic formula: Although it is not diagonal, it has a nice block diagonal structure, as used and described in Appendix A.
which is diagonal. The average probabilities P (E T , Q r T , i a , i b , · · · , i c ) are ready to be compared with thermal expectations, i.e with probabilities coming from Gibbs distributions. We will discuss examples in the next section.
The average entanglement entropy -i.e. the entanglement entropy of the average density matrix (see Appendix A)-is finally We concldue that the generic equations (2.12) and (2.13) apply to any theory with a Fock basis structure, such as a QFT or certain spin systems. In particular it is applicable to Holographic Field theories. It seems a rigorous framework to study entanglement between infrared and ultraviolet domains at finite temperature. An exciting possibility is that we could potentially extract physics from the near horizon regions of quantum black holes from the structure of entanglement in Fock space of the field theory. At any rate it is expected to give more insights into the connections between entanglement and quantum gravity [19,20], an exciting direction which we leave for future work.

Examples: massless scalar fields in two dimensions
In this section we analyze in detail two specific examples. The first example is a massless scalar field theory in 1+1 dimensions defined on a finite line segment. The second one is its generalization to the case of N scalar fields. For both cases we will show the emergence of Gibbs distributions as effective states for reduced subsystems. The procedure also allows the computation of deviations from precise thermality. Finally, entanglement entropy for single modes as a function of the momentum, and entanglement between high and low energy momentum modes for a given energy cutoff will be discussed. We will end with the analogue of Page curve [5] for the case at hand. This enlarges the program spelled out in [18] to the case of random states.

Massless real scalar field on a finite segment
In this case the energy/momentum dispersion relation for excitations over the vacuum, together with the quantization condition reads for a segment of length L and n = 1, 2, · · · . The Fock space is spanned by vectors of the type: where i n is the number of particles with momentum p n , and α is just a natural number running over all the infinite but countable eigenstates, used here to simplify notation. These are eigenstates of the free Hamiltonian with Dirichlet boundary conditions. If the true Hamiltonian conserves the total energy, which for a Fock state reads where n max is defined by p nmax = E T , then any initial state with definite total energy E T will evolve towards states of the form where the |α are all the states belonging to (3.2) with total energy E T . By definition these are Ω( we conclude that the number of states with a given E T is equal to the number of different partitions p(n) of the natural number n max = L π E T , which is given asymptotically for large n max by The global entropy can now easily be extracted and for large n max we find So, in the limit E T → ∞, this agrees with the Cardy formula for a CFT with central charge c = 1 and L 0 − c 24 = n max . If the interacting theory would be a conformal theory on a line segment of length L, the conformal dimension of the pure state |Ψ out would then be ∆ = n max . Now we can directly apply the generic formulas derived in the previous section. The approximation U (t t relax ) U random is operationally defined by (2.3), which we repeat here to emphasize that it does not change from one case to another: Imposing average normalization, we obtain [ Ψ out |Ψ out ] = ΛΩ(E T ) = 1. The average of the global density matrix is given by which is just the microcanonical density matrix, obtained here in a straightforward manner by applying random dynamics in the appropriate subspace.

Entanglement of a single momentum cell
We now study the entanglement entropy in a pure state of a single particle specie. In our example, choosing one particle type just amounts to choosing a definite momentum p n for a fixed n. The momentum cell can still be multiply occupied, so the subsystem has a finite dimension determined by the size of the cell, i.e. i max n + 1, where the maximum occupation number is given by i max n = E T /p n , the integer closest to E T /p n from below. Writing the state in the form where α now represents all other particles types different from p n , we arrive at This is directly a diagonal density matrix, due to energy conservation. The average density matrix is where Ω(E T , i n ) is the number of states with total energy E T and i n particles with momentum p n . Therefore, the average entanglement entropy of momentum cell n is given by Formulas (3.12) and (3.13) are the analogues of (2.8) and(2.9) for the case at hand. To compute these quantitities we need to find Ω(E T , i n ). In this example this number can be analytically computed. The procedure is explained in detail in Appendix B. Here we just quote the result: where p(n) is the number of partitions of the number n. Now we are ready to make the first crosscheck of our computations, which constitutes one of the main results of the article. The typical probability of finding i n particles with momentum p n is given by .
It is easy to check that all probabilities add up to unity. The leading term when i n p n E T , or equivalently i n i max n , is given by This is a thermodynamic limit. Indeed, i n p n E T means that the energy in momentum cell n is small compared to the total energy, or equivalently, momentum cell n is not well occupied. This implies that most of the energy is sitting in the other momentum cells, and that these cells form a heat bath for momentum cell n. Another way of saying this, is that this limit is a good approximation for low values of n. For higher values of n, the energy in this momentum cell will typically be too large, or i max n = n max /n will be too small for the thermodynamic limit to be a good approximation. If we would have used the usual Gibbs ensemble this probability would be given by (3.17) So in the limit i n p n E T , of for CFT's, large conformal dimensions ∆ = n max , the pure state is typically seen by the momentum cell as a thermal bath at a temperature given by We thus see that it is possible to derive generic Gibbs ensembles, with any desired effective temperature, using random dynamics in the approppriate subspace. Notice that (3.18) implies S(E T )T (E T ) = 2E T on the one hand, and on the other hand there is the general Combining the two, we can determine the pressure density: This is the same equation of state as for a two-dimensional CFT. In this way we expect to recover the known relation = (d − 1)P, valid for d-dimensional CFT's, using just random dynamics. More generically it should be possible to derive any equation of state by choosing the appropriate field theory and using the same procedure. The equation of state of a fluid system, at least for this case, is the typical macroscopic configuration of the true evolving pure quantum state. The next step is to compute the deviations form thermality. The next to leading term in the previous i n p n E T expansion is given by To proof this, we only needed the Hardy-Ramanujan asymptotic formula for the number of partitions, see (B.12) in Appendix B.
The error we find here is thus much bigger than the one is usually expected. Within the usual random unitary formalism, the errors are typically of O(e −S ), so exponentially suppressed in the entropy. In our case, if we take a momentum cell with energy i n p n T , we are finding errors of O(T /E) ∼ O(1/S). This seems just to be due to energy conservation, which is explicitly ensured in our formalism. That we find such large errors may have implications for black hole physics and bulk locality in AdS/CFT, see [11,12] and references therein, and we address this issue further in the next subsection when we discuss the large N limit.
Finally, the average entanglement entropy of the momentum mode is given by 10 : and we remind that i max n = E T /p n = n max /n . The dominant contributions to this sum come from the low occupation numbers, where the probabilities are approximately thermal. For high occupation numbers, i n i max n , the probabilities are exponentially suppressed in the entropy, as one can easily compute from (3.15).
At present we have not found a way to evaluate the sum algebraically in closed form as a function of n, though one can evaluate the sums explicitly for any given n and n max , e.g. on Mathematica. However, for high values of n this is possible. For example, in the case of n = n max , the highest momentum possible, there are only two terms in the sum in the entanglement entropy, i n = 0 and i n = 1 = i max n , and the result is where in the second line, we took the leading term in the limit n = n max → ∞. The relation (3.22) shows that there are extremely small entanglement entropies in random QFT states 11 . For conformal field theories, and in the context of the AdS/CFT correspondence, we do not expect these entanglement entropies to be captured by some geometric quantities in the bulk. The proposal for deriving entanglement entropy of CFT's holographically, developed in [19], is expected to capture entanglement entropies with a minimum size of O(1), since this would correspond to surfaces of Planckian size. Entanglement entropies of O(Se −S ) are clearly of non-perturbative nature from the point of view of AdS/CFT dualities [21].
On the other hand, the thermal entropy of a single mode is based on the Gibbs probability distribution (3.17). The result is well-known and can be directly computed using P (E T , i n ) and Shannon's expression for the entropy. It reads with inverse temperature β = 1/T given by (3.18), and we remind that p n = πn/L, such that βp n = nπ/ √ 6n max , which is independent of the size L. The function (3.23) is a monotonically decaying function.
The two functions, the entanglement entropy and the thermal entropy, are plotted against the momentum n in Figure 1. The dependence against n provides some short of entanglement 'running' on the energy scale of a thermal-like state. It would be interesting to find an asymptotic formula for the entanglement entropy for large E T and generic momentum p n . This would make easier the comparison with the thermal approximation. Numerical analysis, as can be seen from Figure 2, shows that for large values of n max , the entanglement entropy approaches the thermal entropy, but the corrections to the thermal result are not exponentially suppressed in the entropy, and go like 1/ √ n max ≈ 1/S. This was obviously expected from the corresponding errors in the probabilities themselves. At any rate, notice that since corrections die in the thermodynamic limit, the expectations coming from the simple analytical thermal expression (3.23) might be ready to compare with geometric dual formulations of the QFT, a very interestring direction to explore.

Entanglement entropy of low-energy degrees of freedom and Page curve
As described before we can also consider any subset of particles we wish. In this case a natural property to analyze is the entanglement between high-momentum and low-momentum modes for a given threshold p n . More explicitly we can integrate out all particles with momentum higher than p n . The reduced density matrix is then given by where α labels the set of particles with momentum greater than p n . As in the previous section, this is not a diagonal density matrix. Taking the average we find The average probabilities P (E T , i 1 , i 2 , · · · , i n ) can be computed using the same previous method of constrained partitions. This is explained in detail in Appendix B. The result is given by (B.17) divided by the total number of partitions p( L 2π E T ): This is a quite complicated expression for generic i 1 , i 2 , · · · , i n , though one can evaluate these sums using Mathematica rather easily. The situation becomes more cumbersome when evaluating the entanglement entropy, since we now have to perform additional sums over multiple occupation numbers. In practice this turns out to be rather hard using Mathematica, and we only succeeded to evaluate the entanglement entropy of a system of up to the first eight modes. However, a simplification occurs whenever n k i k p k E T , namely when the subsystem has much smaller energy then the total energy. Notice that if r n, then we can approximate the partitions which we can use in the probabilities (3.26). Indeed, if n k=1 i k p k E T holds, we can apply (3.27) to all terms in (3.26), so that we obtain where P (T, i 1 , i 2 , · · · , i n ) is the probability given by the Gibbs ensemble. One can again compute the subleading corrections, using (B.11) for each term in (3.28) and find powerlaw suppressed terms in the energy of the subsystem divided by the total energy. So the error analysis is similar to the case of a single momentum cell, and the leading corrections scale like 1/S. Hence for any subsystem, up to computable corrections, the entanglement entropy associated to ρ 1,2,··· ,n is just the sum of thermal entropies of each momentum mode. In this way we can derive the analogous curve in a QFT setup to the so-called Page's curve [5]. Obviously, if we express the entanglement entropy as a function of the corresponding thermal entropy we obtain directly Page's curve, with somewhat different deviations from thermality. But in this context it is more interesting to paint the entanglement entropy as a function of the energy scale used to divide the high-energy modes from the low-energy ones, given by p n . Hence we define two complementary subsystems A and B in momentum space, as depicted in Figure 3. Up to subleading corrections in the thermodynamic limit, we can use the thermal entropy for the low-energy modes, which becomes simply the sum of the thermal entropy of the individual modes The plot of the thermal entropy S A is given in Figure 4. It is hard to perform the sum in given n, as a function of n (horizontal axis). In the figure we took 1 ≤ n ≤ 200 and n max = 1500. The curve asymptotes, for n → n max , to 96.22, which is reasonably close to the total entropy of the system given by (3.7), which for n max = 1500 yields S 99.35. The discrepancy gets smaller for larger n max . The entanglement entropy approximates the thermal entropy well whenever the energy in subsystem A is small compared to the total energy, so for small values of n. Deviations then go like 1/S.
(3.29) analytically, except in the continuum limit where the modes βp n become infinitesimally spaced. Since β∆p = π/ √ 6n max , the continuum limit is n max → ∞, or L → ∞ keeping temperature fixed. In this limit we can approximate the sum by the integral

31)
and S is the total entropy (3.7). As a crosscheck one can verify that in the limit n → ∞, one obtains the total entropy. This can either be seen from the asymptotic expansion of the dilogarithm, or by directly doing the integral (3.32) We can also determine the thermal entropy of the complementary system, which we denote by B, namely the system of momentum modes that lie between a given n and n max . This is given by  Figure 4. The curve now asymptotes for n → 1, again to 96.22, which is reasonably close to the total entropy of the system given by (3.7), which for n max = 1500 yields S 99.35. The discrepancy gets smaller for larger n max . The thermal entropy of system B is reduced to half at mode number around n ≈ 21 again. For larger values n 21, the thermal entropy is a good approximation for the entanglement entropy, with corrections of order 1/S. thermal entropies associated to A and B have different functional structures as we vary the subsystem sizes. Given that the entanglement entropy of A should be equal to that of B it might naively seem there is a contradiction here. But indeed there is no contradiction, since the thermal entropy is a good approximation for the entanglement entropy of A for values up to n = 21 (half the entropy), and for the entanglement entropy of B for larger values of n. Therefore, what is meaningful here is the critical momentum mode p n for which the thermal entropy of the reduced subsystem is precisely half of the total entropy. This happens when the corrections to the first term in (3.30) cancel between each other. Remarkably, in the large L limit, there is an exact solution of this, due to the identity of dilogarithms which we will use for x = 2 and βp n = log 2 in (3.30). For these choices, all other terms cancel except for the first one. Hence the critical momentum dividing the QFT in two halfs maximally entangled with each other is given by giving a different interpretation of temperature in the QFT. From this perspective, the temperature T provides the energy scale wich divides the QFT into two equal parts maximally entangled with each other, with a entanglement entropy equal to S/2. In terms of the mode numbers, (3.35) can be written as

The large N limit
As commented in the introduction, part of the reasons to generalize the random unitary framework to the context of quantum field theory is to get closer to the physics of black holes 12 . In particular our formalism can be useful to describe black holes in anti de Sitter spacetimes via AdS/CFT [21]. In these theories we have a large gauge group, with a corresponding large number of field species. In AdS/CFT this number of field species is counted by the central charge c of the CFT, which is taken to be large to have a smooth gravitational dual background. With this in mind we would like to repeat the previous exercise for the case of N scalar fields, in the large N limit. For each scalar field, labelled by a = 1, · · · , N , we have the same energy and momentum dispersion relation for excitations over the vacuum for a segment of length L and n a = 1, 2, · · · . The Fock space is spanned by states of the type |α ≡ |i na , i n b , · · · , (3.38) where i na is the number of particles of the scalar field a with momentum p na , and α is again a natural number running over the infinite but countable set of eigenstates, used here to simplify notation. Conservation of total energy where n max is as in the one-field case p nmax = E T (so n max = LE T /π), implies that any initial state with definite total energy E T will evolve towards states of the form where |α run over all the states belonging to (3.38) with total energy E T . By definition these are Ω(E T , N ) = e S(E T ,N ) , where S(E T , N ) is the microcanonical entropy at energy E T of the theory with N scalar fields. In Appendix B this number is computed, and in the limit E T N π/L or equivalently, n max N , we obtain consistent with the Cardy formula for the microcanonical entropy: Now we can repeat the same generic procedure described in Section 2. In this case we will apply the generic formulas for the average density matrix and entanglement entropy (2.12) and (2.13) directly. The average of the global density matrix is given by which is again the microcanonical density matrix. The typical density matrix associated to a definite momentum p na is where Ω(E T , N, i na ) is the number of states with total energy E T and i na particles with momentum p na , and i max na = E T /p na . Therefore, the average entanglement entropy of a particle with momentum p na is given by (3.45) Formulas (3.44) and (3.45) are again the analogues of (2.8) and (2.9) for the case at hand. To analyze the previous formulas we need to find the constrained multiplicities Ω(E T , N, i na ). This is computed in Appendix B in the same limit E T N π/L as before. The result reads where we remind that n max = LE T /π. Expanding the probabilities P (E T , N, i na ) for i na n a n max , the leading term is given by which is just the Gibbs ensemble at temperature Notice that again T S = 2E T , so that the pressure density reads as expected for a two-dimensional conformal field theory.
We can now compute the deviations form thermality, to check if there is some extra dependence on the central charge c = N of the theory. The next to leading term in the previous expansion is given by: The error we obtain is again much bigger than usually considered. For a typical momentum mode with energy of O(T ), the error N E T (T ) is: In conformal field theory language, this translates into so for large conformal dimensions ∆ compared to the central charge c, the errors are small. Equivalently, for high temperatures, the errors are small. There is a simple intuition behind this result, given the approximation in which we are doing the computations. As explained in Appendix B, to compute the microcanonical degeneracy Ω(E T , N ) for the case of N fields, we need to perform a difficult sum over different partitions. To compute it, we use a saddle point approximation. The physics behind the saddle point approximation, expected to be valid for E T N π/L, is that the problem at hand is equivalent to N single scalar field theories, each one with a total energyĒ T = E T /N . In this approximation we can then use the results of the single field model with total energy given byĒ T . Formula (3.20) then provides an error of order in a pn ā E T = N in a pn a E T , and indeed, in the limit E T N π/L, the error is small. We want to remark here that one should not believe (3.51) at all energies, since the previous formulas for the multiplicities are only valid in the limit E T N π/L, or equivalently, for high temperatures T L 1. Naively it might seem that thermal physics should not be valid for E T N π/L, but this is not the case. For smaller energies it is difficult to do the exact computations. But to see that thermal dynamics is still a good approximation, and compute the deviations from it, we can consider the specific case in which the total energy is given by E T = π/L. The only Fock states with such an energy are the ones with one particle excited in the lowest momentum mode. There are N of such states, one per field specie. Therefore, using the previous random machinery, the typical reduced density matrix of the lowest momentum mode of a single field is given by This is just a mixed state, with probability equal to N −1 N for specie 1 to be in the vacuum and probability equal to 1 N of having its lowest momentum mode occupied. Matching with the associated Gibbs ensemble is not as transparent as before. One possibility is to match the first probability exactly, from which we obtain β = L π log N . With this effective temperature the other probability is given by 1 N (1 − 1/N ). We see that the deviation from thermality is of O(1/N ) in this specific case, so we conclude that thermal dynamics is still a good approximation to the expected unitary microscopic result in the large N limit. At any rate, it seems there might be a qualitative difference in the analysis when going from energies E T N π/L to E T N π/L, or equivalently from high temperatures to below a critical temperature T crit ∼ 1/L. Assuming that our results would also be valid for a CFT on a circle, and the field theory at large N is dual to a gravitational bulk with AdS 3 radius proportional to L, then this would signal the Hawking-Page transition [22], seen in the field theory in the size of deviations from thermality. This might have interesting implications in the context of AdS/CFT [23]. The regime we are working in corresponds to the black hole phase. Indeed, the black hole corresponds to a highly excited state in the CFT, so we require a large conformal dimension ∆. Furthermore, large central charge is required for the bulk theory to be weakly coupled, and finally, we require ∆ c or T T crit to be well above the Hawking-Page transition point.
Moving forward, the average entanglement entropy of the single momentum mode n a is given by (3.54) As stated before, this is a complicated sum for which we do not have a definite analytic expression. For the case of n a = n max one can explicitly do the sum, since there are only two terms, i na = 0 and i na = 1. The result is where in the second line we took the leading term in the limit n = n max → ∞. We see that including more field species does not spoil the appearance of exponentially suppressed entanglement entropies in momentum space. We remark again that we find it very unprobable that geometric quantities, such as geodesic lengths, can capture such physics, since the minimum proper lengths are of O(1) in Planck units. Other aspects of the entropy behavior do not change drastically either from the single field case. This is easily observed by noticing that the thermal entropy of a given momentum mode, whatever the field specie, is left basically unchanged by the inclusion of more field species. The only change comes again by the total energy associated to a given field, and therefore the effective temeprature associated to each mode. For N fields, the total energy given to one field isĒ T = E T /N , and therefore the temperature T = Integrating out all momentum modes higher than a certain critical momentum p n is also possible. The associated constrained multiplicity can be computed in the same way as the previous ones, and an analogous formula to (3.26) can be obtained. Therefore, one can approximate the typical reduced density matrix with the Gibbs ensemble in the same previous limit. We will not repeat the procedure here, since it is straightforward and those results are not affected by the inclusion of more fields.
Besides, the analogous of Page curve in this QFT setup is not going to change either. To observe this, notice that up to subleading corrections in the thermodynamic limit, we can use the thermal entropy for the low-energy modes. This is simply the sum of the thermal entropies of the individual modes of every field specie where we remind that β = πN L 6E T . Due to the simplicity of the previous relation, the critical momentum p n for which the thermal entropy of the reduced subsystem is precisely half of the total entropy does not change when considering N fields. Hence, the critical momentum dividing the QFT in two halfs maximally entangled with each other is again giving a different interpretation to temperature in the QFT.

Conclusions
The use of random unitary dynamics is having a strong impact in the physics of black holes and in the context of quantum thermalization, see the nice recent reviews [11,12] and references therein. Probably the most famous example comes when considering finite dimensional Hilbert spaces, and when using the Haar ensemble of random unitary matrices. This example was first studied in [13], in which the typical properties of reduced density matrices were derived. Later, in [5], their results were used to compute the average entanglement entropy of a given subsystem.
In this article we have taken some steps towards generalizing the previous random unitary framework, so that it can be applied to any physical system and phase of it 13 . We formulated the random unitary framework in Fock space. If unitary evolution constrains the state of the system to live in a specific subspace of the Hilbert space, defined by some set of charges, the typical properties of the subspace can be studied using the simple Gaussian relations (2.2) and (2.3). This Gaussian ensemble is not precisely the Haar ensemble, but deviations between both of them are exponentially suppressed. Indeed, just with those Gaussian relations we were able to rederive the old results of [13,5] in Appendix A.
Formulating the problem in Fock space makes the study of reduced dynamics very natural. Generically one can choose any subset of particles, charcaterized by several quantum numbers, and integrate out all the others. The main results of the article are equations (2.12) and (2.13). These are generic formulas for the typical reduced density matrix and the typical entanglement entropy associated to it. They are valid for any desired subsystem in any theory. They are written just in terms of 'constrained multiplicities' Ω(E T , Q r T , i a , i b , · · · , i c ), i.e. the number of configurations with a given set of constraints. In this way we connect modern approaches to quantum thermalization concerning quantum information theory, and in particular quantum entanglement, with more traditional microstate counting apporaches.
The standard deviations of both formulas are computed in Appendix A. For the case of entanglement entropy we developed a new method to compute the deviations. The method directly connects the differences between Page's curve and the exact thermal answer to the well-known Wigner's semicircle law for the eigenvalue statistics of the ensemble of Gaussian Unitary Matrices (GUE) 14 , see Appendix A. This is very satisfactory, since it connects in a transparent way the more traditional approach to quantum chaos provided by random matrices and gaussian ensembles, see [24] for a nice introduction to the subject, to more modern approaches based on entanglement entropy.
Finally we applied the generic formulas for a massless scalar field theory in a finite segment, and for its generalization to the case of N field species. In this context we arrived at the following results: • We showed the emergence of Gibbs dynamics with the correct effective temperature as a function of the total energy of the state. Any desired temperature can be produced, just by varying the total energy. The equation of state of the QFT is seen as the typical macroscopic configuration of the underlying microscopic quantum state.
• The typical density matrix is not equal to the thermal density matrix, but to the microcanonical density matrix. Both are seen to be equal just in the strict thermodynamic limit. This is a huge difference with the old model described in [13], in which typicality is equal to thermality.
• The deviations from thermality are of O(1/S) instead of O(e −S ). These are much bigger than expected, and it seems just due to energy conservation. This result might imply that information from black hole evaporation can be extracted within perturbation theory.
• We find exponentially suppressed entanglement entropies in the random state. The smallest ones are of O(Se −S ). In the context of holography, we do not expect this to be captured by geometrical bulk quantities, such as geodesic lengths, since these can only measure entanglement entropies with a minimum size of O(1).
• Drawing the analogue of Page's curve in the QFT scenario, as a function of the energy scale, provides a sort of entanglement running for the random state. The critical momentum dividing the QFT in two halfs maximally entangled with each other is given by p crit = T log 2, giving a different interpretation of temperature in the quantum theory.
We want to finish with some outlook. Since the generic equations (2.12) and(2.13) apply to any theory with a Fock basis structure, such as a QFT or certain spin systems, there are two interesting avenues for future research in this regard.
The first one is to apply the formulas to integrable systems, and see whether Generalized Gibbs Ensembles can be reproduced naturally, and what are the deviations from them. This might ultimately provide some more insight into generic differences between integrable and non-integrable quantum theories. Besides, although Generalized Gibbs Ensembles seem to fail in some situations, see [7], the random unitary ensembles we used in this article may represent better the physics of the integrable system.
The second one is to apply the framework to holographic field theories, beyond what is done in Section 3.2. The previous formulas seem a rigorous framework to study entanglement between infrared and ultraviolet domains at finite temperature. An exciting possibility is that it could turn into a workable route to extract physics from the near horizon region of quantum black holes. Aspects of these near horizon regions might be encoded in the structure of the Fock space entanglement of the field theory. At any rate we expect this framework to give more insights into the connections between entanglement and quantum gravity [19,20].

A Average entanglement entropy vs entanglement entropy of the average
One of the main objectives of this article has been to give a generic expression for the average entanglement entropy of a given reduced subsystem in a random state. Mathematically we needed to compute the following average We will now develop a novel method to derive the average. We will show that the method is able to reproduce Page's result as a specific case, but it is otherwise generalizable to our QFT framework as well.
Let us consider first the classical analogue of the previous problem. This will not be just a pedagogical exercise, since we will need the results of this classical case for the more complicated quantum one.
Consider a random probability distribution p i , for i = 1, · · · , n. With this we mean that each entry p i is an independent random variable with certain probability distribution itself, with the only constraint of global probability normalization i p i = 1. The average over the randomness in the probability distribution itself will be termed by [· · · ]. For example, the moments of p i are [p m i ] , (A.2) for m = 1, 2, · · · . The mean will be denoted simply by [p i ] =p i . The fact that they are independent random variables implies Given that the mean of a probability distribution cannot be zero, it is more interesting to work with the random variable δp i = p i −p i , the moments of which can be easily obtained from the moments of p i . In this context, problems naturally appear when we want functions of p i . The average value of a given function f (p i ), with respect to a single realization of p i , will be represented But since the probability distribution itself is a random variable, we have that [f (p i )] p is also a random variable, and we are more interested in its moments rather than in the variable itself. In particular the mean is given by To compute the average of the function we can proceed iteratively, by Taylor expanding the function in terms of the deviations from the mean. For example, in the case f (p i ) = − log p i we are computing the average of Shannon's entropy: which can be Taylor expanded as Since [δp i ] = 0 by construction, up to higher order terms in the Taylor expansion, the average Shannon's entropy is given by where DEA stands for Deviation from the Entropy of the Average, a quantity that needs to be computed to make sure the first term is the leading term in the themodynamic limit, and for other potential applications as well. We want to remark here, as it is more transparent now, that DEA is not a priori related to deviations from some definition of thermality. We want to make the case that there is a conceptual, precise and computable, difference between DEA, the Standard Deviations from Typicality σ 2 (S(p)), and Deviations from Thermality DT. Indeed one could also be interested in the average deviation of a typical realization from the mean [S(p)] 15 . This is the standard deviation from typicality To compute it, we need to find the first term in the sum using again the previous expansion method. We obtain (A.10) The average of this expression is given by and combining formulas we finally obtain: where the dots indicate higher order terms in δp.
We have written the average entropy (A.8) and the standard deviation from it (A.12) in terms of [(δp i ) 2 ]. If we could obtain this quantity for the problem of interest we could obtain the average quantities.
Let us come back now to the quantum case. As with the classical case we will call [ρ] =ρ. For all cases discussed in this article we have (ρ) ij = (ρ * ) ij = (ρ) ii δ ij , i.e the typical density matrix is a real and diagonal matrix. We follow the same procedure and change the integration variable, from the state ρ to the deviation from the mean: so by construction we have [(δρ) ii ] = 0 , (A.14) where ii are the matrix indices, and also We conclude that the reduced density matrix will be some typical density matrix plus some random matrix, with a structure that might depend on the case studied. Finally, we can translate the average over ρ to the average over its random eigenvalues p i : (A. 16) and therefore, at first order, over the random eigenvalues δp i of the random matrix δρ: In this way, if we know the statistics of the random matrix δρ and the statistics of its eigenvalues, we can compute the average value of the entanglement entropy and its standard deviation using formulas (A.8) and (A.12). We will compute it for bipartite systems next.

A.1 Random dynamics, GUE ensembles and Page's formula
Following Ref [5,13] we consider a bipartite system, with total dimension AB, where A is the dimension of subsystem A, and B of subsystem B. If |i , with i = 1, · · · , A, is a basis of A and |α , with α = 1, · · · , B, is a basis of B, the state of the system can generically be represented as The application of a random unitary to the previous state can be defined as in Section 2, except that we let it act in the full Hilbert space instead of the subspace of fixed energy. It reads At first order, the random unitary just produces AB independent gaussian random variables with squared deviation given by Λ. To fix this deviation we just fix the average normalization of the state We can now trace out subsystem B. The reduced state for subsystem A is given by We can now derive the statistics of the reduced state from the previous statistics (A. 19). The average is given by so the average density matrix is exactly equal to the thermal density matrix at infinite temperature. The mean probabilities are given byp i = 1/A for each i. We remark here that this is not generic, as we have shown in the article. When charges are conserved the average density matrix is naturally the microcanonical density matrix, and therefore it is close to the thermal density matrix, but not equal. In such cases computing the deviations from the entropy of the average is not as important, because the leading behavior of the deviations from thermality are just given by using the entropy of the average. The deviation from the mean is seen to be Therefore, the statistical properties of δρ A = ρ −ρ A are just given by [(δρ A ) ij ] = 0, and more significantly We conclude that the statistics of δρ A are those of a random matrix belonging to the Gaussian unitary ensemble (GUE), with σ 2 (δρ A ) = 1 A 2 B . Given this insight we can use one of the basic results in random matrix theory, see [24] for example, which is the single eigenvalue probability distribution f (λ). The function f (λ) is the celebrated Wigner's semicircle law. For a matrix of size N , with deviation σ for each of the entries, it is given by which is, to the level of accuracy in which we are working, Page's result for the average entanglement entropy [5]. In this case, because the mean density matrix is exactly equal to the thermal density matrix, these DEA deviations are equal to the deviations from thermality DEA = DT = A 2B , a feature which will not extend to the QFT case.
A nice feature of this framework is that it can be immediately extended to compute the deviations from typicality, which are given by relation (A.12) The standard deviation from typicality is therefore parametrically smaller than the previous DEA = A 2B .

A.2 Average entanglement in a QFT framework
The extension of the previous framework to the QFT case does not entail any more conceptual insights. It just brings some technical complexity, because the structure of the reduced density matrix is not as simple as the previous case. But the final result turns out to have the same form, i.e we will obtain DEA = A 2B for an appropriate A and B that we determine below. What is different is that for the QFT case we have DEA = DT, since the typical reduced density matrix is not equal to the thermal one. Therefore we rigorously justify the use of the entropy of the average through the article, since it provides the leading difference from thermality in scenarios in which several charges are conserved.
Let us first consider the case of a reduced density matrix for one particle type a. This case is particularly easy since the reduced density matrix is already diagonal before averaging, see (2.7). The diagonal entries, or the probabilities, are given by We can therefore compute the DEA directly from (A.8). One easily finds The deviation from the entropy of the average is then hence exponentially suppressed in the entropy S. Now we generalize to any desired subset of particles a, b, · · · , c. This is more difficult, since the reduced density matrix is no longer diagonal before averaging. To derive the deviations from the entropy of the average we have to understand the structure of the reduced density matrix. The reduced density matrix is given by formula (2.11), which we rewrite here where α labels the set of particles which are traced out. Although this is not a diagonal density matrix we notice that it has a certain block diagonal structure, due to energy and charge conservation. The entries outside the blocks are direcly zero, without the need of averaging. The blocks are defined by their total energy E ia,i b ,··· ,ic The statistics of the reduced density matrix are its averagē |i a , i b , · · · , i c i a , i b , · · · , i c | , (A.34) and the deviation from the mean, whis is given by: where we have used a short hand notation for i ≡ i a , i b , · · · , i c , etc, and where it is implicitly assumed that i, i , j, j belong to the same block, since in a different case the entries are directly zero.
From the previous expressions we can compute the statistics of δρ a,b,··· ,c = ρ a,b,··· ,c − ρ a,b,··· ,c . By construction [δρ a,b,··· ,c ] = 0. The deviations are given by: We conclude that δρ a,b,··· ,c is a block diagonal matrix, with each block B i characterized by its total energy E ia,i b ,··· ,ic T and its total charge Q ia,i b ,··· ,ic T , and where each block is a random matrix taken from GUE with size . If δp i are the eigenvalues of the block B i , the contribution to the deviation from the entanglement entropy of the average of B i is (see formula (A.8)) . (A.37) The total deviation between the average entropy and the entropy of the average is finally . (A.38) Although this expression seems opaque at first sight, it is exactly the same expression as for the Page's case. Notice that i S 2 B i counts the number of non-zero terms in the reduced density matrix, which for one particle subsystems is just i max a + 1. There is an analogous expression for the reduced density matrix of the complementary subsystem, say ī S 2 Bī . The direct product of these two numbers must be the total non-zero entries of the full density matrix, so we have i S 2 We conclude that i S 2 B i = A provides the right dimension of the chosen subsystem, while for the complementary we have ī S 2 Bī = B. Relation (A.38) is therefore exactly equivalent to relation (A.28). Finally, the deviations from typicality can be computed using (A.12) in a similar fashion.
We conclude that the difference between the average entropy and the entropy of the average density matrix behaves in the same way as the common Page's case [5], i.e it is of O(e −S ) for small subsystems with dimensions of O(1), and it is of O(1) for subsystems with dimensions of O(S).
On the other hand, the deviations from thermality are naturally defined by the difference between the thermal entropy and the average entanglement entropy In Page's case, the average reduced density matrix is equal to the thermal density matrix, and therefore the first two terms cancel each other, leaving DT = DEA = A/2B. In the QFT case, since for small subsystems we already have S β − Sρ ∼ (1/S), we do not need to consider the subtle DEA, since the leading corrections are already accounted by the entropy of the average density matrix.

B Partitions and constrained partitions
One of the main messages developed in the article concerns the direct connection between typical reduced dynamics and multiplicity counting. In particular the computation of reduced density matrices and their associated entanglement entropies boils down to the computation of microcanonical and constrained degeneracies. In the specific cases considered in this article this turns out to be possible, and we provide the detailed process in this appendix.
In the case of the real scalar field on a lign segment, we need to compute Ω(E T ), the number of states at energy E T , and Ω(E T , i n ), the number of states at energy E T with i particles with momentum p n . Ω(E T ) was seen to be the number of different partitions p(n) of the natural number L π E T , which is given asymptotically by In the same way, Ω(E T , i n ) is the number of different partitions p(n , i n ), of the natural number n = L 2π E T , in which n appears i n times. To compute it, notice the following theorem due to Euler: which provides a generating function for the number of different partitions p(m). This is because P (x) = (1+x+x 2 +· · · )(1+x 2 +x 4 +· · · )(1+x 3 +x 6 +· · · ) · · · (1+x n +x 2n +· · · ) · · · , (B.3) and picking a monomial in the k-th part looks like x i k k for arbitrary positive integers i k . In the product P (x), we get all monomials of the form x i 1 1 x i 2 2 x i 3 3 · · · = x i 1 +2i 2 +3i 3 +··· , which produce all possible partitions of the integer m = i 1 + 2i 2 + 3i 3 + · · · ni n , (B.4) with n ≤ m and i m = 0 or 1.
To fix the number of times a given number n appears in the partition, we just need to fix the monomial in the corresponding parenthesis, i.e we fix and select in the n-th bracket only the monomial x inn . This way, we get the partition of m in which the number n appears i n times for fixed (n, i n ). Hence, the generation function of p(m, i n ) is given by 17 Q in (x) = (1 + x + x 2 + · · · )(1 + x 2 + x 4 + · · · )(1 + x 3 + x 6 + · · · ) · · · × (1 + x n−1 + x 2(n−1) + · · · )x inn (1 + x n+1 + · · · ) · · · = m=∞ m=0 p(m, i n )x m . (B.5) In fact, the sum starts only from m = i n onwards, so p(m, i n ) = 0 for m < i n , which is also obvious from the definition of the partition. The previous expression implies For m − i n n < n, the identity is still true but without the second term. We now finally obtain Ω(E T , i n ) = p( L π E T , i n ) = p L π E T − i n n − p L π E T − n(i n + 1) , (B.8) a result that was used in Section 3.1.
One of our main result in the text is the sub-leading corrections to the thermal entropy, given in (3.20). It is easy to proof this formula using the Hardy-Ramanujan asymptotic formula for the number of partitions, 3 . (B.9) 17 The partition p(m, i n ) should not be confused with the partition denoted by p(m, n), which is the partition of m with largest part n. Higher order corrections are of order a 2 /N 2 or more. One would be inclined to also drop the subleading term in the exponent, such that the only next to leading order term is the a/N term in front of the exponent. This would be fine, unless a ∼ √ N , which is what we actually consider in the main text (when the momentum mode has energy of order T ). In that case, a/N ∼ 1/ √ N is still small, but expanding the exponential now gives subleading terms that are of the same order as a/N . The terms don't cancel out each other but merely change the coefficient in front of a/N . So we conclude that P (N, a, The corrections are hence of order a/N , and if we take a ∼ √ N (and similarly for b), the exponent is of order unity, and the total error scales like 1/ √ N . In the main text, 1/ √ N ∼ 1/S, so the error is inversely proportional to the entropy. Generalizing this procedure to include more constraints is straightforward. We could be interested in Ω(E T , i n , i l , · · · ), the number of states with i n units of momentum n, i l units of momentum l, etc. This is then equal to p(m, i n , i l , · · · ) the number of partitions in which n appears i n times, l appears i l times, etc, and where m = L 2π E T . We get this number from its corresponding generating function: (B.13) The previous formula can be applied for example to compute the entanglement between high energy momentum modes and low energy momentum modes. If we trace out momenta bigger than p n , we need to analyze the following generating function: (B.14) To convert this to a generating function, we use the following identity: (1 − x)(1 − x 2 ) · · · (1 − x n ) = 1 − This provides a very complicated expression for the number of modes with i 1 particles with momentum p 1 ,i 2 particles with momentum p 2 , etc., which is given by: l . (B.17) Although this seems an opaque expression, it can be matched exactly with predictions from the Gibbs distribution, as was explained in a previous section. Now we describe the computations in the second case, a theory of N scalar fields. We will proceed in much the same fashion as the previous case. First we compute the generating function Q N (x) of the number of ways p N (n), with n = LE T /π, of writing: It is simple to observe that: ) This is just seen by expanding each of the fractions, and verifying that the products provide all terms in (3.3). Therefore we obtain: p N (n) = m,r,··· ,l p(m)p(r) · · · p(l)δ n,m+r+···+l , (B.20) where the number of indices m, r, · · · , l is equal to N , and p(n) is the usual number of partitions. The previous sum can be evaluated by a saddle point approximation, when n N . Physically, the saddle point approximation is just the statement that with overwhelming probability the total energy E T N π/L will be equally distributed over all N scalar fields. Mathematically we have m = r = · · · = l = n/N , and the previous degeneracy reads: p N (n) (p(n/N )) N → λ N e 2π where λ = N 4n √ 3 . In this n N limit, the entropy now provides the usual Cardy formula: S 2π 1 6 cn max , (B.22) Finally, the generating function Q N iq a (x) of the number of states of total energy E T and i qa units of momentum q a associated to the field a is given by