Spread of entanglement and causality

We investigate causality constraints on the time evolution of entanglement entropy after a global quench in relativistic theories. We first provide a general proof that the so-called tsunami velocity is bounded by the speed of light. We then generalize the free particle streaming model of arXiv:cond-mat/0503393 to general dimensions and to an arbitrary entanglement pattern of the initial state. In more than two spacetime dimensions the spread of entanglement in these models is highly sensitive to the initial entanglement pattern, but we are able to prove an upper bound on the normalized rate of growth of entanglement entropy, and hence the tsunami velocity. The bound is smaller than what one gets for quenches in holographic theories, which highlights the importance of interactions in the spread of entanglement in many-body systems. We propose an interacting model which we believe provides an upper bound on the spread of entanglement for interacting relativistic theories. In two spacetime dimensions with multiple intervals, this model and its variations are able to reproduce intricate results exhibited by holographic theories for a significant part of the parameter space. For higher dimensions, the model bounds the tsunami velocity at the speed of light. Finally, we construct a geometric model for entanglement propagation based on a tensor network construction for global quenches.


I. INTRODUCTION AND SUMMARY
Understanding the evolution of quantum entanglement in non-equilibrium processes such as thermalization is a question of much interest. Entanglement could reveal quantum correlations not easily accessible by other observables such as thermodynamic quantities or correlation functions, and thermalization provides a dynamical setting to study the generation and spread of entanglement between subsystems.
A simplest physical context to probe this question is the evolution of entanglement entropy S Σ (t) after a global quench in a conformal field theory (CFT), where one injects a uniform energy density in a very short time interval at t = 0 and then lets the system evolve.
Here Σ denotes the entangling surface, whose characteristic size R will be taken to be much larger than the inverse equilibrium temperature, i.e. R 1/T . Recent studies of evolution of S Σ (t) for quench processes in (1 + 1) dimensions as well as in holographic systems of general dimensions have revealed a "universal" linear regime [1-3] 1 ∆S Σ (t) = v E s eq A Σ t, R t eq . (1.1) Here ∆S Σ (t) is the change of S Σ (t) from its value at t = 0, A Σ is the area of the entangling surface Σ, and s eq is the equilibrium entropy density. eq ∼ 1/T is the local equilibration time. More explicitly, we expect for a region A whose size R A is comparable to, but larger than eq , that its entanglement entropy saturates at the thermal value after the local equilibration time, i.e.
S A = s eq V A , for R A 1 T and t eq (1. 2) with V A the volume of A.
In (1.1) v E is a constant of dimension velocity and depends on macroscopic properties of the state. With the speed of light set to c = 1, for (1 + 1) dimensions it was found [1] that v E = 1, d = 2 , (1. 3) while for higher dimensional holographic systems at zero chemical potential [2,3], dubbed an "entanglement tsunami" [3] with v E interpreted as the velocity of the tsunami wave front. It was further observed in [3] that for holographic models after local equilibration the normalized rate of growth (1. 5) 1 See [4] for a proposed theory explaining this behavior. appears to be bounded by (1.4) (1. 6) In the linear regime (1.1), R Σ is constant given by v E , but in general it can have a complicated time dependence. Note that dS Σ /dt cannot be compared meaningfully across different systems or regions of different size as it generally scales with the geometric size of Σ and the number of degrees of freedom of a system. R Σ was designed to provide an intrinsic measure for the rate of growth. With dimension of velocity, we expect that in relativistic systems R Σ should be constrained from causality by some multiple of the speed of light. 2 The simplicity and universality of the linear growth (1.1) begs for an underlying physical mechanism. In particular, it would be desirable to relate v E and R Σ (t) to the speed of light, and to develop some intuition about the physical origin of the value of v E in (1.4).
In this paper we first derive a formula which relates the tsunami velocity v E to the mutual information of certain spacetime regions. The positivity of mutual information then leads to a proof that in relativistic theories v E is bounded by speed of light in all dimensions, i.e.
v E ≤ 1 , (1. 7) although, as we will discuss in later sections, likely for d > 2 the inequality cannot be saturated. A different proof of (1.7) has been found by T. Hartman [7], which uses the monotonicity of relative entropy.
We then consider various explicit models for entanglement propagation. Calabrese and Cardy [1] proposed a simple free particle streaming model to explain the linear behavior (1.1) in (1 + 1) dimensions. In this model, the injected energy density due to a global quench at t = 0 is assumed to create EPR pairs of entangled quasiparticles which subsequently propagate freely. At t = 0, entanglement correlations among quasiparticles are assumed to be local, 3 which eventually spread to large distances via free propagation of quasiparticles.
In this model (1.3) comes from quasiparticles traveling at the speed of light. That (1.1) can be captured by such a simple model is remarkable and surprising. It appears to indicate that interactions do not play a role in the growth of entanglement, with the long-range entanglement of the final state solely coming from the spread of initial short-distance correlations.
An indication that this success is likely an accident is that for more than one intervals, the model fails to reproduce the qualitative behavior of both holographic and CFT results [8][9][10].
The free streaming model can be generalized straightforwardly to a more general entan- where V A , VĀ denote the volume for A andĀ (complement of A) respectively, and s is a constant (which depends on the specific system). The measure is motivated from the result 3 At the onset of linear regime, the scale of entanglement correlations should be controlled by 1/T . It is a good approximation to treat them as strictly local when considering regions with R 1/T . 4 We study these aspects in detail for various choices of µ[A] in Appendix B. For example, while one again finds linear growth (1.1) at early times, the tsunami velocity v E now depends on µ[A].
of [11] where it was found that the average entropy for a subsystem (with size smaller than half of the total system) in a random pure state is to a very good approximation given by its size. The rough intuition behind RPS is that for a sufficiently "equilibrated" system, all degrees of freedom are entangled with one another in an equal way, and thus the entanglement entropy of a subsystem is proportional to the number of degrees of freedom it contains.
Using the strong subadditivity property one can show that the RPS measure in fact provides an upper bound for the propagation of entanglement among all free streaming models, leading to an upper bound for R Σ (t) (thus an upper bound for v E ) . This then motivates to introduce interactions in quasiparticle propagation. With interactions we then immediately face the problem of characterizing the quantum state of an interacting many-body system. Instead of confronting this very difficult problem directly, here we seek a qualitative understanding of how linear growth can arise in an interacting system and how interactions can enhance the spread of entanglement. For this purpose, we will consider the infinite scattering limit, i.e. we assume that scatterings among constituents of a system are so efficient that the typical scattering time can be taken to be zero compared to the scales of entangling region size R and time t of interests. Holographic systems after a quench in the limit of equilibrium temperature T → ∞ can be considered as an example, as there the local scattering rate is controlled by T . In the infinite scattering limit, the evolution simplifies as interactions do not introduce additional scales into the problem.
We propose a very simple model which applies the RPS measure (1.8) to certain spatial regions determined from the causal structure associated with the entangling surface. The use of RPS measure is natural in the infinite scattering limit as in this limit interactions are extremely efficient in redistributing entanglement.
The model, to be referred to as the maximal RPS model, appears to capture gross features of entanglement spread, including the linear growth, of holographic systems in the T → ∞ limit. In fact in d = 2, it does much better than expected. 5 Applying it to a single interval in d = 2 we again find (1.3). For two intervals, the model precisely recovers holographic results. For three and four intervals, we find the model (and its slight variations) reproduces intricate entanglement patterns exhibited by holographic systems for a significant part of parameter space. For general d > 2, this model gives in all dimensions v E = 1.
We also show that the failure in reproducing holographic results for certain regions of parameter space for three and four intervals in d = 2, as well as v E = 1 for d > 2 can be attributed to the fact that our model can violate the strong subadditivity condition and/or does not take full account of causality constraints. Given the already remarkable success of the maximal RPS model, in an attempt to better understand how quantum information is organized in holographic systems, we construct another model for the evolution of entanglement entropy that is inspired by the tensor network description of state evolution after a quench. This interacting model satisfies many natural geometric criteria on the entropy function, including strong subadditivity, and reproduces the holographic results for multiple intervals in d = 2. However, it does not provide a stronger constraint than v E ≤ 1 on the tsunami velocity for d > 2. It would be very interesting to find further criteria on the entropy function for the time evolution generated by a local Hamiltonian.
The plan of the paper is as follows. In the next section we provide a proof that the tsunami velocity is bounded by the speed of light. In particular, we derive a formula which relates the tsunami velocity v E to the mutual information of certain spacetime regions.
As a preparation for later discussions, in Sec. III we introduce the RPS measure for the entropy of subsystems and show that it is an upper bound for the entropy of systems of particles homogeneous in space. In Sec. IV we introduce free streaming models of entanglement propagation and discuss some explicit examples. A general upper bound on ballistic propagation of entanglement in these free streaming models is proven in Sec. V. The proof demonstrates that these models cannot account for the velocity of entanglement propagation in holographic theories. In Sec. VI we present an interacting model and apply it to various examples. It indicates that interactions can increase the tsunami velocity in higher dimensions and gives an upper bound for this velocity in relativistic theories. We also discuss shortcomings of the model. In Sec. VII we develop yet another description of entanglement evolution that is inspired by tensor network constructions. This model has the advantage of satisfying the strong subadditivity constraints and reproducing the right entanglement pattern for holographic theories in d = 2. In this context we also show the holographic result gives an absolute upper bound on the entropy after a quench for relativistic theories in d = 2.
In Sec. VIII we further examine a relation discovered in Sec. II between tsunami velocity and mutual information in the context of various models discussed in earlier sections. Some more technical results and proofs are given in the appendices. In particular in Appendix B we work out many aspects of ballistic propagation of entanglement in general dimensions, including mutual information and finite volume effects, for different entanglement patterns of the initial state.

II. PROOF OF A GENERAL UPPER BOUND ON THE TSUNAMI VELOCITY
For d > 2, so far the linear growth (1.1) has only been established for holographic systems [2,3]. In particular, both the linear growth (1.1) and tsunami velocity v E are independent of the shape of entangling surface Σ [3]. Assuming this behavior persists for general systems, here we prove that that the tsunami velocity v E is bounded by the speed of light.
Since (1.1) is shape-independent, it is enough to consider Σ a straight hyperplane, i.e. the entangling region is a half space. Let us consider at time t a half-space W (t) whose boundary is perpendicular to the x 1 -direction and another half space W (t + δt) at time t + δt. Since entropy in the quench evolution is translation invariant, we can take W (t+δt) infinitesimally displaced in the x 1 coordinate such that the boundaries of these two regions W (t) and W (t + δt) are connected by a strip of a null plane X (see Fig. 2). Consider the mutual information between W (t) and X where in the second equality we used S(W (t) ∪ X) = S(W (t + δt)) and in the third line the linear regime formula (1.1) for S(W (t)) and S(W (t + δt)).
Here we are interested in an excited state with vacuum entanglement subtracted, thus all quantities in (2.1) should be understood as so.
and from non-negativity of the mutual information It should be emphasized that here we are saying the mutual information with the vacuum part subtracted should also be non-negative. This can be justified since we are considering only contributions linear in s eq . We can take s eq large so the term proportional to s eq in the mutual information has to be non-negative by itself. 6 A half space at time t and another at time t+δt. The null region X connects the boundaries of these two regions. The horizontal direction is x 1 and vertical direction is time. Directions parallel to the boundary are not shown. 6 Note that in the vacuum, there are divergences associated with the sharp corner between W (t) and X, which could ruin an inequality like (2.3). In the same spirit, peculiarities for the entropy of sharp null surfaces in interacting theories [12] do not apply here since in the present context all geometries can be thought of as having a minimum curvature radius 1/T . respectively. Then, because of the monotonicity of mutual information, the mutual information between the null strips A and B must also vanish Now given that S(A ∪ B ) = S(C) and that by symmetry the entropy in the two null strips is the same, from (2.6) we find that i.e. the entropy of a small null strip is equal to the thermal entropy of its projection to a constant time slice. Plugging (2.7) into (2.3) we then get which proves that for any relativistic system v E must be bounded by the speed of light. A different proof of (2.8) has been found by T. Hartman [7], which uses the monotonicity of relative entropy.
Plugging the expression for S(X) into (2.2), we also find Equation (2.9) is an instructive formula which relates the deviation of a tsunami velocity from the speed of light to the entanglement between W (t) and the null surface X. In particular, it says for v E to be equal to 1, W (t) and X have to be unentangled.

III. RANDOM PURE STATE MEASURE
For the rest of the paper we investigate the propagation of entanglement in various free streaming and interacting models. Before doing that, we digress here to discuss the random pure state measure, which will play a crucial role in our later discussions.
Consider a gas of particles on a spatial manifold M in some pure state. We assume that the state is homogeneous, i.e. the particles are uniformly distributed on M. Now consider the entanglement entropy S(A) for a subregion A. We assume that S(A) satisfies the condition that as the size of A goes to zero where s is a constant. We will comment on the motivations for this condition a bit later.
We first show that given (3.1) the entanglement entropy S(A) is bounded by where V A , VĀ denote the volume for A andĀ (complement of A) respectively. We will refer to µ RPS [A] as a random pure state measure. We apply the strong subadditivity condition [13] to regions A, B, C shown in Fig. 4 to get the inequality: Using that B, C are infinitesimal, we then have where we have used (3.1) on right hand side of the equality. This inequality holds for any region A and any other infinitesimal region C outside A. It implies that as we increase the size of a region the entropy cannot increase faster than the volume of the region times s. 7 Therefore, for any region A with V A ≤ V /2 where V is the total volume we necessarily , proving our assertion (3.2). Let us now elaborate on the condition (3.1). Suppose that as V A → 0, S A is instead given For the case of super-volume behavior, α > 1, since α / → 0 as → 0, we conclude that S(A) is identically zero for all A. Clearly, this is unphysical.
For α < 1, for small A, S A can increase as V α A which is faster than (3.1). This is indeed possible, but if this power law behavior continues to large sizes no volume law or equilibrium entropy after a quench can be achieved. In fact, S A /V A → 0 for large volume if α < 1. Hence, our condition (3.1) simply means that for the quenched systems we are 7 Similarly, from S(A) = S(Ā) it also follows that S(A) − S(A ∪ C) ≤ sV C , i.e. as we increase the size of a region the entropy cannot decrease faster than the volume times s. interested in, which have large energy density with respect to subsystems sizes and times involved in the dynamics, we are assuming a local equilibrium has already taken place for small sizes and times of order 1/T . The condition (3.1) and the random pure state measure (3.2) are also reminiscent of [11] where it was found that the average entropy for a subsystem (smaller than half size of the total system) in a random pure state is to a very good approximation given by its size.

A. Setup
Here we consider a model of free propagation of entanglement after a quench at t = 0. We assume the system is homogeneous, isotropic and is in a pure state after the quench.
We take the initial state at t = 0 − as unentangled, and require that in equilibrium (defined as t → ∞) the entanglement entropy is proportional to the volume of a region. We make the following assumptions on the quench and the subsequent propagation of entanglement: 1. The quench generates a large amount of short-distance entanglement correlations which subsequently propagate freely. We will assume that the quench time interval is negligible and the initial entanglement correlations can be taken to be local. In other words, at t = 0, each point acts as an independent source of entanglement correlations, which then spread freely, limited only by causality. At time t, the entanglement relations from x = 0 spread at most to the sphere | x| = t.
2. For simplicity, we will assume that the correlations are concentrated on the light cone. It should be straightforward to generalize the discussion to include entanglement correlations inside the light cone, which we expect not to change qualitatively the physical picture and the upper bound on the propagation derived below. We could also simply postulate a µ[A] which satisfies all the properties of entanglement entropy as in the example of random pure state measure discussed below.
Given that each light cone is independent, the time evolution of the entanglement entropy of the region A enclosed by a surface Σ can be obtained by summing over the entanglement entropy of the parts of the light cones intersecting this region, i.e.
whereĀ denotes the complement of A on the light cone, and the strong subadditivity for any region A and B.
We also assume that measure µ[A] for small A (even with several disconnected pieces) is proportional to the normalized area ξ A of that region, which we already motivated around (3.5) in the last section. More precisely, for a region A included in a spherical cap of angular size ∆θ Here s is a constant and ω A denotes the volume of region A on a unit sphere and ω d−2 is the volume of a unit (d − 2)-sphere. In our current context, we will see in Sec. IV C 1 that imposing (4.4) is equivalent to the requirement of a final equilibrium state with a volume law entropy distribution. In fact, the final equilibrium entropy density is precisely given by the constant in (4.4), i.e. s eq = s.

By definition, µ[A]
and thus s has the dimension of an entropy density, i.e. 1/volume.
Given that s is the only scale of the system, for a scalable surface Σ, on dimensional grounds we can write S Σ in a scaling form where R is a characteristic size of Σ, or equivalently where λΣ is Σ rescaled by a factor λ. This scaling relation is also satisfied by holographic systems in the large size and long time limit, as we will discuss more in Sec. VI B. Equation (4.6) implies that for t small, when S Σ should be proportional to the area of Σ, S Σ must grow linear with t. At large times as t → ∞, if the system has an equilibrium, i.e. f (t/R) has a well defined t → ∞ limit, then s must be proportional the equilibrium entropy density.

B. Some examples
It is instructive to look at some specific examples of µ[A].

Entanglement carried by EPR pairs
One assumes that the quench creates a uniform density of independent EPR pairs of If A consists of a region included in one half of the light cone it immediately follows that where s is a normalization constant and ξ A is the area of region A normalized by the area of the whole light cone (4.4). The normalization constant s can be written more explicitly in terms of the particle density n as where ν the entanglement entropy within the pair from tracing out one of the particles.
For general A, not necessarily included in a half light cone sphere the measure is more complicated and depends on relative locations of different parts of A. It can be written formally as where n and n denote unit vectors on a unit sphere with Ω and Ω the respective associated angular measure. We can give a more compact expression for this measure as follows. We define A as the set of antipodal points in A, that is, A is the set of unit vectors n such that − n ∈ A. Then, as only those quasiparticles in A contribute to the entropy, whose pair (at the antipodal point) is inĀ, we have If A is included in one half of the light cone, A ∩Ā = A, and we get back (4.8).

2m particles forming a GHZ block
The EPR pair example has only bipartite entanglement among particles. We now consider an example with multipartite entanglement. One again assumes that the quench creates a uniform density of quasiparticles, but now particles from each point separate into uncorrelated blocks each of which consists of 2m particles, with m an integer. Within each block, the 2m particles are entangled. To satisfy momentum conservation, for simplicity we will take that the 2m particles within a block come in pairs with back to back momenta.
We consider the simplest entanglement relation that when tracing out any subset of particles within the block of 2m particles, one gets the same entanglement entropy ν. This is motivated by the so-called GHZ state for k qubits 12) which has this property and thus the name for this example. 8 Consider a region A which is included in half of the light cone, then we find that where s is again given by (4.9). To see (4.13), we note that : (i) n 2m is the number of GHZ blocks at each point; (ii) from pairwise momentum conservation there can be at most m particles lying in A; (iii) as far as there are particles lying in A, the entanglement entropy for any particular configuration is always ν; (iv) (1 − 2ξ A ) m is the probability that none of the 2m particles lies within A.
For a general region, we have to calculate the probability that none or all particles are inside A. These probabilities are given by (1 − ξ A∩A ) m and ξ m A∩A respectively. Hence, for any region A we have (4.14) This formula gives back (4.13) for a region included in half of the light cone, as ξ A∪A = 2ξ A and A ∩ A = ∅ for this case. The case m = 1 reproduces the EPR result (4.11) since (ξ A∪A − ξ A∩A ) = 2ξ A∩Ā . The formula (4.14) also obeys (4.2), which is most easily seen by rewriting it using 1 − ξ A∪A = ξ A∪A and A ∪ A =Ā ∩Ā as (4.15)

Random pure state measure
Another measure is the random pure state measure we discussed in Sec. III, i.e.
for any region A and s is a normalization constant. While µ RPS coincides with µ EPR when A consists of a region included in the half sphere, in general (4.11) is clearly different 8 We note that because the tripartite information I 3 ≥ 0 for GHZ states and holographic mutual information is monogamous [14], the GHZ pattern of entanglement cannot be realized using holographic geometry.
Nevertheless, it is a simple example that holographic results can be compared to.
from (4.16). Note that (4.16) only depends on the area of a region on the unit sphere (or its complement) for any A. This is not so for both µ EPR and µ GHZ . As we showed in Sec. III, given (4.4), the RPS measure provides an upper bound for other measures. This will enable to us to establish an upper bound on R Σ (t) and v E in Sec. IV.

C. Evolution of the entanglement entropy
We now use (4.1) to derive some general results for the evolution of S Σ (t). Here we will focus on universal features which do not depend on the specific form of µ[A]. In Appendix B we study many other aspects including mutual information and finite volume effects based on the specific examples of the last subsection.
We will consider a compact surface Σ with a characteristic size R. In other words, R collectively denotes the curvature radii of Σ. The equilibrium value may be defined as As t → ∞, the size of light cones become much greater than that of Σ, i.e. t R. Thus only small fractions of a light cone can be inside Σ. The collection of points whose light cones intersect with Σ will be denoted as N Σ (t) and is given by a shell centered around the region, see Fig. 5. The integration over N Σ (t) can be written schematically as where η denotes directions tangent to shell, while y denotes the integration along the width of the shell as indicated in Fig. 5. Clearly, the precise shape of the shell will depend on the shape of Σ and for an irregular shell there may not be a preferred splitting in (4.18), but as we will see, such details are not important.
Now fix an η and consider the integral (4.1) over y. As we vary over the range of y, the corresponding L Σ ( η, y; t) provides a foliation of region A enclosed by Σ, see Fig. 5. In particular, for t → ∞, L Σ ( η, y; t) corresponds to a tiny part of the light cone from x = ( η, y) and we can approximate where we used (4.4) for infinitesimal regions.
The normalized area ξ(L Σ ( η, y; t)) for L Σ ( η, y; t) can be further written as where a(y) is area of L Σ ( η, y; t). We thus find the integral of (4.1) over y gives where V Σ = dy a(y) is the volume of the region A enclosed by Σ. Note that the above integral is independent of η. Now integrating over η, to leading order in large t approximation we simply get the area of the cross section of the shell, which is in turn given by the area of a sphere of radius t. Such a factor precisely cancels the denominator of (4.21) and we conclude that with the equilibrium entropy density given by For the RPS example s is simply a normalization constant and nothing more can be said.
For the EPR and GHZ examples, the above equation can be further written in terms of the particle density n as which has a simple physical interpretation. Recall that for both examples ν is the entanglement entropy for a single particle when tracing out the others. As t → ∞, the entangled particles are separated by infinite distances and thus the entanglement entropy for any finite region is given by the particle number density times ν. For a generic interacting system the equilibrium entropy density s eq is expected to coincide with the thermal entropy density, and (4.24) is also natural from that perspective.
It is not difficult to realize from this proof that a uniform volume law for infinitesimal regions of any shape on the sphere as in eq. (4.4) is also necessary to get the volume law at late times (4.22).

Early linear growth
Now let us consider the early growth, i.e. for t R. At such times, the radius of a light cone is much smaller than the curvature radius at any point of Σ, see Fig. 6. We can then locally approximate Σ as a straight hyperplane, with translational symmetries along directions tangent to Σ. The integrations in (4.1) can then be factorized into an integral along Σ, which simply gives a factor A Σ (area of Σ), and the relative location y of centers of light cones with respect to Σ in the perpendicular direction. Then the early growth of the entropy will be determined completely by the measure µ applied to spherical caps. Let us introduce the notation explicitly as where the factor of 2 comes from the domain of integration y ∈ (−t, 0) , and ξ(x) is the normalized area of a spherical cap for a unit sphere with angular spread defined by the perpendicular distance x = y/t from the center (see Fig. 6). In (4.26) we used the time independence of the entropy measure on the light-cone. In (4.27) we chose to normalize the quantity by the equilibrium entropy density s eq .
Since (4.27) only involves a region smaller than half the sphere, µ EPR and µ RPS give the . v GHZ In Sec. IV we provide an upper bound for the speed v E for any entanglement measure.

D. Quadratic growth before local thermalization
In [3], it was found that for fast quenches in holographic theories, there is a period of quadratic growth before the linear growth, which sets in only after the local equilibration time t eq . The local equilibration time is defined as the time scale by which local thermodynamics already applies with a thermal entropy density s th , but long range correlations in which we are interested have not been established yet. For strongly interacting systems, various holographic studies [15,16] indicate that the local equilibration t eq ∼ 1/T where T is the final equilibrium temperature, and by this time the entanglement correlations from the quench will have at most spread to a length scale eq ∼ 1/T . For regions with size satisfying RT 1, we can treat t eq and eq as zero, which is essentially what we have being doing so far. In other words, our above discussion should be interpreted as applying only after t eq .
More explicitly, in our setup we have assumed that at t = 0 entanglement measure µ[A] on light cones have already been fully established. We believe this assumption is reasonable only after local equilibrium has been established, i.e. after the quench has finished, it still takes some time for a system to build up the local entanglement measure µ[A], and that time scale can be interpreted to be t eq .
We will now show that with some very simple assumptions, one can easily obtain quadratic growth of entanglement entropy with time. For definiteness of the discussion, we will use the EPR model as an example, although the discussion can be easily adapted for a generic measure µ[A]. Consider equation (4.8), except that now the prefactor s is taken to be a function of time for t < t eq . We will take the simplest possibility: a linear function, i.e.
For t > t eq the discussion is essentially the same as before, recovering the linear growth. For t < t eq , the integral (4.26) becomes In contrast, for holographic systems one finds [3] where is the energy density. This is not that different from (4.31) considering t eq ∼ 1/T and that for a CFT = d−1 d s T .

MENT
In Sec. III we showed that the RPS measure (4.16) is an upper bound for all measures, an immediate consequence of which is an upper bound for the entropy of any region at any moment of time in models with ballistic propagation by an explicit geometric function.
In this section we use this to prove an upper bound on R Σ (t).
First let us look at v E , which only involves spherical caps. Using .

(5.3)
Note that the EPR measure (4.28) saturates the bound. We note that there is an alternative proof of without using (3.2). In fact one can use the strong subadditivity condition to prove a stronger inequality i.e. µ cap (ξ) is a concave function. We give the proof of (5.5) in Appendix A. Because a concave function always lies below any of its tangents, and µ cap (0) = s eq , equation (5.4) then follows.
We now show that the normalized rate of growth of entropy R Σ (t) is bounded by v free E at all times. From (5.1) we get Therefore a bound on normalized grow rate would follow from a purely geometric inequality We now prove this last inequality (5.8). Let us write an integral representation for the normalized area: where Θ Σ ( z) is the characteristic function of Σ, which takes the value 1 for z inside Σ and 0 for z outside of it. The time derivative is given by where σ denote collectively a set of coordinates on the entangling surface Σ, n(σ) is the unit vector normal to Σ, and x(σ) the position vector on the surface. Then we have where we took the absolute value inside the integrals. By performing the integral over x, we can get rid of one of the delta functions, and we obtain By using the rotational symmetry of the integral over y in (5.12), we can make the replacement n(σ) → n, with a fixed (and arbitrary) unit vector n, thus the integral over σ can be where in the last line we recognized (4.28). This completes the proof.
Note that the bound for the growth rate can only be saturated if both inequalities (5.6) and (5.8) are saturated. The inequality (5.6) is only saturated for all light cones, if: (i) the measure is equivalent to the RPS measure; (ii) all light cones whose intersection area with the region A bounded by Σ is less than half the sphere volume are increasing their intersection with time; (iii) all light cones with intersection greater than half the sphere volume are decreasing their intersection with time (the latter two conditions are satisfied by any shape at t = 0). Even for the RPS measure this is not the case for long enough times (except for a planar entangling surface). We also note that R Σ (t) can be negative in some circumstances, as shown in Appendix B 1 d.

VI. INTERACTING MODELS
In this section we present an interacting model. 9 Although the model of [1] captures the time evolution of entanglement entropy for a single interval precisely in two dimensions, in higher dimensions, and qualitative differences also arise for entanglement entropy for multiple intervals in two dimensions [8,9]. Furthermore, in free propagation models the spread of entanglement depends sensitively on the entanglement pattern of the initial state as we have studied in detail in Sec. IV and Appendix B. In contrast, in holographic systems, as emphasized in [3], the linear growth emerges after a system has locally equilibrated with thermodynamical concepts such as temperature and entropy density already applying at scales of order 1/T . This implies that by the time the linear growth emerges, the details of the initial state should have already been largely erased by interactions, and the linear growth must be a consequence of interactions.
An immediate consequence of interactions is that the quantum state of the system can no longer be described as a tensor product of those resulting from each point at t = 0. As a result, our fundamental equation t > R. We will approximate each particle as a qubit and treat the scattering as a unitary transformation. In other words, after scattering, particles 2 and 3 maintain their original directions but the resulting state is related to the product state before the scattering by a unitary transformation. Note that an arbitrary unitary can encode not just forward, but backscattering as well. The latter is implemented by a unitary that swaps the wave functions of 2 and 3.
More explicitly, with the notation the scattering of 2 and 3 can be described by with U ij a unitary matrix. Without loss of generality the state for 1, 2, 3, 4 before the scattering can be taken to be with α parametrizing the entanglement among the pair. The entanglement entropy obtained from tracing out one of the particles in a single pair is The state after the scattering is For different situations depicted in Fig. 8, we are interested in different reduced density matrices: (a) In this case the relevant quantity is S 13 , i.e. the entropy for the reduced density matrix ρ 13 from tracing over particles 2 and 4. We will denote the entropies corresponding to (6.3) and (6.5) as S (i) 13 and S   (e) The relevant quantity is S 3 . When a pair is maximally entangled (|α| = 1), S 3 is not modified for any U , as the entanglement between 3 and others is simply redistributed.
For other values of α, the situation is a bit similar to that of S 13 discussed above in (a). is given in Appendix C.
One should be able to use the above analysis to construct a dilute gas model to obtain quantitative results, as there are only a small number of scatterings. We will leave this for the future. Here we consider a model for the infinite scattering limit.

B. The infinite scattering limit
When including multiple scatterings, one needs to consider states associated with more and more particles and clearly we lose control very quickly. Nevertheless based on intuitions obtained from single-scattering results, we will see that one could still draw some qualitative conclusions about the general interacting case.
On dimensional grounds we can write the evolution of entanglement entropy for a region where T is the equilibrium temperature, s eq is the equilibrium entropy density, and ξ collectively denote other length scales in the system. In strongly interacting systems such as holographic theories, the local scattering rate is controlled by 1/T . For weakly interacting systems, the mean free path is typically controlled by other scale(s) ξ 1/T . To simplify the analysis, we will work in the regime and assume that the scaling function f (4.6) has a finite limit in this case: We refer to this regime (6.7) as the infinite scattering limit. For example, holographic systems after local equilibration are governed by it. Note that (6.8) is of the same scaling form as (4.5) valid in the free propagation case, even though the underlying physics is very different. In free case (4.5) is a consequence of no scattering and s is determined by the initial state, while in (6.8) is essentially a consequence of infinite number of scatterings (as t is infinite compared with any scattering time) with s eq determined by dynamics. As already mentioned below (4.6), the scaling form (6.8) implies that if there is a regime that S Σ is proportional to the area of Σ, then the time dependence must be linear.
We again assume that the quench generates a finite density of identical particles, which then subsequently propagate at the speed of light isotropically, and we allow an arbitrary number of scatterings. We will assume that on average scattering events are isotropic and homogeneous in space, implying that both incoming and outgoing particles are uniformly distributed in all directions. As in the (1 + 1)-dimensional example of last section, scattering events will be treated as unitary transformations on all particles (which are assumed to be identical) that are at the same point at a given time. The labeling of particles after a scattering event is again arbitrary. Given the isotropy we can choose the labeling so that particles will not change directions after scatterings. This means that we can trace the whole spacetime trajectory of a particle from its origin even in the presence of interactions. The ability of tracing a particle trajectory will play an important role in our discussion below.
At any given time, to calculate the reduced density matrix ρ A for A (and thus the entanglement entropy), we need to simultaneously trace over all particles lying outside A, rather than restricting to a subspace like in the non-interacting or single-scattering case. Now note the following: 1. The situation in (b) of last subsection can be immediately generalized to conclude that scatterings that happened in the past domain dependence ofĀ (complement of A), i.e. in D − (Ā) are not relevant, as they amount to unitary transformations in HĀ that do not change ρ A . 10 In Fig. 9, where we depict the time evolution for one interval in (1 + 1) dimensions, these are regions shaded in red.
2. Similarly as in (c) and (d), scatterings among particles which fall inside region A are also not relevant, as such scatterings act by unitary transformation on H A and thus will not change ρ A . In other words, scatterings in region D − (A), shaded green in Fig. 9 can be neglected. Thus particles that spend their whole life in the green region do not give rise to any entanglement. Note that for t > R, the green region no longer intersects with the t = 0 spatial manifold.
3. Situations like (a) and (e) corresponds to scattering between particles one of which falls into A and the other falls intoĀ. We will refer to these as effective scatterings.
As in the single scattering case effective scatterings do affect entanglement.
The above discussion shows that we only need to consider particles originated from the region where M denotes the full spatial manifold at t = 0. 11 Furthermore, only those scatterings of these particles that take place in the white regions in Fig. 9 are relevant. This implies that disconnected regions of N (t) can be treated independently of one another.
In the regime (6.7) all the particles will have scattered essentially an infinite number of times. In such a situation we expect that any memory of the initial state will be forgotten, and we can simply assign a geometric measure for the entanglement entropy. Since now the relevant Hilbert space is that for all the particles in N , we will simply postulate a random pure state measure for the entanglement, i.e. 10) nĀ(x, t) , (6.11) 10 Recall that the past domain dependence D − (Ā) ofĀ is defined as the spacetime region where all futureextended causal curves pass through A. 11 Note that the region N Σ (t) we used in the free streaming model is in general a subset of N (t), see Sec. VI E for further discussion.

P"
A" A" x" where ν eq may be interpreted as average entropy per particle, and N A [N i (t)] is the number of particles originated in N i (t) that fall into A at time t. The number of such particles is given by the integral of the density of particles n A (x, t) that originated from x and fall into A at t. Taking the smaller value of the number of particles falling into A andĀ has the same rationale as the earlier postulate of random pure state, and ensures S A = SĀ. The sum i is over the disconnected components of N , i.e. N = ∪ i N i . As mentioned earlier, disconnected components of N i should be treated independently and thus summed separately.
At small t, the number of N i is always the same as the number of disconnected boundaries of A. As time evolves, different N i can join each other. Eventually all N i 's will merge into a single connected region N (t). At sufficiently late times, for A in an infinite space, it will always be the case that N A [N (t)] < NĀ [N (t)] asĀ is infinite. From homogeneity we can then conclude that S A = s eq V A , s eq = ν eq n, t sufficiently large, (6.12) where n is the number density.
We move on to analyze some examples. For the area of one extremal surface we get the one interval result (6.13) We have to add up these contributions and minimize over the pairings: where σ is a permutation. Note that for different times a different permutation may realize the minimum, which corresponds to the change of dominance of the extremal surfaces in holography.

One interval
Consider a single interval of length 2R in (1 + 1) dimensions, see Fig. 9. For t < R, we have two disconnected N i each of width 2t, and for any point n A = nĀ = n 2 , where n is the total particle density. We then find that S A (t) = ν eq × 2 × n 2 × 2t = 2s eq t, s eq ≡ nν eq , t < R .
For t ≥ R, there is only one connected N , and thus we conclude that We then find that The time evolution of entanglement is in agreement with (6.13).

Multiple intervals
Let us first explain how the calculation goes in an example of two intervals. For definiteness, consider intervals of lengths 2R 1 , 2R 2 separated by a distance L with L < 2R 1 < 2R 2 , see Fig. 10. As in Fig. 9, green light cones indicate D − (A) and red ones D − (Ā). At a given time, indicated by the horizontal blue line we decompose N (t) into disconnected pieces. At the time indicated in the plot we have two disconnected pieces. We then count the number of particles that end up in A and those that end up inĀ for each N i , and take the minimum of the two numbers.
From Fig. 10 and elementary geometric considerations one readily sees that the number of particles from each N i that end up in A andĀ is: and (6.10) can be rewritten in a simpler form: Equation (6.21) was proposed recently in [9] to capture the time evolution of entanglement entropy in holographic systems, motivated from the picture of entanglement tsunami [3]. In this interpretation, N i (t)'s are given by the regions covered by the tsunamis, 12 and one again 12 Entanglement tsunamis originate from the boundaries of entangled regions and propagate in both direc- One can readily check that (6.10) (and equivalently (6.21)) reproduces holographic results for two intervals (6.14) for all parameters, R 1 , R 2 , and L.  13 Even when (6.10) (and (6.21)) deviates from the holographic results, the overall trend is still quite similar, but it can give unphysical answers. We give an example in Fig. 11, where the entanglement entropy develops a discontinuous jump at some time. The mathematical reason for the jump is as follows. Let us denote the time of the jump as t c . At t c − (with → 0) we have two disjoint regions N 1 and N 2 : At t c + , N 1 and N 2 join into a single region and we should count the particles of N A and NĀ for N 1 ∪ N 2 , i.e.
There is a discontinuity from (6.22) no matter which is chosen in (6.23). This phenomenon is likely generic as one increases the number of intervals: whenever two regions where A and A dominate respectively join together, there will be a discontinuous jump. In Fig. 11 we also plot the corresponding holographic result which is continuous and is smaller than (6.21) for a period after t c . We will further discuss the physical origin of the jump in Secs. VI E and VI F.
Given its simplicity, it is quite remarkable that the maximal RPS model manages to reproduce intricate holographic results for multiple intervals for a significant part of the parameter space. As already alluded to in the Introduction and at the beginning of this section, the ballistic picture of the spread of entanglement discussed in Sec. IV-V of this paper cannot account for the holographic results for more than one interval in (1 + 1)dimensions. This was analyzed in detail in [8,9], and we do not repeat this comparison between the results of the ballistic and scattering pictures. Recently, a CFT analysis showed that rational CFTs behave according to the ballistic model, while non-rational CFTs are 13 We sampled the parameter space by fixing the leftmost and rightmost boundary points and by throwing the other boundary points between them randomly using the uniform distribution. Our sample size was 500. A" A" expected to interpolate between the free streaming and the holographic behavior, which is reproduced by (6.21) [10]. 14 Finally, note that there is a simple "phenomenological fix" to the discontinuity problem, as follows. When two RPS regions with different dominance join, say N 1 withĀ and N 2 with A as in the example of Fig. 11, immediately after the two regions start overlapping we still keep N 1 and N 2 as independent, i.e. the contribution from them is still given by S RPS (N 1 ) + S RPS (N 2 ) rather than the bigger value S RPS (N 1 + N 2 ). As time evolves we merge them into a single RPS region when S RPS (N 1 + N 2 ) = S RPS (N 1 ) + S RPS (N 2 ). If before these 14 The holographic behavior is universal in large central charge CFTs with a sparse low-lying operator spectrum.
two regions merge, other RPS regions start overlapping with either of them, we follow the same rule in deciding whether they should merge with other regions. Sampling over the parameter space, we found that in the improved model the failure rate in reproducing the holographic answer (6.14) is only 2% for three intervals and 8% for four intervals.

D. Higher dimensions
Let us first consider early times t R, for which we can approximate the boundary as a straight-line, then N (t) has is a strip of width 2t with Σ lying in the middle as indicated in Fig. 12. For this geometry the total number of particles from N falling into A orĀ are the same, so we can choose either of them. It then immediately follows from (6.10) that where in the above equation 2A Σ t is the volume of N (t) and nν eq is divided by two, as exactly half of all particles from N (t) will fall into region A because of isotropy. Thus, in this model the tsunami velocity v E is precisely given by the speed of light in all dimensions! It is interesting to contrast this computation with the earlier free propagation calculation of v E in Sec. IV C. We take the EPR (or equivalently the RPS) measure for the free streaming model. In both the free and interacting models we associate a measure to points on the t = 0 time slice. Consider the contributions from points P and Q in Fig. 12. In the earlier free propagation calculation, for light cones originating from P or Q we took the area of the smaller spherical cap of the intersection of the light cone with Σ, while for (6.10), we simply use the spherical caps inside A, which for P is the bigger spherical cap. This thus leads to an enhancement in v E . Physically, in free particle models entangled particles originating volume of the inner white annulus in in Fig. 13: Of course, it is a lot easier to use the simplification provided by (6.21). and the upper bound on the entanglement propagation is achieved when using RPS for each |ψ x . When including interactions clearly (6.26) does not apply. Nevertheless due to constraints from causality, within a finite interval t not all degrees of freedom can interact with one another. The basic idea behind (6.10) is that at time t, the full wave function can be factorized based on the casual structure of A, i.e.
where · · · denotes the factor of the wave function which is irrelevant for the entanglement of A. We then apply RPS to each |ψ N i (t) .
Instead of (6.27) one can in principle consider a finer partition of N (t) than connectivity, and for any α, there exists an i such that M α (t) ⊆ N i (t). We can obtain a general class of RPS models of entanglement propagation by applying RPS to each M α , i.e.
The free streaming RPS model (6.26) is a special case of (6.28) with M α given by a point. 16 In other words, the free streaming RPS model is the finest division of N (t), it is the minimal RPS model. In contrast, the model (6.10) is the coarsest division, and thus we will refer to it as the maximal RPS model. The fact that a subdivision decreases the entropy implies that with the class of all RPS models, the free streaming and the maximal RPS model provide respectively the lower and upper bounds, Furthermore from the discussion of Sec. III, the RPS measure provides an upper bound for the entanglement entropy among all possible entanglement measures with a given partition.
We thus conclude that the maximal RPS model (6.10) should provide an upper bound on entanglement propagation for all relativistic interacting systems after a global quench. This conjecture is consistent with our empirical observation that when the maximal RPS result deviates from the holographic one (6.14), it always lies above it. 17 That for d > 2 the maximal RPS gives v E = 1 is also consistent with the result of Sec. II.

F. Further discussion of the maximal RPS model
We expect that the RPS measure applies when all degrees of freedom in the relevant region (i.e. within each disconnected N i ) have fully "equilibrated," i.e. have interacted sufficiently with one another. Otherwise it provides an overestimate. Consider at some time t c , there are two regions N 1 and N 2 joining into a single connected region. (6.10) then dictates that at t c + (with → 0) we must apply the RPS measure to the whole N 1 ∪ N 2 . But physically in going from t c − to t c + , there is just not enough time for this "total equilibration" to happen. When N 1 and N 2 are dominated by A andĀ respectively before joining, such "lack of equilibration" will lead to a discontinuity, as in the example of Fig. 11. When N 1,2 are both dominated by A (orĀ), or for one of them A andĀ give equal contributions, there will not be a discontinuity. Nevertheless, one may have expected that even in such cases the "lack of equilibration" may also lead to deviations from holographic results during the subsequent evolution after the joining. It is then rather curious that we do not observe such deviations at least in our sampling of the parameter space. This indicates that holographic systems "equilibrate" remarkably efficiently.
The discontinuity in the example of Fig. 11 also means that the strong subadditivity The issue of "lack of equilibration" for N (t) becomes more significant in higher dimen- 17 In Sec. VII C we prove that in (1 + 1) dimensions holographic theories give the fastest possible entanglement spread. This then implies that whenever the maximal RPS result deviates from holography, it overestimates the entropy.
sions. Recall the calculation of v E illustrated in Fig. 12 Sec. VII we propose another approach -different from the family of RPS models -for the computation of entropy in the infinite scattering limit, which again gives v E = 1. In this regard, it is also interesting to note that in the Floquet systems discussed in [19], there is an exact causal light cone for any local operator and v E = 1 is actually achievable, at least for a certain class of initial states. But we should note that the systems of [19] do not appear to have a continuum limit and thus may not be describable as a relativistic system with a local Hamiltonian.
Finally, let us note that (6.10) is based on tracing over a local Hilbert space at t = 0. The tracing is not local at time t. This is a consequence of unitary transformations we performed so as to track the trajectories of particles. This is certainly not ideal. In (1 + 1)-dimension, the equivalent proposal (6.21) motivated from the tsunami picture is based on tracing out a local Hilbert space at time t, and thus is conceptually more appealing. But it appears not easy to generalize (6.21) to higher dimensions for regions of general shapes as the behavior of entanglement tsunamis become complicated at late times. We can view the wave function |ψ(t) prepared by the interacting model of Sec. VI B as a tensor network, and abandon the idea of associating entropy to the particles ending up in the region A. Instead we can regard the time evolution as a quantum circuit of depth t that prepares an entangled state from a product state through the action of the unitary scattering matrices U introduced in Sec. VI A. It will be easier for us to convert U into a four-indexed tensor V ij,kl , and to be agnostic about the dimension of the Hilbert space associated to the particles scattered, and simply denote it by χ; the indices run over i, j, · · · ∈ (1, χ). We explain our notation in Fig. 14.
The unitary time evolution prepares us a state where T i 1 i 2 ...i N (t) can be obtained from contracting V ij,kl according to the pattern described in Fig. 14. This description provides a convenient interpolation between the free streaming and the infinite scattering pictures. In the free streaming case V ij,kl = δ il δ jk 18 , while for strong scattering we expect V to be random. By tuning V , we should be able to learn how the behavior of entanglement spread interpolates between the two.
The network of Fig. 14 resembles that for global quench described in [2], but there is a fundamental difference. Here the vertical direction is physical time, i.e. Fig. 14 is a quantum circuit, while in [2] the vertical direction is an auxiliary RG time. Accordingly, here the state of the system at a given time t is described by a single slice of the network, while there the state is described by the whole network. Nevertheless, in the linear growth regime there appears some isomorphism between the network here and that of [2]. It would be interesting to understand this better. 18 Of course, in this case one has to supply a nontrivial locally entangled initial state. For tensor networks there exists a bound on the entanglement entropy of a region A: where cut is the length of the minimal cut through the tensor network that separates A fromĀ. An example of a minimal cut for two intervals in (1 + 1) dimensions is presented in Fig. 15. For large enough χ and for a generic unitary scattering matrix V the bound (7.2) is expected to be saturated. We then point out that from Fig. 15 it follows immediately that in (1 + 1) dimensions our tensor network exactly reproduces the holographic result (6.14) for arbitrary number of intervals. In the next section we discuss a higher dimensional model inspired by the minimal cut bound in tensor networks.
We emphasize that the tensor network of Fig. 14  we obtained from the picture of scattering quasiparticles, and a priori it has nothing to do with holography. The minimal cut in Fig. 15 is also a tensor network concept. There are tantalizing connections with holography though. With physical time replaced by RG time, the network here resembles that of [2] which in turns looks like a "nice slice" inside an eternal black hole. The minimal cut prescription is also reminiscent of the holographic extremal surfaces [20].

B. A geometric model inspired by tensor network
Working with a discrete tensor network in higher dimensions is inconvenient due to the breaking of rotational symmetry. Using the physical insight from the tensor network picture, we now propose a simple continuum model for the time dependence of the entropy of an arbitrary region that satisfies all the physical criteria we are aware of that such a function should satisfy. We emphasize that we do not know of any local Hamiltonian that would produce the entropy given by this model, and it would be very interesting to find further criteria that the entropy function should satisfy that would constrain or rule out the model we propose in this section.
The entropy function after a global quench should satisfy the following geometric requirements, some of which we have discussed in previous sections, others were implicit:  Using some insight from the tensor network picture, below we will produce a family of entropy functions S(A) -including one with v E = 1 -with these properties in any dimension. Of course, the free streaming models of Sec. IV also obey all these properties, For the construction let us introduce the set of all (d − 1)-dimensional surfaces Φ α on R d , x 0 > 0, that are smooth almost everywhere 19 and have a normal vector n = (n 0 , n), which satisfies in all points where the normal is well defined That is, the slope of the tangent vectors of the surface is bounded above by α; α = 1 means that the maximal slope is given by the speed of light.
For any Σ ∈ Φ α let us define V (Σ) as the volume of Σ using the metric ds 2 = d x 2 , which is degenerate. That is, V (Σ) is the volume of the spatial projection of Σ. Then, let us consider the following entropy function for an arbitrary spatial region For this formula we can consider surfaces Σ that hit the t = 0 boundary of the spacetime and end there. Equivalently, we can think that once they hit t = 0 on the boundary, they run on the t = 0 plane, but this part of the surface does not have any volume. Because for ds 2 only the spatial volume counts, a large number of surfaces give the same entropy. The minimal surfaces Σ can be thought of as the analog of a minimal cut in the tensor network; as explained in the caption of Fig. 15, the minimal cut is not unique either.
Regarding the tsunami velocity for small times, we see that the least volume will be given by a surface that runs to the past of A as fast as it can subject to the constraints on the slope to end on the t = 0 boundary. This gives v E = α .
From the arguments of Sec. II it is clear that v E > 1 violates SSA, which is condition (h) above. This then restricts the range of α to 0 < α ≤ 1, and we will show below that for any α in this range (h) is satisfied. In particular, for α = 1 we have a model that gives v E = 1 and satisfies all the conditions we listed above.
We now prove that SSA holds for 0 < α ≤ 1, but is violated in some circumstances for α > 1. Let us take a Cauchy slice containing two overlapping regions A and B, and for 19 That is, they can contain singular codimension-1 sets, where the tangent is not defined.
0 < α ≤ 1 use the degeneracy of the minimal surfaces to push Σ A and Σ B to the past so that they necessarily intersect, see Fig. 16. We can reinterpret the two surfaces as corresponding to A ∪ B and A ∩ B. They may not be minimal, but minimal surfaces corresponding to A ∪ B and A ∩ B will just decrease the entropy, so SSA follows. These steps are identical to the proof of SSA in holography for static situations [21].
A" t Here we chose α < 1, so the surfaces from Φ α are allowed to be steeper than the light cone. Right: Proof of SSA for the case 0 < α ≤ 1. We choose A to consist of a constant time and a lightlike strip (drawn by blue), while B is a union of two lightlike strips (drawn by green). (The proof works for arbitrary A and B on the same Cauchy slice.) Example minimal surfaces corresponding to A and B are drawn by dashed lines and colored in the same colors as A and B respectively. We can reinterpret the intersecting minimal surfaces as surfaces ending on A ∩ B (drawn by red) and A ∪ B (not drawn). Note that in the present case the red surface is minimal, while the one corresponding to A ∪ B may or may not be minimal depending on the position of the t = 0 boundary.
The above proof fails for α > 1, because in some cases the minimal surfaces corresponding to A and B cannot be made to intersect. SSA is violated in the setup of Fig. 17, as the entropies for A, B, and A ∪ B are all proportional to their spatial projection, while S (A ∩ B) is larger than the spatial projection of A ∩ B due to the turning point. Note that for a lightlike (or sufficiently boosted) strip the minimal surface is necessarily cuspy in this case, any smoothing would bring the surface out of the set Φ α ; we allowed for codimension-1 singularities on Σ in our assumptions.
This model for v E = 1 (α = 1) in (1 + 1) dimensions gives exactly the prescription of the tensor network discussed in the last subsection, and it coincides with the holographic result. For higher dimensions it again leads to v E = 1. Note that in this model in higher dimensions the minimal surface that gives v E = 1 corresponds to the situation where entanglement essentially only propagates along one direction (i.e. the direction perpendicular to the boundary), which is clearly a bit peculiar. This aspect is similar to the behavior of the maximal RPS model discussed in the last section. It would be nice to understand whether we are missing some more subtle isotropy constraint. Without using the isotropy of the system, one cannot hope to prove a more restrictive upper bound than v E ≤ 1, as many decoupled (1 + 1)-dimensional CFTs placed next to each other satisfy all our assumptions, and trivially yield v E = 1. Another question we leave for future research is how the present model compares with the holographic results once we set α = v holo E .

A" B"
\ A" B" A" t FIG. 17. Failure of SSA for α > 1. We use the same regions A and B as on Fig. 16. However, the minimal surface corresponding to A now cannot be pushed further to the past, while the one corresponding to B cannot be pushed further to the future. Thus, minimal surfaces corresponding to A and B cannot intersect, and the geometric proof of SSA breaks down. SSA is violated because the minimal surfaces corresponding to A∩B have larger volume than the spatial projection of A∩B.
Finally, we note that the family of models we constructed in this section gives for a region lying on a constant time slice S(A) ≤ s vol(A) for all times, and can give rise to nonzero mutual information at intermediate times.

C. Holographic theories maximize entanglement spread in (1 + 1) dimensions
The tensor network model seems to include by construction the heuristic idea of a "maximal spread of entanglement". If we think that strong interactions are effectively taken into account by random unitaries in the vertices of the tensor network, the model would try to produce locally a random pure state as fast as possible given the causality constraints.
In fact, we can prove that the holographic entropy formula S H (A) (6.14) gives the absolute maximum for the entropy S(A), for any number of intervals, and any global quench in relativistic theories in (1 + 1) dimensions.
In order to show this, let us take regions at a fixed time t and start with a single interval of size r. From SSA we have S (r) ≤ 0 (for any translationally invariant state). From the physical assumptions about the global quench we also have that S(r) ∼ sr for small r, as small intervals saturate at the volume law, and that the entropy saturates to an r independent constant S(r) ∼ 2sv E t for large r, as large intervals are in the linear regime. It is immediate that the holographic S H (r) = s min(r, 2v E t) is the maximum concave function with these two asymptotic behaviors.
Here v E = 1, but we write it explicitly for convenience.
For n intervals the proof is by induction. We start by assuming that S H (B) is the maximum possible entropy for any set B with less than n intervals.
For n intervals let us write A = {(a 1 , b 1 ), (a 2 , b 2 ), . . . , (a n , b n )}. Recall that the geodesics describing the bulk extremal surface for this region in the holographic setup all lie in some spacelike surface, and cannot cross each other [22] (otherwise another surface exists with smaller area). 20 It is not difficult to realize 21 that there must be a geodesic in the extremal surface for A that either joins the two end-points of an interval from A, say I j = (a j , b j ), or there is at least one geodesic that joins the two consecutive points b k , a k+1 for some k. This last possibility is equivalent to joining the two end points in an interval fromĀ, which itself is a union of intervals and two half lines. As these two cases are completely analogous, without loss of generality let us restrict our attention to the first case, i.e., that I j = (a j , b j ) determines a geodesic in the extremal surface. Then the rest of the geodesics 20 That S H (A) gives the absolute maximum for the entropy can be proven directly from the formula (6.14) without referring to geodesics, but perhaps it is easier to visualize the proof presented here. 21 This can also be proved by induction: a given geodesic splits the set of geodesics in two, to the ones outside and inside of it, and consequently the problem is mapped to the same problem with less intervals.
give the extremal surface for B = A − I j = A ∩Ī j . We have S H (A) = S H (I j ) + S H (B) ≥ S(I j ) + S(B) , (7.8) where in this last inequality we used the induction hypothesis. Now, by subadditivity

VIII. TSUNAMI VELOCITY AND MUTUAL INFORMATION
In this section we further examine the relation (2.9), which we copy here for convenience (see also Fig. 2) in the context of free streaming and interaction RPS models discussed in earlier sections, which sheds new insight onto the values of v E .
Let us first consider (1 + 1) dimensions. In a free propagation system with all particles traveling at the speed of light, from plot (a) of Fig. 18, it is clear that there is no entanglement between W (t) and X (see caption). This is consistent with v E = 1 in such a model. There 22 We need the condition of equal central charge to have the same entropy for a single interval.
is, however, entanglement between W (t) and X even in a free propagation system if there are also particles which can travel smaller than the speed of light. See plot (b) of Fig. 18.
This means that whenever there is propagation of entanglement inside the light cone we expect v E < 1.
In a system with only particles traveling at the speed of light, interactions will generically generate entanglement between W (t) and X, as indicated in plot (c) of Fig. 18. Thus in (1 + 1) dimensions, interactions slow down v E compared with the free propagation at the speed of light, and for a general interacting theory we should have v E < 1. This conclusion is consistent with v E = 1 from CFT [1], as well as holographic calculations [16], where the results apply in the "infinite scattering limit" R, t 1 T . In other words, in these theories, we expect with f a positive function. For holographic theories, the function f can be read from (5.38) of the second reference of [3] and is indeed positive. [17] also found that interactions that break the scaling symmetry slow down the tsunami. It can also be readily checked that in the maximal RPS model I(W (t), X) = 0.
The story in higher dimensions is rather different. Even in a free propagation system with all particles traveling at the speed of light, there is entanglement between W (t) and X, which can be immediately seen as follows. Particles from a point x with nonzero velocity in directions perpendicular to x 1 will have a velocity smaller than the speed of light in the x 1 direction. In other words, when projected to the x 1 direction, these particles propagate "inside" the light cone. Thus from plot (b) of Fig. 18 they will generate nonzero I(W (t), X).
This is perfectly consistent with our earlier discussion that v E < 1 for higher dimensional free propagation models. In fact by computing directly I(W (t), X) and then using (2.9) gives an independent derivation of v E , which by consistency should agree with our earlier expression (4.27). Below we will check this is indeed the case.
With already nontrivial entanglement between W (t) and X in free propagations, interactions can increase v E if they can reduce entanglement between W (t) and X. It appears hard to imagine that the entanglement can be reduced completely to zero. But at the moment we do not have any definite argument to exclude the possibility that in an isotropic system there exists some limit, in which v E = 1 can be approached like in (8.2).
Finally, as promised earlier we show that v E derived from (2.9) agrees with (4.27) for a (a)" at the speed of light, scatterings can also generate entanglement between W (t) and X. Suppose initially 1 and 2 are entangled. Scattering between 2 and 3 will generate entanglement between 1 and 3, thus leading to entanglement between W (t) and X. The region N (t) which can contribute to S(X) is also much larger than that of (a) and is the same as that of (b). general free propagating model. Consider a light cone from a point of distance y from the entangling surface (see e.g. Fig. 19), as in Fig. 6. Positive y is outside Σ, negative is inside Σ. A light cone centered at y intersects the regions W (t) and W (t) ∪ X in spherical caps with opening angles θ = arccos y t , (8.3) θ + δθ = arccos y − δt t + δt , (8.4) respectively. The corresponding spherical caps have normalized area ξ and ξ+δξ respectively. In complete analogy with the entropy, the mutual information can be calculated for every light cone independently, as in (4.1). It is given by a formula analogous to (4.28) where we used that the small annulus region that we get from the difference of the spherical caps has to obey (4.4). On Fig. 19 this region is the two small arcs between the blue and green planes. We series expand (8.5) to get where we introduced x = y/t. We calculate the first term using the expressions: δθ(x) = δt t 1 − cos θ(x) sin θ(x) . (8.8) Using these expressions we obtain after the change of variables dx = d cos θ: For the second term in (8.6) we use the chain rule to write: (8.13) where in the second line we did a partial integration and used µ cap (0) = µ cap (1) = 0, while in the third line we used (4.27). Combining the two terms then gives which reproduces the identity (2.9).
light cone. We then find which immediately leads to Second, we consider spherical caps in d > 3, where the geometry is more complicated.
The simple proof of the concavity of µ cap in d = 3 is similar to the entropic proof [23] of the C-theorem in d = 2, whereas the proof in d > 3 below uses the more complex method of [24] that was originally developed to prove the F-theorem in d = 3.
We are going to use strong subadditivity to prove the concavity of the function µ cap (ξ) as a function of the normalized area ξ. Strong subadditivity (4.

3) for two spherical caps
A and B involves the intersection and union of these regions, which are not spherical caps anymore. In order to have an inequality containing only spherical caps we can follow the procedure used in [24]. We use a large number N of spherical caps X i , i = 1, ..., N , which are copies of one spherical cap rotated around a point in the sphere, see figure 21. Strong The regions on the right hand side of this inequality are not spherical caps, but as shown in figure 21 they approach to spherical caps in the limit of large N 23 . In this limit (A3) 23 We know the wiggly caps of figure  will be converted into an inequality involving an integral of spherical caps with sizes varying between a maximum ξ max and a minimum ξ min (see figure 21). Following [24] we can then take ξ max − ξ min = and expand the inequality for small to get an inequality for µ cap (ξ) and its derivatives for a single ξ.
Though we can follow these same steps here, we can obtain directly the final result without doing the explicit calculations by arguing as follows. As a result of this procedure we should get a differential inequality for µ cap (ξ). Because strong subadditivity is linear in the entropy and involves four regions, it will always lead to linear differential inequalities containing at most second derivatives of the entropies. Hence we should get In this appendix we study in more detail various aspects of entanglement propagation using the three examples discussed in Sec. IV B. We will mostly use spherical regions for illustration, although in some cases we make general remarks valid for all measures and general shapes. More specifically, we will investigate the following issues: • full time evolution and the entanglement rate (1.5) for simple shapes such as the sphere, • for the EPR example, one can further define an entanglement density which can be used to better visualize the spread of entanglement, • saturation time for generic shapes, • finite volume effects, • mutual information.
Along the way we also compare the results obtained here for ballistic propagation with those of holographic systems. We take the width of the interval to be 2R. From Fig. 22 we conclude that the entanglement entropy has the time dependence: where s is given by (4.9). We divided s by 2 as n/2 is the density of quasi-particle pairs.
Both the slope of the linear growth and the saturation time agrees with those found from direct field theory calculation [1] and those obtained from holography [16].  (ξ A , ξĀ)). We can conveniently treat all measures at once by working with: where m = 1 gives the result for the EPR and RPS measures, while m > 1 corresponds to GHZ blocks. The absolute value comes from combining the relation ξĀ = 1 − ξ A with (4.13) valid when ξ A < 1/2.
To apply (4.1) we need to work out first the region N S d−2 (t), which is explained in Fig. 23.
We then find that with min(ξ(r), 1−ξ(r)) = 1 2 where we used the formula for the area of a spherical cap, and I z (a, b) is the regularized incomplete beta function. We did not find a way to perform this integral for general d, m, so we present some examples.
For the EPR case m = 1, we find (with τ ≡ t/R) Numerical plots for various m in d = 3, 4 for the GHZ measure are given in Fig. 24. Note that while for EPR and random state measure, the saturation time is always t s = R, for GHZ the saturation time is infinite. It can also be readily checked that for early times and at late time the entropy S S d−2 (t → ∞) = s R d−1 V B d−1 , consistent with our general discussion in Sec. IV C 1.
c. Strip in d = 3 We consider a strip of width a and length L in d = 3 as another simple example. Our discussion will be less detailed, than for the sphere case. For early times, t < a/2 the time evolution is exactly linear, while for t > a/2 the width of the strip starts to play a role.
The time dependence becomes more complicated as for some of the light cones L strip ( x; t) becomes a disconnected region, and we have to use (4.14). The result for early time valid for arbitrary measure is with τ = t/a. For the EPR model, we get for later times For the case of the strip the RPS model is not equivalent to the EPR model. The results for the RPS, the EPR, and GHZ block models are plotted in Fig. 25. It is remarkable that the measures originating from a quasiparticle picture give an infinite saturation time, while the RPS model saturates in τ = 1. We now briefly examine the entanglement entropy of two separated disks and strips. We We included the example of two strips, as it also displays negative entanglement growth rate, and this geometry will also play a role in our discussion of mutual information in Appendix B 5. In Fig. 26   for the RPS (blue) and the EPR (orange) models.

Entanglement density
For the case of EPR pairs we can introduce a local, and thus more refined, measure of entanglement: entanglement density. The entanglement density ρ Σ ( x, t) at a given point x ∈ A inside Σ is defined as ν times the density of quasiparticles whose entangled partners lie outside Σ. Recall that ν was introduced below (4.9). It then immediately follows that ρ Σ ( x, t) can be readily worked out as follows. In the EPR example, two point x, y are are only entangled for one moment, when their distance is exactly 2t. One can then introduce an entanglement "correlation function" The normalization of the above function is determined by requiring which is the total amount of entanglement x has with the full space. The entanglement density ρ Σ ( x, t) can then be obtained by integrating the above expression over all y that lies inĀ, i.e. the region outside Σ, The above expression can be easily described in words: draw a sphere of radius 2t around x, and calculate the portion of the surface that falls outside Σ. The simple intuitive picture this is that we are counting the quasiparticles at x that have their partners outside, which all lie on the sphere of radius 2t drawn around the point.
For Σ a sphere, the integral in (B14) can be readily performed. Actually, we have already performed this calculation in Appendix B 1 b, so all we have to do is to replace t → 2t in (B4). For d = 3 we get where the last two cases can only happen before and after t = R/2. The entanglement entropy saturates when ρ(r) reaches s everywhere inside the sphere. It is curious that at the center of the sphere r = 0, the density jumps from ρ(r = 0) = 0 to saturation value s at t = R/2. We plot (B15) for various times in Fig. 27.
The entanglement density can be used to give a precise definition of the entanglement tsunami introduced in [3]: we define the tsunami wave front as the boundary between regions of ρ = 0 and ρ = 0. From (B14), one can immediately conclude that under such a definition, locally the wave front should progress at the speed of 2, i.e. twice the speed of light. For Σ being a sphere, the wave front can be visualized from the lower half of the plots in Fig. 27, with the wave front reaching r = 0 at t = R/2. Note that when the wave comes in, the region covered by the wave is only partially entangled with outside, i.e. with a value smaller than the equilibrium value s. The curves in the upper half in Fig. 27 suggest there is a "reflected wave", whose wave front can be defined as the boundary between the region which has reached the equilibrium value s and the region which has not. This reflected wave starts at t = R/2 from the center and moves at speed 2 outwards, reaching Σ at t = R. In this example, the linear growth (4.26) and the associated v E can be considered as an average effect. The picture here is very different from that proposed in [3] for strongly coupled systems, where the region covered by the tsunami wave will already have reached their equilibrium value. The difference may be due to that in free theory as we are considering here, there is actually no dynamical process of equilibration. We should also keep in mind, as we will also elaborate below, that generically there is no unambiguous definition for an entanglement density. The entanglement density discussed above is specific to the EPR example. When there is multipartite entanglement, such a definition does not appear to exist even assuming a quasiparticle picture. The basic reason is that with multipartite entanglement, one could no longer localize entanglement to a point. This can be readily seen from the GHZ example illustrated in Fig. 28. We discuss how interactions change the perspective on the tsunami wave front in Sec. VI C.

Saturation time
In Sec. IV C we showed that the entanglement entropy has an equilibrium value S eq = sV Σ .
In Appendix B 1 b we saw that for a sphere of radius R, the EPR model has a finite saturation time given by t s = R, while for GHZ block example the saturation time is infinite. In this subsection we make some general remarks on the saturation time for general shapes.
Let us first consider the EPR example. From (B14) we can immediately conclude that for any Σ the saturation time t s equals half of the largest distance between two points on Σ.
If t > t s the entanglement density at any point inside Σ is ρ(x) = s, because the Dirac delta in (B14) has support on a circle of radius 2t around x, which now completely lies outside Σ; see Fig. 29. Phrased slightly differently, at time t, all quasiparticles which entangle with those at x lie on the circle of radius 2t centered at x, and the total number of them is n, the particle density at x (which is in fact the same everywhere due to homogeneity). When this circle lies completely outside Σ, all of them contribute to S Σ and the entropy does not change with time. The saturation time is thus t s = R for spherical Σ and t s = ∞ for any non-compact region. In particular, t s is infinite for a strip.
The RPS measure gives a rather different saturation behavior from the EPR model as evidenced by the finite saturation time even for a non-compact shape; for the example of the strip see Fig. 25. The criterion for saturation in the RPS model is that after the saturation time t s there should not exist any light cone, whose intersection with region A has a normalized area grater than 1/2.
The results of the EPR and RPS measures should be contrasted with holographic systems, for which t s is finite for a strip and given by where 2R is the width of the strip and v E is the "tsunami velocity" which appears in (1.4).
For holographic systems, the saturation time for a sphere is given by where we have quoted the value of c E for a neutral system. We thus see that for a strip, holographic systems saturate faster than the EPR model (t s = ∞) and the RPS model (t s = 2R, as can be calculated from the above criterion or read from Fig. 25), while for a sphere holographic systems saturate slower than the free streaming models that have t s = R.
Note that the velocity c E has also appeared in [25] as the "expansion" velocity of the the time evolution of a local operator in a thermal state.
For a general measure from our discussion in Sec. IV C 1 we expect that in the t → ∞ limit, there are generically 1/t corrections to the leading behavior (4.22). This implies that the saturation time is generically infinite. As an example, let us consider Σ a sphere, for which case the shell in Fig. 5 is also spherical. One can then compute the subleading corrections using (B3) by expanding µ cap (ξ) to higher orders in ξ, µ cap (ξ) = s eq ξ + a 2 ξ 2 + a 3 ξ 3 + . . . , where we used (4.23). We will not go into details here, except to mention that whenever the nonlinear term in (B18) are non-vanishing one gets subleading corrections in 1/t of the form, S(t) = s eq V Σ + n # a n t (n−1)(d−2) .
Thus all measures for which there are nonlinear terms in (B18) have infinite saturation time.

Finite volume effects
So far our discussion assumed the system has an infinite volume. If the system has a finite volume, the large time behavior of the entanglement entropy will be modified when carriers of entanglement can explore the whole volume.

a. EPR pairs and GHZ blocks
Let us first consider the EPR and GHZ examples which can be treated in a unified manner.
For a system with a finite volume, after a long time there will be no correlation between the positions of the particles that originated from the same point. (This assumptions should hold true except for resonant situations in special geometries.) Then we have a constant density n of quasiparticles, and a total of nV system of them, randomly distributed.
We get entanglement except in cases, when quasiparticles originating from one point are all inside or outside Σ, hence (B20) This expression is manifestly symmetric under V Σ → V system − V Σ , hence the requirement that the entropy of a region and its complement is equal in a pure state is satisfied. Another consistency check is that in the infinite V system limit we get S Σ = sV Σ . The maximum of (B20) is achieved for V Σ /V system = 1/2, In the EPR (m = 1) case the resulting expression is The maximum entanglement is S Σ max = s 4 V system .
b. Random pure state measure According to the discussion in Sec. III random pure states are expected to give: This result is consistent with (4.16); if we followed the time evolution of light cones in a compact geometry, a light cone would become a curve densely filling the whole volume V system . The maximum entropy is reached again at V Σ /V system = 1/2, and its value is S Σ max = s 2 V system , twice the value of the EPR pair model. Equation (B23) is of the form expected from a holographic system.

Mutual Information
We now consider the qualitative behavior of the time dependence of mutual information If the whole system is compact, the mutual information will tend asymptotically to where we have used (B22).
An interesting feature of the EPR example is that the corresponding mutual information is extensive, i.e.
where the tripartite information is defined by For GHZ, I(A, B) becomes non-zero at the time L min 2 , as in the EPR case. Here, however, for any time we can always find a light cone that intersects both A and B, and we can put some quasiparticles on these intersections without violating momentum conservation. This leads to a non-vanishing I(A, B) for all times. As t → ∞, as the normalized volume of intersection of a light cone with A and B will necessarily go to zero, therefore I(A, B) will asymptote to zero. This phenomenon is reminiscent of the differences in saturation between the GHZ and EPR cases for the entanglement of a single region.
Let us now examine the property of I 3 for GHZ. Motivated by the above discussion of the EPR case, we note that in order to get a possibly nonzero I 3 in the GHZ case, we need some number of particles in all three regions. From the property that tracing out any number of particles leads to the same amount of entanglement, we conclude that Note that GHZ states are very special, choosing a different multipartite entanglement pattern for the particles would generically lead to a non-definite sign of I 3 .
c. Random pure state measure Consider a light cone which intersects with A and B, and denote the area of its intersections as ω A,B respectively. From (4.16), assuming ω A > ω B , the contribution from this light cone to I(A, B) is: From this expression one can check explicitly that the mutual information is non-extensive and monogamous, i.e. which intersects both A and B, and more than half of its area is inside AB. This implies that the mutual information stays zero for all times, if the separation of the regions is too large compared to their sizes. Also for large times we always get zero mutual information for compact regions. These results are reminiscent of holographic examples: holographic mutual information is monogamous [14], and the time dependence of mutual information shows similar qualitative behavior in holography, see e.g. [18].
In finite volume we can determine the saturation value of I(A, B) from (B23). For which has the same form as (B31).
d. Example of two strips in d = 3 As an example consider the mutual information for two parallel strips A and B of width a and length L, separated by a distance l in d = 3. In this case the RPS model will have zero mutual information unless a l ≤ √ 2−1 2 0.207. If this holds the mutual information will be non zero starting at t = l/ √ 2, in contrast to the GHZ and EPR models which have non zero mutual information starting at t = l/2 for all values of a/l. In these latter models the mutual information decays asymptotically to zero with time, while in the RPS model is already zero for a finite time. This is illustrated in figure 30 where we plot the numerical evaluation of mutual information I(A, B)/(s eq La) as a function of time, τ = t/a for these models. Note that the higher m GHZ models produce slower decay of mutual information, just as they give slower saturation for the entanglement entropy for one strip shown on