Void formation in operator growth, entanglement, and unitarity

The structure of the Heisenberg evolution of operators plays a key role in explaining diverse processes in quantum many-body systems. In this paper, we discuss a new universal feature of operator evolution: an operator can develop a void during its evolution, where its nontrivial parts become separated by a region of identity operators. Such processes are present in both integrable and chaotic systems, and are required by unitarity. We show that void formation has important implications for unitarity of entanglement growth and generation of mutual information and multipartite entanglement. We study explicitly the probability distributions of void formation in a number of unitary circuit models, and conjecture that in a quantum chaotic system the distribution is given by the one we find in random unitary circuits, which we refer to as the random void distribution. We also show that random unitary circuits lead to the same pattern of entanglement growth for multiple intervals as in (1+1)-dimensional holographic CFTs after a global quench, which can be used to argue that the random void distribution leads to maximal entanglement growth.


I. INTRODUCTION
The Heisenberg evolution of operators in a quantum many-body system is in general extremely complicated. But during the last decade, remarkable universalities have been found, such as ballistic growth of operators [1] and growth of operator entanglement [2][3][4][5]. Such universal properties have played an important role in diverse problems like scrambling of quantum information, quantum many-body chaos, and entanglement growth during thermalization (see e.g. [6,7]).
In this paper, we discuss another universal feature of operator evolution: an operator can develop a void during its evolution. More explicitly, considering a spatial region A within the "lightcone" of an initial operator O, we can decompose its time evolution O(t) as where 1 A denotes the identity operator in A,ÕĀ is some operator inĀ (the complement of A) 1 , and O 2 (t) is an operator whose projection onto A is orthogonal to 1 A . Here we assume that the system has a finite-dimensional Hilbert space and tensor product structure associated with spatial regions. See Fig. 1 for an illustration. Given the space of all operators is a Hilbert space, we can also associate a weight or "probability" for O(t) to develop a void in region A P (A) . (1.2) Below we will refer to the presence of O 1 (t) in O(t) as void formation. Void formation is present in both integrable and chaotic systems, and is required by unitarity. We will show that it has important implications for unitarity of entanglement growth. For example, evolving from an initial product state, it is the contribution from O 1 (t) that ensures S A (t) = SĀ(t), where S A is the entanglement entropy for region A. Further, we will show that void formation is responsible for generating mutual information and multipartite entanglement among disjoint regions during the evolution of a system, as we illustrate using a cartoon picture in Fig. 2. We will also derive a number of general constraints on probability distributions of void formation from unitarity, which are applicable to both chaotic and integrable systems. When a single operator has some probability of evolving to an operator with disconnected parts in A_1, A_2, and A_3, forming a void within its lightcone, it contributes to multipartite entanglement between the three disjoint regions. shaded regions indicate where the operator has nontrivial support. A process like (a) contributes to multipartite entanglement between the three disjoint regions A 1 , A 2 , and A 3 .
To develop intuition for probability distributions of void formation, we study three types of unitary circuit models in one spatial dimension: (i) the random unitary model of [8][9][10], which can be considered a minimal model for quantum chaotic systems; (ii) a "free propagating" model [11] in which entanglement can only be spread, but not created, which may thus be considered a proxy for free theories; (iii) a circuit built from perfect tensors [12], which may be considered a model for interacting integrable systems. In the free-propagating and perfect tensor models, patterns of void formation depend sensitively on the initial operator.
In particular, in the perfect tensor circuit model, void formation exhibits a fractal structure.
In the random unitary circuit, we find at sufficiently late times in all (1 + 1)-dimensional systems [11], we are led to conclude that together with sharp light-cone growth, the random void distribution maximizes entanglement growth.
The plan of this paper is as follows. In Sec. II, after describing in detail our general set-up, we discuss implications of void formation for unitarity of entanglement growth and generation of mutual information and multi-partite entanglement, as well as constraints on void formation from requiring unitarity. In Sec. III, we discuss the random circuit model, derive the random void distribution (1.3), and discuss its implications. In Sec. IV, we discuss void formation in the free propagation and perfect tensor models. We conclude in Sec. V with future directions. We have included a number of Appendices for technical details.

II. VOID FORMATION AND IMPLICATIONS
In this section, we first describe our general setup, and then derive some simple constraints from unitarity on void formation during Heisenberg evolution.

A. Setup
For convenience, we will consider a one-dimensional lattice system with a finite-dimensional Hilbert space at each site. The discussion generalizes immediately to higher dimensions. We comment on generalizations to systems with an infinite local Hilbert space in the discussion section, Sec. V.
The Hilbert space at a site i will be denoted as H i , and is taken to have dimension q.
The full Hilbert space is H = ⊗H i , and has dimension q L , where L → ∞ is the system size.
Operators at a single site form a Hilbert space of dimension q 2 , which will be denoted as G i .
Operators of the full system form a Hilbert space G = ⊗ i G i of dimension q 2L . We will use O i a , a = 0, 1, · · · q 2 − 1, to denote an orthonormal basis of G i which is normalized as where 1 i is the identity operator of H i , and there is no summation over i in the above equation. Orthogonality with O i 0 implies O i a , a = 1, · · · q 2 − 1, are all traceless. A convenient choice of basis which we will use throughout the paper is where X and Z are respectively the shift and clock matrices 2 . An orthonormal basis for G, 2 Explicitly, Z = q−1 k=0 e 2πik/q |k k| and X = q−1 k=0 |k + 1 k|, where addition is defined mod q.
which will be denoted as O α , α = 0, 1, · · · q 2L − 1, can be obtained from tensor products of {O i a }. These basis operators satisfy O 0 is the identity operator for the full Hilbert space H, and all other O α 's are traceless.
Under time evolution, where U (t) denotes the evolution operator. From unitarity of U (t), We can interpret |c β α (t)| 2 as the probability of operator O α evolving to O β . Systems with a local Hamiltonian have an effective light-cone speed v c for how fast an operator can grow with time [13], so that |c β α (t)| 2 ≈ 0 for O β not contained within the light-cone of O α . For the purpose of not obscuring the conceptual picture with technicalities, throughout this paper, we will consider a particularly simple form of operator evolution in which the end points of an operator of interest move at the light-cone speed v c in opposite directions 3 .
We will refer to this assumption as sharp light-cone growth. See Fig. 3. From now on we will set v c = 1. Causality imposed by the light-cone structure will play a key role in the subsequent discussion. Throughout the paper, if not mentioned explicitly, we will always consider q large and ignore subleading 1/q corrections. The qualitative picture does not change in more general situations, but the story becomes technically more complex, and will be treated elsewhere.
The statement about light-cone growth does not say anything about the internal structure of an operator under time evolution. A basic question addressed in this paper is: can 3 Examples include the random unitary circuits in the large q limit [8,9], Cllifford circuit models to be discussed in Sec. IV, and (1 + 1)-dimensional CFTs in the large central charge limit [14]. This was also used as a toy model in [7,15].
A main goal of this paper is to explore the role of void formation in the entanglement structure of a system. A good observable for this purpose is the evolution of the second Renyi entropy, which can be expressed in terms of operator growth [7][8][9]15]. Suppose at t = 0, the system is described by a homogeneous pure product state, runs over all sites of the system and ρ i is a pure state which is the same for all sites. It is convenient to choose a basis so that with Z i the clock matrix at site i, and thus where I denotes the set of operators which can be built from powers of Z i 's. Note that the space I is q L -dimensional, in contrast to the q 2L -dimensional full space of operators.
Under time-evolution, the reduced density matrix for some region A is given by 4 where we have used (2.4), and the fact that due to tracelessness of all nontrivial basis operators, only operators of the form O β ⊗ 1Ā with O β an operator in region A (denoted by β ∈ A) contribute to TrĀO α (t). |A| denotes the size of region A. The second Renyi entropy for A can then be written as (2.10) We will now make a further simplification by ignoring the off-diagonal terms (i.e. terms with α 1 = α 2 ) in (2.10). For chaotic systems, one expects the phases of c β α to be random, so that the off-diagonal terms are suppressed by order O(q −|A| ) compared with diagonal terms [15]. For integrable systems one cannot make this argument. Nevertheless, there are often situations where the off-diagonal terms vanish identically. This is the case for all the explicit examples we discuss in Sec. III and Sec. IV. We then find  Throughout the paper, we will denote the union of two regions A 1 ∪ A 2 simply as A 1 A 2 .

B. Upper bound on average probability for void formation
To give some intuition for the motivation behind (1.3), we first discuss the average probability for an operator to become trivial in a given region.
Consider a Hilbert space G of dimension d and a subspace G ⊂ G of dimension d , with P the projector onto G . For a vector |ψ ∈ G, the probability that it transitions into a state in G under a unitary transformation is given by p = Tr P U |ψ ψ| U † . (2.14) The average probabilityp, obtained by averaging |ψ over all unit vectors in G with the unitarily invariant Haar measure, is then We take G to be the Hilbert space of all operators with dimension d = q 2L , and G to be the set of operators which are the identity in region A, which is a subspace of dimension d = q 2(L−|A|) . We thus conclude that the average probability for an operator to be trivial in (2.16) Apart from the assumption that U is unitary, (2.16) is independent of U and does not depend on the region A other than through its size. Note that (2.16) gives the probability for an operator to be trivial in A, which is larger than the probability for developing a void in A, as it also includes final operators which are trivial in some disconnected parts ofĀ.
Note that if we do not average |ψ but instead take U to be a random unitary matrix with the Haar distribution, we find the same answer. This is the motivation for the name "random void distribution" for (1.3).

C. Unitarity of entanglement growth for one interval
We now present a simple argument which shows that void formation plays a crucial role in ensuring that the entanglement growth of a system after a global quench is compatible with unitarity. The argument can be used to derive a constraint on void formation by requiring unitarity.
To find the second Renyi entropy (2.11) for some finite interval A, we need to find the is then given by the number of basis operators in the set I ∩ D(A), which leads to , s eq ≡ log q .  This discussion provides a simple explanation for the linear growth of entanglement entropy and its saturation, and shows that such behavior has its physical origin in ballistic operator growth, regardless of whether the system is chaotic or integrable. 5 But at this stage, there is an apparent violation of unitarity. Applying the above discussion to S    with O J(A) a nontrivial operator in J(A). In the discussion above, such an operator was assumed to have no contribution to NĀ(t), as naively its time evolution will be nontrivial in This result is intuitively appealing: when an operator develops a void in an interval A, it leads to mutual information between regions separated by A.
We stress that (2.20), and accordingly (2.22), are constrained by unitarity and should apply to any system, integrable or chaotic, which has sharp light-cone growth for the initial set of operators. We will see that (2.20) is indeed satisfied in various exactly solvable unitary circuit models in Sec. III and Sec. IV.

D. Mutual information and multi-partite entanglement
Here we explore further implications of void formation, as well as general constraints on this process, by examining the Renyi entropy for two and more disjoint intervals. Extending (2.22), we first show that the mutual information between two disjoint finite intervals is determined by the void formation function G(B,B; t) for some appropriate regions B and B. In a certain region of the parameter space, the corresponding mutual information is universal, determined by (2.20) from unitarity. We also derive new constraints on void formation from unitarity of entanglement growth for two intervals. For more disjoint intervals, we see that void formation in operator growth leads to multi-partite entanglement, and void formation functions provide new measures for characterizing multi-partite entanglement.
Consider S 2 for a region A = A 1 A 2 separated by an interval R. Without loss of generality, we can take |A 1 | ≤ |A 2 |. For t < |R|/2, from causality, N A (t) of (2.11) factorizes into a product of the functions for A 1 and A 2 where N A 1 (t) denotes the contribution from initial operators in the region D(A 1 ) and is given by (2.17). For t > |R| 2 , initial operators in the region FIG. 5. Different situations for two intervals. Different horizontal lines correspond to t = 0 slices Here we take |R| < |A 1 | and thus l 1 = |A 1 | and can now potentially form a void in region R and thus contribute to N A (t) which gives For further discussion, it is convenient to introduce the following notation: Note that when t < |R|/2, we have the factorized form (2.23) in any theory with sharp light-cone growth. For |A 1 |/2 > t > |R|/2 (which can happen for |A 1 | > |R|), we have K(R) = J(R), and from (2.20) , K(R) becomes an empty set, and thus G(R, K(R); t) = 1. So for t < l 1 /2 and for t > |A|+|R| 2 , G(R, K(R); t) and I 2 (A 1 , A 2 ) have a universal form in all systems with sharp light-cone growth 6 . For |A|+|R| 2 > t > l 1 /2, K(R) ⊂ J(R), the behavior of G(R, K(R); t) 6 In the discussion of [17] of Renyi entropies for two-dimensional conformal field theories (CFTs), these are indeed the regimes which are universal for all CFTs.
becomes system-dependent, and so does I 2 (A 1 , A 2 ). Now consider the entropy for the region Similar to the discussion immediately above, for t ≤ l 1 /2, the constraint (2.20) is enough to 2 , but for t > l 1 /2 new constraints arise. We find (see also Fig. 5) If we take A 2 to be the entire semi-infinite region to the right of R, then both regions K(A 1 ) and K(R) appearing in (2.29) and (2.30) depend only on A 1 and R, and l 2 /2 → ∞, so (2.29) holds for all t > l 1 /2. We can thus deduce from (2.29) a general constraint on the void formation functions for any two adjacent finite intervals A and B in an infinite system, where L 1 , L 2 are the semi-infinite regions to the left of A and to the right of B respectively.
The different regions appearing in (2.32) are shown in Fig. 6.
The above discussion can be generalized to express the second Renyi entropy for any number of intervals in terms of appropriate void formation functions. On the one hand, by requiring S for any A. We will see an explicit example of this in Sec. III C. Now consider a region consisting of n disjoint intervals A = A 1 · · · A n separated by intervals R 1 , · · · R n−1 , as in Fig. 7. Then the void formation function G(A, Q; t) gives a contribution to the entropy of the regionĀ = R 0 R 1 · · · R n , but does not contribute to the entropy of any region consisting of a proper subset of {R 0 , · · · , R n }. Thus void formation in A leads to multi-partite entanglement among all disconnected regions in A, which can be captured by the quantity G(A, Q; t).

III. RANDOM VOID DISTRIBUTION AND ENTANGLEMENT GROWTH
In this section, we consider the probability distribution of void formation for a generic operator in the random unitary circuit model in the large q limit. We show that it is given by the random void distribution (1.3). We then show that by assuming the random void distribution for all initial operators, we can correctly obtain the full expression for S in the random circuit model for A consisting of an arbitrary number of disjoint intervals.
Surprisingly, the resulting expression is found to coincide with the von Neumann entropy for holographic systems. In the next section, we will contrast the random void distribution with the void formation properties of two Clifford circuit models (which may be seen as non-chaotic systems).

A. Random unitary circuits
We first describe briefly the setup of the random unitary circuit discussed in [8][9][10], and its main properties. Consider a time-evolution of the system as in Fig. 8, where the evolution operator U (t) can be written as Here we take discrete time steps, with U n corresponding to the evolution operator at the n is a q 2 × q 2 unitary matrix that acts on two neighboring sites i and i + 1 at n-th step, and each such matrix is averaged over the Haar measure independently. We will consider the limit of large q.
The random unitary circuit can be seen as a discretized model for chaotic quantum systems. It is manifestly unitary and local, but replaces the local interactions between neighboring spins in a realistic Hamiltonian system by random ones. It provides a powerful playground for studying chaotic systems, as many observables such as entanglement entropy, OTOCs, and operator spreading coefficients are analytically calculable, and the resulting behavior has been found to be consistent with numerical results of realistic chaotic spinchain systems [8,9,[18][19][20].
Here are some features of the random circuit model in the large q limit which are relevant for our discussion [8-10]: 1. For a basis operator O α (introduced above (2.3)) with right and left endpoints given by x r and x l respectively, one finds which implies that the end points of any operator O move under time evolution in opposite directions with light cone speed 1 as in Fig. 3. Here and below, an overline denotes an average over the random unitaries. 3. In the large q limit, where N A (t) was defined in (2.11). Here the off-diagonal terms in (2.10) automatically vanish due to random averages. Furthermore, one has where S (A) denotes the von Neumann entropy. So our discussion below about S 2 can also be understood as being relevant for the von Neumann entropy.

B. Random void distribution
We now look at the probability distribution for a generic operator O to develop a void in some designated region A in the random circuit model at large q.
We can expand O in terms of basis operators as Then under time evolution, the probability for O to have a void in A is where P Oα was introduced in (2.6), and in the second equality the off-diagonal terms drop out due to the random average. In (3.7) by "β with void in A" we mean O β should be trivial in A and have support in each disconnected part ofĀ. For instance, in the case of Fig. 7, O β should be the identity in A while being nontrivial in each of R i 's. From now on, for notational simplicity we will suppress the explicit overline for averages.  Oα (t) in the large q limit. When the leading contribution is given by the former process, which can physically be seen as one of genuine "void formation," we still have (3.8). When one of the latter types of processes is Oα (t) is enhanced compared to (3.8). Nevertheless, for a generic operator (3.6), the part of the operator corresponding to O α with an initial void is suppressed due to a much smaller phase space. One can show that the enhancement from disconnected processes never overcomes the suppression. See Appendix A 2. Given that (3.8) is independent of O α other than through the causality constraint A ∈ J + (O α ), a generic operator O also satisfies (3.8).
Note that if we assumed that for a given initial operator, the probability of having any of the q 2 basis operators at any site between the endpoints of the operator at time t is the same, then the probability q −2|A| would immediately follow: having a void in A corresponds to fixing the operators at |A| sites in the final operator to be one of q 2 options, while allowing the remaining operators to take any value. A similar ergodicity assumption was used in the operator growth model introduced in [7]. Equation (3.8) also applies to the random circuit model at a finite q in the regime that region A and t are large (so that q |A| , q t are large), as we will discuss elsewhere.
From (3.8) one can obtain the void formation function (2.13) for any regions A and B.
In the simplest case with A consisting of a single interval lying within the future light cone of a region B, we have For the Renyi entropy of two intervals (2.24)-(2.25), we need G(R, K(R); t) with K(R) = J(R) ∩ D(A 1 RA 2 ), whose behavior depends on the relative sizes of |A 1 |, |A 2 | and R. For example, for |R| < |A 1 | we find that which upon using (2.24) leads to The mutual information between A 1 and A 2 is thus nonzero only for the case |R| < |A 1 |.
One can also obtain S using the partition function method of Appendix A 3, and one finds agreement with (3.11)-(3.12).
An alert reader may notice that the expressions (3.11)-(3.12) coincide with the expressions for the evolution of entanglement entropy after a global quench in holographic systems [21,22]. We will see below that this is not an accident; the results agree for any number of intervals.
To conclude this subsection, we note that if there exists some (t) will deviate from (3.11)-(3.12), and thus from the holographic expression.

C. Random void distribution and maximal entanglement growth
Consider a region A = A 1 · · · A n consisting of n intervals A 1 = [l 1 , r 1 ], · · · , A n = [l n , r n ], separated by intervals R 1 , ..., R n−1 , as in Fig. 7. The entanglement entropy S Each connected pair {l i , r j } in a configuration γ contributes |l i − r j |, while each unconnected point contributes t, so that we get (3.13).
Extending our discussion in section III B for two intervals, one can show that the same expression (3.13) follows by using only the following elements: (i) sharp light-cone growth; (ii) the random void distribution (3.8) for all initial operators; (iii) large q. The derivation is given in Appendix B.
The expression (3.13) can also be shown to be equivalent to the expression of entanglement entropy (in a scaling regime) for holographic systems after a quench. Holographic systems are a certain class of strongly coupled (1 + 1)-dimensional conformal field theories (CFT) which have a gravity dual. In the large central charge limit, their entanglement entropy can be calculated using classical gravity. More explicitly, in the regime with t, |A 1 |, · · · , |A n |, |R 1 |, · · · , |R n−1 | large while |A i |/t, |R i |/t are fixed, the holographic entanglement entropy after a global quench has the following form (see e.g. [11]) and σ are permutations of {1, ..., n}, and s eq is the equilibrium entropy density. The contribution we get from a permutation σ at time t on the right-hand side of (3.14) is equal to the contribution we get on the RHS of (3. We can check that a minimal configuration in (3.13) only involves connections between adjacent endpoints. So we get the same result if we restrict the definition of connectable points below (3.13) to adjacent endpoints l i and r j such that t > |l i − r j |/2.
As the number of intervals increases, equation (3.14) (or equivalently (3.13)) gives rise to intricate patterns of time-dependence when the relative sizes of the intervals |A i | and their separations |R i | are varied. It is remarkable that these patterns can be reproduced by the extremely simple underlying rules of sharp light-cone growth and the random void distribution. We note, however, that while for the random unitary circuits, S 2 coincides with the von Neumann entropy in the large q limit, this is no indication that this is true for holographic systems in the large c limit. 7 So the entanglement spectrum of holographic systems cannot be fully approximated by random unitary circuits, and while it is natural to expect that the random void distribution should play some role in holographic systems, it cannot be the full story.
In [11], it was shown using the strong subadditivity condition that the holographic expression (3.14) in fact maximizes the entanglement growth for an arbitrary number of intervals among all (1 + 1)-dimensional systems with a strict light cone. Thus we find that the random void distribution (3.8) together with sharp light-cone growth maximizes entanglement growth (recall that S 2 is upper-bounded by the von Neumann entropy). In the two-interval case, this statement implies that any system with S 2 (t) not equal to the holographic result can only be spread, but not created, which may thus be considered a proxy for free theories; (ii) a circuit built from perfect tensors, which may be considered a model for interacting integrable systems, as while it can generate entanglement in certain initial product states, like all Clifford circuits it does not lead to growth of operator entanglement, and also does not have the form of the out-of-time-ordered correlator (OTOC) expected in chaotic systems [8,9]. To study entanglement growth, now instead of generic homogeneous pure product states, we have to consider a more restricted set, as these Clifford circuit models do not generate entanglement in an arbitrary initial pure product state. Furthermore, initial basis operators in the entire system can in general both grow and decrease in size under the action of the circuit U . The set of initial states we look at are again of the form where the set I consists of q L basis operators, which are in general no longer just tensor products of powers of Z i . In each of the models, we will choose ρ 0 such that the end points of all basis operators in the associated I move outwards with v = 1 under time-evolution.
As before, the entanglement growth can be obtained from (2.9)-(2.13). Note that for Clifford circuits the off-diagonal terms in (2.10) vanish identically due to the one-to-one mapping between initial and final basis operators. The sharp light-cone growth of operators in I again implies that S Before discussing the models explicitly, here we summarize some common features, which are also shared by random unitary circuits in the large q limit: 1. While their void formation structure is very different from the random void distribution, we will see the corresponding void formation functions nevertheless satisfy the  It is tempting to speculate that given sharp light-cone growth, equation (4.2) is true in all unitary systems.

A. Free propagation model
In this model, the two-site unitary matrices in the circuit are given by [11] U a 1 a 2 ,b which is a discrete version of the quasiparticle models for entanglement growth proposed in [26].Ũ takes a product state to a product state at all times, but can spread the entanglement to large distances if we consider an initial state which has short-range entanglement between adjacent pairs of sites.
The evolution of basis operators 8 in this model has a simple form: the basis operators at different sites evolve independently from each other, and all operators that are initially at an even (odd) site move to the right (left) at speed 1, so that at an odd time t, for an initial The density matrix for (4.4) is of the form (4.1), with I the set of operators which can have any basis operator O i at an even site i, but at site i + 1 have some fixedÕ i+1 determined The time evolution of basis operators in I can be readily obtained from (4.5), see Fig. 11 for an illustration.
In Fig. 12, we see that all initial operators from sites in X (4.6) 8 We use the following conventions to avoid complications due to lattice effects which are not relevant in the continuum limit. All spatial regions we consider have their left endpoint at an even site and their right endpoint at an odd site. We always consider times at which an odd number of layers of unitaries have been applied.   Fig. 12). It can also be checked the constraints (2.29)-(2.31) obtained from unitarity are satisfied in this model.

B. Perfect tensor model
In this model, which was previously considered in [12], the Hilbert space at each site has dimension 3 (q below should be understood as being 3), with a basis {|0 , |1 , |2 }, andŨ † acts on the Hilbert space as: where |ψ 3 ≡ 1 √ 3 2 k=0 |k , for which I is the set of basis operators with powers of X on even sites and powers of Z on odd sites. From (4.8) one finds that acting on operators in the basis (2.2),Ũ sends any basis operator on two sites to another basis operator, and has the following properties: 1. It takes any operator with a power of X on site i and a power of Z on site i + 1 to an operator which is non-trivial on both i and i + 1. From items (1) and (2), we see that all operators in I grow outwards with speed 1, as illustrated in Fig. 13. Thus we have the same form of S (A) 2 (t) for a single interval A as we found in the random circuit and free propagation models with the initial states we considered there. Note that (4.9) is a product state, so unlike the free propagation model, the perfect tensor model can generate entanglement in an initial pure product state.  Fig. 15. The void formation in this model has a fractal structure, similar to the fractal Clifford circuits discussed in [24,25]. 10 The number of non- U at Q acts on an operator which is non-trivial on site i and trivial on site i + 1, giving an operator which is non-trivial on both i and i + 1. By repeatedly using the fact that operators nontrivial on a single site evolve to operators non-trivial on two sites, we see that the right endpoint moves to i + (t − 1) at time t.

V. CONCLUSIONS AND DISCUSSION
In this paper, we examined the implications of void formation in operator evolution for entanglement growth, and showed that it plays an important role in maintaining unitarity of entanglement growth and generation of mutual information and multi-partite entanglement.
We showed that the void formation probability for generic operators in random unitarity circuits is given by the random void distribution (3.8). We also showed that the intricate time-dependence of holographic entanglement entropies for an arbitrary number of intervals after a global quench can be understood as a consequence of the very simple rules of sharp light-cone growth and the random void distribution.
Furthermore, we found sharp differences between the void formation properties of random unitary circuits and the non-chaotic circuit models we studied, which suggests that void formation can be used to characterize differences in the operator evolution of chaotic and integrable systems which are not captured by the movement of the operator endpoints alone [28]. It is also an interesting question whether void distribution can be used to distinguish between different classes of chaotic systems. For example, it is conceivable that the random void distribution may only apply to highly chaotic systems.
In our discussion, for simplicity of presentation we have restricted to systems with sharp light-cone growth in the evolution of operators. In general systems, the fronts of operator growth should follow a distribution. For example, in random unitary circuits at finite q, the evolution of an operator exhibits a diffusive front around a butterfly velocity v B which is smaller than the lightcone velocity v c [8,9]. Our discussion, including the random void distribution, can be generalized to these situations, although the story is technically more complicated and will be presented elsewhere.
In our discussion we have focused on the second Renyi entropy, which can be conveniently expressed in terms of the expected number of operators that develop a void in a certain region. It would be interesting to explore the implications of void formation in operator growth for higher Renyi and von Neumann entropies 11 , as well as whether the unitarity of these quantities imposes further constraints on void formation. It is possible these quantities will involve other aspects of void formation, and not just the squared absolute values of the operator-spreading coefficients.
While we have restricted to one spatial dimension, our discussion can be immediately generalized to higher dimensions. In one dimension, a process of void formation separates both the original operator and the full space into disconnected parts. Thus it simultaneously creates "holes" in an operator and breaks it into disjoint pieces. This is not true in higher dimensions, where "hole formation" in an operator and breaking an operator apart are distinct void formation processes, as discussed in Fig. 2. In particular, it is the latter type of process which contributes to mutual information and multi-partite entanglement among disjoint regions. It is also a natural question whether the "hole formation" and "breaking apart" could follow different probability distributions in higher dimensions.
In this paper, we defined a void as a region of identity operators among regions of non- 11 although in some cases, like random unitary circuits in the large q limit, all the entropies are the same.
trivial support of an operator. This definition is only appropriate for a lattice system with a finite one-site Hilbert space at infinite temperature. For finite temperature or continuum systems, a mathematically rigorous definition is tricky. Operationally, one can define a void as the part of an operator which is given by the equilibrium density operator.
It would be interesting to explore the implications of void formation for other observables to see how it affects their behavior in integrable and chaotic systems, and also to see if the non-unitarity in the absence of void formation manifests itself in other observables. For example, one can show that in some simple models without void formation, the out-of-timeordered correlation functions (OTOCs) will also violate unitarity, although the violation appears less dramatic than that in the entanglement entropy. We will leave the exploration of this question for elsewhere.
It is important to see whether one can find other measures to characterize void formation besides than the probability functions we discussed in this paper. For example, how does one characterize the fractal structure of Fig. 15? A related question is whether such fractal structure is generic among interacting integrable systems.

Mapping to a classical Ising partition function
Consider the operator spreading coefficient |c β α (t)| 2 introduced in (2.4), which can be written as a matrix element on four copies of the Hilbert space: In the last line we have introduced, for any operator O in the system, "up" and "down" spin states on four copies of H, In the case where O = 1, we use the notation |↑ = |1 ↑ , |↓ = |1 ↓ .
The time evolution operator U for the entire system is a tensor product of random unitaries from the Haar ensemble applied at each time on pairs of sites, as shown in (3.1)-(3.2) and

Derivation of the random void distribution
Now let us consider the probability P Oα (t) = β with void A |c β α (t)| 2 for a basis operator O α to develop a void in region A = A 1 · · · A n in random unitary circuits in the large q limit.
We will take A to be in the future light cone of O α , as otherwise P Oα (t), it is convenient to consider a slightly different quantity which also includes possible contributions from processes in which O α evolves to operators trivial in some disconnected parts ofĀ. Such contributions are not in P Oα (t). Summing (A1) over all O β which have the identity in A we find where |↓Ā = ⊗ i∈Ā |↓ i and we have used the fact that where the sum over a runs over the complete set of basis operators at site i. Now (A4) can be calculated with a partition function with boundary conditions as shown in Fig. 21, with the interactions on the top boundary given by Fig. 20. The domain walls give a factor of q −(|A 1 |+...+|An|) . Combing these factors with the prefactor q |Ā| = q L−(|A 1 |+...+|An|) in (A4), we find the total contribution from this configuration is q −2(|A 1 |+...+|An|) . There are other possible domain wall configurations contributing to (A4), but all these other configurations correspond to the final operators in (A3) being trivial in some disconnected parts ofĀ. An example is given in Fig. 24. Thus only the configuration 12 At finite q, it is not sufficient to simply know the length of a domain wall, as a combinatorial factor needs to be included in each configuration to count different possible paths of a given length. But in the large q limit, we can ignore this q-independent combinatorial factor.  If O α has an initial void, then the domain wall configuration shown in Fig. 23 still exists, and evaluates to the same value. However, in this case, we also have the possibility that an initial void may evolve into a final void while the disconnected parts of the initial operator evolve independently. Such processes correspond to a new configuration in the partition function for P (A) Oα (t), shown in Fig. 25. Clearly when either |A| or |G| is larger than 2t, this is the only process which can contribute to P Oα (t). When |G| < 2t and |A| < 2t, in general both possibilities exist and compete. The domain wall configuration of Fig. 23 gives the probability that the initial void closes and opens up a new void, and we have the same value as (A6). The configuration of Fig. 25 gives a contribution q −|A|+|G|−2t = q −2|A| q |A|+|G|−2t which dominates when |G| > 2t − |A|. Note that for a generic operator (3.6), for O α with initial void G, we have |a α | 2 ∼ q −2|G| and thus the overall contribution from such operators is q −|A|−|G|−2t which is subdominant compared with q −2|A| for |A| < 2t.
To conclude this discussion, let us note that one can find the probability for an operator of size l i to evolve into an operator of size l f (assuming it is allowed by causality, and both operators do not have voids) in the large q limit. This probability is independent of the initial and final operators, and is given by (A7) From here one finds that the probability going to all final operators with length l f = l i + 2t allowed by causality is q −l i −l f −2t q 2l f = q l f −l i −2t = 1, which leads to the sharp light-cone growth noted in (3.3).
In obtaining (A9) we again used the fact that on summing over all operators in A, we obtain |↓ A by using (A5). Also recall that ρ 0 = ⊗ i ρ i with each ρ i given by (2.7). Note that the boundary conditions at top boundary are reversed compared with (A4). In the large q limit, one has e −S (A) 2 (t) = e −S (A) 2 (t) and all S n are the same [10], thus the (minus) logarithm of the partition function corresponding to (A9) gives S (A) 2 (t) and other entanglement entropies. The structure of the lattice is the same as in Fig. 21, but along the top boundary, we have ↑ spins inĀ and ↓ spins in A. On the lower boundary, we have ρ i↑ at each site. As in last subsection, the evaluation of (A9) boils down to summing over domain wall configurations.
From the fact that each ρ i is a projector onto a single state in the one-site Hilbert space, we get a factor of 1 q from each site on the lower boundary, irrespective of whether the lowermost bulk layer has an ↑ or ↓ spin at that site. 13 Thus irrespective of where the domain walls end, we get a factor of 1 q L from the lower boundary, cancelling with the prefactor q L in (A9). Effectively the bottom boundary does not play any role. Now consider a region A = A 1 · · · A n consisting of n intervals A 1 = [l 1 , r 1 ], ..., A n = [l n , r n ], separated by intervals R 1 , ..., R n−1 . Due to the boundary conditions, in each nonzero configuration in the partition function, we have 2n starting points of domain walls on the top boundary, from each of the l i and r i . As discussed in last subsection, in the large q limit, we only need to specify whether each domain wall starting on the top boundary reaches the bottom boundary, in which case we get a factor of q −t , or joins with another domain wall to enclose an interval of length l, in which case we get a factor of q −l . The latter possibility only exists for t > l/2.
and n γ is the number of unpaired points in γ. In the large q limit, the configuration with minimal f (γ) dominates. 14 We thus find S (A) 2 (t) = s eq min γ f γ , s eq = log q .
(A11) 14 In the large q limit we can also ignore any q-independent prefactor, as it contributes an O(1) term to S 2. Consider one of the factorized parts, in which 2t is greater than all R i in that part, for example, Q 1 in Fig. 27. There are possible multiple competing contributions to N (A 1 A 2 , D(Q 1 ); t), which are exhibited in Fig. 28. One contribution comes from operators which are nontrivial at all sites of D(Q 1 ), see Fig. 28(b), which from (3.8) gives where M D(Q 1 ) (t) = q |D(Q 1 )| = q |A 1 |+|A 2 |+|R 1 |−2t is the number of initial basis operators in D(Q 1 ). There is also a disconnected contribution from Fig. 28(a), where nontrivial operators in D(A 1 ) and D(A 2 ) separated by an initial void evolve independently to region A 1 and A 2 respectively. Note that one may also consider initial operators with a void like in Fig. 28(c), where the non-trivial parts of the operator are not contained within D(A 1 ) and D(A 2 ). Assuming sharp light-cone growth, such an operator cannot give a disconnected contribution in which O 1 , O 2 evolve independently to A 1 and A 2 .
However, such an operator can give a connected contribution, which corresponds to situations where the initial void closes and then opens an new void. This contribution is suppressed compared to (B3), as the phase space for initial operators with a void is suppressed, while the probability for an individual operator to develop a void is the same. FIG. 27. R 2 separates A into two parts with M 1 = A 1 A 2 and M 2 = A 3 A 4 , which can be independently considered. The initial operator must have identity in region D(R 2 ) in order to contribute to (B1), and thus the contributions from two sides of region D(R 2 ) factorize.
We will prove (3.13) by induction. For two intervals, n = 2, we already showed this explicitly in Sec. III B (the discussion leading to (3.11)-(3.12)). Here we give another derivation which connects more directly to the form (3.13). From item 1 above, for t < |R|/2, For t > |R|/2, from item 2 we have two competing contributions: the connected one, which is given by (B3), and the disconnected one, given by (B4). Comparing (B3) and (B4) in the large q limit leads to the two-interval result from (3.13), with the connected contribution (B3) corresponding to having the pair (r 1 , l 2 ) in the endpoint configuration γ.
Assuming we have (3.13) up to n = k intervals, let us now consider n = k + 1.
Let us denote the collection of R i 's which are greater than 2t by R big . Such The same thing happens to (3.13): if there exists a |R i | > 2t, then the two parts of A separated by interval R i can be independently minimized as there are no allowed pairing between end points on the left of R i and those on the right of R i . Thus equation (B5) also applies to (3.13), and each term in the sum of (B5) is given by (3.13) from our assumption regarding n ≤ k.
We increase t until t = r/2 when the set R big becomes empty, where r is size of the largest R i , which we call R max . Slightly before reaching t = r/2, from the random void distribution (3.8) initial operators cannot develop a void in region R max , which in the language of (3.13) corresponds to the fact that the two end points of R max cannot be paired with each other. Now when t > r/2, from the random void distribution (3.8), there are new contributions coming from initial operators developing a void in region R max , which compete with previously existing ones. Note the new contributions include the connected one for the full region D(Q), but also disconnected ones. See Fig. 29 for two examples. These new contributions are in one-to-one correspondence with new γ-configurations in (3.13) which come from pairing the two end points of R max . One can also readily check that their respective contributions agree. Thus concludes the proof.
In the above proof, we assumed the random void distribution for all initial operators, an assumption that is not precisely true in random unitary circuits in the large q limit due to the subtlety noted at the end of Appendix A 2. Yet we found by the partition function calculation of Appendix A 3 that equation (3.13) is true in random unitary circuits in the large q limit, indicating that the cases where P Oα (t) for initial operators with voids is not given by the random void distribution can be ignored in the calculation of S 2 . For instance, this means that in Fig. 28(c), the cases in random unitary circuits where the disconnected contribution (due to independent evolution of O 1 and O 2 respectively into A 1 and A 2 ) is dominant can be ignored. Indeed, on using (A7) and counting the number of relevant initial and final operators, we can check that the collective disconnected contribution from all initial operators like in Fig. 28(c) is the same as that from Fig. 28(a), changing S 2 at most by a q-independent O(1) term which can be neglected in the large q limit.