Entanglement renormalization, quantum error correction, and bulk causality

Entanglement renormalization can be viewed as an encoding circuit for a family of approximate quantum error correcting codes. The logical information becomes progressively more well-protected against erasure errors at larger length scales. In particular, an approximate variant of holographic quantum error correcting code emerges at low energy for critical systems. This implies that two operators that are largely separated in scales behave as if they are spatially separated operators, in the sense that they obey a Lieb-Robinson type locality bound under a time evolution generated bya local Hamiltonian.


Introduction
In most physical theories, the notion of locality is imposed, as opposed to being derived from more elementary principles. The AdS/CFT correspondence indicates that this picture may need to be amended, at least for studying the quantum theory of gravity [1][2][3]. An interpretation of the duality in the language of the quantum error correcting codes [4], and the proposal that spacetime may be built out of entanglement [5], suggests a fruitful avenue along which we can study these questions in the language of quantum information theory.
There has been a recent surge of activity devoted to constructing holographic quantum error correcting codes [6][7][8][9]. These are families of codes which can be formally expressed as an encoding map from the bulk theory to the boundary theory or vice versa. While the details behind these codes vary, they share a number of interesting properties. The operators in the bulk can be mapped to operators on the boundary which obeys certain quantum error correction properties outlined in Ref. [4]. They can also reproduce, to some extent, the celebrated Ryu-Takayanagi formula [10].
However, several important issues remain unresolved. Most importantly, these codes are constructed from scratch, as opposed to being derived from a set of well-motivated assumptions. If we believe in the unitary equivalence of CFT and the quantum theory of gravity in AdS space, we should be able to explain how such codes emerge from the properties of the CFT. Second, the question of dynamics remains open. Modulo one exception [7], these codes are formally maps from the bulk to the boundary that are injective but not surjective. Therefore, acting a Hamiltonian on the code state will generically produce a state that is outside the code subspace. Furthermore, the boundary Hamiltonian, even if it is local, becomes generically non-local once it is mapped to an operator in the bulk. Resolving what it means to have causal bulk dynamics in the presence of these complications is clearly a nontrivial problem.
The purpose of this paper is to make progress on these important issues. First, we show that an approximate version of holographic quantum error correcting code emerges at low energy at criticality at scales large compared to the AdS radius, if the ground state can be well-approximated by a certain multi-scale entanglement renormalization ansatz (MERA) [11] for which correlations decay polynomially with distance. Empirical evidences suggest that this is likely to be true for quantum spin systems at criticality [12]. If that is indeed true, our work implies that certain variants of holographic quantum error correcting codes naturally emerge in these systems. We also derive fundamental bounds on the error correcting capabilities of these codes. As for the dynamics, we derive a Lieb-Robinson type locality bound [13] between two observables that are largely separated in scale. It is important to note that these observables generally do not even commute with each other. Despite this fact, they behave as if they were spatially separated operators undergoing a dynamics generated by a local Hamiltonian. In some sense, the causal dynamics in the bulk emerges from the universal structure of entanglement at low energy.
Our work supports the proposals to interpret MERA as a discrete analogue of the AdS/CFT correspondence [14][15][16][17], at least at scales large compared to the AdS radius. In order to be able to accommodate locality at sub-AdS scale, one would need to incorporate more fine-grained structures. In its present form, our conclusion is so general that it is even applicable to free-fermion systems, which is unlikely to admit a semiclassical gravitational dual [18].
It has been known that tensor networks such as MERA lead to constructions of various quantum error correcting codes [19]. What is interesting is that, as suggested by Pastawski et al. [20], these codes naturally appear at low energies of critical systems. These codes differ greatly from the so called topological codes [21] in that (i) erasures of bounded regions can be corrected up to a polynomially small error, rather than exponentially small error and that (ii) one in fact has a family of codes that are related to the geometric data of the hyperbolic space. Our work provides a concrete framework and technical tools from which the structure of these codes can be studied.
The results presented here rely on a very general property of entanglement renormalization and on recent insights from the theory of approximate quantum error correction (AQEC) [22]. It is particularly illuminating to use entanglement renormalization in the "Heisenberg picture," wherein the renormalization group (RG) flow acts on the space of observables. This is an observation already made in the literature [23,24], which we generalized substantially in this paper. The only property that we use is the fact that this RG flow (i) preserves locality and that (ii) it is norm-nonincreasing. Both of these properties are manifestly true for various proposed forms of entanglement renormalization, but they are not the only possibilities. While we restrict ourselves to one-dimensional systems for concreteness, it should be clear that these two properties are sufficient to guarantee a similar conclusion in more generalized settings, e.g., higher dimensions and different spacetime geometry. The insight that we bring from AQEC is a duality relation between decoupling and recoverability; the degree to which quantum information can be recovered from a given region is exactly equal to the degree to which certain regions are completely decoupled from each other [22,25,26].
In Section 2, we review basic facts about entanglement renormalization and derive several identities that form the basis of our analysis. In Section 3, we sketch the relation between entanglement renormalization and error correction in the context of holography. We then derive fundamental properties of error correcting codes that emerge from entanglement renormalization in Section 4. In Section 5, we use these properties to constrain the support of the logical operator and derive a Lieb-Robinson type locality bound between two bulk observables.

Entanglement Renormalization
Entanglement renormalization was introduced by Vidal [11,27] to numerically study the critical behavior of one-dimensional quantum many-body systems. Generalizations to higher dimensions are known [28]. We review the basic notions underlying these constructions, and review several facts that are pertinent to this paper. MERA is a many-body quantum state that is created by applying a quantum circuit to a simple product state, say |0 ⊗N , where N is the number of qubits.
There are two important properties that underlie this circuit, and these will form the basis of our argument. First, the circuit is hierarchical. It can be decomposed into a sequence of isometries, which are labeled in terms of a parameter (s) that ranges from 0 to O(log N ). These isometries will play an important role; we denote them W s . It should be noted that the isometry W s maps vectors of the Hilbert space at "scale" s to the Hilbert space at scale s − 1. These Hilbert spaces are denoted H s . In particular, H 0 is the physical Hilbert space. Second, at every level s, the isometry W s preserves locality. Applying the dual of these isometries to a local operator results in another local operator; that is, the support of W s O s W † s can only be larger than the support of O s by a constant amount, where O s is acting on H s .
In the original construction of Ref. [11], W s is the composition of a global product of isometries and disentanglers at scale s: W s := ⊗ xs V xs ⊗ ys U ys , where x s and y s are an index of the position along the chain at level s. In Figures 1 and 2, for convenience, we will illustrate binary MERA constructions with uniform isometries V xs = V and unitaries U xs = U , however our results hold for the more general construction above.
While MERA is usually a single state, we will instead consider a family of subspaces, C s . These subspaces are defined in terms of the isometries from H s to H 0 : C s = {W 1 · · · W s |ϕ s | |ϕ s ∈ H s }. In Fig. 1, we have illustrated C 5 for W s := ⊗ xs V xs ⊗ ys U ys . It should be clear that for any finite MERA construction there exists an s max = O(log(N )) such that C s is trivial for all s ≥ s max . For further analysis, it will be convenient to work in terms of a certain family of purified states. Consider a state ρ acting on H s . We would like to consider a family of states that are (i) first purified and (ii) then mapped into the Hilbert space H s (s < s) by applying an isometry W s W s +1 · · · W s . In concrete terms, such a state is expressed as follows: where |Ω s is a maximally entangled state between H s and a copy of H s which we call H Rs , and U Rs is a unitary operator acting on H Rs . In particular, |ρ s is a purification of C s : tr Rs [|ρ s ρ s |] = ρ s .

Renormalization in the Heisenberg Picture
The MERA formalism is especially well suited to studying expectation values of local observables. More generally, we will need to consider objects of the form: where O s is some operator that is supported on H s ⊗ H Rs and |ρ s , |σ s are purifications of ρ s , σ s ∈ C s , and O s will often have some additional locality structure on H s . The reason for considering such objects will become evident once we explain its relation to quantum error correction in Section III. For the moment though, it will be important to develop the machinery for their analysis.
For that purpose, it will be convenient to recast this object in an alternative form, which can be thought as the "Heisenberg picture" of entanglement renormalization. Let us first note the following identity: . This map is completely-positive, trace-preserving (CPTP) and unital. Such maps are often referred to as (unital) quantum channels. In particular, it is norm-nonincreasing and maps the identity operator to the identity operator. We shall refer to the process of applying Φ s+1 s to O s as the process of coarse-graining (renormalizing) the operator from scale s to s + 1. More generally, we will consider the map: which corresponds to the process of renormalizing an operator from scale s to s , where s > s. It is clear that Φ s s maps operators on H s to operators on H s . Under the renormalization procedure, the evolution of the operator can be broken down into two stages. In the first stage, the support of the operator shrinks monotonously, at a constant rate: if A s is a simply connected region at level s, then an operator O As supported on A s gets mapped to an operator O A s+1 ≡ Φ s+1 s (O As ), where |A s+1 | ≤ c|A s | for some constant c > 1. In the (binary) MERA network illustrated Fig. 1, the constant c is 2. The set of the supports over different scales, {A s , A s+1 , · · · , A s −1 , A s }, is said to be the past causal cone of A from s to s . When the range is obvious from the context, we shall simply say past causal cone, without specifying the range.
In other words, as an operator is renormalized from one scale to another, its support ( ) shrinks exponentially with the scale separation. After O(log ) renormalization steps, the support size becomes O(1), and the second stage begins. What distinguishes the second stage from the first is the fact that the support of the operator remains constant under further coarse graining. The minimal nontrivial regions which can support such operators are referred to as the elementary blocks of the MERA network (see Fig. 1 The aforementioned behavior of renormalized operators is, qualitatively speaking, independent of the details of the MERA network. That is, the conclusion remains intact even if the shape of the network differs at different scale or even if there is a spatial anisotropy. However, accommodating those generalizations will necessitate unnecessary complications. This is why we shall consider MERA networks that are scale-invariant, which we define below.

Definition 1.
A MERA network is scale invariant if there exists an isometry V and a unitary U such that V xs = V ∀x s , s and U ys = U ∀y s , s. Φ s s is a quantum channel that maps operators on H s to operators on H s , which will typically be different spaces. However, if the MERA network is scale invariant, when Φ s s acts on an observable in an elementary block of s, it gets mapped to an observable in an elementary block in s + 1. The unique channel Φ mapping operators between elementary blocks of s and s + 1 can be represented as one with identical input and output space (see Fig. 1a). This is extremely convenient as it allows us to map a "trail" of elementary blocks up the MERA network as the iteration of quantum channels (Fig.1b). If the network is scale invariant, as is expected for critical systems, the dynamics between elementary blocks is governed by stationarity and mixing properties of the channel Φ, which will also be referred to as the transfer operator.
Generic quantum channels (see the appendix for a discussion) that have the same input and output algebra can be written as where λ k are the eigenvalues of Φ, and L k , R k are the bi-orthonormal left and right eigenvectors: tr[L k , R l ] = δ kl . The spectrum of the channel is bounded by one (|λ k | ≤ 1), and for generic quantum channels, there is only one eigenvalue of magnitude 1 corresponding to the unique stationary state of the channel (in the Schrödinger picture). Arranging the eigenvalues in decreasing order (decreasing real part), we get that λ 0 = 1, with L 0 = 1 and R k = ρ ss is a density matrix, which we will refer to as the stationary state for the elementary block. λ 1 will play an important role in the remainder of the paper. For scale invariant MERA, ν := − log 2 (Reλ 1 ) will be referred to as the scaling dimension.
To conclude this section, we formally define the class of channels that we plan to work with: Definition 2. For a scale invariant MERA network defined by the isometries {W s := ⊗ xs V xs ⊗ ys U ys }, we say that the class of channels Φ s+1 s (·) = W † s (·)W s is RG-regular if its action on elementary blocks can be written as in Eq. (2.5) with a scaling dimension ν := − log 2 (Re[λ 1 ]) strictly larger than zero.
If the subspaces C s are related to H s by an RG-regular channel Φ s 0 , we will say that C s are RG-regular subspaces (or codes in the error correction language).

Calculus for entanglement renormalization
The action of the renormalization map Φ s s on general, non-local operators play an important role. We develop a calculus that facilitates this analysis below. The operators that we consider are, generally speaking, supported on three subsystems, which we denote as A, A ,and R. Here A is a subsystem of the physical Hilbert space(H 0 ), R is the purifying space, and A is yet another subsystem that is included neither in the physical Hilbert space nor in the purifying space. Let us denote such an operator as O AA R . Simply connected regions of the physical Hilbert space H 0 will be denoted without a subscript (A).
We consider a linear map of the following form: It is important to note that the output of this map is generally an operator, because A lies outside of the physical Hilbert space and the purifying space. We see that The first line follows from the definition of the state. The second line follows from the locality of the renormalization map; it maps an operator supported on AA R to an operator supported on A s A R. We will often write subsystems as superscripts in order to specify the MERA scale s in the subscript. While this identity in Eq. (2.7) may seem a bit obtuse, it has important implications. First, consider the case in which A is an empty set. The correlations between A and R for an arbitrary operator have a simple closed-form expression.
The claim follows from the trivial identity There is in fact a more general identity, which plays a crucial role in our analysis. (2.10) Proof. Consider an operator Schmidt decomposition of O ACR : where O A,j is an operator supported on A and O CR,j is an operator supported on CR.
Because any operator admits such a decomposition, it suffices to prove the statement for an operator of a tensor product form between A and CR. Without loss of generality, 1 Correspondingly, in view of Eq.2.7, OAR should not be viewed as an operator supported on A and the purifying space. It should be instead viewed as an operator supported on A and a subsystem R which is neither in the physical Hilbert space nor in the purifying space.
Note that tr As [ρ As s Φ s 0 (O 1 )]O 2 is an operator supported on CR, as the term appearing before O 2 is a scalar. By using the fact that tr[ρ CR . Since we assumed that A s ∩ C s = ∅ ∀s ≤ s, the past causal cone of A and C never overlap with each other in this range. Therefore, . This completes the proof.

Quantum Error Correction and Holography
Recently, various quantum error correcting codes were proposed as models of holography [6][7][8][9]. These codes are equipped with a family of logical operators that are labeled by the coordinates in the bulk. The radial coordinate, which in our setup corresponds to the scale(s), is particularly interesting in the context of quantum error correction. The logical information becomes progressively more well-protected against erasures of boundary subsystems as it recedes further into the bulk. We will show that such codes naturally arise from the MERA construction. Our choice of logical operators follow the choice of bulk local operators defined in Ref. [16,17]. In our notation, the logical operators at scale s will have the form of W s · · · W 1 OW † 1 · · · W † s , where O ∈ B(H 0 ). We show that, as in the existing proposals [6][7][8][9], these operators are more well-protected as s increases. We also derive several fundamental properties of these codes.
How are these results at all related to the discussion in Section II? The answer lies on a well-known duality relation between two different concepts, which is perhaps one of the most fundamental insights behind quantum error correction. Erasure of a certain region is correctable if and only if the region contains no logical information [25,26]. In slightly more technically terms, an erasure is correctable if and only if the region is uncorrelated with the purifying space for all the codewords. This equivalence relation implies that it suffices to bound the correlations between the purifying space and a subsystem of interest. This is why we considered objects of the form of Eq.2.2 in Section II.
It turns out, however, that much more can be learned about the structure of these codes by introducing a more refined notion of correctability. It is the notion of local correctability which was introduced in Ref. [22] and used in the context of holography in Ref. [29]. As in Refs. [25,26], there is a similar duality relation between local correctability and the degree to which different subsystems are uncorrelated from each other. In words, erasure of a region A is locally correctable from a recovery operation on AB if and only if ACR is decoupled into A and CR, where C is the complementary region of AB and R is the purifying space. It should be clear that this subsumes the less general case of B being an empty set, which corresponds to Refs. [25,26]. Specifically, this result is encapsulated in Theorem 1 Theorem 1. [22] Consider a code C whose underlying Hilbert space can be decomposed into a tensor product of A, B, and C. Let R be the purifying space of C. Then the following two objects are equal: where inf is over all CPTP maps from B(H B ) to B(H AB ) and B(·, ·) is the Bures distance.
A few remarks are in order. First, the Bures distance is a distance measure that can be easily related to a more familiar one, the trace norm: The trace norm between two quantum states has the operational interpretation that it quantifies the probability with which two states can be distinguished by a global measurement. Second, if ρ ACR is close to ρ A ⊗ ρ CR , it implies that erasure of region A can be corrected by some map supported on AB. This is because such a factorization implies that the expression in Eq. (3.1) is small, which subsequently implies that the expression in Eq. (3.2) is small. The latter equation, in words, says that the original state is close to the state that is created by (i) erasing A and then (ii) applying some recovery map on AB. The converse direction also works. If there exists a recovery map on AB that can correct the erasure of A, then ρ ACR should be close to the form of ω A ⊗ ρ CR by Theorem 1. Because these two states must be also close to each other over their subsystems, ω A should be close to ρ A , establishing the converse direction.
To summarize, by exploiting the basic structure of the MERA network, one can tightly bound correlations between two subsystems. This bound in turn, by using Theorem 1, implies that erasure of certain regions are correctable. This establishes how well the logical information at different scales are protected.

MERA as an approximate quantum error correcting code
We have already formally defined the code subspace C s ⊂ H 0 . What remains is to study the properties of the code subspace. What kind of erasures are correctable? If they are correctable, how well can those errors be reversed? As we shall see, the analysis follows naturally from the framework that we have constructed in Sec. 2. We begin by a simple warm-up exercise, wherein we study the correctability of simply connected regions. We then move on to studying the correctability of more general regions and deriving a fundamental tradeoff bound. The key technical result is Theorem 2, which establishes the local correctability of these codes.

Correctability of simply connected regions
As a warm-up exercise, we show that erasure of any simply connected region A can be approximately corrected up to a small error if s log 2 |A|.

Lemma 3. Let C s be an RG-regular MERA code. Then for any O AR ∈ B(H A ⊗ H R ) where
A is a simply connected region, and any purified code state ρ AA c R , Proof. First recall the following two identities. 3) The first identity follows trivially from the definition and the second one follows from Lemma 1. Let us denote the left hand side of Eq. (4.1) as δ. The two identities above imply One can see that Eq.4.1 holds with a choice of constant C = 2d 2 .
By invoking Theorem 1, we can easily show that the region A is correctable.

Corollary 1.
For an RG-regular MERA code C s , and for any simply connected region A, there exists a CPTP R acting on H 0 such that where C is a numerical constant, and A c denotes the complement of A.
The proof simply follows by applying Theorem 1 and then using the relation between the Bures distance and the trace distance (Eq.3.3).
Our findings support the conclusion of Ref. [20], in which it was suggested that low energies of the critical systems should have a certain error correction property. In particular, our work provides a satisfying answer to the question: how does an error correcting code emerge in these systems? It arises from the fact that the ground state can be wellapproximated by a MERA state.

Local correctability
As was the case in Ref. [22], the notion of local correctability plays an important role in our applications. We derive this for RG-regular MERA codes.
Theorem 2. Let C s be an RG-regular MERA code. Let A be a simply connected region and let B be a region shielding A such that AB is a set of sites that are distance x or less away from A and |AB| < 2 s . C is the complement of AB. Then there exists a recovery map R AB for all purified code states ρ ABCR 0 , where c is a numerical constant.
Proof. The proof is similar to that of Lemma 3. First recall the following two identities.
provided that the past causal cone of A and C do not overlap with each other all the way up to a scale s. The first identity follows from the definition and the second identity follows from Lemma 2. Let us denote the left hand side of Eq. (4.6) as δ. The two identities above imply that where O A C R = Φ r A 0 is an operator supported on A = A r A , C = C r A , and R. Here r A is chosen such that A r A is contained in an elementary block and r is chosen to be the scale after which the past causal cones of A and CR overlap with each other. This happens when x is shrunk to a size of O(1). Thus, it can be chosen to be r = log 2 x + O(1), where O(1) is a non-universal constant of order unity. Now consider the following operator Schmidt decomposition: b) The minimal correctable region of C s is also the minimal support of a logical operator, which corresponds to the distance of the error correcting code. We see that it takes a cantor-set type form, as already suggested in Ref. [29].
The action of Φ r r A on this operator is of the following form: An important consequence of local correctability is that two distant correctable regions are jointly correctable. In the context of quantum error correction this property is called the union lemma. Indeed, suppose that regions A 1 and A 2 are both locally correctable on A 1 B 1 and A 2 B 2 up to error each. Then A 1 A 2 is locally correctable on Lemma 11 in Ref. [22]).

Applications
There are many implications of Theorem 2. As was the case in Ref. [22] this forms the basis behind deriving fundamental tradeoff bounds for MERA codes. Furthermore, it also implies that two observables that are largely separated in scale compared to 1/ν behave as if they are space-like separated operators. In particular, they obey a Lieb-Robinson type locality bound.

Tradeoff bounds
In this section we will derive bounds on the minimal support of a bulk logical operator on the boundary. In terms of quantum error correcting codes, this quantity corresponds to the distance of the code. For simplicity, we consider the limit: ν → ∞. We do not expect this limit to be physical, because to our knowledge, no such theory is known at this point. However, it is the limit in which all of our statements become exact. In particular, we partially recover the so called 'uberholography,' which was suggested recently by Pastawski and Preskill [29]. In this limit, all correlations outside the bulk lightcone vanish completely.
Consider an RG-regular MERA code C s of n physical qubits. At this point, we do not restrict |C s | to being constant. From Theorem 2, we know that any state ρ ∈ C s can be recovered from ρ A c by applying a channel on AB, provided that x ≥ |A|, where AB is a set of sites that are distance x or less away from A. By choosing x = |A| + O(1), we see that a subsystem A of size less than 2 s /z can be locally corrected from such B, where z = |AB|/|A| = 3.
Therefore, the logical information of C S can be recovered from these N − 2 s /z qubits. However, we can do better than this. By the so called union lemma [22,30,31], two disconnected correctable regions A 1 and A 2 are jointly correctable if their local recovery maps have non-overlapping supports. This implies that n/2 s + O(1) many regions of size 2 s /z are jointly correctable, implying that in fact only n(1 − 1 z ) + O(1) many qubits are required to recover a code state.
It turns out that we can do even better. Let R AB be the map recovering erasure of region A. R AB takes as input the state ρ B and outputs ρ AB : R AB (ρ B ) = ρ AB . B is compose of a left and a right component: are the left and right parts of B L , then we get: R AB R A 1 B 1 (ρ B1 ) = ρ AB (see Fig. 2b for an illustration). We can now iterate until we are left with 2 g regions of constant size. We now estimate what value g can take. The smallest elements have size O(( z−1 2z ) g |AB|), with AB the original region. We want to know what fraction of AB is left after the g steps, or 2 g = |AB| α . This yields In terms of error correcting codes, we get that |AB| = O(n/k), because k = 2 log 2 (n)−s and |AB| = 2 s by construction, so that that distance (the smallest support of a logical operator) satisfies d ≤ C(n/k) α , for some constant C, and α = log(2)/ log(2z/(z − 1)) ≈ 0.63 for z = 3. Note that our bound differs from Ref. [29]; there α ≈ 0.78, which yields a weaker bound. This is because our notion of local correctability is stronger than that of [29]; erasure of a simply connected region can be corrected if x > |A| in our setup, but the setup of [29] requires x > c|A|, where c ≈ 2.414. Indeed, the existence of the operators with small scaling dimensions in holography implies that the ν → ∞ limit cannot be an adequate description of such theories.

Emergent lightcone
In this section, we establish a bound on how fast correlations between bulk local observables build up in time. We will bound a commutator of the following form: where O 1 (t) is a local operator acting on H 0 , O 2 is a logical operator of C s , and the time evolution is generated by a local Hamiltonian. We derive an upper bound, which remains small provided that |t| is small compared to 2 νs up to some multiplicative constants. It is interesting to compare this bound to the well-known Lieb-Robinson bound [13], which states that where L is the distance between the nontrivial support of O 1 and O 2 , v is the Lieb-Robinson velocity, and ξ is a constant that depends on the underlying interaction graph. The main difference is that Eq. (5.3) holds in the entire Hilbert space, while Eq. (5.2) only holds in a low energy subspace, i.e., the code subspace C s . Obviously, a more refined bound would involve the size and the location of the supports of O 1 and O 2 , but that is beyond the scope of this paper. Here we focus on a simpler setting, in which O 1 is assumed to be a local operator and O 2 to be an arbitrary logical operator in the code subspace.
Because the dynamics in the physical Hilbert space is assumed to be generated by a local Hamiltonian, observables under this time evolution obey Eq. (5.3) with an appropriate choice of v, c, and ξ. From this fact, we can derive the following bound.
Theorem 3. For an RG-regular MERA code C s , a local physical operator O 1 and a logical operator O 2 of C s , we get where O 1 (t) = e iHt O 1 e −iHt , c is a constant, ν is the scaling dimension, v is the Lieb-Robinson velocity of H, and ξ is a numerical constant that depends on the interaction graph of H.
Proof. We consider the left hand side of Eq. (5.4) where |σ 0 = O 2 |σ 0 and |ρ 0 = O 2 |ρ 0 are states in C s . From Eq. (5.3) it follows that there exists an operator O l 1 (t), supported on a set of sites with distance l or less away from the support of O 1 , such that [32] ) |σ s . Now, we can decompose the action of Φ s 0 into Φ r 0 and Φ s r so that Φ r 0 (O l 1 (t)) is contained in an elementary block. Denoting this operator as O , The first term vanishes because both ρ s |σ s and ρ s |σ s are equal to ρ 0 | O 2 |σ 0 . The remaining term, , is bounded by 2 −νs d 2 O 1 . By choosing l ξ = νs + v|t| ξ , the bound is derived.
It should be noted that the bound on ρ 0 | [O 1 (t), O 2 ] |σ 0 does not necessarily imply a bound on ρ 0 | [O 1 , O 2 (t)] |σ 0 . This is because the action of the Hamiltonian may map a state in C s to a state outside of this subspace. However, this was to be expected, since we did not incorporate any relation between the code subspace and the Hamiltonian. One solution is to consider the action of the commutator on states which are eigenstates of H. One physically reasonable choice would be the ground state of H. If the ground state of H, |ψ , can be represented by a MERA such that the code subspace C s defined by this MERA is RG-regular, then At this point, we have only considered a Lieb-Robinson bound between an observable in the bulk, and another observable on the boundary inside the future light cone of the first. A more complete geometrical description of the Lieb-Robinson bounds for two observables anywhere in the bulk would be desirable. This is left for future work.

Conclusion
In this paper, we have outlined a mechanism by which certain toy models of holography can be derived from generic properties of states at criticality. It is straightforward to see that the main findings of this paper, e.g., Theorem 2 and its implications follow in higher dimensions as well. This is because the derivation was based on a very general property of entanglement renormalization. This provides a partial explanation for the origin of these codes, assuming that the low energy states of conformal field theory can be well-described by MERA. It also explains how a causal dynamics can arise in these systems, despite the fact that the effective Hamiltonian in the bulk is not manifestly local.
However, many important issues remain. For one thing, it will be interesting to understand the entanglement wedge reconstruction [4] in our framework. The approximate nature of our bound makes this analysis challenging. It is also important to note that our bound is not strong enough to ensure locality at sub-AdS scale. This was to be expected because our bound does not incorporate the properties that are expected to be satisfied by conformal field theories with a semi-classical gravitational dual: that there is a gap in the scaling dimension of the operators. [33] For both of these issues, a framework that can organize operators in terms of their scaling dimensions is desirable. One possibility would be the operator algebra quantum error correction, as was suggested in Ref. [4], or an approximate version thereof. An analogous analysis would require a derivation of Theorem 1 for general operator algebra, which may be of an independent interest.
In order to gain a more refined insight, it will be important to study families of circuits that are equipped with more refined set of structures. There are many interesting questions in this direction. Would a random tensor network of Ref. [8] emerge from the random MERA network in Ref. [15]? Can we import the constraints posed on the operators of the CFT into the language of quantum error correction? Would the tradeoff bounds on quantum error correction lead to nontrivial constraints on gravity? These are left for future work.