A General Proof of the Quantum Null Energy Condition

We prove a conjectured lower bound on $\left_\psi$ in any state $\psi$ of a relativistic QFT dubbed the Quantum Null Energy Condition (QNEC). The bound is given by the second order shape deformation, in the null direction, of the geometric entanglement entropy of an entangling cut passing through $x$. Our proof involves a combination of the two independent methods that were used recently to prove the weaker Averaged Null Energy Condition (ANEC). In particular the properties of modular Hamiltonians under shape deformations for the state $\psi$ play an important role, as do causality considerations. We study the two point function of a"probe"operator $\mathcal{O}$ in the state $\psi$ and use a lightcone limit to evaluate this correlator. Instead of causality in time we consider \emph{causality in modular time} for the modular evolved probe operators, which we constrain using Tomita-Takesaki theory as well as certain generalizations pertaining to the theory of modular inclusions. The QNEC follows from very similar considerations to the derivation of the chaos bound and the causality sum rule. We use a kind of defect Operator Product Expansion to apply the replica trick to these modular flow computations, and the displacement operator plays an important role. Our approach was inspired by the AdS/CFT proof of the QNEC which follows from properties of the Ryu-Takayanagi (RT) surface near the boundary of AdS, combined with the requirement of entanglement wedge nesting. Our methods were, as such, designed as a precise probe of the RT surface close to the boundary of a putative gravitational/stringy dual of \emph{any} QFT with an interacting UV fixed point. We also prove a higher spin version of the QNEC.


Introduction and summary
Bounds on the stress tensor T −− of a QFT have important consequences for the semiclassical limit of gravity -using these bounds we can rule out pathological spacetimes that might arise when coupling gravity to matter in the form of this QFT. Typically these pathologies have their root in some form of causality violation of the resulting spacetime. However naive bounds that apply to classical field theory, like the local Null Energy Condition (NEC) T −− > 0, are violated quantum mechanically. The NEC was central to the classical proofs of the black hole area law [1], singularity theorems [2], topological cencorship [3], etc [4]. In order to generalize these proofs to the quantum regime several new energy conditions on T −− have been conjectured with various degrees of non-locality [5][6][7][8][9][10][11][12][13]. Despite their origin in gravitational physics these generalizations often have a limit which applies directly to the QFT in curved space, and it is furthermore interesting to study their validity and consequences even in flat Minkowski space [14,15]. Perhaps most excitingly their validity almost always relates to bounds on the behavior and manipulation of quantum information in the QFT [16,17], further strengthening the important connection between gravity and quantum information.
Recently two different proofs of the Averaged Null Energy Condition (ANEC) in any UV complete QFT have appeared in the literature [18,19]. In Minkowski space this is the positivity constraint: The proof [18] works in Minkowski space and for null integrals along a complete null geodesic generator of the horizon of a static black hole. This is sufficient to rule out using these black holes as traversable wormholes [20,21]. The proof of the ANEC in [19] was based on causality considerations applied to the two point function of a probe operator O evaluated in the state |ψ . In [18] the proof was based on monotonicity of relative entropy under shape deformations of an entangling region A which is taken to be a null deformed cut of the Rindler horizon x + = 0. This in turn imposed a condition on the negativity of the shape deformations of vacuum modular Hamiltonians for these entangling cuts which was then proven to be related to the ANEC operator in (1.1). Indeed it does not take much more work to use these modular Hamiltonians combined with monotonicity of relative entropy to prove the quantum half-ANEC: which is then an important ingredient in the semi-classical proof of the Generalized Second Law for black hole mechanics [13]. It turns out however that we need an even more local constraint in order to generalize other classical gravitational theorems, such as the Bousso/covariant entropy bound [22], to the semi-classical regime. One such constraint is the Quantum Null Energy Condition (QNEC) [23][24][25] which is logically more general than the ANEC and the half-ANEC, implying both of these if it is true. For certain special cases the QNEC is the functional derivative δ/δx − (y), or shape deformation, of (1.2). Since we don't expect a second order shape deformation of relative entropy to be constrained in sign for a general quantum system, if we are to prove the QNEC it will involve an essentially new ingredient beyond monotonicity. As we will see the QNEC follows from a more fine grained notion of causality compared to the results in [19], where we make use of the probe operator O, but at the same time add the action of modular Hamiltonians into the mix. We will prove a (slight) generalization of the QNEC. This is essentially an integrated version: where A and B are two spatial regions of a fixed Cauchy slice. They should satisfy the inclusion property D(B) ⊂ D(A) where D(C) is the domain of dependence of C. The x − parameterizes a null line passing from the entangling surfaces ∂A to ∂B located at fixed (x + = 0, y) and y locally labels the coordinates along the entangling surface with x ± labeling null coordinates transverse to the surface. We will require the surfaces ∂A, ∂B to be locally stationary at the point y, which means the extrinsic curvature in one of the null directions K + ij = 0 vanishes at y as well as a sufficient number of its y derivatives.
Other than this we only require that the domains of dependences D(B) and D(Ā) are non timelike separated, so for example A, B could have multiple disconnected components with non trivial topology etc.
u, x v, x + Figure 1. Our setup involves two causally disconnected regions of Minkowski space, the domains of dependence of B andĀ (shaded green regions). These become close to null separated along a null line (gray curve in the left figure) along which we would like to prove the QNEC. The null separation along this line is the coordinate length δx − . We insert two operators (blue and red dots) in these respective regions in a lightcone limit close to the continuation of this null line.
The essential idea for the proof is to study the following correlator: 1 where the two probe operators O B , OĀ are inserted in the region D(B) and D(Ā) respectively. We then act on these operators with modular flow O B → e isK B O B e −isK B using the (full) modular Hamiltonians K B , K A defined for the sub regions B, A respectively and for the state |ψ . The modular Hamiltonian can be defined abstractly (with some technical assumptions on the state |ψ ) with respect to the algebra of bounded operators within the region, and Tomita-Takesaki theory [26,27] guarantees that the modular flowed operators for real s are still contained within the algebra of operators of that region. More constructively the modular Hamiltonian is related to 2πH A ≡ − ln ρ A the reduced density matrix of |ψ restricted to A and is sometimes referred to as the entanglement Hamiltonian, and modular flow simply involves time evolution using this Hamiltonian. In this paper we will mostly be interested in the full modular Hamiltonian which is K A = H A ⊗ 1Ā − 1 A ⊗ HĀ, and it is important to note that K A,B |ψ = 0 for the defining state. For some special cases modular evolution can be local, as for example the case where A → A 0 is a half-space/Rindler cut in Minkowski space and for |ψ → |0 the vacuum [28]. The action of K 0 A is then a boost holding fixed ∂A 0 . This is our definition of the term in the denominator of f (s) (1.4) where A 0 and B 0 are half space cuts such that ∂A 0 and ∂B 0 are parallel to ∂A and ∂B at y respectively. Later in the paper we will slightly refine this definition of the denominator to allow for ∂A 0 , ∂B 0 to be null cuts of the Rindler horizon agreeing locally with ∂A, ∂B. In this case the denominator can be constrained using the so called theory of half-sided modular inclusions [29][30][31][32][33] -the computation of which has some overlap with the recent paper [34].
Since the probe operators are initially spacelike separated they commute, and since D(B) and D(Ā) are spacelike seperated the modular evolved operators will also commute: for real s. This fact, pertaining to causality, translates into a statement about analyticity of f (s) in the complex s plane. Indeed a generalization of Tomita-Takesaki's modular theory [29,31,35,36] establishes the analytic extension of f (s) in the strip −π < Ims < π. We will use this analyticity to prove the QNEC. Roughly speaking, if the QNEC were violated the modular evolved operators could exit their respective causal domains and cause a branch cut along Ims = 0 giving a non-zero commutator (1.5). Since we are using modular evolved operators this is a subtle violation of causality, but one which makes sense in the context of AdS/CFT. Indeed it was recently shown that modular evolved operators give a way to reconstruct bulk operators localized within the entanglement wedge associated to some boundary sub region such as A [37,38]. The entanglement wedge is believed to be the largest region containing information reconstructible using operators acting on the sub Hilbert space H A in the QFT [39][40][41][42], and so these bulk regions are causally constrained by the boundary theory. Additionally the QNEC was proven for theories with an AdS/CFT dual using exactly this causality requirement [43][44][45] -more specifically the entanglement wedge nesting (EWN) requirement. As we will explain there is a very precise sense in which this paper can be thought of as studying subtle QFT causality requirements via the causality properties of a gravitational dual with an emergent radial direction [46]. Most QFTs do not have a classical gravity dual, but in some sense since we will only be studying properties of the gravitational system close to its boundary, the real stringy/strongly interacting nature of the dual gravitational system is suppressed. Our results thus identify (1.5) with the QFT equivalent of EWN.
Of course having setup the problem it may seem hard to compute f (s) in any useful way, that is, retaining full generality over the state ψ as well as the generality of the entanglement cuts ∂A and ∂B. This is because K A,B are complicated non-local operators. We will manage to make progress here using a lightcone limit for the operators O B,Ā , as pictured in the setup of Figure 1, where the operators are separated in the (x + , x − ) direction by an amount (∆v, ∆u) and ∆v → 0 as ∆u is held fixed. This is a very similar limit to that considered in the causality ANEC proof [19] although now in the presence of entanglement cuts through points collinear with the operators. In this limit we can use the replica trick to compute properties of the general modular Hamiltonians K A,B , coupled with a defect lightcone OPE argument. The defect is the non-local co-dimension 2 twist operator of the n-replicated theory and a large part of our computations involve controlling the spectrum of local operators on the defect (referred to as defect operators) in the n → 1 limit.
For large s (but not too large as to move us out of the lightcone limit) we find the small but growing correction term: where Q − is the QNEC object in (1.3). Note that we have introduced a quantity we call G N via its usual relation to G N ∝ 1/c T in holographic theories (4.28) in units where R AdS = 1. There is no need for G N to be small. We have also defined z 1 in terms of the kinematics of the operator insertions which exactly plays the role of the radial z coordinate in an emergent AdS. Here δx − is the coordinate distance between ∂A and ∂B and ∆u > δx − must be true. All we have to do now, taking inspiration from the chaos bound [47] and causality bound [48] stories, is prove that Ref < 1 along the lines Ims = ±π/2 in the complex s-strip. Analyticity then allows us to extract Q − from (1.6) as an integral over (1 − Ref ) along these same lines which is then constrained to be positive thus proving the QNEC.
Let us add one more comment on the meaning of f (s). The modular flows generated by K B 0 ,A 0 simply boost the operators O B,Ā and the denominator is explicitly given by the two-point function for the boosted operators. The numerator generally lacks such a simple picture, and it involves complicated interactions between the operators and the defect. However, in the light-cone limit we can interpret the numerator approximately as a two-point function for local operators. The leading correction to f (s) in (1.6) can be viewed as giving a shift for the separation (−∆v) > 0 in the numerator relative to that of the denominator: This is suggestive of a tendency towards shifting the branch cut singularity in the two point function onto the real s axis if Q − < 0 for large enough s. This is not a precise argument since the shift is only important when the small correction in f (s) competes with 1, but that's okay since the precise argument was given above. However it allows us to identify the gravitational time delay/advance that we should look for in the bulk, and we will identify this by (very slightly) generalizing the arguments of [43] which proved the QNEC for holographic theories using entanglement wedge nesting (EWN) of the RT [49]/HRT [50]/quantum extremal [51] surface near the boundary. Many of the properties of the function f (s) are the same as the function f (t) defined in [47] for studying chaos using an out of time order four point function for two different operators W, V (t) in a thermal state. The analogy is strengthened by taking the thermal state to be that of the Rindler state, and time t to be generated by boosts using the Rindler Hamiltonian. This is then the same setup as [19] for proving the ANEC using causality -although the equivalent function is constrained in different kinematic regimes -determined by how large t is and whether the operators are in a lightcone limit for the causality bound (large t) or the Regge limit (even larger t, but not as large as the scrambling time ∼ ln c T ) for the chaos bound. The analog of our setup would evolve V not with the Rindler Hamiltonian but with the complicated modular Hamiltonian of the state W |Ω and now reduced to two different entangling regions. From this point of view our paper represents a generalization of the chaos and causality bound setup, however we have not yet explored the extent to which we can apply this setup usefully to non-relativistic quantum systems and generic perturbed thermal states. We also do not have much to say about the equivalent "even larger" s regime of f (s) in a large-N theory analogous to the Regge limit leaving these fascinating generalizations for future work.
The paper is organized as follows. In Section 2 we start with background on various known results that will be useful to us, including a discussion of the holographic proof of the QNEC as well as a discussion of geometric modular Hamiltonians and their action on local operators. We note an interesting relation to the well studied theory of half-sided modular inclusions. In Section 3 we discuss the use of the replica trick to compute properties of general modular Hamiltonians. We then consider the defect OPE which is necessary to carry out the replica trick computation. This includes a discussion of possible local defect operators that arise when n ≈ 1. In Section 4 we compute the matrix elements of the modular Hamiltonian in the state excited by O B,Ā in the lightcone limit. In Section 5 we use this result to find the action of modular flow in a perturbative expansion with respect to the lightcone limit which gives the result (1.6). In Section 6 we detail the general properties of f (s) which lead to the QNEC. In Section 7 we discuss some loose ends, including an understanding of local geometric contributions to entanglement entropy that can contaminate the QNEC quantity Q − and thus invalidate the bound for non-stationary entanglement cuts. We also discuss a higher spin version of the QNEC, generalizing the higher spin version of the ANEC proven in [19]. We conclude in Section 8 with several possible extensions. Some computations and details are relegated to the Appendices.
Note added: In this version of the paper, we are introducing a minor but technically important modification to the operators O B,Ā appearing in f (s) compared to the previous pre-print. In particular, we are inserting them at positions that are s-dependent in a way that we will specify later. The additional s-dependence is small at large s, and only affects the analytic properties of f (s) in a controlled way that decouples from our main arguments. The reason behind this modification is to eliminate contributions to f (s) that could potentially contaminate the relation between the bound for f (s) and the QNEC statement Q − ≥ 0 in the light-cone limit. These modifications were understood in [52] where they naturally arose from relative modular flow. More details will be explained as we lay out the actual proof.

Background
In this section we collect some known results from the literature that we will make use of throughout the paper. We will bring our own perspective to these results relevant to our discussion. Let us set the stage by setting up the problem we wish to study more precisely than in the introduction.

Setup and conventions
We take the metric to be flat: where we use u = x − = t − x and v = x + = t + x for null coordinates adapted to an entangling surface ∂A which passes through the point u = v = y i = 0 (we have set y = 0 relative to the introduction!). We use both v, u and x ± to maximize our variable options. Wick rotation is given by τ = it such that −u = z = x + iτ and v =z = x − iτ . The first entangling surfaces will be defined close to y = 0 via such that A is a space like region ending on ∂A to the "left" -roughly speaking within the wedge u > 0, v < 0 close to y = 0. The other entangling surface is displaced in the u direction at y = 0: where again B is a space-like region to the "left" of this cut. This description could break down far from the null line passing through both ∂A and ∂B at y = 0, but the details far away will not play a role in our computations. For now we will be agnostic to the exact shape of the entangling regions, except to require that D(B) ⊂ D(A). We will later discover that some further local conditions are required in order to claim the QNEC bound -these are similar conditions to those discussed in [43], that locally the entangling cuts should be stationary with (∂ y ) p X + A,B (0) = 0 for sufficiently many derivatives. These further conditions contain the special case where A and B are general null cuts of the Rindler horizon v = 0 such that X + A,B = 0 and X − A,B (y) are left arbitrary (except for the inclusion requirement X − B (y) > X − A (y) for all y .) However our results are much more general than this.
To lighten the notation we will use the following: where u B > 0, v B < 0 and uĀ < 0, vĀ > 0 and we have made explicit which state the modular Hamiltonian refers to. For example H Ω B 0 = −(2π) −1 ln TrB 0 |Ω Ω| where |Ω is the CFT vacuum. We will often suppress the y = 0 label on the operator insertions. We also suppress K 0 B 0 → K 0 B which should be understood from the superscript label. For most of the paper, except in Section 7 and below, we will take the undeformed regions ∂A 0 , ∂B 0 to be flat Rindler cuts that agree with ∂A and ∂B at y = 0. We turn now to a description of the modular Hamiltonians for these regions in vacuum.

Vacuum modular Hamiltonians and modular inclusions
We start by describing a special class of modular Hamiltonians, these are the so called local modular Hamiltonians which apply for relativistic vacuum states and for simple flat Rindler cuts. Modular flow in this case is just a local boost around the entangling surface and the modular Hamiltonians for two flat cuts of the same Rindler horizon form an algebra which is the one that naturally arrises in the theory of half-sided modular inclusions [29]. This case applies to K 0 A,B the modular Hamiltonians for the two uniform Rindler cuts D(B 0 ) ⊂ D(A 0 ) in vacuum which are important for evaluating f (s) in the lightcone limit.
The action is simple: and for B: (2.8) Furthermore these satisfy an algebra: where P − = 1/2(H + P x ) is the translation operator in the x − direction iP − = ∂ − . This algebra is 2 dimensional and isomorphic to the algebra associated with the affine group u → au + b. For the pattern of modular flow in the correlator f (s) we find: where the U (b) generates a translation in the null direction u → u + b. Note that P − is clearly a positive operator via vacuum stability, which it must have been due to the negativity constraint on modular Hamiltonians under shape deformations [17]. In this paper we will work in a limit where the ψ modular Hamiltonians are well approximated by these modular Hamiltonians plus computable corrections. In Section 7 we will find that in order to account for some of these corrections it is useful to consider a more general class of vacuum modular Hamiltonians, the form of which was only recently elucidated [18,34,53]. 2 This class derives from arbitrarily shaped null cuts of the Rindler horizon in vacuum. An important result now comes from the theory of half-sided modular inclusions which can be used to prove that the algebra (2.9), suitably generalized, continues to apply in this more general case.
One proceeds in two steps, the details of which are given in Appendix A. Firstly we recall that half sided modular inclusions apply to the case where B 0 is an arbitrary null cut of the Rindler horizon v = 0 satisfying D(B 0 ) ⊂ D(A 0 ) where A 0 is a uniform Rindler cut ending on ∂A 0 : u = 0, v = 0 with an associated local modular Hamiltonian. The region B 0 then has the nesting property that These conditions are enough to prove the result that the algebra defined in (2.9) continues to hold with the replacements: where the notation K 0 {·} defines the modular Hamiltonian as a functional of the specific entangling cut of the Rindler horizon (recall that X + A,B = 0 and X − A = 0 for now.) Intriguingly one way to prove this is by studying a very similar correlation function to that which appears in f (s) namely: Applying the nesting property (2.11) and positivity properties of K 0 29] argued for an analytic extension that is periodic and holomorphic in the thermal s strip: −π < Im s < π with j(s + iπ) = j(s − iπ). Similarly j(s) is necessarily bounded in this strip 2 These modular Hamiltonians for general QFTs are consistent with those of free theories which were worked out by A. Wall [13] based on light front quantization. and the only way to satisfy all these conditions is if j(s) is a constant independent of s. Expanding about s = 0 one derives the algebra in (2.12). 3 Continuing on, this algebra allows us to find an expression for the null deformed modular Hamiltonian in terms of an integral of the stress tensor: This was recently shown in [34] and we will give a slightly different proof of this in Appendix A. The main ingredients in our proof are the algebra (2.12) as well as the recent computation of linearized shape deformations to the Rindler modular Hamiltonian [18] which allows us to fix the modular Hamiltonian for small X − B (y). This result proves the conjectured answer in [18] that the higher order corrections in the X − B expansion are essentially trivial.
Note that these new modular Hamiltonians are not local in the sense that they do not generate local flows. With the result (2.12) in hand one can then just go and calculate the algebra when A and B are both deformed null cuts (see Appendix A and [34]): where: While the action of these modular Hamiltonians is not local, it will become local when acting on operators in the lightcone limit (close to the Rindler horizon.) This should allow us to compute the action of the vacuum modular Hamiltonian perturbatively in the lightcone limit, which goes into computing f (s). As we will see the details of this computation will not be important, excepting that they satisfy the modular inclusion algebra (2.15). We make a final point returning to the simple uniform null cuts. The nested boosts relevant to f (s) that we computed in (2.10) tells that for large s we simply have a null translation by a small amount δx − . This means we can take s large without the operators exploring too much of the spacetime and this will be important for us to claim the more general QNEC results. We can also understand what happens if we move the two entangling cuts away from each other slightly in the v direction by an amount δx + . The inclusion property D(B 0 ) ⊂ D(A 0 ) is now only true if δx + ≤ 0. These modular Hamiltonians are not constrained by the algebra of half sided modular Hamiltonians, but since here they are simple boosts we can just explicitly compute the modular flow. Consider the flow: which for large s still gives an operator shifted in the null u = uĀ + δx − direction, but the operator is now moving to large v ≈ vĀ − δx + e s . If we plug this into the vacuum correlator in the denominator of f (s) and expand for small δx + e s we have: So unless we set δx + = 0 we find a small but growing e s term which should be compared to (1.6) and (1.7). Without making this later δx + e s expansion the two operators will eventually become time-like separated from each other if δx + > 0. Since in this case the two domains of dependence D(Ā 0 ) and D(B 0 ) are not causally disconnected there is no issue with the necessary appearance of a branch cut in s along Im s = 0. However this gives us some intuition for the growing e s QNEC term we are claiming for more general modular Hamiltonians and states. Consider a holographic theory. If the QNEC is violated then as one moves slightly inwards in the holographic z direction the two bulk entanglement wedges forĀ and B will come into causal contact. Since the JLMS [37] result tells us that boundary modular flow equals bulk modular flow, a similar algebra for modular Hamiltonians should now apply except in the bulk and now determined via the relative position of the RT surface (elucidated further in the next subsection). Near the boundary we can approximate the cut with a bulk Rindler cut except slightly deformed due to the movement of the RT surface in the v = x + direction as we move inwards from the boundary. From this consideration we expect to find the same e s growing term that we found by shifting δx + on the boundary. In particular the wrong sign δx + > 0 which applies when Q − < 0 is an indication that the entanglement wedges are coming into causal contact. We now turn to a holographic calculation demonstrating that indeed this bulk causality consideration is determined by the sign of the QNEC quantity Q − .

Holographic proof and EWN
Let us attempt to identify the gravitational time delay/advance directly in the bulk. In this section we assume our CFT has a description in terms of a weakly coupled classical Einstein gravity theory. This is only true for a small class of theories, but these theories allow us to develop intuition for the general case. The results here are not new and were originally worked out in [43], relating the QNEC in holographic theories to entanglement wedge nesting (EWN.) The EWN property states that if two boundary regions satisfy D(B) ⊂ D(A) then the dual entanglement wedges must satisfy the same condition. The entanglement wedge of a region A is the domain of dependence of the spacelike A b region located between A on the boundary of AdS and the RT surface E A . This requirement can be understood as being basic to the program of entanglement wedge reconstruction [44]. We will work out a slight generalization for the integrated version of the QNEC in (1.3). For simplicity we will ignore many complications due to extrinsic curvature effects and effects arising due to a relevant deformation which takes us from the UV CFT to a more general QFT. These more complicated effects were discussed carefully in [43].
The metric solving Einstein's equations near the boundary of AdS has a Fefferman-Graham expansion: Similarly the two RT entangling surfaces parameterized via v, u = X ± RT (z, y) have an expansion: where we find this form by solving the extremal surface condition close to the boundary.
Here τ µν , p A,B − are not fixed by the asymptotic boundary conditions. They are state (ψ) dependent and can be related to the CFT stress tensor and the shape variation of the holographic EE respectively: The later relation may be less familiar to the reader, but can be thought of as the usual Hamilton-Jacobi relation between conjugate coordinates (X − , p − ) in the sense where z is time and the area of the RT surface or S EE is like the action holding fixed the boundary value x − : p − ∼ ∂ z X + ∼ δS EE /δx − . We will take X + A,B (y) = 0 for simplicity to suppress additional leading terms that would arise in the z expansion of X + RT (z, y) multiplying various local extrinsic curvature invariants. We also remind the reader that X − A (0) = 0 and X − B (0) = δx − . Now we consider a high energy particle moving near the boundary of AdS in the ∂ u direction along a null geodesic with approximately fixed v, z and paramaterized by the coordinate u. To leading order we only need to track the small change v → v(u). This particle will be analogous to our O probe. To see if the two entangling wedges are causally disconnected we consider this null geodesic to pass through the point v(0) = X + RT,A (z, y = 0) at u = 0, y = 0 and z fixed. The particle then picks up a delay in the v direction as it Comparing this new v coordinate to the position of the B RT surface we find this is determined by the QNEC quantity: where we have used (2.23). This result should then be compared to (1.7) and (2.19) to find a consistent story between the bulk and the boundary. The lightcone limit allows us to study particles propagating near the boundary of AdS and weakly interacting via graviton exchange with the state ψ. This turns out to be a useful picture in any interacting CFT [54][55][56], and this was the original motivation for studying the lightcone limit in this context. The delay one extracts from this picture is the total delay of the particle integrated over all boundary times −∞ < u < ∞ -causality then imposes the ANEC constraint. The boundary theory ANEC is in this way related to the Gao-Wald causality condition on the bulk [57], that the fastest path in the full spacetime between two null separated points on the boundary is a null line on the boundary. By studying causal curves that reach into the bulk and are sensitive to the boundary theory stress tensor τ −− term in the metric the authors [57] used Gao-Wald to prove the ANEC. This does not usefully constrain a more local version of the NEC because propogating the particle from the boundary into a fixed z causes a large v delay which swamps the delay/advance due to the τ −− term in the metric. This can only be removed by taking the two points on the boundary to be infinitely separated in the null u direction. Entanglement wedge nesting is a much more fine grained version of causality that allows us to directly study the gravitational delay/advance at a fixed z coordinate via the introduction of the entangling surfaces. And it turns out the way to extract this from the boundary theory is with the correlator in f (s).

Tomita-Takesaki theory
In order to prove various (non perturbative) properties of f (s) we will need to have a better understanding of modular flow for a more general class of states than for the vacuum of a QFT. For now we will present (a brutalized version of) the abstract algebraic discussion of Tomita-Takesaki theory, see for example [27]. This will pertain to the action of a single modular flow -the double modular flow will be discussed later in Section 6. The idea is to consider a von Neumann algebra of bounded operators A associated with some local region in spacetime say D(A). If we additionally have a state |ψ on the total Hilbert space that is cyclic and separating for A -meaning that O A |ψ is dense in the total Hilbert space for all operators O A ∈ A and that O A cannot annihilate |ψ , then one can define the following The lightcone limit of a four point function ψOOψ can be interpreted in terms of a dual holographic setup where the dual particle excitation to O and ψ stay far away from each other in AdS space by having a large relative angular momentum. We use this setup as inspiration for our computation, where we add into the mix two entangling surfaces which start null separated at the boundary and fall into the bulk. EWN is the statement that these surfaces should be spacelike separated as one moves into the bulk. The O particle in the high energy/lightcone limit probes these entangling surfaces near the boundary after we act with modular flow on this particle. modular operators: where J is anti-unitary and ∆ A is positive and Hermitian, but generally unbounded. To make contact with the (full) modular Hamiltonian one writes ∆ A = e −2πK A where now K A will not be a positive operator. One can then show that: where A is the commutant which is then the bounded operators associated to the region D(Ā).
Physically, cyclic and separating just means that the state has a large amount of entanglement between D(A) and D(Ā) and we expect that all reasonable QFT states one might consider have this property. For the case of the vacuum the Reeh-Schlieder theorem [58] rigorously establishes this fact. In a quantum system with a finite dimensional Hilbert space this condition would be equivalent to the statement that the reduced density matrix ρ A (for a finite quantum system ∆ A = ρ A ⊗ ρ −1 A ) has full rank and so is invertible [59], however it will be important to acknowledge the fact that in an infinite quantum system, since ∆ A is unbounded we have to carefully specify the domain on which it acts. For example it is known that A |ψ is generally in the domain of ∆ α A for 0 < α < 1/2 and A |ψ is generally in the domain of ∆ α A for −1/2 < α < 0 [27]. An important consequence of this structure is an abstract version of the KMS condition. To understand this we consider the correlator (which is a baby version of f (s)): which can be analytically continued into complex s in the strip −π < Ims < π. On the upper/lower edge we have: The difference across the cut, which arises in the s strip after we identify s ≡ s + 2πi at Ims = π, is just the Analyticity along Ims = 0 is simply related to the fact that the original operators O A and OĀ commute. We can give a less rigorous discussion of these results by appealing to the analogy with thermal systems. For example, if the subspace had a trace, we can then replace the correlators as: where we have set s = t + iσ. This expression demonstrates where the strip −π < σ < π comes from. In an infinite dimensional system the sum over intermediate eigenstates of ρ A is not guaranteed to converge outside of this range. In our case there is no trace, however we could regulate things around the entangling surface with a hard wall cutoff in order to introduce a trace.
Moving forward we want to study the situation where there are now two algebras with a common cyclic and separating state and the inclusion property A B ⊂ A A . This is harder to study but we can use various results from the literature. We will explain these in a later section.

Replica trick for the modular Hamiltonian
Our computation of f (s) will now begin in earnest. Our first task is to compute matrix elements of K A sandwiched between |ψ excited by the O operator insertions. To do this we will need to use the replica trick.
Previous discussion of using the replica trick to compute the modular energy of excited states has appeared in [60] (also [61,62]). This was then used to study the modular energy in 2d CFTs. While we will take a very similar approach there will be an important difference. We would like to write the answer in terms of twist operators in the orbifold theory CF T n /Z n . It is not totally clear this is possible since, as noted in [60], the replica trick in this case explicitly breaks the Z n symmetry which cycles through the replicas. For this reason the results in [60] are left in the form of correlation functions on n-sheeted branched coverings without the Z n symmetry. On the other hand the orbifold theory is much more under control since we can use standard results about defect CFTs [63][64][65][66][67] in order to make progress with computations. This will be the main technical difficulty that we have to overcome here.

Replica trick
The replica trick is a way of computing properties of the operator ln ρ A using the limit: This is useful because it is sometimes possible to compute traces over ρ n A for integer n using a path integral. The limit is then only achievable once an analytic extension is found from integer n to complex n. While this is usually subtle the replica trick has yielded many powerful results relating to entanglement entropy in QFT [68][69][70][71].
We will firstly be interested in simply evaluating the half modular Hamiltonian: 2πH A ≡ − ln ρ A ⊗1Ā, thought of as an operator on the total Hilbert space, between matrix elements of the defining state |ψ excited by local operator insertions. This is not a totally well defined object in the continuum and so will only be an intermediate step towards computing the full version: K A = H A − HĀ which is well defined. We will not be completely explicit about how we regulate H A to define it, but we will assume this regulator allows us to define a trace over the various tensor factors in the Hilbert space. Consider: which we can write as a trace over the Hilbert space H A ⊗ HĀ The trace in (3.3) can be computed using a path integral. We first write a path integral representation of ρ A by integrating over Euclidean space with a branch cut running along A and different boundary conditions above and below A used to represent the density matrix. To be concrete let us take the state |ψ to be defined via local operator insertions which we will also denote as ψ(x) (perhaps smeared appropriately). We place two operators ψ and ψ † on the Euclidean section above and below the Cauchy slice A ∪Ā on each replica The details of this state and the Euclidean path integral used to construct these states will not matter, except to note that for now we take |ψ to be a pure state. We will extend the proof to the case of mixed states in Section 7.
We now write a path integral representation for TrĀ (OĀ |ψ ψ| O B ) which differs from ρ A by the additional O operator insertions within the path integral. We imagine slicing open the path integral along radial lines emanating outwards from ∂A and integrating forward in a clockwise angular direction 4 -so the ordering of operator insertions (including the operators ψ(x) that create the state) in the H A Hilbert space language is always angular ordering.
Putting the various density matrices together and tracing we can write the answer as a correlation function on a non-trivial manifold M n (A) which consists of n copies/replicas of the d dimensional Euclidean space which are cut and joined cyclicly along A. Sometimes we will refer to this space as a branched manifold. The state operator insertions ψ and ψ † both arise on each replica and O B , OĀ live on the same single replica. Then: where ψ ⊗n means insert the operator symmetrically on each replica. This is not yet an orbifold correlation function. The branched manifold can be alternatively represented by using a co-dimension 2 (non-local) twist defect operator living on ∂A: where the orbifold/gauging of CF T n by the discrete cyclic permutation symmetry is necessary in order to remove the existence of (n − 1) extra conserved stress energy tensors from the new replicas -thus allowing us to apply standard CFT considerations to the orbifold theory on the original (unbranched) manifold R d but now in the presence of a co-dimension 2 twist operator. 5 Indeed the state operator insertions are clearly symmetric under the Z n symmetry so they are genuine orbifold operators. To unclutter the discussion moving forward we will often suppress the existence of these operators and consider them part of the definition of the twist operator Σ n (∂A)ψ ⊗n ψ †⊗n ≡ Σ ψ n (∂A) ≡ Σ n where the later replacement is for further decluttering purposes.
Unfortunately the operators O B and OĀ are quite clearly not orbifold operators, so (3.5) is not yet well defined. We cannot simply symmetrize each individual O B or OĀ operator over the action of the Z n group since the two operators are necessarily inserted on the same replica. Thus we consider O B OĀ to be a bi-local operator with a non-local string attached whose sole job is to keep track of the relative position of the operator 4 We work clockwise because the entangling region A starts on the left of the cut. This results in some funny minus signs, such as the Euclidean holomorphic coordinates close to the entangling surface satisfies z = −ρe −iθ → u = ρe s where θ increases in the clockwise direction and ρ is the radius with ρ > 0, θ = 0 specifying the A region. We wick rotate as θ = is. 5 Orbifolds of 2d CFTs are well studied [72]. The higher dimensional versions have received less attention, see [73] for a recent discussion which however is complicated by non-trivial topology. We can literally view the resulting theory as a discrete gauging of the replica symmetry, by coupling the theory to a continuum version of a discrete gauge theory as reviewed in [74].
on the different replicas, say when we move one of the operators around the twist defect relative to the other.We can then Z n -symmetrize this bi/non-local operator by summing this composite over the different replicas. We take this as our definition of (3.5) which we rewrite as: where the superscript notation O (k) (x B ) specifies which replica the operator descends from on M n (A). In Section 7 we will discuss a more precise definition of this bi-local operator where the string attached is actually a sum over Wilson lines for the orbifold gauge group Z n . For now we note that, due to the non-local nature of this operator, we must pick where we place the branch cuts in the definition of M n in order to define which local operator O lives on which replica (and thus define what we mean by k). Excepting the effective string that remains attached between x B and xĀ and moves past the twist operator interesecting the regionĀ, the choice of exactly where we place the branch cut goes away upon moving to the orbifold theory -as it must. Now we would like to compute K A ≡ (− ln ρ A + ln ρĀ)/2π by doing a similar replica trick to compute ln ρĀ. Notice that the difference here is the positioning of the branch cut. However in the orbifold theory, by definition, there is no knowledge of the position of the branch cut so one might conclude incorrectly that the answer, upon subtraction, is 0. The reason we find a non-zero answer can be understood since moving the position of the branch cut fromĀ → A yields a different ordering for the bi-local operator O B OĀ. The conclusion is that we can compute the full modular Hamiltonian as: where for the later non/bi-local operator we have moved the OĀ relative to O B around the twist operator once. That is: At this point we have now set up the problem. Computing any of these correlation functions seems difficult. We aim to make progress by bringing the O operators close to the twist defect Σ and using a defect operator product expansion (dOPE.) We turn to this now.

Defect OPE
If we take the pair of operators O close to the defect, say at a point y = 0 along the defect, we can imagine zooming out and replacing these with a sum over local defect operators on Σ n . That is → i β i O i (0) . Note that we might have done this in two steps, first replacing Figure 3. (upper ) The defect OPE argument is based on bringing the two O operators close to the defect and replacing these with a sum over defect operators. The dashed line between the operators represents the non-local string we need in order to study these operators in the orbifold theory. The dashed circle represents the radial quantization sphere in the defect theory on which we decompose the state in a basis of local operator insertions at the origin of this sphere. (lower ) The OPE coefficients β i can be computed by making the same replacement, but now on a defect which allows us to do the computation. That is on a flat defect in vacuum. The other operator O j is inserted elsewhere on the defect so we can extract the various β i . the pair of operators by a sum of ambient local orbifold operators using the regular OPE, then bringing these operators close to the defect. This would look roughly like: One might even think that there is another way to do this -first bring one of the operators OĀ close to the defect and expanding this in terms of defect operators. However this later method is not possible because individually OĀ is not an orbifold operator. Thus there is really only one channel we can evaluate this correlator in. 6 Additionally we will choose to ignore the intermediate step of the ambient OPE involving the sum over J in (3.9) -mostly because it turns out in the limit n → 1 we can directly compute the dOPE coefficients β i for this full replacement O B OĀ → i β i O i (0) without having to sum over an infinite set of intermediate operators.
We can compute the dOPE coefficients β i as follows. Firstly note that since the replacement is done locally we could have done the same replacement on a twist defect within a totally different setup but using the same replacement coefficient β i . 7 Thus let us consider a flat/planar twist defect defined in the vacuum of a CFT and living along ∂A 0 . To extract a particular OPE coefficient we must also insert some other defect operator O j far away from 0 at a point y. All of this still in the presence of the bi-local O B OĀ. Now in this new setup take the O B,Ā 's close to the defect simultaneously and make the same replacement we did above: where in the notation we established above Σ 0 n = Σ 1 n (∂A 0 ) and A 0 is the uniform half space Rindler cut and the superscript 1 denotes the state operator insertions appropriate for the vacuum |0 . See Figure 3 for a schematic of this replacement. This allows us then to extract β i after inverting the operator metric defined from the two point function of defect operators on the planar defect: The result is written as a sum over all local defect operators; and we are not being careful about the distinction between defect primaries and descendants, which is not really important as long as we compute the operator metric G ij carefully. Actually the operators we will eventually be interested in will all be defect primaries such that G will be diagonal.
Up until now we have kept things general, and for integer n all of the above should make sense. However controlling the spectrum of defect operators and the resulting β i OPE coefficients is difficult. If we can now argue for an analytic continuation in n then it turns out there are several big simplifications that occur when taking the limit n → 1.
One of these simplifications is that we can move the ∂ n so it only acts on the last term in (3.11) -the three point function term. This is because this is the only term that knows about the O B,Ā operator replica ordering which was discussed aobve. So when we compute ln ρ A − ln ρĀ this term would vanish at n = 1 since then there is only one replica and the difference O B OĀ − O B OĀ( ) vanishes. So if the ∂ n acts anywhere else the three point function term would give zero as we send n → 1.
The other simplification is that for small n ≈ 1 the various correlators we need to compute in the presence of Σ 0 n are fixed in terms of CFT correlators in flat space plus 7 We are lying a little here. It turns out that β i is sensitive to the local extrinsic curvature of the defect at y = 0, in analogy to regular OPE coefficients being sensitive to local curvature invariance of the metric if we make an OPE expansion of two local CFT operators in curved space [75]. We will fix this lie in Section 7.
insertions of the modular Hamiltonian associated to the Rindler cut. Since this later insertion is a known integral over the CFT stress tensor [76] we can compute β i using the local data of the CFT. So we need to argue for an analytic continuation in n of the defect operator spectrum as well as the OPE coefficients β i computed in (3.11). Thus we turn to a discussion of the defect operator spectrum.

Local defect operators
We start by reviewing what is known about local ambient space operators (away from the defect) for the replicated orbifold theory. They take the schematic form: where we sum over Z n cyclic permutations of the different replicas. The conformal dimension of this operators is ∆ {α k } = k ∆ α k . The operators are located at the same point on each replica. In terms of including the effect of these ambient operators in certain entanglement computations the analytic continuation in n has been successfully found for low dimension operators with a small number (fixed and independent of n) of non unit operators inserted on each replica. These operators are important when replacing non-local twist operators with local operators when viewed from a distance, as in [77][78][79][80][81], or when bringing a twist operator close to an anti-twist operator, as in [34,82]. Note that our dOPE should not be confused with the various OPE arguments used in these papers.
Here we will only concern ourselves with single operator insertions on one replica symmeterized appropriately: where hopefully the notation does not cause confusion. In order to discover the defect operator spectrum we need to take the local ambient orbifold operators (3.12) and bring them close to the defect. Any defect operator that is not discoverable in this way will not contribute to the answer in (3.11). In Appendix C we argue that one can reproduce the full set of such defect operators by limiting oneself to single replica operators, (3.13), and bringing these close to the defect. Or in other words the more general "multi-replica" operators given in (3.12) do not add to the list of local defect operators when we bring these close to the twist defect.
Schematically we should find for the single replica bulk operators when we bring them close to the defect we can rewrite these as a sum over defect local operators O j : where we have written the expansion in Euclidean coordinates about the defect 8 : Here ∆ j is the defect operator dimension and j ∈ Z is the angular momentum around the transverse plane to the defect, i.e. associated to the charge under SO(2) rotations w → we −iφ . Spinning ambient operators should be decomposed under the action of the SO(2) × SO(d − 2) subgroup of the full Euclidean rotation group, and in this paper we will only need to consider scalar operators under SO(d − 2). If the resulting operator O α transforms nontrivially under SO(2) rotations then α = 0. In order to extract the operators in (3.14) generically we could draw a set of small radial quantization spheres S d−1 × R + around the point y = 0 on the defect. The Hilbert space on the sphere S d−1 is associated to the defect CFT (the defect lives on an S d−3 × R + subspace in radial quantization coordinates) with the symmetry group SO(2)×SO(d−1, 1) of conformal transformations holding fixed the defect. We can then decompose operators into primaries and descendants and extract these operators by acting with the appropriate projection operators made out of Casimirs etc. This is a somewhat tedious procedure, especially for spinning operators. For the class of operators we will be interested in there is a quicker way.
Let us consider the lowest dimension defect operator ∆ of fixed spin . Then we can extract this operator via a limit where τ ≡ ∆ − and τ α = ∆ α − α define the twist of the defect and ambient space operators respectively. It is natural to normalize these operators such that Z α = 1 which means the overall coefficient in their two point functions cannot be independently set to 1. These operators will be the leading operators of interest to us when we take the lightcone limit since they have minimal twist at fixed SO(2) spin. For this reason they are also necessarily defect primaries. We now argue for an analytic continuation in n. The most important thing that such a continuation should satisfy is locality of the bulk orbifold operators relative to the twist defect -that is, in Euclidean, moving a bulk operator by an angle 2π around the twist defect should return the original operator. This quantizes j ∈ Z where Z/n is not allowed but would have been possible if we had not gauged the Z n symmetry. This is also important to ensure a consistent defect theory with well defined OPE coefficients for all values of n.
In particular this means that j will stay an integer under n-analytic continuation, ruling out things like j = ? pn for p ∈ Z. All of these requirements are actually in line with the n continuation advocated in [71,83] for computing holographic Renyi entropies. With this in mind an appropriate continuation would define for real (and complex) values of n: the dimensions ∆ j (n) and the various defect OPE coeffcients β j (n) and Z j J (n) which agree for integer n are suitably well behaved for large n and analytic in n.
How might we extract the defect operator dimensions as a function of n? We can think of two ways to study this. The first method only applies to holographic theories, however it is useful to give intuition into this problem and will give a method to extract the spectrum of defect operators for all n. Versions of this problem have been studied previously in holographic defect theories [84] and it does not require much to adapt to the replica problem at hand [85]. Essentially the idea is to study fields propagating on the Hyperbolic black hole, which is dual to the twist defect Σ 0 n in a conformal frame where the defect lives at the boundary of H d−1 × S 1 . Then the spectrum of defect operator dimensions can be extracted by solving simple wave equations in this black hole subject to certain boundary conditions. Since this approach does not work for general CFTs we leave the details of this to Appendix B where we apply it to the specific problem at hand. See Figure 5 below for the most illuminating picture of the defect spectrum that results. This then gives us a check on the second method that works only for n ≈ 1 but now for general theories.
The second way to extract the spectrum is to imagine we have on hand the ambient space two point function in the presence of the flat Rindler defect Σ 0 n . Actually it is rather simple to n-analytically continue this two point function following [86] where the answer can be written in terms of a thermal correlator, at temperature 1/(2πn) for the CFT on H d−1 × S 1 . While this thermal correlator is not known in general, it is however computable for n = 1 and in an expansion about n = 1.
For example n = 1 gives back the CFT two point function on flat space to leading order. This two point function decomposes into correlators of defect operators living on a now imaginary defect lying along the Rindler cut. This decomposition can be understood as branching the operator representation of SO(d + 1, 1) into representations of the subgroup SO(2) × SO(d − 1, 1), thus these defect operators have simple conformal dimensions related to the CFT operator dimension ∆(n = 1) = ∆ + Z ≥0 . Then it is interesting to understand the leading (n − 1) correction to these operators. We expect a correction to the conformal dimension ∆(n) = ∆(1) + O(n − 1) and perhaps some mixing with other bulk operators, however in addition to this we will also find a new phenomonon can occur: completely new operators can arise that effectively decouple exactly at n = 1 and which were not visible at the leading order. These only arise from spinning operators with spin ≥ 2 and the displacement operator is an example which arises from the stress tensor. These new operators will play an essential role moving forward.

Example: scalar two point function
Let us go through a simple example using scalar orbifold operators built out of the CFT scalar operator φ. Following [86] 9 and the procedure outlined above we consider the two point function and give an n analytic continuation: where the later correlator lives on the branched replica manifold (i.e. n copies of the CFT with a d − 1 dimensional branch cut/plane along the Rindler cut ∂A 0 .) The λ integral encircles n poles on each replica at λ = 1. The complex λ plane reflects the branching structure of M n along a fixed y slice (i.e. there is a branch cut starting at λ = 0 which we take to run along the negative real axis). The correlator additionally has branch cuts which for y = 0 start at λ = z/w and λ =w/z due to lightcone singularities. These properties are totally general, and the simplest way to understand them is to conformally map M 0 n to H d−1 × S 1 : the S 1 factor has length θ ≡ θ + 2πn. The correlator maps to a thermal correlator for the CFT on the spatial manifold H d−1 and the j sum over replicas is a sum over θ = 2πj which we turn into a contour integral over the complex s = −iθ strip for −2πn < Ims < 0 . The final form in (3.17) follows from setting λ = e −s and passing back to the flat conformal frame. Right now n should still be an integer. However we can pick the C contour so that both the operators in (3.17) stay on a single replica -that is λ should itself remain on a single replica. The n analytic continuation is then manifest since the two point function is well defined on M n for non integer n due to its relation to a thermal correlator on H d−1 and the contour no longer depends on n being integer. For example we can pick C to wrap the replica branch cut around λ = 0 → −∞ and the one remaining pole at λ = 1 (see Figure 4). A more detailed explanation for why this is the correct n analytic continuation can be found in [86].
With this n-continued correlator in hand we can in principle extract the spectrum of defect operators that couple to this operator φ. Of course we do not in general know this correlator and so we must work with n ≈ 1. The leading order answer at n = 1 is then unsurprisingly just the flat space CFT correlator: and the next correction is: where at this order the ∂ n pulls down the half modular Hamiltonian and the first term above comes from the one remaining pole at λ = 1 (see the bottom part of Figure 4). The shift ∂ n → (∂ n − 1) can be achieved via an (n − 1) correction to the overall coefficient of the n = 1 two point function which has a trivial effect on the defect spectrum.
A very important property of (3.20) is that if we move the operator φ(w,w, 0) around the would be defect we get back to the original correlator -this was one of our requirements for an n analytic continuation. This happens here because of a conspiracy between the two terms of (3.20). In the second term we must deform the λ contour as we move the operator around -this leaves a contribution from the double pole at λ = 1 giving a commutator −2π [H A , φ(z,z)] φ(w,w) which cancels with the same commutator coming from the first term in (3.20) which arrises as we move the operator past the modular Hamiltonian insertion.
We can expand the expressions above at small w,w, z,z and extract from this the defect operator spectrum. To directly extract the defect primary operators we would need to study the defect channel conformal blocks which can be found in [67,89]. This is rather complicated and we are not interested in the complete spectrum of defect operators. Instead we will focus on the operators relevant for the lightcone limit where we setw → 0 and z → 0. We can smoothly take this limit on (3.19) and this fact, along with the defect expansion given in (3.14), tells us that there must be a set of defect operators with SO(2) "spin" and ∆ = ∆ + . Furthermore these operators are necessarily primaries. These are found by expanding the remaining correlator: thus ≥ 0 for these operators. We could have also extracted these operators using the projection (3.16) since they are the lowest dimension operators with fixed . Also note that for < 0 we would take the limit w → 0 andz → 0 and reproduce a similar set of operators now with ∆ = ∆ + | |.
At the next order in the (n − 1) expansion attempting to directly setw → 0 and z → 0 fails due to divergences that arise. This failure results in new log terms that can be understood as giving rise to anomalous defect dimensions. These log's are quite intricate and involve a delicate interplay between the two terms in (3.20). The first term can be computed using the expression for the stress tensor OPE block of two auxiliary time like separated scalars, which turns out to be just the half modular Hamiltonian operator for the double cone region between the two operators [90,91]. This region can be conformally mapped to the Rindler wedge. Thus the first term is just the stress tensor conformal block for the two auxiliary operators and the two φ operators. The details of this computation and the lightcone expansion are left to the Appendix B. We find: where P (x) and Q(x) have power series expansions in their arguments and while P is rather complicated we can write explicitly: We note some consequences of this result. Most importantly there are no new operators that arise at order (n − 1). This is because P φ and Q φ have a regular Taylor series so, comparing to (3.21), P φ only contributes to an order (n − 1) shift in the two point function of these operators (∝ c ∆ g φ + O(n − 1)) and Q φ results in a shift in the scaling dimensions: where explicitly the anomalous dimensions are: and (x) ≡ Γ(∆ + )/Γ(∆) is the Pochhammer symbol. These shifts are always (n − 1)× a negative number if ∆ > h − 1 satisfies the unitarity bound. 10 While the details of this (n − 1) shift in the defect operator spectrum is not important for our final goal, we went through this exercise to make sure we have a good understanding of the important defect operator spectrum. We have also checked that this small (n − 1) correction is in agreement with the equivalent holographic technique for computing ∆(n), summarized in Figure 9 of the Appendix.

Stress tensor and the appearance of the displacement operator
We will now analyze the defect operators that appear in the stress tensor channel. Again motivated by the lightcone limit we will limit ourselves to the correlator of T −− and T ++ where the ± are in the transverse directions to the defect. These operators will give rise to defect operators with the lowest twist. The sum over replicas can be written: Note the extra factor of λ 2 relative to (3.17) which arrises from the rotation/boost applied to T ++ . The contour is chosen as before to allow for an n continuation. The n = 1 answer is just the CFT correlator: from which we again find operators in the lightcone limitw, z → 0 labelled by their spin and with: as well as the conjugate operators with opposite charges → − but the same scaling dimensions. The shift in the spin arrises because T −− already transforms with charge two under SO(2) rotations so α = 2 in (3.14). If we had set w,z → 0 we would have found a set of operators with scaling dimensions ∆ = d − ( − 2) for = 2, 1, . . . , −∞ which are now however not the same as the conjugate operators we found above -indeed they are not even operators with minimal twist ∆ − = d + 2. This happens because of the asymmetry in the lightcone limit for spinning operators.
We will need to extract the two point function of these twist d − 2 operators: One might have imagined that the (n − 1) corrections works as above for the scalar case -the two point functions and scaling dimensions will shift by small amounts. There is however one new phenomenon that arrises from the λ integral in (3.26). The (n − 1) correction can be written as: We analyze the integral by rescaling λ → λy 2 /(wz) and then expanding the (λ−(wz/y 2 )) −2 term: Note that the second term and higher in the expansion in brackets have divergent λ integrals indicating an issue with the uniformity of this expansion as a function of λ -a more careful limiting procedure will resolve this divergence into a ln(wz) term which then must be added to the ln's arising in the modular Hamiltonian term of (3.30). As with the scalar case we expect an overall ln(wwzz) which results in a shift of the defect dimensions as in (3.22). However more interestingly the first term in (3.31) is finite and can be seen to give rise to a new operator with = 1 and This is not surprisingly the displacement operator [85,93]. We have then re-derived the results of [88,93] for the coefficient of the two point function of the displacement operator as n → 1: . The fact that g 1 = O(n − 1) has interesting consequences for the defect OPE that we are interested in, and we will explore this in the following section. For now we simply note that this displacement operator has the same twist d − 2 as the other operators we singled out (3.28).
The O(n−1) corrections to g and ∆ for ≥ 2 that we could attempt to find using the correlators in (3.30) at this order are not important since they are only subleading effects to the defect operators we already found at n = 1. Thus we will not bother to track down the anomalous dimensions etc. 11 In the Appendix B we computed the defect operator spectrum arising from the stress tensor directly using a holographic model. This gives a more complete picture of the spectrum of defect primaries that are scalars under SO(d − 2) rotations as a function of n. Since our general CFT computations agree with the holographic ones close to n = 1 we expect we have not missed any of the important defect operators. The picture we have of the defect spectrum is usefully summarized in Figure 5.

Higher spin versions of the displacement operator
Using the same techniques, we can also compute the spectrum of displacement operators coming from a higher-spin operator J µ 1 µ 2 ...µ J with conformal dimension ∆ J and spin J > 2 (we consider only symmetric traceless representation of SO(d) rotations.) The minimal twist operators arise from J +...+ and J −...− which we focus on. Note that these are not conserved currents and ∆ J > d + J − 2.
Analogous to the case of stress tensor, the associated displacement operators emerge at order O(n − 1) and can be computed by extracting the power-law terms y −2 ∆ at order O(n − 1) in the higher-spin two-point function: where the . . . in the first line include terms that correspond to "normal" defect operators that survive at n = 1, as well as their corrections at higher orders of O (n − 1); c J is the coefficient of the CFT J J two-point function. We conclude that when we extend to the case of higher-spin operator with spin J, there emerges a family of J − 1 displacement operators D labelled by their SO(2) spin with 1 ≤ ≤ J − 1.
We are not completely sure how to interpret these new defect operators. One might conjecture that in holographic theories they represent new very massive fields living on the RT surface. However in the holographic case one works in a large N theory and there will be double trace operators of higher spin that should equally well give rise to double trace displacement operators. We leave speculation about these to the discussion section. Either way the particular displacement operator D J−1 coming from an ambient operator J −...− of lowest twist at fixed even J play a definitive role when we examine a higher spin version of the QNEC as we will discuss in Section 7.

Summary
In summary we will concentrate on the following defect operators coming from a limit of the bulk stress tensor close to the defect: where all the operators we are interested in are primaries and have the lowest dimension/twist with a fixed . We have used (3.16) and taken the limit |z|, |w|→ 0 inside the contour integral which is allowed for twist d − 2 operators. One can check that (3.34) reproduces the results in this section as long as one considers only the leading term that arises in the (n − 1) expansion for a fixed . Since all these operators have the same twist they will contribute equally in the lightcone limit of interest. The defect operators coming from ambient operators with the lowest twist will dominate in this limit. While this could be a scalar operator with (d − 2)/2 < τ < d − 2 we will see how this contribution exactly cancels in the sum rule that we derive in Section 6.3.

Evaluation of the modular Hamiltonian
We would like to use what we just learnt about the defect operator spectrum as n → 1 in order to evaluate (3.11) which we reproduce here: The operators that dominate in the lightcone limit can be found via the residue projections in (3.34). Since these operators are primary G ij is diagonal and can be inverted easily. We take i → T and j → T − such that we can set where g was given in (3.32) and (3.29) which we also reproduce here: To evaluated (4.1) we just need to compute the "one point functions" Σ n O i (0) and the "three point function" terms Σ 0

One point functions
The one point functions are the only piece of data in the defect OPE that know about the state ψ and the details of the entangling cut A. We will relate the n → 1 limit to the entanglement entropy of ψ reduced to A as well as the one point functions of T µν in the state ψ. These are the two ingredients that go into the QNEC. Let's begin with ≥ 2. In this case the limit n → 1 is trivial and we simply get local operators without the defect: where the (−1) comes from the awkward minus sign relating the holomorphic coordinates to lightcone coordinates w = −u.
We now move to = 1 where we find the physics of the displacement operator. See [94] for a similar discussion based on [95,96]. We would like to evaluate: where T here is the orbifold stress tensor, which includes the sum over replicated stress tensors. Note that the analytic continuation in n is trivial since there is only a single T operator insertion on each replica. Since the n = 1 limit gives the stress tensor in the state ψ, we do not find any 1/w pole that gives a non-zero answer. Hence this one point function must be ∝ (n − 1). At this next order we pull down the half modular Hamiltonian and it is certainly possible that T −− is singular in the presence of H ψ A . We extract an answer if there is a pole H ψ A T −− ∝ 1/w then: Let us back up a little now and rather compute the answer at finite n. We can find the 1/w pole in this case as follows. The displacement operator is related to the nonconservation of the stress tensor in the presence of the twist defect. We have: where, recall that y are the d−2 the coordinates along the defect the transverse coordinates to the defect are w,w (at least close to y = 0). And the displacement operator in turn is related to the shape deformation of the orbifold partition function with the twist defect. That is the shape deformation of the Renyi entropy: 12 Now the Ward identity in (4.6) implies the necessary 1/w pole. That is we must have: gives the desired delta function. Thus we have: where we define the functional derivative to absorb the measure factor √ h. The limit n → 1 of the Renyi entropies give the EE, and so we find the desired behavior: where we could have extracted the displacement operator directly from the n = 1 limit from the pole of T −− in the presence of the modular Hamiltonian: taking the limit n → 1 on (4.8).
Note that P − depends non-locally on the shape of the entangling surface. So for example functional derivatives of this with respect to x − (y) are non-zero also when y = 0. It is also non-linear in the state |ψ ψ|. This should be contrasted with the T −− contribution which is localized to the null generator at y = 0 and is the expectation value of a linear operator on the Hilbert space.
There is a potential issue here relating to divergences that might naturally appear in S EE . These would show up in the language of the displacement operator as divergences in the one point function Σ n T 1 . Thus we need to specify a regulator to define S EE as well as T 1 . The regulator will depend on a small parameter with units of length. Actually many natural regulators (the brick wall for example) can be designed to preserve the boost invariance around the undeformed cut ∂A 0 such that δS ( ) EE (A 0 )/δx − = 0 for the vacuum, so one might not expect additional divergences for excited states. 13 Of course additionally divergences might show up since A is not locally flat -an effect we will ignore until Section 7.2. If we do not find a regulator where this divergence goes away for the flat cut, then actually the dOPE method forces a resolution. In this case we would find that the one point function of the displacement operator on the flat cut is non-zero: This means that T ( ) 1 is no longer a primary operator. This fact would then lead to some non-trivial mixing when we make the defect OPE argument. For example the displacement operator would now mix with the defect unit operator: We can deal with this mixing simply by removing the one point function: The end result is that the displacement operator one point function must involve a vacuum subtraction of the entropy for the flat cut ∂A 0 : where we have made explicit the state dependence. This vacuum subtraction will be very important later when we deal with the local geometric terms that we are ignoring for now.

Three point functions
It turns out the three point function term in (4.1) can be treated in a similar way to the ambient space two point correlator which was used above to extract the defect spectrum 13 Entanglement entropy has potentially state dependent divergences which would invalidate this argument [97]. However these state dependent divergences also afflict the definition of the stress tensor, via improvement terms, in such a way that the state dependence cancels between the various terms we find in the lightcone expansion. This should be clear in the final answer, at least for null cuts of the Rinlder horizon since then the QNEC quantity is related to the relative entropy which does not suffer from state dependent divergences. and two point function coefficients g . The operators O B OĀ are either always on the same replica or displaced by one replica. We need to then sum over the different replicas. We treat the case where the two operators O A are on the same replica: and then do a simple continuation, as discussed in Section 3 to find the other case. We must additionally insert T − which also involves a symmeterization n−1 k =0 T (k ) ++ over each replica. As a result of the Z n symmetry around the defect there is a non-trivial sum left over labeled by j = k − k the separation in replicas: Actually, as we discovered before, the limit z → 0 is not well defined because of the appearance of log's. These will not afflict the final answer for the full modular Hamiltonian but for now we work with a finite but small z. The analytic continuation of the term in brackets can be treated as we did in Section 3.3.1 turning the sum into a contour integral dλ. Since the n = 1 term vanishes in the full modular Hamiltonian we can concentrate on the (n − 1) piece. For the term in brackets we have: where in the first term we insert H 0 A along the region A 0 just before the operator O B insertion in the Euclidean clockwise ordering sense. In the second term we have rescaled λ → λ/z relative to the equivalent expression in (3.20). The specified λ contour is such that T ++ naturally stays away from the O operator insertions with T ++ again inserted just before O B in the clockwise ordering. Note that the second term depend on the CFT 3 point function T OO which is fixed by conformal invariance but the first modular Hamiltonian term is now determined by a four point function T T OO and is not fixed by conformal symmetries.
Fortunately when we compute the full modular Hamiltonian this term turns into K 0 A OT O and since K 0 A is a conserved charge this correlator is now related via a Ward identity to the universal 3 point function. More specifically we may apply the continuation OĀ( ) directly to (4.18) and subtract. The resulting expression has a finite z → 0 limit, allowing us to make the projection onto the defect operators: where the first term in (4.18) becomes: In fact this term will only be non-zero for ≥ 2 due to the lack of a 1/z that is necessary for a non-zero answer for = 1. Thus the displacement operator gets no contribution from this modular Hamiltonian term. For ≥ 2 the commutator K 0 A , OĀ simply moves the local operator around a little and so summing over C (1) in the defect OPE gives the same contribution as the stress tensor exchange OO → T → ψψ of: evaluated in the lightcone limit. This is the subject of [19]. Indeed this allows us to include another term we have thus far ignored -the unit operator contribution which will exactly reproduce the unit operator exchange in the correlator of (4.21). So at this point we reassess our goal. We can remove the C (1) term in the three point function by computing the following vacuum subtracted modular Hamiltonian instead: where we have used K A |ψ = 0. The second term C (2) that remains from (4.19) is The contour C(Ā) encircles the operator OĀ in the clockwise direction, straddling the branch cut that runs out to λ → ∞. This particular contour arrises because we are computing the full modular Hamiltonian. Tracking the motion of OĀ( ) we find we have to deform the λ contour in (4.18) so that T ++ avoids OĀ. This results in C(Ā) when we subtract the two terms. 14 See Figure 6 for an illustration of this. If we switch the order of integration, which is justified because λ stays well away from the origin, we can now do thez integral: We finally need the following CFT 3 point function: Here ∆u = u B − uĀ and ∆v = v B − vĀ and we have dropped various terms like u B v B 1 which are small in the lightcone limit. The λ integral over this 3 point function can be 14 Note that λ is an anti-holomorphic coordinate so the ordering prescriptions are reversed and the deformation OĀ( ) involves a clockwise rotation in the λ plane. Figure 6. The complex λ plane represented as a slice of pie with the edges identified. Note that λ is an anti-holomorphic coordinate so ordering is now anti-clockwise compared to the discussion in the text. The green curves are the contours we integrate over. The two figures on the left represent the two terms in (4.18) and on the right we have the two terms C (1,2) in (4.20) and (4.23) which arises after computing the full modular Hamiltonian using the operator deformation of (3.7).
done and written in terms of a Hypergeometric function. We find that for ≥ 2 we can then rewrite C (2) /g as another Hypergeometric integral: That is, one can explicitly check that the two integrals (4.26) and (4.24) agree after dividing by g and using the definition of G −1 N ∝ c T that appears in the normalization N via: Here c ∆ sets the overall normalization of the OO two point function and is related to c T ∆ via the Ward identity. The relation between c T and G N is by definition the usual relation in holographic theories with R AdS = 1. We use the definition here for convenience, however we stress that we need not have G N small. Additionally for = 1 we find: where the 1/(n − 1) compensates the displacement operator one point function which is O(n − 1).

Putting it together
We can now put everything together. The term with ≥ 2 can be easily summed using the integral representation (4.26) and: Adding in the displacement operator we find for (4.22): While this result is only an intermediate step towards our final goal it is instructive to work with this a little. Imagine now taking a limit where u B , −uĀ → ∞. Of course we should always keep u B 1/|v B | etc. so we remain in the lightcone limit. After integrating by parts we discover: Note that the object in brackets can be interpreted as the first order change in the CFT relative entropy for the regionĀ (with the reference state being the vacuum) under a null deformation in the positive u direction. At this stage we could try to constrain the sign of the shape deformation of R which is in turn related to matrix elements of the shape deformation of the full modular Hamiltonian for the state ψ. If we consider changes of the subregion A under inclusion then since in this case δ shape K is a negative semi-definite operator then one might argue that δ shape R is negative. 15 Indeed such deformations applied to (4.32) will lead to something proportional to the QNEC quantity, however we cannot claim δ shape R has a definite sign since we are subtracting the action of K 0 and this subtraction will change under the deformation. We note in passing that our original goal was to derive the QNEC in this way, however this method ultimately failed for the reasons explained. Although once we had the result (4.32) in hand it was not hard to come up with an argument that does work involving modular flow. We turn to this now.

Warm up
In this section we warm up with a simple example of modular flow, using the results of the previous section. Consider: we would like to evaluate this in the lightcone limit. 16 We will do this using perturbation theory, where the small paramater is |uv| 1 for both u = {u B , uĀ} and v = {v B , vĀ}. We expect we can apply this here because we know that the action of K A within correlators of operators which are close to light like separated (i.e. in the lightcone limit) is well approximated by K 0 A plus corrections in powers of uv. This is the content of (4.31) where the leading correction is suppressed ∼ (uv) τ /2 for τ = d − 2 the twist of the stress tensor. Since the modular flow generated by K 0 A at zeroth order does not take us outside of the lightcone limit we then might expect we can continue to apply (4.31) to higher orders in the actin of K A which is necessary to generate modular flow. In this section we will use this method to compute (5.1).
The method we will use here is actually not completely justified since in reality (4.31) is a statement about matrix elements of K A in a subset of states O |ψ for operators light like separated from the defect. 17 While to compute higher order powers (K A ) m we might transition outside of the light cone limit and find matrix elements not well approximated by K 0 A . To address this issue we need a method to directly compute higher order powers of K A in these states. There is a very efficient method to compute these higher order terms, by essentially directly computing (5.1) using the replica trick. Roughly speaking we can compute correlators involving ρ p A O B ρ −p A for integer p and analytically continue this p → is/2π. 18 This method syncs well with the defect OPE computation. We present this in Appendix D and find exactly the same result as the perturbative method that we present now, thus providing a complete justification of these results (and also likely a justification of the method.) 16 Note we continue to label OB as if it is in the region D(B), despite the B region playing no role here.
If the reader likes take B = A in this warm up section. 17 There is a useful analogy here to the quantum error correction language of bulk reconstruction advocated in [37,41,42]. These "lightcone" states are like the code subspace where the action of modular flow is well approximated by K 0 A . This is the analogous content of the JLMS [37] result that the bulk modular Hamiltonian is the boundary modular Hamiltonian when acting on low energy states in the gravitational dual. However more work is required to show that this bulk/boundary equivalence applies to modular flow; see footnote 5 for a justification [38]. For example this does not work for the quantum error correction codes discussed in [98] where the relation between bulk and boundary modular Hamiltonians requires a double sided projection Πc onto the code subspace: K bulk = ΠcK bdry Πc . Roughly speaking in our case we want a version involving only a half sided projection K bulk = ΠcK bdry = K bdry Πc although a weaker version should work too. We thank Aitor Lewkowycz, Don Marolf and Xi Dong for discussion on this. 18 We thank Aitor Lewkowycz for suggesting this to us.
The method here starts by using the second expression in (5.1), writing K A = K 0 A + (K A − K 0 A ) and then expand in (K A − K 0 A ). We work with the second expression in (5.1) rather than the first because the perturbative series arranges itself differently for these two expressions -the end result is the same, but we choose the most efficient route. Expanding using time dependent perturbation theory: which is still a local operator just at a different point in spacetime. We write this point as: and similarly for the modular flow of x B (t − s) etc.
In order to arrive at this result, for terms already at the order O(K A −K 0 A ), we are free to make the following manipulations: e iK 0 A t |ψ ≈ e iK A t |ψ = |ψ . We have made such a replacement several times in the second term of (5.2) with the goal of writing the answer in terms of the commutator in R defined (4.31). We will need this commutator for operators that are no longer at the location of O B , OĀ so let us define instead: where the A in R(..; A) refers to the fact that we are using the A modular Hamiltonian. In the next section we will need the B version of R. The lightcone limit of R can be succinctly written in terms of the following function: such that: We will also need the expansion of ψ| O 2 O 1 |ψ in the lightcone limit -which was computed in [19]. This can also be written simply in terms of F : Note that for this computation we will always take u 1 < 0 < u 2 . Now that everything is written in terms of F , we note that it satisfies the following properties: where x 2 (t) is the modular flow/boost of the coordinate x 2 with respect to the action of K 0 etc. These identities will be useful for the various manipulations we make below. Putting all the above definitions together we can now write the correlator we are interested in (5.1) as: It turns out we can simplify a lot this later t integral: where λ = e t and uĀ(t) = λuĀ. Now we would like to exchange the λ and u integrals. In order to do this we have to make the lower limit of integration λ independent. So we insert a step function H(u − λuĀ) (where keep in mind that uĀ < 0) so we can extend the lower range of integration to uĀ(s) = e s uĀ. Additionally using the fact that: we can then easily do the λ integral: Adding this term to the P term in (5.10) we see there are various cancelations and we have: which can also be compactly written as: is the jump discontinuity in the function A s (u) at u = 0. There are several checks on this result. Firstly when s = 0 we trivially go back to the result of [19] given in (5.8). Secondly we know via Tomita-Takesaki theory that this expression should be analytic in the strip −π/2 < Im s < π/2. One can check this is indeed the case, and relies intricately on the region of u integration in (5.15) staying on the opposite side to the action of modular flow in the coordinates of F . It is interesting that the integral is contained within uĀ < u < u A despite the fact that zeroth order modular flow moves the operators further away (for example at intermediate steps above this is not the case). This is linked to the fact that K A |ψ = 0 which implies that we can do the computation above in several different ways (say starting from the first expression in (5.1) or ψ| e −isK A O B e isK A OĀ |ψ ) and consistency of these different computations tells us that the final u integral region must be contained within uĀ < u < u A . We do not have a deeper understanding of this.
Finally later when we bound the function f (s) we will need to consider single modular flow but for the case where the two operators are initially inserted in the same region. This is a mild extension of the above computation, although now we must be more careful with operator ordering and branch cuts. The answer is the naive generalization of (5.14) where for the operator ordering shown we should take −π < Im s < 0 in order to pick the correct branch cut prescription when doing the u integral. Here: and we take the case uĀ, u Ā < 0, u Ā − uĀ > 0 and v Ā − vĀ < 0 (so the two operators are initially space-like separated before the action of modular flow). Other cases can be arranged via analytic continuation of this answer.

Double modular flow
In this section we would like to use the above method to compute the more complicated double flow correlator (see Figure 7): The operators O s B,Ā are inserted at: The superscript on the coordinates is to emphasize that we have shifted the operator insertions by a small s-dependent amount, as illustrated in Figure 7. We have done this for future convenience where this will lead to several cancelations that are important for the final bound that we derive. These were found by trial and error, however a complete understanding of these can be found in later work [52] where they naturally arise from relative modular flow. See for example (3.25) of that paper -where the shifts are encoded in the V factors. In the next section we will discuss general properties of f (s) in the strip −π/2 < Ims < π/2. Here we would like to compute f (s) in the lightcone limit where we take s large but not too large. For now we keep s fixed and O(1). Either way taking v B,Ā small, the dominant contribution will be the lowest twist defect operators which for now we take to be the stress tensor and the displacement operator. We will use the same method as in the simple example above. We now have the complication of two modular flows to keep track of, however we can use the same perturbative method: writing ) and expanding. Just as for single modular flow this method needs proper justification. Again we go through an alternative method in Appendix E that gives exactly the same answer as the perturbative approach. The method involves using a Cauchy-Schwarz bound to write (5.18) at the order we wish to compute it, but now only using single modular flow. We have an independent (not perturbative) argument that our computations of singular modular flow are under control and so the results of this section are justified.
In the perturbative approach we expand starting from: The leading order term (after replacing K A → K 0 A etc.) is constrained by the algebra of modular inclusions. We will use the following notation to denote the zeroth order modular A e isK 0 B and this is a local operator at: 19 We will use similar notation to describe other zeroth order modular flows below. Expanding we find: We can write this in terms of the correlators used in the warm up section R(x 2 , x 1 ; A), R(x 2 , x 1 ; B) and P(x 2 , x 1 ) which are in turn related to the block function F . In particular note that all denominators in P, R are the same for all the terms above: Factorizing out this denominator we find: In a slight generalization to the previous section we have defined: (5.24) relevant for the action of K B − K 0 B , · . Very similar manipulations to the previous section follow, in particular we switch the order of integration from dλ ↔ du where λ = e t . We also need some slightly more general identities for F relating to boosts around the u = δx − surface: Turning the crank we find the answer: 19 Note the ordering of the argument in OĀ(sA, −sB) is important and the subscript tells us which modular Hamiltonian to evolve with, K 0 where we have the piecewise defined function: This is quite a remarkable result. Again the stress tensor is only integrated between the positions of the original operator insertions, and cancellations occur in the computation to ensure this. Although our intermediate steps involve T −− inserted in a larger u window, we note that since we are re-summing local operators at the location of the defects to find this integral, the computation is actually never sensitive to the expectation of T −− outside of this window. Because of all the definitions that go into the above result, the reader might get lost in these expressions so we write these out more fully: 20 and the functions are: (5.34) We thus have the promised e s growing term multiplying the QNEC quantity. Later we will find it necessary to place the operators symmetrically about the two entanglement cuts with uĀ = −u B + δx − , then: The s-dependence we introduced in the insertion positions u s B,Ā has the desired effect of eliminating the would-be s-dependence in the term (−∆u) h . This s dependence was in v1 of this paper, and leads to issues later.
where ∆u = u B −uĀ > δx − and ∆v = v B −vĀ < 0 . This was the main goal of this section. In the following section we will study more general properties of f (s) in the complex strip and use this to show that the quantity Q − above is positive. 6 General properties of f (s)

Analyticity in the s-strip
We start by noting that in this section in order to apply various theorems from the algebraic approach to QFT we need to take the operators O s B,Ā to be non-distributional bounded operators. This can be achieved by firstly smearing the local operators over a small neighborhood δ d in spacetime keeping this region within the respective domains of dependence D(B) or D(Ā). Secondly to make the operators bounded we could apply a projection from the spectral decomposition of the operator or we could also simply use fermionic operators. As long as the lengths scales that these procedures introduce (e.g. δ) are much smaller than the various length scales in our final setup we expect the details of this procedure will not matter and in our final results we can replace the operators again with local bosonic operators.
Let us first define two functions which are un-normalized versions of f : We can write the functions as inner product on two states: 3) The two functions g ± are well defined for {g − : −π ≤ Ims ≤ 0} and {g + : 0 ≤ Ims ≤ π} as long as x s B ∈ D(B), x sĀ ∈ D(Ā), this is satisfied for Res > s 0 = max ln . These are semi-strips which we simply refer to as strips, inside of which the functions are bounded using the Cauchy-Schwarz inequality by the norms of these states The bound is finite in their respective s-strips thanks to Tomita-Takasaki theory (see Section 2.4). This also establishes analyticity since Tomita-Takasaki theory tells us that the maps taking R → H CF T or H CF T : have analytic continuations into the strip {−π ≤ Im s ≤ 0, Re s > s 0 } for the vector and dual vector in g − (s) of (6.3). This is similarly true for the vectors and dual vectors in g + (s) but now for the strip {0 ≤ Im s ≤ π, Re s > s 0 }.
The fact that the operators are inserted in an s dependent manner leads to a slight subtlety here since for complex s the operators (or smeared versions thereof) are no longer clearly part of the algebra. This is not a problem since at somewhat large s the shifts are small and can be achieved with a simple Taylor series expansion in δx − e −s . We can then make the argument for each term in this expansion. Another justification of this should be achievable with relative modular flow, since this gives rise to these operator shifts [52], and has a controlled analytic structure. See also Section 7.2 for further discussion of this point. Now notice that g + (s) = g − (s) for {Ims = 0, Re s > s 0 } since the modular flowed operators e isK A O sĀ e −isK A and e isK B O s B e −isK B commute with each other. This is due to the inclusion property B ⊂ A andĀ ⊂B. Note that for real s the small s-dependent shifts are real so indeed we can associated the operators to their respective algebra.
Therefore we have g ± (s) as holomorphic functions agreeing along the real s axis where they are continuous functions, by the edge of the wedge theorem, they must be complex analytic continuations of each other. Thus we can define g(s) holomorphic in the full strip {−π ≤ Ims ≤ π, Re s > s 0 }. We will then use g(s) in our definition of f (s) after normalizing appropriately.
To make contact with some of the theorem's used when studying half sided modular inclusions we can prove analyticity in a slightly different way. Consider the structure theorem proven in various ways [29,31,33,35] which states that has an analytic extension to the complex strip {0 ≤ Ims ≤ π} as a holomorphic function with values in the space of bounded operators on the QFT Hilbert space. Furthermore in this strip the operator norm is bounded V − (s) ≤ 1. Similar statements hold for the opreator: however now V + (s) ≤ 1 in the strip {−π ≤ Ims ≤ 0}. We will not go over the details of the proof except to note that B ⊂ A implies that V + (s − iπ) = J A V + (s)J B so the operator V + is unitary on the top and bottom of the strip Im s = 0, −π, and the bound on the norm in the interior follows roughly from the maximum modulus principle. Note that in our definitions of V ± the s-strips are reversed compared to where we defined g ± . We can thus use the results above to extend the definition of g ± (s) to the full region {−π ≤ Im s ≤ π, Re s > s 0 } which then must satisfy g + (s) = g − (s) ≡ g(s) everywhere in this strip again via the edge of the wedge theorem. For example now we can set: Re s > s 0 (6.10) We can see, in a specialized example, how the bound on the norm of V ± arises. Take the modular flow for Rindler cuts in vacuum, i.e. those relevant for defining the denominator in f (s). In this case the modular Hamiltonians K A,B → K 0 A,B satisfy the Poincare algebra and we can simply compute V 0 ± : where U is a null translation in the x − direction. This translation is generated by a positive null momentum operator P − with U (a) = e iaP − . Such that a = ∓δx − (1 − e −s ) with Im a > 0 in the appropriate s strip. This situation will generalize to cases that satisfy the contraint of half-sided modular inclusions e isK 0 A A B e −isK 0 A ⊂ A B for s > 0 (see Section 2.2 and Appendix A). This only applies to vacuum states for special cuts, and we emphasize this is not a general result for the ψ modular Hamiltonians. In fact later we will give evidence that the growing Q − term only arises in the case where the algebra does not apply. However the case for half-sided modular inclusions will always be used to normalize g. So we give a complete definition of f : with f (s) = g(s)/g 0 (s). Here the region of analyticity for g 0 (s) is at least that of g(s), although we now have to consider the possibility that g 0 (s) has a zero. For Rindler modular Hamiltonians in a CFT we can just compute this quantity: where this function is s-independent and has no zeros. So we conclude that f (s) is analytic in the complex s-strip of interest.
For example the small corrections we computed in Section 5.2 for the lightcone limit of f (s), which were in general complicated functions of s (5.30), demonstrate this analyticity in a non-trivial way. Although it should be noted that it is important for the QNEC proof that we can make the analyticity argument for general s including very large s where we move outside of the lightcone limit and where we do not have an explicit expression for f (s).
We end by noting that the operator bound on V is not useful for us since the norm of the state O A |ψ is dependent on the details of the smearing of the operator, so this will be a very weak bound scaling with some inverse power of the small distance scale δ over which we smear the operator. We now move onto proving a more refined bound on the boundaries of the reduced strip {−π/2 ≤ Im s ≤ π/2, Re s > s 0 }.

Cauchy-Schwarz bound
We now study constraints on f (s) along the lines {Im s = ±iπ/2, Re s > s 0 }. Our considerations here are in line with, and thus motivated by, the derivation of the chaos [47] and causality [48] bounds. Starting with Im s = π/2 we use g + defined in (6.2). We use the Cauchy-Schwarz inequality to show: |Ω | , s = t+iπ/2, t ∈ R, t > s 0 (6.15) The denominator is simply given by (6.14), which is s-independent. The numerator terms in (6.15) are determined by two point functions, for example: where we have taken OĀ ,B to be a Hermitian operator, and thus O sĀ By examining the norms in (6.16) let us introduce two normalized single modular flowed correlators: These are designed to be analytic in s as well as to reproduce the norms in (6.16) for s = t + iπ/2. The denominators in hĀ ,B (s) are also s-independent, and agree with one another as well as with that of f (s) if we choose: So we pick {uĀ ,B , vĀ ,B } to be these values from now on. It is very important that the denominators match so that the leading terms for the Cauchy-Schwarz bound below in the light-cone limit all cancel. The functions hĀ ,B (s) can be computed using the single modular flowed correlators discussed in Section 5.1, and have the same analyticity properties as f (s); they are real and positive along {Im s = π/2, Re s > s 0 } because they are proportional to norms of states there. Now define The Cauchy-Schwarz bound therefore translates into: By starting with the definition of g(s) using g − (s) and that hĀ ,B (t − i/2) = hĀ ,B (t + i/2) for t ∈ R, t > s 0 , one can show that the same is also true for Im s = −π/2, Re s > s 0 . We thus conclude that Re F (s) < 0 , Im s = ±iπ/2, Re s > s 0 (6.21)

The sum rule
In order to make contact with the ANEC sum rule derived in [19] let's map the strip −π/2 < Im s < π/2 to the upper half plane via: and define note that z 2 → η was used in [19].
To proceed, let us consider a contour Γ in the s-stripe where F (s) is analytic, consisting of Im s = ±π/2 and Re s = − ln r 1 , Re s = − ln r 2 , where s 0 < − ln r 1 < − ln r 2 . This is then mapped to semi-circles of radii R 1,2 = r 1,2 ∆u−δx − connected by straight line segments in the σ-plane (Figure 8).
By analyticity Γ dσF (σ) = 0, we have that: where R semi-circle denotes an anti-clockwise integral over the semi-circle of radius R. In general, F (s) is some complicated function we know nothing about other than its analytic properties. However, when the light-cone expansion is valid (i.e. moderate t), we can approximate the double-modular flowed term f (s) by (5.30), and the same-side single modular flowed terms hĀ ,B (s) using (5.16). Let us write the leading order light-cone approximation thus obtained asF (s). After some tedious algebra one can show that : This fact is somewhat analogous to the statement that the double discontinuity defined for a 4 point CFT correlation function in [99] vanishes when applied to a single conformal block in the s-channel. The expressions in (5.30) are analogous to the light cone limit of these blocks. We emphasize that (6.25) is a non trivial result for which it is important that we made the appropriate small s dependent shifts on the operator locations. This fact will also be essential to our proof of the QNEC bound as we will expand upon below. In light of (6.24), this implies that for any R 1 , R 2 where the light cone expansion is valid, we have: independent of the radii R 1 , R 2 . Suppose we expand: , ∈ even 0, = −1 and ∈ odd (6.28) The R independence of (6.26) implies that we must have Q 2n = 0 for n ∈ Z, thusF (σ) only contains terms of odd integer power, of which only the simple pole Q −1 σ −1 survives the semi-circle integral: The residue of the σ pole in the light cone limit is: At this point let us also comment on the reason for the s-dependence in the operator insertions x s B,Ā . Had we chosen to proceed with s-independent insertions, say simply placing O B,Ā at u B,Ā , v B,Ā (see Figure 7), which was how we initially attempted the proof, then the correspondingF (σ) thus computed will be contaminated by even powers of σ. Doing the semi-circle integral will yield corrections to (6.29) which are supressed by powers of R but might be leading in z. In order to suppress these corrections we would need to be in the large t (small R) limit compared to the other small paramater z # controlling the light cone limit, at which point the light-cone approximation F (s) ≈F (s) can no longer be trusted. In particular the order of limits is very important: we must take z small before we take s large. We can only achieve this, while projecting onto the term of interest, once (6.25) is satisfied. The s-dependence in the positions of the operator insertions x sĀ ,B is introduced for this purpose.
As an example of the power of this statement we can now give the reason that low twist scalar contributions can be ignored. While such contributions might dominate the light cone limit they are killed by the above contour integrals. More explicitly this is because they don't have a σ −1 term but they will still satisfy the constraint (6.25). So for example if we had not proceeded with the s-dependent shifts of the operator locations then indeed such low twist contributions would have given the leading answer. Now consider the contour C consisting of the semi-circle {|σ|= R, Im σ ≤ 0} and the straight-line segment {−R ≤ σ ≤ R, σ ∈ R}, inside a region where F (σ) is analytic and the light-cone approximation F (s) ≈F (s) is valid. Taking the real part of C dσF (σ) = 0 (6.31) Applying the Cauchy-Schwarz bound we obtained before, which in the σ-plane says: we can extract the QNEC quantity as a positive sum rule: This ends our proof of the QNEC.

Mixed states
Here we address the issue of mixed states. Until now we have only considered pure states |ψ , and since entanglement entropy is non-linear in the state we cannot simply extrapolate the QNEC for mixed states from the QNEC for pure states (this should be contrasted with the ANEC where this was possible.) Indeed we will see an interesting effect relating to mixed states when interpreting our results through the lens of a putative gravitational dual theory: the entangling surface splits into two surfaces, one for A and one forĀ. This is also true holographic theories if we use the quantum extremal surface [51] in our computation of the quantum corrected entanglement entropy. So our results to follow give further, theory independent, evidence for the importance of the quantum extremal surface (see [94] for a derivation in holographic models.) Our methods rely on the existence of a modular operator K A associated to a sub region. While we could define − ln ρ A + ln ρĀ for any mixed state ρ ψ , this is not the correct generalization of K A . For example it has very different properties than what we might hope, most notably there is no sense in which this guess annihilates the state. It is also, up to a sign, the same operator for A andĀ which would mean our subsequent arguments, if they had been possible, would not distinguish S A from SĀ. The correct thing to do, it turns out, is to instead look for a purification of ρ ψ in an enlarged Hilbert space H A ⊗ HĀ ⊗ H C . Then the correct modular operator is 2πK A = (− ln ρ A + ln ρ A+C ) (7.1) Which now manifestly differs from the minus of: 2πKĀ = (− ln ρĀ + ln ρ A+C ) (7.2) Finding a purification is always possible and at worst we must take this new Hilbert space to be a double of the original CFT Hilbert space. Let us embrace this "worst case" scenario and map ρ ψ to a pure state in the thermofield double Hilbert space H CF T ⊗ H CF T . To illustrate this we consider a mixed sum over primary states: where |α = ψ α (0) |Ω is a real scalar primary operator insertion working in radial quantization about the origin and α| = lim x→∞ |x| 2∆α Ω| ψ α (x). This density matrix is a state on the S d−1 CFT Hilbert space and we would like to consider tracing over some spatial sub-regionĀ ⊂ S d−1 . We can take the purification to be: We now aim to compute matrix elements of the modular operator K A using the replica trick, relating the answer to the n analytic continuation of a genuine correlator in the Z n orbifold theory. For example matrix elements of ln ρ A can be computed via the n → 1 limit of: where: (7.6) and the symmeterization only acts on the pairwise operators. Note that the operators that create the state, in the mixed case, are inserted in a correlated way at 0 and ∞ on a fixed replica. This is just like the bi-local O B OĀ operator insertion that we had to deal with previously. We can understand how these specific bi-local operators arise from the purifcation perspective since tracing out CF T gives: Thus correlating operators on the same replica. When computing matrix elements of ln ρĀ +CF T for the state |ψ it turns out the replica trick gives exactly the same prescription we have been working with for pure states: CF T n /Zn (7.8) that is all we have to do is take OĀ and move it in a clockwise direction around the Σ n defect. This follows after two steps. Firstly the trace over the purification now results in: which correlates operators δ α k ,β k+1 on shifted replicas. The second step comes after we trace over A in the above expression and write the answer as a correlation function on a branched manifold M n . In order to produce the same branch cut structure as the replicated manifold M n used for the computation of Z n (A), we must deform the cuts, which for M n initially lie alongĀ, back to A. Since the branch cut is now moved across the state operator insertions (say at 0) this effectively undoes the shift in the correlation of ψ α operators, thus giving back the same bilocal operators (7.6) as appearing in Z n (A). The only difference coming when we include the O B,Ā operator insertions one of which shifts replicas as we deform the branch cut. Hence (7.8).
The end result is satisfying and gives the same prescription for deformation the O B,Ā as we worked with before. So our proof continues for mixed states also. Note that the different correlation between state operator insertions means that we will find a different answer for K A versus KĀ.
There is another formalism for dealing with the various bi-local operators and oddities that we have defined in this paper. We sketch the picture here, and leave details for future discussion. Again the idea is that we want to write the replica trick computation using objects which are intrinsic to the orbifold theory. The oribold theory is a discrete gauging of the Z n cyclic permutation symmetry for the CF T n theory. Gauging this symmetry results in new non-local operators that live naturally in this theory. For example the entanglement region A where one identifies the different replicas, can originally be thought of as a co-dimension 1 operator inserted along A and ending on ∂A. After gauging we remove the co-dimension 1 operator leaving a co-dimension 2 twist operator Σ n living at ∂A. In particular the position of branch cut on the replicated manifold becomes irrelevant.
This twist operator carries a Z n charge, corresponding to a d−2 form generalized global symmetry [100], which arrises when we gauge the original (0-form) replica symmetry. The twist operators are somewhat analogous to flux tubes and we can measure the charge of the flux tube by encircling the twist operator with a Wilson loop for the discrete gauge symmetry: W q (C). These are labeled by an integer q = 0, . . . n − 1 such that: W q (C)Σ n (∂A) CF T n /Zn = e iq2π/n Σ n (∂A) CF T n /Zn (7.10) Here C circles ∂A (on the un-replicated/un-branched space.) We can define local operators that are charged under the gauge symmetry via: (7.11) and k, l, q etc are defined modulo n. These will only make sense in the orbifold theory (for q = 0) if we attach them to Wilson lines. Thus we propose to define our bi-local operators as: where C(x, x ) is some open curve between the points x, x . The curve C was previously implicitly defined by our convention of which point is on which replica for the branched covering (i.e. where we choose our branch cuts on M n before gauging.) This definition also works for the bi-local probe operators: The two different prescriptions for ln ρ A and ln ρĀ +CFT correspond to different choices of curves the C(x B , xĀ), passing to the left or right of the twist operator at ∂A.
Let us illustrate this with the following example. Consider a more standard replica trick computation -that of computing thermal Renyi entropies S n (A). Actually the well known methods for computing this are not obviously intrinsic to the orbifold theory for the same reasons discussed above. The computation of S n (A) would be expected to be governed by a twist correlator on a space S 1 × R d−1 . However since the twist operator is co-dimension 2 in the orbifold theory and does not have a branch cut or a co-dimension 1 object ending on it, this correlator is not sensitive to the difference between A andĀ. Since S n (A) = S n (Ā) for mixed states this would give the wrong answer. The issue arrises because the sums over the states in the thermal density matrix on each replica are independent, and in particular do not involve a sum over the action g of the replica symmetry n−1 k=0 Tr CF T n g k e −βH CF T n , necessary to project to the symmetric states after gauging Z n . It turns out we can remove the the projection onto symmetric states by introducing a sum over Wilson lines W q (S 1 ) wrapping the thermal circle. That is: where C wraps S 1 . The difference between the A andĀ Renyi entropies corresponds to picking the curve C to either intersect A orĀ respectively. In our computation, where now ρ ψ corresponding to the thermal state, we should pick both the curve C as well as an open curve between the local operators O B,Ā on which to place these Wilson lines.

Local geometric terms
We now address the existence of local geometric terms that appear in the expansion of f (s) in the lightcone limit. These terms might actually contaminate the expression we derived so far. Such that the leading correction to 1 will not be given by the QNEC quantity. The holographic QNEC proof [43] suggests that these new terms do not appear if we specify that locally the entangling surface is stationary under deformations in the x − direction, that is we should require that the extrinsic curvature K + ab (y = 0) = 0 and enough y derivatives thereof about the point of interest (y = 0) should kill these terms. Of course it would be nice to derive this condition in our general proof of the QNEC without resorting to holography. We aim to do this here.
Recall that we are picking a coordinate system close to y ≈ 0 such that the entangling surface ∂A is defined by the equation: The extrinsic curvatures are K A± ab (y) = ∂ y a ∂ y b X ± A (y). In our final setup for the QNEC we also have the other region B which we can take to be defined close to y = 0 via: Without loss of generality we can set ∂ y X ± A (0) = 0 which implies that ∂ y X + B (0) = 0 otherwise we violate the nesting condition D(B) ⊂ D(A). For full generality we will leave ∂ y X − B (0) non-zero. At second order in the y expansion the nesting condition means that: is a positive semi-definite matrix. This then implies that ∂ 2 y is the sum of the eigenvalues of the matrix (7.17). In our language the origin of the sought after divergent terms is from a more careful study of the dOPE replacement when computing the modular Hamiltonian via the replica trick: where previously we assumed β i can be calculated by making this same replacement on a flat defect in vacuum ∂A 0 in the presence of a defect operator O j . One might worry that this does not correctly capture the shape dependence of the surface ∂A, since the flat defect is a rather brutal replacement. We initially did not worry about this since shape deformations away from the flat defect are achieved via the displacement operator and its derivatives which would appear in this expansion via the defect operator insertions O j .
However we now show that this replacement does not capture correctly the local geometric terms defined at the point y = 0 -that is the extrinsic curvature and a finite number of y derivatives thereof. So while it does capture the non-local shape dependence -i.e. the behavior of ∂A far from y = 0 we need to work harder to account for the local shape dependence.
The way to fix this problem is to realize that β i is still sensitive to local geometric quantities at the point y = 0. Thus we propose to think of the β i ≡ β i (K ± (0), ∂K ± (0), . . .) as a function of K ± and its derivatives at that point. Note the extrinsic curvature is the only geometric quantity of interest in Minkowski space, however in curved space we would have dependence on the various local curvature invariants. We leave a complete study of this to future work.
Computing the β i is now more involved. One must do the replacement on a curved defect that has the same extrinsic curvature and derivatives thereof locally at y = 0. Note that the replacement is still made in vacuum which then will make it possible to compute β i .
The case with K + = 0 and K − = 0 Let us address the simplest case where X + A ≈ 0 21 around the point y = 0 so we are free to replace the entangling cut with one lying along a null cut of the Rindler horizon: u = X − A (y) 21 We take the notation ≈ 0 in this section to denote that X + A and a finite number of derivatives at y = 0 vanishes. The exact number we would require depends on the spacetime dimensionality since we only must ensure these terms are sub-leading compared to the QNEC term appearing in f (s).
where we choose this function such that X − A ≈ X − A around y = 0. Actually in this case the replacement yields a surface where we know the exact modular Hamiltonian: We discuss this in Appendix A. This was also proven in [34] using similar methods. At the order of interest the only effect of X − A (y) that we care about is on the OPE coefficient for the defect unit operator β 1 . After taking the n → 1 limit on (7.18), this gives rise to a shift in the leading order term for the modular Hamiltonian matrix elements: 22 Thus when we compute the function R defined in (4.31), modular flow and all the related quantities we must replace the leading terms with modular flow using this new zeroth order modular Hamiltonian: K 0 { X − A } and also K 0 { X − B } appropriate for the other surface ∂B. This necessitates two changes to our functions f (s) = g(s)/g 0 (s) and the subtraction terms h A,B (s) in (6.17). Firstly, since the new leading order term is different it makes sense to redfine f (s) where we pick the denominator g 0 (s) to involve vacuum modular flow for the new null deformed cuts X − A,B . We should similarly do this for the h A,B (s) denominators. This procedure then removes the potentially offending terms due to the extrinsic curvature K − . It means that in the light cone expansion the leading contribution to each of these functions is 1 and the next contribution is O(z d−2 ).
For example the new denominator of f (s) is: where we have used the more general algebra that these modular operators (2.16) satisfy.
Here U is a generalized null translation operator, which roughly speaking, translates each null generator along the Rindler horizon by a non-uniform amount. It can be written as an exponentiation of the ANEC operator: After we do this replacement with the null deformed modular Hamiltonians we encounter a new issue. We have now potentially contaminated the Cauchy-Schwarz (CS) bound in (6.20) in the light cone limit at some lower order in the z expansion. It turns out to fix this we need the leading contributions to the denominators of (6.17) to match the denominators in (6.15). Note that the numerators were designed to match after computing the CS bound. The denominators matched in the previous calculation after imposing the relations in (6.18) for the location of the operator insertions. Now that these denominator terms involve non-geometric modular flow (albeit still for the vacuum state) this is seemingly much harder to arrange.
So the second change to f (s) we must make is as follows. In fact we can fix the above issue by applying the small s-dependent shifts to the operator insertion by using vacuum modular flow itself, rather than just moving the operators by hand. That is consider where U a was defined just above and we allow for a complex. These are exactly the shifts predicted in [52]. Note that for uniform entangling cuts ∂A, ∂B these flows are geometric and the shifts given above are the same as the shifts in (5.19) (up to simple translations by δx − /2 that we could absorb into u B , uĀ.) For non-uniform cuts these new operators become non-local, although they are approximately local which is sufficient for our computations in f (s) etc. One can then check, using the algebra of half-sided modular inclusions discussed in Appendix A, that with this replacement the denominators in (6.17) and (6.15) are all s independent (as they were previously.) We finally need them to all be equal. This was achieved previously via (6.18) but here we must again use vacuum modular flows: Note that U 1 = U (1−e −s )/2 | s=iπ is a generalized translation that provably sends OĀ to an operator in the algebra A B and then the conjugation operators J 0 B sends this to an operator O B in A B . This relation then replaces (6.18) for this more general case, and it is possible to check that our new choice reduces to (6.18) for uniform cuts (again up to some simple δx − /2 shifts.) One can also now check that the denominators in f (s) and h A,B (s) agree and are s independent.
We can run the QNEC proof using the new definitions of these operators (7.24-7.25) and with the appropriate replacement of the vacuum modular flow by the vacuum modular flow for the deformed null cuts. Note that the issues we deal with above only afflicts the leading terms in the light-cone limit. They give sub-leading correctons to the z d−2 terms where the QNEC lives and so these terms will agree with previous considerations.
We finally need to revisit analyticity of f (s) and F (s) in (6.19). We have already seen that the denominators, appropriately defined, are s independent so we may drop them. We are left to worry about the dependence on s in the numerator, especially through the operators U a . Consider the followinng numerator for 0 ≤ Ims ≤ π: The basic question is: can we give an analytic continuation of into the strip 0 ≤ Ims ≤ π and similarly for the bra in (7.26). If we can, then the resulting function has the desired analyticity. A similar discussion is necessary for −π ≤ Ims ≤ 0 with a different function, however we will not spell this out. We addressed the issues of analyticity previously in Section 6.1, where we had to worry about the s dependence in the operator locations spoiling analyticity. We can give a similar discussion here. At large s we can expand: where P was defined in (7.23) and for large s this expansion should converge. For sufficiently local operators these commutators are well defined and result in an operator that is still inside the algebra A A . Thus we can analytically continue these terms into the strip using Tomita-Takesaki theory applied to the state dependent flow e isK B . Likely there is a more rigorous proof of this using ideas similar to [52].
We also need to consider the numerator of h A (s) and h B (s). There is a similar discussion for these. For example the numerator of h B (s) can now be written as: and we can apply a similar expansion as in (7.28) to convince ourselves of analyticity for large s.
In conclusion all elements of the QNEC proof (computability, CS bound and analyticity) work for two entangling cuts that become null cuts of the same Rindler horizon close to y = 0. This seems to be a general lesson -when we can replace the defect locally with one where we know the modular Hamiltonian as a local integral over the stress tensor, then the QNEC result applies.
Note that the appropriate quantities in the QNEC P − should now be defined using the subtracted EE for the null cut ∂ A determined by X − : The case with K + = 0 We now turn to X + = 0 where we will discover terms that render the QNEC inapplicable. We consider only the case where also X − ≈ 0, leaving potential cross terms between the two extrinsic curvatures to future work. There is no limitation to studying these cross terms, they require just a little more work. We now consider a new replacement surface: which agrees with the exact cut X + A (y) ≈ X + A (y) for a finite number of y derivatives around y = 0. We additionally require that X + A (y) smoothly match onto a flat defect with X + A (y) = 0 for larger |y|> y . We still place a local defect operator O j elsewhere on the defect along ∂A but in the region where X + A (y) = 0. Since we make this replacement in vacuum we can now use the results in [18,88] to compute shape deformations of the flat defect. This should allow us to express β i as an expansion in extrinsic curvatures and their derivatives. We should also find that the non-local dependence on X + A , which is arbitrarily chosen, should drop out.
We illustrate this with an example, where we account for the linear in K + terms and derivatives thereof appearing in β 1 the defect identity operator coefficient. As n → 1 the shape deformed defect will now change the three point function term which we computed around (4.21): . . (7.32) where we have expanded the shape deformation to linear order using the displacement operator. Recall that this is the defect operator D + ≡ −2π T −1 . We know that the displacement operator inserted in a correlator like this simply gives 2π(n − 1)× the half ANEC integral as we send n → 1 [85]. The contribution to the full modular Hamiltonian can be found after applying (−∂ n )/(2π) and extending the half ANEC integral to the full ANEC integral: and where we have used lim n→1 Σ n 1 = 1 and G 11 = 1. In order to compute this ANEC integral we will use the follwing three point function: (7.34) where we have kept higher order terms v B u B and vĀuĀ that we dropped previously in (4.25). These will be important to keep here to regulate some divergences that arise as we send these to zero. After some computation we can write the answer as: and where c d−1 = (4/π) h−1/2 Γ(h + 1/2)/(d − 1). We see that this will contribute local geometric terms if we expand as z 2 → 0: Note that first term will not contribute because we have chosen X + A (0) = 0. When we integrate this against the profile we find the expected extrinsic curvature dependence: (7.37) where K + A is the trace of the extrinsic curvature. However this story is not yet completethe answer we have so far depends on the full function X + A (y) which was randomly chosen except for it's local behavior around y = 0. This is because of the last term in (7.37) which has a non-local dependence on X + A (y). This term is taken care of by the EE subtraction in (4.15) which is now a subtraction using the deformed replacement cut determined by the profile X + A (y). We call this term S EE ( A, |Ω ) = S Ω EE { X + A } and note that any reasonable regulator for EE will yield a term which cancels the the non-local term in (7.37).
To make this clear, let us collect all the terms that appear in R (4.31). We can write this suggestively as: where we are suppressing the stress tensor dependence since it is the same as before. Here we have defined the "Ryu-Takayanagi" profile:

39) This profile has several interesting features. Firstly it only depends on X +
A locally at the point y = 0. As expected one can show the non-local part cancels appropriately. This is because if we expand δ δx − (y) S Ω EE { X + A } in X + A we will always find a universal cutoff independent piece which is the second order shape deformation of the vacuum entanglement entropy (sometimes called the "entanglement density" [101].) This was studied in depth in [88] where one can check that the CFT entanglement density exactly cancels the term in (7.37) that was troubling.
Secondly the cutoff dependence used to define S EE should be absent as we remove it → 0. The natural regulator for entanglement in our computation is a vacuum subtraction for which we don't expect state dependent divergences to arise as explained in Footnote 13. This is to be expected since we are ultimately computing a UV finite quantity (R and f (s)).
Lastly this is exactly the profile of the RT surface that one finds by linearizing the surface equations of motion near the boundary of AdS in a holographic CFT. We linearize near the boundary but allow for a totally general bulk. The linearization is necessary because we only include linear terms in X + A in our analysis. The linear in X + A terms are important for the leading terms in the z expansion however at higher orders in the z expansion we expect to see non-linear dependence on X + A . This dependence is in principle computable using this approach and we expect agreement with the non-linear RT functional, perhaps supplemented with the appropriate higher derivative corrections. We leave checking this for future studies.
We now go back and compute f (s) with this new R (7.38). Tracking through the computation in Section 5 the new terms can be grouped in with the displacement operator terms as the transform in the same way under modular flow. The result has the same form as in (7.39) but where we use a slightly different definition of the z coordinate due to the intervening modular flow: For large s this becomes: Using this z coordinate we have for large s: where recall that the lightcone expansion is controlled by z 1. We can now understand why the QNEC might not be satisfied for surfaces with non-vanishing extrinsic curvatures K + . It is because the leading terms that we might constrain using the sum rule actually trivially satisfy the positivity constraint. These terms are the extrinsic curvature terms: for which the sum rule is non-negative by the nesting condition (7.17). Here K + A,B (0) is the trace of the extrinsic curvature. At this point if we demand K + A (0) − K + B (0) = 0 then we might still succeed in proving the QNEC (with the appropriate subtracted S EE 's) in d ≤ 3, since the next leading term is the Q − term.
However we should be cautious here because the leading extrinsic curvature terms in (7.43) actually competes with 1 (that is z 2 /∆v = (δx − − ∆u)/4 ∼ O(1))) so in some sense it spoils the perturbative expansion altogether. Thus we should only trust this analysis for small K + A,B . Note that even though there is a cancelation for this term if we demand K + A (0) = K + B (0), this cancelation may not be enough to save the break down in the perturbative expansion. We hope to clear up this question in the future.
Actually the correct thing to do for the cases where K + A,B (0) are non-zero is to make the dOPE replacement using a spherical defect where the trace of the extrinsic curvature are designed to agree locally with those of ∂A, ∂B. For a CFT the spherical defects have known modular Hamilonians [76,102] and there is an obvious path to follow to proving a so called conformal QNEC (see [43] for the original discussion of this in the holographic proof of the QNEC). We leave the conformal QNEC case to future work.
Either way we have succeeded in proving the necessity of certain local conditions the entangling surface must satisfy in order to claim a QNEC. Most conservatively we should demand that K + A,B and enough derivatives vanish at y = 0. Such local conditions have been extensively studied very recently for the curved space QNEC and QFC [45,[103][104][105] and it is obviously interesting to extend our work to that case.

From CFT to QFT
Most of our arguments have in one way or another relied on working with a CFT. We would like to extend this proof to a relativistic QFT found via a relevant deformation +λΦ of the CFT. Since we are working in the lightcone limit these relevant deformations do not play a very important role -this makes sense from the holographic point of view, since the important physics occurs near the boundary of the dual spacetime and in the UV of the QFT, which is then controlled by the CFT fixed point. In our computation we expect the light cone OPE limit is also essentially controlled by the CFT. Thus for example the stress tensor and displacement contribution to R and thus ultimately f (s) will be the same as before. However we have the same issue as in the previous subsection where there might be more leading terms in the z expansion of f (s) due to λ dependent effects. The analogous effect in AdS/CFT arises via a Fefferman-Graham expansion of the metric and the RT surface [43], which are also sensitive to λ.
Again we expect this to arise in our computations because the coupling λ may appear in the dOPE coefficient for the unit operator β 1 (λ, λ 2 , . . .). Only polynomial powers should appear and the analogy with the extrinsic curvature terms is strengthened by taking spacetime dependent couplings λ(x) such that β 1 can depend locally on λ(0) and its spacetime derivatives. We can deal with these new terms using the same idea as above for dealing with K − terms. We can simply use the vacuum modular flow for the deformed theory. This still works because these still have known modular Hamiltonians that are constrained by the theory of half-sided modular inclusions. We can then run the same argument for the QNEC in this case (see the previous subsection on the K − extrinsic curvature contributions for all the details.) For surfaces where K + is not ≈ 0 then the above argument does not work and we would need to combine the relevant deformation with the X + shape deformation in perturbation theory. This should be doable, and we basically expect to reproduce any terms one might expect to see in a Fefferman-Graham expansion in this way. We again leave the complete discussion to the future, where likely it would be nice to have a more systematic way to study all of these effects at one time (relevant operator deformations, extrinsic curvature deformations and even metric deformations.)

Higher spin versions of the QNEC
It is easy to extend the derivation of QNEC to the case of the higher-spin symmetric traceless operator J −...− of conformal dimension ∆ J and even spin J, where the twist τ J = ∆ J − J is the minimum among operators of the same spin. Previously we found that in this case there is a family of displacement operators D , 1 ≤ ≤ J − 1 emerging at order O (n − 1). Again we can compute R of (5.5) which we can then use to compute f (s). We leave the details of these functions to Appendix F. The result for f is analogous to (5.26) in a somewhat obvious extension, see (F.8).
In the limit e s 1, the null integral is dominated by the interval 0 < u < δx − , and we have: where the new coupling is: Note that G T = −4πG N ∆ O /d for the stress tensor. The sign of G J for the other operators is ambiguous if they don't correspond to some conserved currents since we can send J → −J . However, as we will see, G J Q J does have a definite sign. The new ingredient above are the one point functions of the higher spin displacement operators. Again these are only non-zero at order (n − 1) where we bring down a factor of the (half) modular Hamiltonian, and for this reason they correspond to some object in the QFT which is non-linear in the state. More specifically the various displacement operators appear as singular terms when we take J −...− close to the modular Hamiltonian: After an application of the first law of entanglement one can interpret these as the variational response of the EE to a deformation with respect to the higher spin field +µ ν 1 ...ν J J ν 1 ...ν J and picking a particular profile for the µ close to the entangling surface. However since J is not a conserved current it is hard to make a precise statement here. Note that in the large s limit (7.44) we only find a contribution from the highest spin displacement operator = J − 1.
To extract a sum rule we place the operators symmetrically (v B = −vĀ and u B = −uĀ + δx − ) and we use the definitions for σ, z in (6.22) and (6.23): We can now obtain a sum rule by extracting the higher-order pole σ 1−J using a new version of the projection (6.29), the analytic properties of F (s) then forces out the following constraint: This is now a higher spin version of the QNEC. If we integrate this up, by taking δx − to infinity, we recover the higher spin version of the ANEC first studied in [19]. The QNEC is a more local version and indeed gives a local bound on the expectation value of a higher spin field in any state: where x − λ (y) parameterizes a small null deformation of the entangling surface A λ and with dx − (0)/dλ > 0. It would be interesting to give a gravitational/stringy interpretation/analog of this bound. It would also be interesting to study this in free theories extending the proof of the QNEC for free QFTs [24]. 23

Discussion
In this paper we have found a way to reconstruct the Ryu-Takayanagi entangling surface in a putative dual gravitational theory in any interacting QFT. The reconstruction happens near the boundary of the space where from the outset one might have expected to make progress using an OPE argument. We found that the correct argument involves working with entanglement in the Replica trick and studying the spectrum of defect operators localized on the d − 2 entangling surface twist defect. An essential ingredient included the introduction of probe operators (any operators are ok) which can be made to probe the boundary spacetime in a precise way. With this setup we studied the modular Hamiltonian evaluated between matrix elements of the probe operator, which we then bootstrapped into a study of modular flow. The profile of the RT surface appeared due to a shift in the action of the modular Hamiltonian on the probe operators. Analyticity of modular flow was related to causality which we then used to constrain the sign of this shift thus proving the quantum null energy condition, which was the original goal of this study.
The exact surface that we reconstructed should likely be compared to the quantum corrected extremal surface advocated in [51] which was recently proven, in the context of theories with a gravity dual, to compute the entanglement entropy of the dual QFT in [94]. One piece of evidence for this comes from studying mixed states, where the entropy of complementary regions is not the same. This means there will be two entangling surfaces, one for A and one forĀ, which is indeed what we found.
We should remark that our proof of the QNEC should be considered as a proof strategy that can be tailored to various situations depending on the details of the entangling surface, the space that the QFT lives on, and any potential relevant operator deformations involved. In this paper we worked exclusively in flat space allowing for uniform relevant deformations, although it is possible to generalize to curved space etc. We expect the main difference in this case comes from studying the local-geometric contributions to the defect OPE coefficients β i . We sketched how this works when the entangling surface has extrinsic curvature in Section 7.
Our strongest statements could be made for entangling surfaces that approach nonuniform null cuts of a Rindler horizon, however we managed to also include some leading order effects due to non-stationarity when the extrinsic curvature K + did not vanish, although these terms disrupted the QNEC proof in a controllable manner. Understanding exactly when local geometric terms might disrupt a statement of the QNEC is an important avenue for future study, especially in curved space. Recently there have been several approaches to studying this problem [45,103]. One is to study holographic theories and examine the causality of EWN when the boundary theory lives on a curved metric. The second is to assume the Quantum Focusing Condition and check what constraints must be imposed in order to derive a QNEC in the semi-classical limit. Lastly the condition that the QNEC itself must be a UV finite quantity in order for the bound to make sense, puts similar constraints on the background about which one might prove the QNEC. Since we now have a general proof strategy we should be able to find the general set of conditions for any interacting QFT. Our expectation however is that our results will be in line with those already known due to the holographic proof and so one might not expect to learn anything new here. It is still worth pursuing of course.
We now mention some other more speculative avenues for future pursuit.

Beyond the lightcone limit
It would be interesting to push this computation beyond the lightcone limit. In order to have some control we would need to work with a QFT with a large-N limit and a gap in anomalous dimensions to the single trace higher spin fields [106]. This would involve moving to very large modular flow, still maintaining s ln N where we move into a controlled Regge like regime where we expect to reproduce the bulk physics of a theory with a gravitational dual [47,[107][108][109]. It is not clear what physics we should look forpresumably it should be related to the causal structure of the entanglement wedges but now deep in the bulk. For example this might be a way to give a proof of the entanglement wedge nesting from a purely boundary point of view and potentially beyond the semi-classical limit. There are several challenges here. For example one needs to both control the Regge limit at the same time as potentially higher order corrections to modular perturbation theory. We think that modular perturbation theory can likely be controlled by working, as we did in Appendix D, with (double) modular flow directly in the replica trick. It would also be important to figure out the role of double trace operators and their "displacement operator" contributions, for which we have very little understanding right now. There have been many recent advances in studying this limit for the ANEC version of this problem [99,[110][111][112] and likely we should make use of this new technology.

Meaning of higher spin displacement operators
It is natural to wonder about the physics of the higher spin displacement operators. Recall that these only arise out of (symmetric traceless) operators with spin ≥ 2. Their origin suggests they should be interpreted as new fields living on the RT surface, possibly related to higher spin fields or stringy states -although the double trace versions muddy this possible interpretation. Ignoring the double trace operators for now, one might speculate that these correspond to new modes on the RT surface which typically have a large mass in theories with gravitational duals for the usual reasons that we expect a gap in anomalous dimensions to the stringy states. Yet they could be important for understanding EE more generally, for example in CFTs dual to Vasiliev gravity [113]. They also might have some relation to higher spin entanglement studied in 2d CFT using the various 3d versions of higher spin theories [114,115].
For the double trace versions of the displacement operators it is natural to speculate that these are related to the bulk entanglement contribution to the boundary entropy [116], although the details of this are not clear to us right now.

A new regulator for entanglement entropy
We have several expressions now that relate the modular evolution of probe operators to the EE of the underlying state ψ. We might then invert these relations to give an independent definition of EE. EE typically suffers from UV divergences and is hence not a good continuum object -this is related to the fact that the algebra of a region in QFT is a type III von Neumann algebra which does not have a trace. Thus it is important to find a natural UV regulator for this problem. We actually have in hand a continuum definition where the usual/expected divergences would be controlled using the kinematics of the lightcone limit. That is, while the modular flow correlators are UV finite, the limit we consider has diverging terms parameterized by z 2 = −∆v(∆u − δx − )/4 as ∆v → 0. Thus the divergences in EE due to local correlations would be the same or similar to those found using the RT functional in holographic theories. There are a few caveats here -firstly we would only ever by able to extract the null shape deformation of EE and often, as we argued above, this is UV finite anyway. Although in the presence of extrinsic curvature K + this is no longer true and thus in this situation this new regulator would be useful. One possible proposal is: where we have sent the region B far away by taking δx − → ∞ holding fixed u B − δx − fixed and z, σ were defined in (6.23) and (6.22) and R satisfies z R 1. Note we could have used single modular flow here instead. We use double modular flow so we can use the more developed formulas for that case. Note that the important thing here is to carefully pick the entangling region A 0 which is used in denominator for f , which should have a computable modular flow and should come close to the entangling surface ∂A. We have dropped the various local geometric terms that we discovered in Section 7.2 would then be dropped since they are anyway local to the entangling surface and so could be removed in another regulator using appropriate counter terms. This is to be compared and contrasted with the mutual information regulator of [117]. There are several questions that arise now relating to the properties of this putative EE. We know that this quantity is constrained by the QNEC -but does this imply that it satisfies strong subadditivity, and other constraints obeyed by the usual EE? Also can we give a useful definition along these lines 24 for a non-relativistic QM system, that in this case reduces to the usual definition of EE?
where the the space-time region D(B 0 ) also determines the algebra of operators which thus also has this inclusion property. Under this condition it has been shown that the modular Hamiltonian's satisfy an algebra, which is the same as the obvious algebra that would have applied if B 0 was also a uniform Rindler cut. That is (2.12) which we reproduce here: Some of the ingredients that go into a proof of this were sketched in Section 2.2. We will just take this as an input. The other input will be the perturbative results of [18] which showed that to first order in the shape deformation X − B one can show that: This result was originally proven for arbitrary non-timelike shape deformations (which then includes an additional − dx + X + T ++ term), and in this case we expect the higher order corrections to be non-trivial. However it was reasonable to guess that for null deformations the perturbative series truncated. Here we show this by applying the algebra (A.2). We note that the geometric action in (A.1) implies that we can write this algebra as a differential equation: This is a trivial operator/matrix differential equation (i.e. take matrix elements of both sides) with solution: Taking λ small we can fix M via (A.3) and the truncation of (A.3) follows.
Finally we need to show that this algebra (suitably generalized in (2.15)) also applies when both region A 0 and B 0 correspond to non-uniform null cuts of the same Rindler horizon. We simply compute: where here K 0 is the undeformed Rindler modular Hamiltonian we previously called K 0 A . The first line (A.6), after using (A.2), gives us the sought after algebra that we quoted in (2.15). So we just have to show that the second line of (A.6) vanishes. Here the ANEC operator is: Two such operators commute when y 1 = y 2 since the null generators are always spacelike separated. When they lie on top of each other they commute because they are the same operator. This argument is not really justified since there could be singularities that invalidate these statements. See [34] for many different approaches to deriving of this algebra in the more general case.

B More details on defect operator spectrum
In this appendix, we compute the scaling dimensions of some defect operators that appear in the dOPE of scalars, and the stress tensor. In holographic CFT, we compute them for arbitrary values of the Renyi index n. Further, in the n → 1 limit, we provide up to O(n − 1), the values of these scaling dimension for the "minimal twist" defect operators in the dOPE of scalars valid for any CFT. We do this analytic computation first, then we turn to the holographic case.

B.1 Analytic considerations
We would like to compute more explicitly the leading (n − 1) shift in the ambient space scalar two point function. From this we can extract the shifts in the conformal dimensions of the defect primaries. We need to analyze the two terms (3.20). As explained in the main text the first term is actually the stress tensor conformal block with 4 external scalars. We can thus look up the answer. We can also just do the integral over the stress tensor which defines the modular Hamiltonian. Either way it is possible to reduce this block to a single integral representation which looks very similar to the second term in (3.20). Combining these we have: The fact that this result (B.1) is single valued as one moves one operator around the entangling surface in Euclidean (z → e 2πi z,z → e −2πiz ) becomes the statement that the residue at λ = 1 vanishes, which can be easily checked. We need to analyze the limit z,w → 0. After setting this to zero we find a log divergence coming from the lower λ ≈ 0 integral. This divergence should thus be cut off at λ ≈ zw/X 2 ≈ zw/y 2 as the lightcone limit was not uniform in λ. The coefficient of the log is easily calculable: There are other log terms coming from the upper limit of the integral λ → −∞, but these always give ln(wz/y 2 ) terms, there are several sources of such terms. We don't actually need to do the computation however, since we know the coefficient of the ln(wz/y 2 ) term should be the same as derived from (B.3) so that they sum up to a single valued function on the Euclidean section ∝ ln(wwzz). There will be also non-log terms which we do not keep track of. We can combine these into the claimed result in the main text (3.22) where the term multiplying the log in (B.3) becomes (3.23). .
In d = 4, we find the following O(n − 1) correction to the scaling dimension ∆ of defect operator transverse spin l that appears in the dOPE of an ambient scalar of dimension ∆: We compare this result with that of the holographic computation outlined in the next section of this Appendix, in the plot Fig.9.

B.2 Holographic computation: scalars
We consider a probe scalar field in the hyperbolic black hole background [76] given by While we can perform this computation in arbitrary fixed dimension, we restrict to the case of the 5d hyperbolic black hole B n , which is dual to the twist defect in a 4d holographic CFT. In this case we have where τ ∼ τ + 2πn and n is the Renyi index of the dual defect CFT. Consider the following ansatz for a scalar field of mass µ in this background φ(r, τ, ρ, y) = ρ ∆ e ilτ ψ(r), where ∆ ∈ R has interpretation as scaling dimensions and l ∈ Z is the transverse SO(2) spin. Even for non-integer n we still take l ∈ Z, which is justified since we want this operator to be single valued upon shifting one replica τ → τ + 2π. This might seem strange since the ansatz is not compatible with the thermal periodicity of τ for non-integer n, however this is the correct procedure for analytic continuation as pointed out in [116]. Roughly speaking we can think of this as studying φ in a bulk spacetime defined via the quotient with respect to the replica symmetry B n /Z n . This space time has a conical deficit at r = r h but is well defined for any value of n and the φ fluctuations on top of this space time are now single valued. The analytic continuation procedure also fixes a unique boundary condition at the deficit as we will argue below. Plugging the ansatz into the Klein-Gordan equation this then becomes an ODE in r given by Near the horizon at r = r h , solutions behave as ψ(r) ∼ (r − r h ) ± ln 2 . We must pick the solution that behaves as (r − r h ) + ln 2 so that the scalar field is regular near the horizon. Normalizable modes have the property ψ(r) → 0 as r → ∞. Only for specific values of ∆, will the solutions be both regular near the horizon and normalizable. These values correspond to the scaling dimensions of defect primary operators that appear in the dOPE of the scalar operator of dimension ∆ = d 2 + (d/2) 2 + µ 2 . The numerical procedure resulting from the above discussion, is to solve the differential equation (B.8) by specifying boundary conditions at r = r h . We then check if the solution has the property ψ(r) → 0 as r → ∞. Proceeding this way, we see that the scaling dimensions of the defect operators at n = 1 are given by ∆ = ∆ + l + 2k, k = 0, 1, 2, . . . (B.9) These scaling dimensions at n = 1 can also been obtained by expanding the usual CFT 2 point function of scalars in the absence of a defect (using the appropriate conformal blocks since the holographic method singles out primary operators). The scaling dimensions will receive O(n − 1) corrections, and above we were able to compute these O(n − 1) correction for the twist defect in arbitrary CFTs for a subset of defect operators which satisfy k = 0, the "minimal twist" operators. The numerical procedure also gives us the values of ∆ for arbitrary values of n, and we have plotted ∆ vs n in a few examples in Figure 10.

B.3 Holographic computation: stress tensor
The computation in this section is a generalization of [85] (see also [118,119]). [85] focused on the displacement operator, which is a defect operator that appears in the dOPE of stress tensor for n = 1 and has a protected scaling dimension of ∆ disp = d − 1. Following [85], where the functions X τ (r) and X ρ (r) satisfy We started with 8 first order degrees of freedom , namely k τ τ , k τ ρ , k ρρ , k yy and their first derivatives in r. Three of the components of Einstein's equations are first order constraint equations. Further there are 3 gauge transformations. This leaves us with 2 residual first order degrees of freedom. In fact, in this case, we can construct a gauge invariant linear combination, where B = ∆ 2 (1−3r 2 +f )+2r 2 l 2 , so that σ and σ are the two gauge invariant, independent first order degrees of freedom we seek. Thus, σ obeys a second order differential equation, which we can write (schematically) as, This procedure for finding (B.20) was inspired by the similar fluctuation computation appearing in [120,121]. The normalizability condition is that the r → ∞ behavior of the fluctuation must be pure gauge [85]. In terms of σ, this reduces to the condition lim r→∞ σ = 0. (B.21) Near the horizon, solutions to (B.20) behave as σ(r) ∼ (r − r h ) ± ln 2 . Demanding regularity near horizon leads us to the choice σ ∼ (r − r h ) + ln 2 . The numerical prescription then involves solving the differential equation (B.20) with regular boundary condition at horizon . We then seek values of ( ∆, l) such that the solution satisfies the normalizability condition (B.21). These values correspond to the scaling dimension and transverse spin of defect operators that appear in the dOPE of the stress tensor (See Figure 5 in the main text and Figure 11).

C Multi-replica operators
In this Appendix explain why we left out the more general multi-replica operators (3.12) when we extracted the defect operator spectrum by moving ambient space orbifold operators close to the defect. If multi-replica operators were to give rise to new defect operators  Figure 11. Plots of scaling dimension ∆ of defect operators as a function of Renyi index n in the different channels at d = 3. The displacement operator is shown in pink and has a protected scaling dimension. The lowest dimension operator for fixed is "minimal twist" except for = 0 channel.
we would discover them in the multi-replica two point functions in the presence of Σ 0 n . However when we bring the multi-replica operators close to the defect, on the branched manifold M n the operators on different replicas in some sense are coming close to each other -this is completely clear in d = 2 where the covering map: w = z n is a conformal transformation which removes the twist defect leaving the replica operators approaching the origin at the positions z = w 1/n e 2πk/n for k = 0, . . . n − 1 with |w|→ 0. We can then simply replace these operators by another local operator using the regular CFT OPE such that the multi-replica two point function is a sum over single replica two point functions. Hence nothing new.
In dimensions d > 2 this is quite a bit more subtle because one cannot remove the defect with a conformal transformation -the best one can do is map to the space H d−1 × S 1 with the twist defect operator being sent to the boundary of this space. Now the cluster of operators coming from a single multi-replica operator are located at the same point close to the boundary of hyperbolic space and distributed around the thermal circle at θ k = 2πk for k = 0, . . . n − 1 and where θ ≡ θ + 2πn. The limit towards the defect will then mean the two clusters of multi-replica operators will be separated very far from each other on the hyperbolic factor. In this limit one might hope to again use the ambient space OPE to replace a cluster of operators by a sum over local operators at some fixed θ, say θ = 0 and claim victory. However we then might worry about convergence of this OPE since these operators have separation of order the curvature scale of H d−1 . This does not cause a huge problem -we can consider deforming each cluster of operators to lie around the thermal circle at θ 0 < θ 1 . . . < θ n−1 1 such that the OPE does converge. This implies that the defect operators we have already discovered (from single replica operators) will give the full answer for these values of θ k . This allows us to compute the answer in a slightly different way, instead of making use of the ambient space OPE we can now use the defect OPE. We direclty replace the clusters 25 with the known defect operators, in particular we can make use of the same defect OPE we used in Section 3.2 to re-write the two point correlator schematically as: where i, j sum over the known defect operators. This answer must agree with the ambient space OPE of the paragraph above. However note that the bulk to defect correlators in (C.1) can be extracted from the appropriate projections onto the known defect operator spectrum acting on one operator inside a n plus 1 point correlation function in the presence of Σ 0 n (or rather on the branched covering M n ). These correlators, and hence their projection, are well defined without requiring a convergent OPE sum. Additionally there is no obstruction to using the expression (C.1) for any values of θ k which can now be continued to the required values of θ k = 2πk without worrying about convergence. We have effectively re-summed the OPE (with unknown but potentially knowable functions) which then implies that this is the full answer and we have not missed any defect operators in writing (C.1). Thus we again conclude that no new operators arise.

D Modular flow from the replica trick
In this Appendix we would like to compute the modular flow correlator: h(s) = ψ| O B e iKs OĀ |ψ (D.1) using a different method to the main text. In the main text (Section 5) we used a perturbative expansion, which was at the time not fully justified. Here we will find the same results using a more complete method, thus closing the gap on our computation of single modular flow. The idea is to compute h(s) directly in the replica trick in combination with the defect OPE. Consider: which is well defined for n, p ∈ Z and 0 ≤ p < n. If we can find an analytic continuation from these integers to real n and complex p maintaining the thermal periodicity 0 ≤ Rep < 25 In the orbifold theory the clusters at general θ k should now be thought of as non-local operators with appropriate Wilson lines attached. We will suppress this detail here as it does not effect the defect OPE argument, as was the case for OBOĀ.
n. 26 then we can send n → 1 and p → is/(2π) to recover: h(s) = lim n→1 Z n,p | p=is/(2π) (D.3) Of course the trick is finding the (natural) analytic continuation which agrees on the integers and has nice properties for large n, p. Indeed this later requirement makes it clear that we should search for an analytic continuation first in n and only then in p. This is because for fixed integer n we only have a finite number of p = 0, . . . , n − 1's to work with and there is certainly no unique analytic continuation of a function from a finite number of values. For example once we have Z n,p for n ∈ R >0 and p ∈ Z with 0 < p < n then we can consider Z p (m) ≡ Z m+p,p defined for p ∈ Z and p = 0, 1, . . . ∞ at fixed m > 0. We then seek an analytic continuation of this Z p (m) to complex p. 27 Keeping this in mind, we proceed. We can imagine computing Z n,p for integral values of n, p using the same defect OPE method we used in the main text. We can write this as an orbifold correlator: Z n,p = Σ n O B ( p ) OĀ CF T n /Zn (D.4) where the notation ( p ) means to move the operator O B around the defect p times. Again we could write this using an attached Wilson line that circles the defect p times. Since this operator is still local to the defect as we zoom out we expect the defect OPE method to still apply. We replace the bi-local operator with a sum over defect operators i β i (p, n) O i . The only difference to our computations in Section 4 come from the three point function terms which are used to compute the OPE coefficients β i (p, n). That is: where we have pushed the integration contour to surround the four branch cuts coming when T ++ hits the (two) lightcones of each operator O (at a fixed y separation.) The contour C X encircles the X operator branch cuts that lie on different replicas. These are approximately at the locations: λ = −y 2 /uĀ, λ = vĀzz/y 2 and λ = −e −i2πp y 2 /u B , λ = e −i2πp v B zz/y 2 . At this point we would like argue for an analytic continuation. The three point function is well defined for any n -and can be thought of as a 3 point function of the CFT on Hyperbolic space ×S 1 with inverse temperature n. Thus holding p fixed and integer we have achieved the first required step. Note that we do not know this three point function in general except at n = 1. We could however compute it, for example, using a holographic CFT but we will only need the answer at n = 1. Now it should be possible to continue p. Firstly note that it is important that the λ integral is not moved around by the analytic continuation -this is so that the p → is continuation has the desired analyticity properties which would not be the case if λ could be forced to move onto the pole at λ =z. This means we should continue the various contour integrals wrapping the branch cuts C B , CĀ differently for each term. The CĀ contour integral should be left as is, while for the contour C B we should apply a 2πp rotation to move this integral to the "first sheet" (or in other words we are simply relabeling the replicas using a shift by p.) This operation effectively moves OĀ to the p'th sheet. Note that for integer p non of these operations have an effect on the answer. Now continuing p to non-integer is simple, we simply rotate either OĀ or O B by an amount e ∓2πip → e ±s respectively. It is important to note that we have arranged things so we only rotate the operator that is not surrounded by the λ contour integral.
At this stage we can send n → 1 and plug in the flat space CFT 3 point function. Using this we can then check that the the branch cut contribution for small λ ∼ zz vanishes as we send z → 0. Thus we are left with two terms: If we take a derivative ∂ s and set s = 0 then we re-derive the three point function results from Section 4 (it should be compare with C (1) + C (2) from that section.) Taking higher derivatives gives all the nested commutators. Working at finite s it is now a simple task to use the integral representation (4.26) and (4.29) to sum everything up to (5.14) which was the result of the perturbative computation that we had wished to put on more firm grounds.

E Double modular flow from single modular flow
We start by noting that our formulas for single modular flow imply a very strong statement, that: up to the order that we work at in the light cone limit (i.e up to and including terms (uv) τ /2 for τ = d − 2 the twist of the stress tensor). More explicitly we can show that that η A (uv) τ /2 . 28 Recall that OĀ(s) is simply the boosted operator acting with K 0 A .
This result can be shown by expanding out the norm, resulting in four terms which can be computed using our expression for single modular flow (5.16) (when the two operators act in the same wedge). At the order we compute all the terms cancel. We should take Im s < 0 in order to avoid the two operators coming close which would necessitate smearing. This does not mean that e iK A s OĀ |ψ = ? OĀ(s) |ψ which is clearly false because the norms of these states individually are different. However this difference in norms might only be seen in (E.1) at higher orders in the light cone expansion (at least before O(uv) τ ), which can be explicitly shown using a Cauchy-Schwarz argument.
Let us now define: where we use notation established in Section 5.2 for modular flow with respect to K 0 A and K 0 B . In addition to η A = |a − |α 2 (uv) τ /2 also have that η B ≡ |b − |β 2 (uv) τ /2 still maintaining Im s < 0. Applying Cauchy-Schwarz: But since we only want to compute f (s) ∝ b| a up to order O(uv) τ /2 we can ignore the (η A η B ) 1/2 term and simply set: b| a ≈ b| α + β| a − β| α (E.5) This formula relates double modular flow on the left hand side to single modular flow on the right hand side. So the right hand side can be computed using (5.14). Indeed combining all the terms we find exactly the same answer as the perturbative method for double modular flow applied in Section 5. We can roughly understand these result as telling us the following: where the action of Φ |ψ is in the code subspace (See Footnote 17 for the context of this discussion). Schematically Φ |ψ = LC du dv K(u , v )O(u , v ) |ψ for LC satisfying −v 1 and u not too large. This kernel should be constrained to give the known answer for the perturbative correction ψ| O B Φ |ψ ∼ (uv) τ /2 . We are agnostic to the action of Φ ⊥ except that it should have no overlap with the various light cone states ψ| O B and the small parameter should satisfy 2 ⊥ (uv) τ /2 . This is a weaker form of the half sided projecting onto the code subspace than was advocated in Footnote 17, however it is sufficient to sketch why the perturbative expansion methods in Section 5 worked. If the action of K A where to move us out of the light cone limit, via Φ ⊥ , then if we were computing lightcone matrix elements of (K A ) 2 then we would need the second action of K A to move us back which again can only happen via Φ ⊥ . So we get two factors of 2 A (uv) τ /2 much smaller than the terms we are interested in. This works at higher order in K A also.

F Higher spin QNEC
In our proof of the QNEC, we computed the lightcone limit of the quantity R (x 2 , x 1 ; A) defined in (5.5), which we then used to compute the single modular flow correlator (5.1) and the double modular flow correlator f (s) in (5.18). Specifically, we computed the contribution to these correlators coming from stress-tensor exchange in the OPE of the probe operators O. As discussed in 7.4, it is straightforward to generalize our methods to derive a higher spin version of the QNEC for the symmetric traceless operator J −···− of conformal dimension ∆ J , even spin J, and minimal twist τ J = ∆ J − J among operators of the same spin. This requires computing the contribution to the same set of correlators now coming from J -exchange. In this appendix, we provide some of the details of these computations.
The lightcone limit of R due to J -exchange is written succintly in terms of the following function: This generalizes the function F defined in (5.6). In particular, for the stress tensor G T = −4πG N ∆ O /d. Using the notation and conventions of 5.1, the J -contribution to R is: This generalizes the stress-tensor contribution found in (5.7). Note that there are now contributions from a family of displacement operators, as discussed around (7.46). With the result for R in hand, we can proceed to the single flow correlator. The stress tensor contribution was given in (5.14). The general J -contribution looks similar: With these ingredients in hand, the next step is to consider the function F (s) in (6.19). Note that F (s) is defined as the double flow correlator f (s) with two single flow correlators subtracted out. LetF J (s) denote the lightcone contribution to F (s) coming from J exchange. The expression forF J (s) is obtained by using (F.8) for f (s) and (F.4) for each single flow correlator, evaluated for the appropriate operator coordinates. Rather than writing out the full expression forF J (s) here, we will instead discuss an important property of the result.
In particular, as in the case of the stress tensor (see (6.25)), a crucial property is that This fact, along with the general analytic properties of F (s), are what allow us to extract a higher spin QNEC constraint, as detailed in 7.4. The property (F.10) is a consequence of several precise cancellations, as we now discuss. First note thatF J (s) has an 'integral part' involving piecewise integration in the complex u plane and also a 'displacement part' due to the displacement operator contributions. Let us focus on the integral part first. Fig. 12 is useful for visualizing qualitatively how the cancellations happen. The top figure shows the piecewise nature of the integral part ofF J (t + iπ/2): the black double arrows are the contribution from the double flow correlator f (s), the blue open arrows are the contribution from the single flow correlator hĀ(s), and the red solid arrows are the contribution from the single flow correlator h B (s). A cancellation happens along the two legs of the figure that have contributions from both double and single flow. 29 The result is the bottom diagram in Fig. 12. In particular, the net contributions are such that the integral along a leg below the real axis is equal to the negative complex conjugate of the integral along the mirror image leg above the real axis. We depict this by using the same arrows in the upper and lower half-planes. Verifying this property explicitly in the expression forF J (t + iπ/2) requires using reflection positivity, (F.11) Finally, when we sum all the legs, it follows thatF J (t + iπ/2) is purely imaginary. 30 The last step in understanding (F.10) is to note that the left hand side picks out the real part ofF J (t + iπ/2), which we have just determined to vanish. This follows because 29 One must use the fact that FJ (x2, x1; u) = (−1) J−1 FJ (x1, x2; u). 30 The contribution along − δx − 2 ≤ u ≤ δx − 2 is purely imaginary because of an overall factor of e s(J−1) . F J (s) is real for real s (a consequence of the probe operators O commuting for modular flow by real s), so by the Schwarz reflection principleF (s * ) = (F (s)) * . This completes our discussion of the integral part ofF J (t + iπ/2). Now we turn to the displacement part ofF J (s). For arbitrary s and even spin J, these contributions combine to givẽ (F.12) From this expression, one can verify (F.10) by keeping track of the overall s-dependence.
The first observation is that the ∂ n u F J (x 2 , x 1 ; u) appearing in this formula have the following structure for even and odd number of derivatives (we are suppressing all s-independent prefactors):