All the entropies on the light-cone

We determine the explicit universal form of the entanglement and Renyi entropies, for regions with arbitrary boundary on a null plane or the light-cone. All the entropies are shown to saturate the strong subadditive inequality. This Renyi Markov property implies that the vacuum behaves like a product state. For the null plane, our analysis applies to general quantum field theories, while on the cone it is restricted to conformal field theories. In this case, the construction of the entropies is related to dilaton effective actions in two less dimensions. In particular, the universal logarithmic term in the entanglement entropy arises from a Wess-Zumino anomaly action. We also consider these properties in theories with holographic duals, for which we construct the minimal area surfaces for arbitrary shapes on the light-cone. We recover the Markov property and the universal form of the entropy, and argue that these properties continue to hold upon including stringy and quantum corrections. We end with some remarks on the recently proved entropic $a$-theorem in four spacetime dimensions.


Introduction
Quantum information theory provides powerful techniques to understand nonperturbative aspects of quantum field theory (QFT). One useful way in which this has worked out is by applying information-theoretic inequalities, such as strong subadditivity or monotonicity of the relative entropy, to QFT. These inequalities give insights into causality and unitarity constraints in relativistic theories, which are often hard to recognize from local observables. Some examples include energy conditions in QFT [1][2][3][4][5][6][7], and proofs of the irreversibility of renormalization group (RG) flows in various dimensions [8][9][10][11][12][13]. Recently, it has become clear that these results can be extended and generalized by taking the null limit. 1 Here one considers the reduced density matrix ρ X for a region X whose boundary γ lies on a null plane or on the light-cone. See Figs. 1 and 2. For these regions, Ref. [11] obtained the modular Hamiltonian, which turns out to be local and given by the Rindler result, ray by ray. See also [14,15]. This surprising result is a consequence of the special geometry and symmetries on the null plane. As a consequence, the entanglement entropy (EE) for general QFTs saturates the strong subadditive (SSA) inequality on the null plane, This is called the Markov property, in analogy with the classical case. For a conformal field theory (CFT), the null plane can be mapped to the light-cone, and then (1.1) holds on the null cone as well. With this result for CFTs, we showed in [12] that for RG flows between UV and IR fixed points, the change ∆S(r) = S(r) − S CF T U V (r) in the EE for a sphere obeys This leads to a new proof of the a-theorem in four spacetime dimensions, and it also reproduces the proof of [9] for the c-theorem in two dimensions and the F -theorem in three dimensions. In this way, a single formula unifies all known results for the irreversibility of the RG in Lorentz invariant QFTs in d ≤ 4. See also [13] for related work. In the present work, we will analyze in detail the explicit form of the entanglement and Renyi entropies for regions with arbitrary boundaries γ on the null plane (for general QFTs) and on the light-cone (for CFTs). In Sec. 2 we will provide simple geometric arguments that will prove that the EE and all Renyi entropies are in fact independent of γ on the null plane. This is a very strong result, and it implies that all Renyi entropies also satisfy the Markov property (1.1). This infinite set of equations for the reduced density matrix basically says that the vacuum state behaves like a product state over the null plane. In this sense, the result is opposite in spirit to the Reeh-Schlieder theorem, that forbids such products over spatial regions.
The situation is much richer for regions with boundary on the light-cone, and we study this in Sec. 3. Using Lorentz invariance and the Markov property, we determine the universal explicit form for all the entropies as a function of γ. This generalizes the result for the EE of a sphere to arbitrary boundaries. We obtain a local functional that is an integral over the angular coordinates of the light-cone. We interpret this as an effective action for a dilaton log γ(y) in d − 2 dimensions. 2 In particular, we argue that the universal logarithmic term for the sphere EE generalizes to the Wess-Zumino anomaly action for the dilaton.
In the second part of the paper (Sec. 4) we study these questions from the point of view of AdS/CFT. 3 The EE for the boundary theory becomes the area of the extremal Ryu-Takayanagi surface in the gravitational theory. We construct the extremal surfaces corresponding to regions of the boundary QFT on the null plane and the light-cone. This geometric problem turns out to have various special features: the surfaces are described by linear differential equations (bulk laplacians), and they lie themselves on the bulk null plane or cone. We verify that the Markov property holds holographically. For the null cone, we evaluate the holographic EE explicitly, and check that it agrees with a special case of the general form predicted for CFTs in Sec. 3. These results are extended to include 1/N and 't Hooft coupling corrections.
Armed with these additional insights, in Sec. 5 we revisit the proof of the a-theorem of [9], checking and expanding on the arguments in that work. In the process, we uncover a new positivity constraint for a nonlocal term in the EE. Lastly, in Sec. 6 we discuss implications of our results and various future directions.
Note added: while we were preparing the manuscript for submission, the work [17] appeared, which also studies extremal surfaces with boundaries on the null plane and cone in holographic theories. Some of the results in Sec. 4 -specifically, our formulas (4.9) and (4.18) -overlap with that reference.

Markov property for Renyi entropies
In [18] we showed that modular Hamiltonians H X for regions X with boundary on a null plane x − = x 1 − x 0 = 0 are given by dx + (x + − γ(y))T ++ (λ, y) , (2.1) up to an additive constant. Here y denote the transverse coordinates (x 2 , . . . , x d−1 ), and x + = γ(y) parametrizes the boundary of X on the null plane. This is simply the Rindler result, ray by ray. It leads to the operator equation In this section we will prove a much stronger statement, namely that all vacuum Renyi entropies of regions with boundary on the null plane also satisfy the Markov property. Our analysis on the null plane will be valid for any QFT. Hence, for conformal field theories (CFT), after a conformal transformation, the Markov property also holds for Renyi entropies of regions with boundary on the null cone. This gives an infinite set of equations for the vacuum reduced density matrix, placing strong constraints on quantum entanglement in QFTs.
We will argue that these properties for the entropies arise simply from geometrical considerations. In fact, our arguments also extend to other quantities such as free energies with insertions of (d − 2) dimensional surface operators. In the future, it would be interesting to understand the implications of our formulas for surface operators in gauge theories.

Proof of the Markov property
Let us first describe the setup in more detail. We work in d-dimensional Minkowski space with signature (−, +, . . . , +), and introduce null coordinates Consider a null plane x − = 0 with orthogonal coordinates x + and y a = (x 2 , . . . , x d−1 ) ∈ R d−2 .
The metric on the plane is ds 2 = (dy a ) 2 + 0 dx + dx − . (2.5) We take a d − 2 dimensional surface x + = γ(y) on the null plane, crossing all null rays -see Fig. 1. We wish to compute the vacuum entanglement Renyi entropy S n of a QFT in a region with boundary in γ(y). Since the entanglement entropy does not depend on the Cauchy surface but on the whole causal region, it is equivalent to say that it is a functional of the boundary γ(y). We assume a Lorentz invariant regularization of the entropies, with short distance cutoff . A Lorentz invariant cutoff can be produced using the mutual information, or mutual Renyi entropies; see Appendix A. In a theory with mass scales, S n can also depend on other dimensionful parameters. Since we are working with the vacuum state, we can only use the geometry of γ, , and some constants of the theory to construct S n (γ). In particular, we can expand in terms of functionals of the form where dσ is a volume element along γ and f is a function of the distances between points and the dimensionful parameters. The simplest argument is as follows. These functionals should be Lorentz invariant. In particular, a boost rescales the coordinate x + → λx + , so we have for any λ > 0. Taking the limit λ → 0, and focusing on bounded curves, the entropy of γ must then be the same as the one of a surface arbitrarily near the plane x + = 0. 4 Therefore, S n must be independent of γ. Another way to establish this is to realize that the degenerate metric (2.5) gives an infinite set of isometries for the null plane y = y , x + = h(y , x + ) . (2.8) That is, we can deform the x + coordinate in a way dependent on y, and get the same metric. These are of course not isometries of the full Minkowski space. Any two surfaces γ can be deformed into one another by these isometries. Hence they have identical (flat) intrinsic geometry and also they are identically embedded in the null plane. These isometries imply that the functional (2.6) will be the same for all γ. Nothing changes if we consider using derivatives of γ of any order to form the functional of γ. More explicitly, multiple gradients of γ are tensors that can be expanded with the orthogonal vectors k = (1, 1, 0, . . . , 0) and y a , and the same holds for the distance vectors between any two points along γ. Once these tensors are contracted the components proportional to k do not contribute because k 2 = 0, k ·ŷ a = 0. Hence the remaining contribution is the same as the one of a planar γ, and hence independent of the shape of γ.
Another aspect of this impossibility of distinguishing different γ with a geometric functional is that we cannot form non trivial invariants from the extrinsic curvatures of γ. There are two null vectors normal to γ, k = (1, 1, 0, . . . , 0) and q, q 2 = 0, normalized with k · q = 1. Since k is constant along γ, the corresponding extrinsic curvature vanishes. There is an ambiguity k → λk, q → 1/λq in the representation of the surface in terms of the orthogonal null vectors. Then, in order to produce an invariant we have to use products of curvatures for q and k, which are also zero.
We conclude that all functionals we can construct should give the same value of S n for any γ. 5 The Markov property for S n then follows trivially, that is, the combination because all the entropies are equal. This result for the independence of S n on γ did not assume any unitary symmetry of the vacuum corresponding to the deformations (2.8) of the null plane. However, in addition to Lorentz boosts, such unitary symmetries deforming the null plane along the null rays and keeping the vacuum invariant do indeed exist for the special case x + = x + +γ (y ). These are given by the modular translations corresponding to other arbitrary regions γ with boundary in the null plane [18]. They act as isometries on the plane but do not have local action on field operators outside the plane. Therefore, the transformations between different surfaces γ can indeed be implemented by unitaries keeping the vacuum invariant.
This geometric argument implies that the equality of the entropies for all γ extends to other quantities such as partition functions with insertions of d − 2 dimensional surface operators. But this does not apply to lower dimensional operators which are not equivalent under the isometries of the null plane.
The argument above needed a Lorentz invariant cutoff. Once this requirement is dropped the equality of all entropies for different γ does not hold any more -we could for example change the cutoff around γ and γ independently. However, the Markov property is a regularization independent statement. The reason is that the divergences in the entropies are local and extensive on the boundary of the region; hence in any other regularization they must also cancel locally in the combination (2.9).
In conclusion, a Lorentz invariant geometric functional of d − 2 surfaces with minimal continuity properties must be constant on regions with boundary on a null plane. If this functional is either finite or has local extensive divergences along γ, it must be Markovian on the null plane, and this is a cutoff independent statement. This property then persists on the null cone for a conformally invariant functional (that is, a functional that is conformally invariant for any cutoff independent combination).
We will next illustrate this with a model having extensive mutual information. We will also see directly this structure for the holographic entanglement entropy in Sec. 4.

An example: extensive mutual information model
A simple example is given by the EMI (extensive mutual information) model for the entropy [19]. For a spatial surface A with complementĀ in a given Cauchy surface, this model gives the functional where η is the normalized vector orthogonal to the Cauchy surface. A small distance cutoff is assumed between A andĀ. The interest of this expression is that it gives a simple example of conformal invariant, positive, and strong subadditive functional on causal regions. It can also be thought of as the free energy in the presence of surface operators which are exponentials of free fields [20]. The integrand is a conserved current in both indices what guarantees S is independent of the Cauchy surface. In fact this expression is equivalent to one dependent only on the boundary of A where again a small cutoff is assumed at coincidence points. With a distance cutoff in (2.11), a quick look at the argument above confirms S is independent of the region on the null plane. Markovianity on the cone can be seen directly from (2.10), choosing the null cone as a Cauchy surface. Then the Markov combination (2.3) reduces to the (finite) double integral of the integrand in (2.10) over non-overlapping regions A ∩B and B ∩Ā of the null cone. It is easy to check explicitly that the double integral over patches of the same null cone vanishes identically, while it is always positive for other null patches or spatial regions. This vanishing gives the Markovian property for this functional.

Universal form of CFT entropies on the light-cone
In this section we study the vacuum reduced density matrix for regions whose boundary lies on the light-cone. We will determine the universal form of the entanglement and Renyi entropies for general CFTs. The conformal transformation between the plane and the cone, working in the metric with signature (− + ...+), is given by This maps the past light-cone of the origin x µ = 0 into (part of) the null plane X − = X 1 − X 0 = 0. The origin X µ = 0 is mapped into the point (−R, −R, 0), the surface X ± = 0 is mapped to the circle x 0 = −R, r = R. The points on the null cone from the point line x 1 = −x 0 = R correspond to the infinity in the coordinates X. We will then consider a surface 6 r − = 2γ(y) (3.2) on the past light-cone r + = 0, with This curve parametrizes the boundary of the Cauchy surface. The restriction of the Minkowski metric to r + = 0, r − = 2γ(y) gives a (d − 2)-dimensional sphere with radius that depends on the angular position along the curve: Here g ab (y)dy a dy b = 4 (1 + y 2 ) 2 (dy a ) 2 (3.5) describes a sphere S d−2 of unit radius in conformally flat coordinates. 7 We argued in the previous section that the entropies for a Cauchy surface with boundary on the null plane and Lorentz invariant regularization are independent of the boundary shape. After a conformal transformation to the light-cone, this means that all the dependence on γ has to arise from the short-distance cutoff on the light-cone. (We will see explicit examples of this in holographic theories in Sec. 4). Up to an overall constant, this is local and extensive, and hence the entanglement and Renyi entropies should be given by local functionals of γ/ , its derivatives, and geometric quantities built from g ab S n = d d−2 y √ g L n (γ/ , g ab , ∂ . . .) + F n . Equivalently, the Markov property on the null plane is regularization invariant and hence preserved by the conformal transformations for a CFT. The Markov property on the null cone implies that the entropy is a local functional plus possibly a constant F n independent of γ.
Our goal is to determine the general form of L n allowed by Lorentz invariance. We will find that this is related to a dilaton effective action on S d−2 . Our analysis will reveal how the EE for spheres generalizes to an arbitrary boundary γ(y) on the light-cone. The main results are given in (3.20) and (3.29). The divergent terms are automatically Markovian, and we will find the form of the universal finite contributions.

Lorentz transformations on the light-cone
In order to impose Lorentz invariance, we need to determine how Lorentz transformations act on the subspace r + = 0, r − = 2γ(y). The pull-back metric is (3.4), which describes an S d−2 with varying radius γ(y). It is known that Lorentz transformations reduce to conformal transformations on S d−2 ; this becomes clear in the embedding space formalism, where conformal transformations are represented as linear transformations on a null-cone of a projective space in two more dimensions. We will now review how this comes about; see e.g. [21,22]. It is useful to parametrize the null cone C as where λ ∈ R, y a ∈ R d−2 . The coordinatex µ gives the Poincaré sectionx 0 +x d = 1 of the null cone η µνx µxν = 0; λ describes 'radial' motion on the cone. See also [23]. The conformal factor ω(y) can be arbitrary but here we will fix it to ω(y) = 2 1 + y 2 . (3.9) The pull-back of the Minkowski metric to C then reads which, recalling (3.5), describes a sphere in conformally flat coordinates. In particular, we are interested in a sphere of varying radius γ(y), and this is obtained for λ = γ(y) . (3.11) The main advantage of these coordinates is that there is a simple relation between Lorentz transformations on x µ and conformal transformations on (λ, y a ). In more detail, the Lorentz generators J µν induce SO(d − 2) rotations, translations, special conformal transformations and dilatations on C: In this way, the Lorentz algebra SO(d − 1, 1) gives rise to the conformal algebra for euclidean R d−2 . The coordinates transform as (λ, y) → (λ , y ) with ∂y a ∂y c ∂y b ∂y d δ ab = e 2A(y) δ cd , λ = e −A(y) λ . (3.13) Note that while the embedding space R d−1,1 for CFTs is just an artifact, in our setup it is the physical space where the QFT lives.

Entropies on the null cone
Our goal now is to determine the general form of (3.6) consistent with Lorentz invariance. We can think of S n as an "action" for an euclidean theory that lives on S d−2 , with a scalar degree of freedom γ(y). As reviewed in Sec. 3.1, Lorentz transformations act as conformal transformations on S d−2 , so we will keep the metric g ab explicit to account for conformal rescalings, which act as g ab → e 2A(y) g ab . Furthermore, from (3.13), φ(y) = log(γ(y)/ ) transforms additively as a dilaton field. In this way, the problem of finding the entropies S n is equivalent to that of constructing a conformally-invariant local action in d − 2 dimensions with a dilaton field φ(y) = log(γ(y)/ ). It is interesting to note that dilaton techniques have appeared in the recent proof of the a-theorem in [24]; see also [25][26][27][28]. There, the dilaton is introduced by hand in order to match Weyl anomalies; in our context φ(y) is physical, as it arises from the varying radius of S d−2 on the light-cone. These results on the dilaton effective action will be useful for our goal, especially the d-dimensional analysis in [29]. 8

Odd d
Let us begin with the simpler case of odd space-time dimension d. The 'action' functional for the entropy S n (γ) can be constructed simply as a derivative expansion in terms of local geometric invariants built from the metriĉ with g ab the metric of the unit radius S d−2 . Since this is the metric induced by the Minkowski metric on γ it is clear that these geometric terms are Lorentz invariant. We note that the Riemann tensor can be written in terms ofR ab andR becauseĝ ab is conformally flat (the Weyl tensor vanishes). In addition we could construct invariants using the extrinsic curvatures of γ. We show in Appendix B that the extrinsic curvatures on the null cone give again combinations of the intrinsic metric and the Ricci tensors. Thus the most general effective action is constructed in terms of powers ofĝ ab , the Ricci tensor, the Ricci scalar and covariant derivatives. The first few terms are The constant coefficients β j depend on the specific theory and on n.
In this expression, conformal invariance for the dilaton -namely Lorentz invariance for the d-dimensional QFT-is manifest.
To gain intuition, let us write explicitly the terms with zero and two derivatives: The first term is the familiar area term. Performing a field redefinition the second term becomes, for d ≥ 5, the action for a conformally coupled scalar, where ξ = d ⊥ −2 4(d ⊥ −1) and the Ricci scalar R = d ⊥ (d ⊥ − 1) for the unit-radius sphere. 9 The area term proportional to γ d ⊥ is then simply a conformal potential V (ϕ) ∼ ϕ 2d ⊥ /(d ⊥ −2) . The next terms in the 'effective action' for the entanglement entropy S are higher derivative generalizations of this conformal Laplacian -we will return to this point below.
Note that the overall constant F n is trivially consistent with the Markov property (2.9). However, it is not possible to write it as a local geometric invariant. In this sense it is analogous to the anomaly contributions for even d to be discussed below. For entanglement over spheres, this is the familiar constant term F that measures the free energy of the theory over the euclidean sphere.
Putting these results together, and replacing d ⊥ → d − 2, the universal form of the EE for regions with boundary on the null cone and in odd space-time dimensions becomes Let us compare this with the EE for a CFT on a sphere, Eq. (3.7). We recognize in (3.20) the area terms and all the subleading contributions, generalized to an arbitrary varying curve γ(y). Some of the β k are fixed in terms of the entropy of the sphere. For instance, This means that the coefficient of (∇ log γ) 2 in the first subleading term (γ/ ) d−4 is uniquely fixed by the corresponding term in the sphere EE. This is a consequence of Lorentz invariance. At higher orders, there are more geometric invariants allowed, such as the terms with β 4 , β 4 in (3.15). In this case, the sphere coefficient α d−2k 9 On the other hand, this term vanishes for d = 2, 3 and is proportional to the volume of S d−2 in d = 4.
fixes only an overall combination of the β i , and the entropy for the boundary γ(y) contains more information about the specific theory. The term of order γ d−2−2k is essentially a higherderivative version of the conformal Laplacian on the sphere containing 2k derivatives. We will discuss below a compact expression for such operators.

Even d
For d even this is not the full story: there must be an additional contribution that comes from the Euler a-anomaly. Indeed, recall that for a sphere of constant radius γ at fixed time, we should recover the universal logarithmic contribution We want to find a Lorentz invariant local functional that reduces to (3.21) for constant γ(y). At first, this appears to be challenging in our approach because, as we saw in (3.15), there are no local invariants we can form with geometric quantities fromĝ ab that give rise to such a term. We propose that the generalization of (3.21) to arbitrary γ(y) is a Wess-Zumino term for the Weyl anomaly on S d−2 . To explain how this comes about, let us first review the simplest case of the Weyl anomaly in 2d CFTs. The stress-tensor on a manifold with metric g ab has a trace-anomaly where R is the scalar curvature of g ab . This implies that, under a Weyl rescaling δg ab = 2δσg ab , the effective action W = − log Z changes as A local functional whose variation gives (3.23) can be obtained by introducing a dilaton field τ , which transforms as τ → τ + σ(y) under g ab → e 2σ(y) g ab . The result is the Wess-Zumino action [33] S WZ = c 24π Here the dilaton derivative term cancels the Weyl transformation of the Ricci scalar, . We note that, while this is a local functional of g ab and τ , it is not a local functional constructed from the Weyl-invariant metricĝ ab = e −2τ g ab . Let us return now to the EE calculation for d = 4. 10 We seek a local Lorentz-invariant functional that reduces to (3.21) for constant γ. We found that Lorentz transformations act as conformal transformations on the S 2 null-cone sphere, and that log(γ/ ) transforms as a dilaton field. We then recognize (3.21) as the first term of the WZ action (3.24) evaluated on S 2 . In order to preserve Lorentz invariance, we expect that the contribution to the EE for a curve γ(y) should then generalize to with the overall normalization fixed by (3.21) and the Euler characteristic 1 4π d 2 y √ gR = 2. Note that the coefficient of log( ) is topological and hence is the same for all γ. In particular, this means there is not type B anomaly contribution to this logarithmic coefficient. This can be seen as a consequence of the particular geometry of the cone in Solodukhin's formula [34] for the coefficient of log( ) in generic regions in d = 4. See the Appendix B.
This is a local functional and hence satisfies the Markov property. But, as in the discussion of the Weyl anomaly, it is not a local functional of the metricĝ ab = γ(y) 2 2 g ab introduced in (3.14). It is Lorentz invariant, as can be seen by writing it as a bilocal functional [35,36] with ∇ 2 yĜ (y, y ) = 1 √ĝ δ 2 (y, y ) the Green's function forĝ ab , andR its curvature scalar. Using and integrating by parts, (3.26) reduces to (3.25), up to a term quadratic in R that is independent of γ. This discussion extends to arbitrary dimensions d ⊥ , where the Weyl anomaly is proportional to the Euler density E d ⊥ (plus conformally invariant terms that vanish in our case). The Wess-Zumino action can be computed systematically by integrating the Euler density [25,33], and is proportional to the Euler character of the sphere. The contribution from t = 0 reproduces (3.21), and this is how the overall normalization is fixed. The full integral gives a conformally invariant action with derivatives of the schematic form y log γ (∇ 2 ) d ⊥ /2 log γ . Explicit expressions in various even dimensions may be found in [24,26,27,29,32]. In summary, the entanglement entropy for an arbitrary curve γ(y) in a CFT in even d dimensions is given by The last term is the WZ action on S d−2 with a dilaton log(γ/ ), and it generalizes the universal logarithmic term of the EE on a sphere. In this case, A n = A is just the Euler anomaly.
For comparison with holographic results below, let us give some explicit examples. For d = 4, using the curvature of S 2 , R = 2, we get, from (3.25), Next, for d = 6, we use that the WZ action (3.28) becomes [24] where φ = log(γ/ ). Performing the calculation for a sphere obtains 11 (3.32)

An alternative approach
We now present an alternative construction of the effective action. This approach is somewhat simpler, and makes it clear how Lorentz invariance of the d-dimensional theory is used. First, we write the metric over the varying radius S d−2 as a dilaton factor times the flat space metric, See discussion around (3.11). We then require a local effective action, invariant under rotations and translations on R d−2 , and under scale transformations y → e σ y, τ → τ + σ.
Following the construction of the dilaton effective action in [29], this can be organized in terms of differential operators which contain 2k derivatives and transform covariantly under scale transformations, Hence, the basic scale-invariant objects are d d ⊥ y W k and e d ⊥ τ W r , and the most general local effective action is with αn kr some arbitrary coefficients. The term proportional to αn kr contains 2k + 2 i n i r i derivatives. 12 An explicit evaluation of the first few contributions in (3.36) recovers the terms analyzed in Sec. 3.2. This approach has the advantage of unifying odd and even d; in particular, the Wess-Zumino term arises from the limit k → d ⊥ /2, This is the reason for the normalization in (3.34). For instance, after integration by parts, which agrees with (3.25).

Holographic analysis
In this section we analyze the entanglement entropy for regions with arbitrary boundaries on the null plane and, for CFTs, with arbitrary boundaries on the null cone, in theories with holographic duals. Via the HRT formula [38,39], this translates into finding extremal surfaces anchored at boundary curves γ(y) in the null surfaces in asymptotically AdS space. This geometric problem turns out to have many special and interesting features, which are not present in the case of generic space-like boundary curves. In particular, we will find that the extremal surface is determined by a linear second order differential equation. We will check that the Markov property holds, and regain the general expressions of the previous section for EE in a null cone for CFTs. We will also show that these results hold when adding corrections for finite N or finite 't Hooft coupling λ.

Regions with boundary on a null plane
The metric for an asymptotically AdS space with Lorentz symmetry corresponding to the vacuum state in a holographic theory is , and lim z→0 f (z) = 1. Here z ∈ (0, ∞) and y i ∈ (−∞, ∞) We want to find an extremal surface in the bulk with boundary on a d − 2 surface on the boundary given by The minimal surface has d − 1 dimensions and we parametrize it with the coordinates α i ≡ (z, y). The induced metric on this surface is We have to minimize the area We have two equations of motion, one for x + and one for x − , and the Lagrangian depends only of the derivatives of these fields. The equation of motion for x + contains only terms proportional to derivatives of x − , and hence can be solved taking consistently with the boundary condition. This simplifies the equation of motion coming from the variation of x − , since we only need to keep the terms linear in ∂ i x − in (4.4). The result is This equation determines the minimal surface. Surprisingly, it is a linear equation for the shape x + . A reason for this is that if x + is a solution, a scaled λx + has to be a solution since it arises from boosting. It is the same as the equation for a massless scalar in the bulk metric (4.1).
Since we have obtained a minimal surface that lies completely on the x − = 0 plane on the bulk, the area on this surface has to be computed with the induced metric that is completely independent of the shape of x + (z, y). Hence, once we fix a cutoff z = and integrate the volume of this z, y plane for all y and z > , the area is independent of γ( y). This works for general f (z), i.e., it captures fixed points (f = 1) and also holographic RG flows. This verifies our arguments in Sec. 2, and leads to the Markov property of the vacuum state in holographic theories. In fact, the area is the same for any surface on the x − = 0 plane but only the solution of (4.6) is extremal. For pure AdS, we can give an explicit solution for the extremal surface. When f = 1, (4.6) reduces to By Fourier transforming in y and choosing the solution regular at infinity, we get the complete solution for the problem See also [17]. Eq. (4.8) was also derived in a different context in [2].

Regions with boundary on a null cone
Next, we consider the entropy of CFTs for regions with boundary on the null cone. One idea would be to obtain the extremal surface and areas by mapping the null plane to the null cone, and then compute the entropy using the metric and a cutoff of fixed z on the cone. We will more simply redo the calculation on the cone directly. We focus here on smooth curves γ(Ω), and later in Sec. 4.3 comment on the effects of cusps. For pure AdS there is a conformal transformation from the null plane to the null cone at the boundary that extends as an isometry on the bulk, respecting minimal surfaces and their areas. Hence, the only differences in the computation of the areas in the planar case and the cone can come from the position of the cutoff. The isometry of AdS corresponding to (3.1) is given by extending this conformal transformation to one in a Minkowski space with one more spatial coordinates z, and Z respectively. These are just the two bulk coordinates. We have exactly the same formula (3.1) but where the vectors have now d + 1 coordinates, and x d+1 = z, X d+1 = Z. The AdS metric is invariant under this transformation. The surface X 0 = 0, X 1 = 0, which corresponds to the minimal surface of Rindler space, is mapped to the spherical cup which is the minimal surface corresponding to the sphere. The surface t + | x| = 0, which is the past light-cone in the bulk of the upper tip of the cone, is mapped into the plane X − = 0. Then, the minimal surfaces we are interested in will lie on this null cone on the bulk.
To follow the geometric ideas for the Markov property on the original AdS space, we will use the following coordinates whereΩ are angular coordinates on the half-sphere t = const,r = const. For the surfacẽ r + = 0 eachΩ constant describes a null line in the bulk having the origin as the future end-point. We will write z =r sin(θ) , θ ∈ (0, π/2) , (4.12) with θ = π/2 corresponding to the point of the sphere further from the AdS boundary, and θ = 0 to the boundary. The AdS metric writes where and Ω are angular coordinates on a d − 2 dimensional sphere describing usual polar coordinates in the boundary of AdS.
On the surfacer + = 0, the induced metric is independent of the remaining coordinater − = 2r = −2t. This shows that, if we naively forget about the cutoff, all possible minimal surfaces have the same induced metric and (divergent) area. If we impose a cutoff on a small θ independently of Ω we get again the same result for all minimal surfaces reproducing the previous result for the plane. However, we want to impose a covariant cutoff at fixed z instead. All the dependence on the shape of γ will come from this cutoff.

Extremal surface and covariant cutoff
Let us compute the equations for the minimal surface, and check that it lies onr + = 0. Writing the d − 1 coordinates for the sphere described byΩ as α i and the sphere metric as g ij , we have to extremize the action with respect to variations ofr ± (Ω). The equation of motion forr − is satisfied, along with the boundary conditions, by settingr + = 0. The equation of motion ofr + gives The same equation holds forr since it is justr − /2. Notice that the equation for (r − ) −1 is linear as was the case of x + for boundaries on the null plane. This is because these two variables are linearly related by the conformal transformation that carries the null plane into the null cone. The boundary curve now is of the form r = γ(Ω), where r = (x 1 ) 2 + . . . + (x d−1 ) 2 . The minimal surface takes the formr + = 0,r(θ, Ω), withr(0, Ω) = r(Ω) = γ(Ω). It lies on the bulk light-cone, as illustrated in Fig. 3.
The solution to (4.17) that is regular in the interior θ → π/2 is 13 and I is some multi-index for the eigenfunctions of fixed degree n. The prefactor in (4.18) is chosen to cancel the value of the hypergeometric function at θ = 0, and a nI are the coefficients of the expansion of γ −1 in spherical harmonics, We want to impose a standard Lorentz invariant cutoff in z =r(θ, Ω) sin(θ) = . (4.21) Let us denote the solution to this equation by θ = β(Ω); it will depend on the cutoff and on the curve γ(Ω). The minimal area then becomes This has the form of a local action for the entropy, as in the QFT calculation. Also, as anticipated, all the dependence on γ(Ω) arises through the cutoff β. Since β ∼ O( ), we expand in small β, obtaining (4.23) Here .

(4.24)
In order to evaluate this expression, we need to solve for β in powers of . Besides the constant term, (4.18) contains a series that starts at order θ 2 and one that starts at θ d . Explicitly, The series in θ 2 can be rewritten in terms of derivatives of γ(Ω) −1 by use of (4.19), This can also be verified by solving (4.17) in powers of θ 2 . In contrast, the series that starts at order θ d does not appear to have a local expansion in derivatives of γ −1 . This series is fixed by requiring regularity at the interior θ → π/2, which is the condition that fixed (4.18). Such terms end up modifying the EE at order 2 , and hence vanish in the limit in which the UV regulator is taken to zero. We will neglect them in what follows. Plugging (4.26) into (4.21) leads to the power-series solution We now use (4.23) and (4.27) to study the extremal surface area in a derivative expansion. For general d, we have .

(4.29)
This is the same for any curve γ(Ω) on the cone, and agrees (as it should) with the holographic result for the sphere [40]. 14 In particular, for d = 3 (4.28) becomes Note from (4.28) that the term of order is a total derivative ∇ 2 Ω (γ −1 ) in d = 3. For d = 5, after integration by parts As in (3.19), the last two terms give the kinetic term for a conformally coupled scalar field, and the first term is a classically conformally invariant potential.

Even d
For even d, the expression (4.28) explains the origin of the universal logarithmic terms, 14 By a slight abuse of notation, we keep the sign (−1) as part of F , in agreement with our convention in (3.20). However, the standard notation for F does not include the sign, as in (3.7).
for d = 2n. It also gives rise to the correct WZ terms, although it is not obvious how to rewrite the previous expressions with hypergeometric functions as (3.28). Let us check this for d = 4, 6.
For d = 4, The second and third term combine to give the two-dimensional WZ action (3.25). For d = 6, It is not hard to verify that this result is a linear combination of the WZ action (3.30) and the two invariant terms that obtain fromR 2 andR 2 ab in (3.15). This is a nontrivial check, given that the four terms in the last line of (4.34) are reproduced in terms of the QFT formula that has three independent contributions at this order.

Comments on cusps
The holographic formula for the entropy contains terms depending on derivatives of γ. Here we want to comment on the interpretation of these terms when γ is not smooth. We will only treat the case of a cusp, that is, the case of a jump in derivatives, and for simplicity will keep the discussion centered in low dimensions d = 3, 4.
For a smooth surface, ∇ 2 Ω (r −1 ) is finite as θ → 0; then we found in (4.26) that ∂ θ (r(0, Ω) −1 = 0 and our previous results apply. However, this need not be true near a cusp. Before getting to the cusps, let us assume that there is some power-law singularity as we approach the boundary, Solving the equation of motion for small θ then gives Therefore, negative powers of θ from ∇ 2 Ωr −1 will indeed modify the expansion (4.26). We will now see that ν = 1 at codimension one cusps.
For simplicity, let us focus on d = 3, and consider a cusp at φ = φ 0 with local angle α. Then, close to the cusp, γ (φ) ∼ δ(φ − φ 0 ) tan α. At finite θ, this delta function is smoothed; we should recover an approximant of the delta function as θ → 0. By dimensional analysis, valid for small θ and near the cusp. Indeed, it is not hard to check that Plugging (4.37) into the minimal area equation and expanding for small θ, we find This can also be checked by computing the Fourier coefficients and performing the full sum (4.18). For instance, the calculation can be done explicitly for a cusp of the form sin |φ|. The same will happen for d ≥ 4 as long as the cusp has codimension one, with φ above playing the role of the local normal coordinate. Indeed, for a cusp at φ 0 that locally looks ; this is just the familiar fact that |φ − φ 0 | is the one-dimensional Green's function. This also says that contributions from cusps of higher codimension will be smaller. Indeed, to get a delta function from ∇ 2 Ω γ −1 at codimension n, we need γ −1 ∼ 1/| x − x 0 | n−2 . However, we are considering curves without such divergences, and so all the cusp contributions will have ν < 1, with ν = 1 for codimension one cusps only.
We conclude that the area integral is not affected by null cusps, since (4.39) modifies the expansion of β(Ω) on a measure zero set of points (the cusps). Therefore the formula (4.28) for the entropy has to be integrated on each side of the cusp where the regular expansion in θ works, without any further cusp contribution. In consequence, the Markov property continues to hold when there are cusps.
However, we cannot eliminate boundary terms in the integration by parts when there is a cusp. For example, the finite term with a Laplacian in d = 4 can be treated in the following way when there are cusps. We integrate in the smooth patches P i to get where the scalar products are with the sphere metric, and η in the last term is the outward pointing unit normal to the boundary ∂P i on the sphere. The first term has a discontinuous but bounded integrand on the boundary (the position of the cusp).
It is interesting to see that written in this way, the contributions of the local integrand cancel locally in the SSA relation, but the second term will cancel in the SSA relation because it has opposite contributions to the intersection and the union. This is because these have locally the same (∇ Ω r)/r at the points of the boundary of the patch, but opposite η.

Higher derivative gravity theories
In the remaining of this section, we will extend the previous results to include stringy and quantum effects.
Higher derivative gravity theories in the bulk around an AdS solution represent different CFTs incorporating 1/λ corrections, with λ the t'Hooft coupling. A general form of the EE functional corresponding to higher derivative Lagrangians was discussed in [41,42]. The result is a geometric functional computed on the generalized Ryu-Takayanagi surface Σ, including curvature and extrinsic curvature corrections. Here we want to briefly discuss how the main results of the preceding sections are expected to remain unchanged for these models.
For a gravity action that is a function of the curvature tensor, the generalized entropy functional has two types of terms. The first is Wald's entropy formula where the vectors n (a) , a = 1, 2, are two normalized vectors normal to the codimension two surface, and ε ab is the usual two-dimensional Levi-Civita tensor. In what follows we find it convenient to choose n (a) as two null vectors orthogonal to the surface, normalized by n (1) · n (2) = 1.
The second type of terms involves the extrinsic curvatures of the surface and is proportional to Here η is the projector onto the vector space normal to the surface The extrinsic curvature is given by where P is the projector to the tangent space of the surface The bulk metric is pure AdS corresponding to vacuum CFT. In AdS the curvature tensor is proportional to combinations of product of the metric tensor. In consequence, Wald's term (4.41) is proportional to the area functional.
Let us consider a surface Σ that lies on the bulk null coner + = 0. In that case we can choose n (1) to be the Killing null vector parallel to the cone. Then we have As the extrinsic curvature tensor (4.45) is symmetric in µ, ν the contribution of the derivative of n (1) vanishes. In consequence only one term remains in the extrinsic curvature (4.45) and the integrand in (4.43) vanishes as well. In addition, we have here a situation analogous to the one of surfaces γ in a null plane discussed in Sec. 2. The areas of any two surfaces lying on this null cone in AdS are equal since only the projection of the surface orthogonal to n (1) contributes, and there is an isometry that shows that these projections are equal along the direction of the null ray. Then, on the null cone in the bulk, all surfaces give the same value of the functional. The equations that fix the position of Σ in the general case follow by extremizing the entropy functional [43]. For surfaces on the null cone, the variations of the entropy functional for variations of position also contained in the null cone, vanish. Hence, analogously to the case of Einstein gravity treated above, one of the equations of motion is solved precisely by placing Σ on the null cone, and this is compatible with the boundary conditions. The other equation of motion will fix the shape of the surface on the cone itself. On the cone, the functional is just proportional to the area, but this need not be the case for deformations that take the surface outside the cone. Hence, we expect the differential equation forr − to get modified by the higher derivative terms in the Lagrangian. However, this equation should still be linear. This is because, as we have explained in section 4.1, boost invariance will lead to a linear equation for regions on the null plane on the boundary, and a conformal transformation will give a linear equation for (r − ) −1 .
In any case, once the surface is determined, the Markov property follows from the fact that the functional on the cone reduces to a term proportional to the area, and the area on the cone is independent of shape. Then, the result can only be affected by the position of the cutoff. Again, we will have a local expression for the entropy as a function of γ, with the same types of terms found in Sec. 3. The only change can be in the coefficients of the independent terms, in particular the value of the anomaly. This can be calibrated by computing the entropy of the sphere. See for example [44].

1/N corrections
According to [45], 1/N corrections to the entanglement entropy in the large N limit come from quantum corrections in the bulk. One has to add to the holographic entropy the entanglement entropy of quantum fields living in the bulk across the Ryu-Takayanagi surface.
For the regions on the light-cone we are considering, the entangling surfaces all lie on the bulk light-coner + = 0 in AdS. Then, we can apply an argument analogous to the one on Sec. 2 for the null plane in Minkowski space. The bulk EE has to be a functional of surfaces on the light-cone, and this light-cone is mapped into itself by isometries of AdS which correspond to conformal symmetries of the boundary theory. For example, we can take a surface γ on the boundary, and a sphere γ on the light-cone which does not cut γ. The modular flow corresponding to γ will move γ towards γ as much as we want. In the bulk, this corresponds to an isometry that will squeeze as much as we want the entangling surface of γ towards the entangling surface of the sphere γ (which is a sphere in the bulk). This symmetry keeps the vacuum invariant and respects a covariant cutoff in the bulk. Hence it will keep the bulk EE invariant. We conclude that quantum corrections in the bulk, except for terms coming from the UV cutoff of the boundary theory, will be the same for all regions on the light-cone, and will not spoil the Markov property. We expect the same structure of the entropy as in Sec. 3, with some corrections in the different coefficients for the independent possible terms.

Revisiting the entropic proof of the a-theorem
In the previous sections we obtained the explicit form of the CFT entropy on the null cone and worked out the holographic case. In this section we will use this information to check the arguments leading to a proof of the a-theorem in d = 4 in [12]. These followed the lower dimensional cases (d = 2, 3) treated in [8,9], where the strong subadditive property of the entropy was used for spheres (intervals or circles in d = 2 and d = 3 respectively) on the light-cone to show the monotonicity of the c and F quantities. In particular, the result (3.29) for the entropy for arbitrary regions on the null cone will allow us to see explicitly why the Markov property has to be invoked as a key ingredient in d = 4, as opposed to the d = 2 and d = 3 cases. However, from the outset we can say that the Markov property plays an important hidden role even in dimensions lower than d = 4. This is because if the strong subadditive inequality can teach us something non-trivial about the RG running, it must be the case that this inequality saturates for a CFT, where no relevant RG running is taking place. This shows the precise reason of the geometric setup of these theorems involving regions on the null cone. This is basically the only case where the Markov property holds for a CFT. 15 Let us first review the arguments in [9]. We start with a boosted sphere of radius √ rR lying on the null cone between the time slices at time |t| = r and |t| = R > r. We then take a large number N of rotated copies of this sphere, as equally distributed on the unit sphere of directions as possible. 16 From strong subadditivity we get the inequality in the limit of large N S( In this expressionS(l) are the entropies of "wiggly" spheres that come about in the process of intersecting and joining boosted spheres in the SSA inequality -see Fig. 4. The wiggly spheres have an approximate radius l ∈ (r, R), and lie around the surface of equal time |t| = l; the deviations from the perfect sphere of radius l at |t| = l form the wiggles, that lie on the null cone, and have a typical width ∼ l/N 1/(d−2) that tends to zero for large N . β(l) is the density of wiggly spheres as the number of boosted spheres N → ∞, divided by N . 17 It is given by In a sense these wiggly regions tend to spheres of radius l for large N , but we have to work out how exactly the entropies behave in this limit. Note that even if the amplitude of the wiggles decreases with N this is not the case for their slope, which remains a fixed function of l in the limit N → ∞.
At this point three different questions arise which have to be understood in order to extract useful information for the monotonicity theorems from (5.1). The first question is if this inequality contains cutoff independent information, that is, if the divergent terms cancel between the two sides of the inequality. Since divergences are local on the boundary of the regions this can be rephrased as if the new features on the wiggly spheres, coming from the locus of intersections of two or more spheres for example, gives place to new unbalanced divergent terms or not. The second question is whether, in case the inequality contains information about finite quantities, this can be extracted in a useful way. In other words, whether the wiggly sphere entropies can be related to sphere entropies. The third and last question is if the inequality will teach us something about the central charges at the fixed points of the RG. We will discuss these three questions in turn. 16 It is not possible to distribute them in a regular fashion for d > 3. The details of this distribution on the unit sphere of directions turns out to be irrelevant as far as a uniform distribution is approached for large N . 17 Strictly speaking the integral in (5.1) is a sum over N wiggly sphere entropies divided by N . The notation with an integral and a density of wiggly spheres of the same radius is a convenience here, that will make sense for later expressions when we take the limit N → ∞, and more information about the entropies of the wiggly spheres is introduced.

The inequality is UV finite
Unbalanced divergences in the inequality in principle could appear due to the cusps formed at the intersection and union of smooth spheres. We want to present a slightly different geometrical setup which bypasses this issue about divergent terms in any dimension.
The idea is to slightly deform the spheres of radius √ rR on the left hand side of the inequality along the null cone and around the points of intersection with other rotated spheres such that all intersections and unions are now smooth (we can choose infinitely many smooth derivatives). See Fig. 5. In this case there are no cusps and it is clear that the divergent terms cancel in any regularization. The price we pay is that now we do not have perfect spheres on the left hand side of the inequality, and they are replaced by wiggly spheres of approximate radius √ rR. The inequality now reads whereS(l) is the entropy of a wiggly sphere of approximate radius l and again the integral on the right hand side is a shortcut for a sum over N terms. In the present case this is not a big price to pay since we already have to deal with the wiggly spheres on the right hand side. The size of the new wiggles used to smooth out the cusps can be made arbitrarily small. While this approach sidesteps the issue of divergences arising at the cusps, in [12] we argued that the divergences cancel out from (5.1), even in presence of cusps. We argued in two steps, assuming a covariant cutoff. 18 For completeness, in the rest of this section we will review and discuss these arguments. 1) First, since (5.1) was obtained by a series of SSA inequalities, the Markov property requires that the divergences cancel for a CFT. Let us see how this comes about. The new divergences on the new local features of the intersections and unions are given by integrals of local geometric terms on the defects of the surface. An essential point is that these defects live on a null cone. The leading divergence is proportional to the defect dimensions, and we also have new terms for all subleading integer powers corresponding to integration of the defect curvatures along the defect. For a CFT the dimensions of these terms are compensated by negative integer powers of the cutoff (or a logarithm if the power is zero).
Let us focus on d = 4. We have linear terms growing as L/ from the intersection of two spheres in a curve of size L, and from the same defect, a term proportional to log(L/ ) due to the integral of the curvature of the intersection curve along the defect. From the vertex of the intersection of three spheres we should also get a logarithmic term. Now, the argument is that the coefficients of these contributions are either zero or have opposite sign for the contributions of the defect to the union and the intersection that gave place to it in the SSA inequality. Let us first consider the leading divergences, where no curvature terms are present. Hence the contribution is the same as for the same type of defect on a null plane rather than a cone. The defect will not contribute because there is no geometric quantity depending on the defect "angles" on which the entropy can depend making the defect contribution different from the plane without defect. These is just a manifestation of the argument in Sec. 2 about functionals on a null plane being independent of γ. In other terms, boosting these geometries while keeping the null plane and the location of the defect invariant, one can squash the planes and make them as similar to a single plane without defect as we want. To be more explicit, take for example the case of the vertex in d = 4. The vertex defines three spatial lines with unit tangents t 1 , t 2 and t 3 . However, these tangents live in a three-dimensional null plane. Therefore they all can be written as linear combinations of a spatial vector living in a two dimensional plane orthogonal to the null vector k and k itself, t i = v i + α i k, with v 2 i = 1, v i · k = 0. In any invariant formed by the three vectors all contributions from the component along k will vanish and then the invariant will be the same as the one formed by three lines in a single two dimensional plane, which of course does not define a real vertex.
Hence we conclude that these terms have zero coefficient and do not appear in the entropy. The holographic examples in Sec. 4 also illustrate this. For d = 3 and d = 4 we showed there is no log( ) (resp. no 1/ ) contribution from the cusps.
In d = 4 we also have the possibility of a curvature term on the intersection of two spheres. This can sense the form of the null cone and in this way bypass the arguments in Sec. 2. In writing the contribution of the curvature term we are allowed to use the gradient operator ∇ µ on the vector k for example, to produce local invariants. However, these gradients are defined on the defect only, and then the indices of the derivatives have to be contracted with one of the defect directions. This defect is locally formed by the intersection of two spatial planes inside the same null hyperplane with null vector k. Each spatial plane has another null vector q i that defines it, such that q 2 i = 0, q i · k = 1. There is an ambiguity in this representation of the planes in the scale of k, as we can freely rescale k → λk, q i → (1/λ)q i . Then, in order to produce the integrand of the contribution we have to write an invariant using the same number of vectors q i than of k. The only non trivial invariant with the right dimensions is This requires a choice of ordering of the two vectors q 1 , q 2 , which can be assigned for example choosing first the one to the right of the direction of integration along the intersection. This orientation changes sign when we compute the contributions of this defect to the intersection and the union of the two spheres, and hence the full log contribution of these defects to the SSA inequality vanish. In our general analysis in Sec. 3, and the holographic case in Sec. 4, we have in fact learned a bit more. We have shown that the total coefficient of the log term is a topological invariant and it is always the same for any shape on the null cone. This is given by an integral of the intrinsic curvature of the surface, giving the Euler number (the only non vanishing term in Solodukhin's formula [34] in this case). Hence, the log( ) contribution clearly cancels from SSA. To see how this fits with the previous argument, suppose we have a normalized contribution log( ) for any shape and we are doing the SSA of two spheres of radius √ rR. The logarithmic coefficient for the intersection and union should be of the form where the first term on the right hand side comes from integration of the constant intrinsic curvature of the spheres and is proportional to the total solid angle. Summing these two equations and using area∩+area∪ 4πrR = 2 we get cusp ∩ = −cusp ∪ , which coincides with the previous argument.
2) The previous argument shows that the inequality is free from divergences for a CFT. If we add a relevant deformation other divergent terms can appear with different powers of , and where some cutoff powers are replaced by powers of the coupling constant. However, the important point is that these terms are again local on the boundary and have to have the same geometric structure as for a CFT, being integrals of local geometric tensors on the boundary. That is, the only change is in replacements of the cutoff by coupling constants. Then, the previous argument still gives an inequality free of divergences.

Converting wiggly spheres into spheres
We would like to convert wiggly spheres into spheres in (5.1) or (5.4). It turns out that this is correct for d = 2 (since there are no wiggly intervals) and for d = 3, where terms produced by the wiggles go to zero for large N . This is not the case for d = 4, and the naive replacement of wiggly spheres by spheres just violates the Markov property. Let us see this in more detail. For a CFT in d = 4 the entropy for a sphere has the form S(l) = c l 2 2 − 4a log(l/ ) .
If we attempt to plug this formula into the Markov equation, assuming wiggly spheres can be replaced by spheres, we find this is not correct. The area term does indeed cancel since 10) and the constant log( ) term cancels as well due to (5.3). However, this is not the case for the −a log(l) term.
The issue here is that there is a nontrivial contribution to the wiggly sphere entropy from the finite term in (4.33) that comes together with the logarithmic term; this contribution, however, cancels for spheres at constant t on the right hand side of (5.9). This invalidates the replacement of wiggly spheres by spheres. We will now see that taking this difference into account correctly restores the Markov equality.
With l = x 2 + y 2 + z 2 , and θ the usual polar angle, the equation for the boosted sphere of radius √ rR is .
We have 1 2 We get a constant integrand (except for higher order terms in 1/N ) on the surface of the wiggly sphere of approximate radius l. 19 Taking into account this term, the Markov equation for the finite terms log( is now satisfied, once we replace β = rR l 2 (R−r) corresponding to d = 4. Note that the cancellation happens in each SSA equality but in terms of the wiggly spheres it happens "non locally", and takes all the range l ∈ (r, R).
Therefore, a finite term coming from the wiggles obstructs replacing the wiggly spheres by spheres. The idea of [12] was to take advantage of the Markov property of a CFT to subtract from the inequality for the entropies S of the deformed theory the equation corresponding to the entropies S 0 of the UV CFT. This can be done at no cost since the SSA of S 0 vanishes exactly. We have shown that, in addition, the divergent terms coming from massive deformations are also Markovian and cancel in the SSA inequality; we can subtract them as well, without spoiling the inequality. Then, in any dimensions, we safely replace S(l) → ∆S(l) = S(l) − S 0 (l) − massive divergent terms , (5.14) 19 The boundary terms in (4.40) cancel automatically in the sum over wiggly spheres.
in (5.4). Now the finite terms of the wiggles coming from the UV fixed point disappear in the subtraction, and we are free to replace subtracted wiggly spheres by subtracted spheres, taking the limit N → ∞, and getting the inequality We still have to check that there are no finite terms induced by a mass parameter that give a contribution for the wiggles that survive in the limit of small wiggles for the deformed theory. In fact, the difference in the EE from a wiggly and non wiggly sphere is controlled by the UV. These terms should be proportional to some mass scale of the square coupling constant g 2 of the theory deformation at the UV, which must be compensated by powers of r and positive powers of the distance scale set by the wiggles size. In consequence, they do not contribute in the large N limit. In more detail, a local term should be of the same form as the ones encountered for CFTs but where a power of the cutoff has been replaced by one of a mass parameter. These contributions are divergent except for some non generic perturbation dimensions. In any case a local term is always Markovian and can be subtracted as well. If the term induced by the deformation is non local, 20 then the change from the wiggly sphere to the sphere is suppressed by powers of the wiggly size, and does not contribute in the limit. We have computed these wiggly massive corrections holographically in Appendix C. The result agrees with these expectations.
Note that for d = 3 the formula (4.30) gives no contribution for the wiggles, and we can safely replace wiggly circles by circles without subtracting the CFT entropies. But this is not the case in higher dimensions.

Irreversibility theorems
We then have (5.15) for spheres in any dimension, where the UV CFT entropy along with other possible divergent contributions have been subtracted. These inequalities are equivalent to the differential ones obtained taking the limit r → R: Writing the entropy as a function of the area a rather than the radius, we get the compact expression ∆S (a) ≤ 0 (5.17) valid in any dimension. Thus the constraint for ∆S is that it must be concave as a function of the area. For completeness, let us briefly review here the results of [12]. With our definition of ∆S, that has the entropy with the UV CFT terms and other possible divergent terms subtracted, in the UV limit of small r all local geometric terms vanish and we get the leading "nonlocal" term (see e.g. [48][49][50] for the structure of the entropy of spheres at fixed points) Concavity, Eq. (5.17), implies two relations between the short and long distance expansions for ∆S(a): 1) The slope of the ∆S(a) curve is bigger at the UV than at the IR; 2) Given that ∆S(0) = 0, the height at the origin of the tangent line at the IR has to be positive.
The first requirement, comparing (5.18) and (5.20), and provided ∆ < (d + 2)/2, gives place to the "area theorem", that is, the decrease along the RG of the coefficient of the area term, 21 ∆µ d−2 ≤ 0 . (5.20) In d = 2 the area coefficient is dimensionless and (5.20) coincides with the c-theorem. The area theorem was obtained in [11] using monotonicity of the relative entropy. The second requirement gives for d = 3 the F -theorem, and for d = 4 the a-theorem, ∆A ≤ 0 . The inequality does not constraint the sign of the subleading terms, in particular the universal terms, for d > 4. In addition to these constraints that come from comparison of the UV and IR expansions, we have to check (5.17) at the UV and infrared expansions themselves. At the IR we get again (5.22) and (5.23) for d ≥ 4. For d = 3 we get information on the sign of the first subleading correction to the constant ∆S d=3 IR = ∆µ 1 r − ∆F − k r α + . . . , (5.24) where the last term is purely infrared in origin and α is related to the leading irrelevant dimension of the operator driving the theory to the IR [49]. We get k > 0 from (5.17). This coincides with holographic calculations [50], and free field theory calculations [51]. At the UV we get that the sign of the coefficient c 0 in (5.18) is the same as the one of ∆ − (d + 2)/2. This also agrees with holographic calculations [49]. Notice that while the inequality (5.17) saturates at the UV, it does not saturate at the IR for d ≥ 4. The SSA inequality always saturates at the IR for regions smooth enough (with IR size curvatures) but this does not allow us to derive (5.17) precisely because we are not allowed to convert wiggly spheres into spheres for these large wiggles.

Final remarks
We have found that the Markov property for EE on the plane, and on the light-cone for CFT's, has an origin that is essentially geometric. Because of that, this property extends to other quantities, e.g. the Renyi entropies; it does not depend on other specific properties that the EE has -and the Renyi entropies generally do not have -such as the SSA inequality. The Markov property together with Lorentz invariance determine the general form of the entropies on the light-cone for a CFT, and turns out to be related to dilaton effective actions in two less dimensions. The universal part is completely fixed by the coefficient A of the conformal anomaly in even dimensions and is given by the Wess-Zumino anomaly action. For odd dimensions the universal part is just a constant F for any region in the light-cone.
Beyond cases that are conformal transformations of the null plane in Minkowski space for CFT's, we expect that the Markov property also holds for any QFT on an space-time having a bifurcate Killing horizon, and where the state is invariant under the Killing symmetry. This is because the Killing symmetry will squash all regions to the bifurcation and keep a covariant cutoff invariant, leading to constant entropies on the horizon. This includes for example, arbitrary QFT in de Sitter space for the de Sitter invariant state and regions on the cosmological horizon, and for regions on the horizon of stationary black holes for the Hartle-Hawking state.
The Markov property for the Renyi entropies extends the constraints on the density matrix beyond Markovianity. For finite systems, the Markov property for all Renyi entropies in subsystems A, B, C, S n (AB) + S n (BC) = S n (B) + S n (ABC) , can only be possible if the global state is of the form, ρ ABC = ρ AB 1 ⊗ ρ B 2 C , with B 1 and B 2 two subsystems partitioning B. Hence, ρ AC = ρ A ⊗ ρ C is a product. This suggests that the vacuum state is roughly a product over different null pencils in vacuum QFT, though this is not quite correct mathematically for a theory in d > 2 and an interacting UV fixed point. In this case, the algebras corresponding to finite regions on the light-cone (that do not generate a domain of dependence containing spacetime volume) actually have no degrees of freedom. Anyway, in the cases where this identification makes sense, free theories and CFTs in d = 2, one can check that the structure of the vacuum is in fact a product state, rather than a more general Markovian state where classical correlations are allowed between A and C. For free theories this is described in [15], while for a CFT in d = 2 the vacuum is a product across the two null directions. The present investigation started in the course of attempting to generalize the entropic proofs of the c and F theorems to d = 4. In this sense it is intriguing that we have found that the entropies on the null cone are classified by dilaton effective actions, which are fundamental in the proof by Komargodski and Schwimmer of the a-theorem [24]. However, in the present case, the dilaton lives in d − 2 dimensions rather than d dimensions. This connection was also noticed by Solodukhin in [16]. Another difference is that our non dynamical dilaton does not necessarily obey unitarity constraints. It would be interesting to investigate if this connection could be the base for extending the irreversibility theorems to dimensions higher than d = 4.
We have checked that the general expressions for the entropy on the cone hold holographically. It is surprising that exact holographic expressions can be found for the entropy of such a large class of regions, though we can understand the origin of this simplification from more general principles. We have discussed how this simplification also permeates to λ −1 and N −1 corrections. Holographically, the origin of all the simplifications is the fact that the entangling surface lies on a maximally symmetric null cone in the bulk.
It would be interesting to obtain the expected form of the Renyi entropies on the cone from a direct calculation of the holographic Renyi entropies. In this case we would have to deal with a (in principle) complicated Schwinger-Keldysh representation with Lorentzian conical defects in the bulk [52] because we cannot use the Euclidean representation [53,54] for generic regions living on the null cone. Our best guess is that the bulk manifold should still be locally AdS, in such a way as to allow to locate the defects on a fixed bulk null cone. If this is the case, the Markov property and the expected expansion of the Renyi entropies would hold by the same reasons discussed in this paper for the entropy.

A Lorentz invariant regularization using mutual information
In this Appendix we review the Lorentz invariant regularization of EE provided by the mutual information for any QFT in any dimension. This is discussed in detail for d = 3 in [47]. We are restricting attention to smooth entangling surfaces, which is all we need in this paper.
Consider a smooth entangling surface γ. We take a spatial unit vector η normal to γ, and a function (x) on γ, which is a smoothly varying short distance on the surface. We will later take the limit (x) → 0, and impose that in this limit the derivatives of (x) approach zero at the same rate as (x). We can construct two spatial surfaces, one on each side of γ, by using the elements of the "framing" (η, ), The idea is to use the mutual information I(γ + , γ − ) as a regularization of the entropy. More precisely we take For the Renyi entropies we use analogously the mutual Renyi entropies I n (γ + , γ − ) = S n (γ + )+ S n (γ − )−S n (γ + ∪γ − ). The 1/2 factor in (A.3) takes into account that the mutual information for complementary regions in a global pure state is twice the entropy. An important point is that the mutual information is regularization independent, that is, taking the continuum limit of any regularization for the entropies on the right hand side of (A.3) should give the same finite result. Hence, S reg is a quantity that belongs to the continuum theory, and in particular is Lorentz invariant in vacuum. The particular symmetric framing on both sides of γ in (A.3) gives the same regularized entropy for complementary regions, as expected property for the entropy of global pure states. However, S reg depends on the framing, that includes the vector field η, and it is not a function of the entangling surface γ alone. In order to get rid of this unwanted framing dependence we note that as we are taking the → 0 limit, we only retain non-positive powers of . The dependence on η can only show up in the divergent terms. As these are produced by ultralocal entanglement between regions arbitrarily close to both sides of γ, these contributions can be written as integrals of local geometrical terms along γ. Now we can just subtract these terms to eliminate the frame dependence This is finite, Lorentz invariant, and completely defined by the theory itself. It can be thought of as a "minimally subtracted" entropy. While S reg (γ) does not have the property of being positive for arbitrary regions, it does retain some other important properties of entropy. The symmetry between complementary Figure 6: Strong subadditivity of the regularized entropies of two surfaces γ A and γ B with smooth intersection and union. The framings of γ A and γ B can be chosen such that they are compatible, i. e., they can be split along the black line in the middle, and reconnect to form the framings of γ A∩B and γ A∪B .
regions is one of these properties, and the other is strong subadditivity. This is shown as follows.
First we take two regions γ A and γ B with smooth intersection γ A ∩ γ B and union γ A ∪ γ B . Then we take compatible framings, as in Fig. 6. We expect that the thin strip terms exactly cancel in This is because these strip entropies should be taken as expansions in inverse powers of , and these expansions should be local and extensive along the strips. Thinking in terms of the Renyi entropies, this should be a property of the operator product expansion of surface twist operators. The cancellation (A.5) gives place to the strong subaditivity of the regularized entropies just because the entropies themselves are strong subadditive, In a sense, since the entropies are strong subadditive, subtracting the frame dependent terms cannot change this fact because divergent terms are always Markovian for smooth enough surfaces. For holographic theories S reg is just the entropy with the usual Lorentz invariant cutoff and the divergent terms subtracted.

B Extrinsic curvatures on the null cone
In this Appendix we argue that the extrinsic curvatures on the null cone do not give rise to additional geometric invariants besides those studied in Sec. 3.
We have a surface r = γ(Ω) on the null cone r + = 0. Let us define n (1) =r −t as the null vector parallel (and orthogonal) to the cone. Let q = 1/2(r +t), with q 2 = 0, q · n (1) = 1. The orthogonal vector space to γ is formed by n (1) and another null vector n (2) given by This is normalized such that n (1) · n (2) = 1. The extrinsic curvatures corresponding to n (i) are defined by with P α β = g α β − n (1)α n where we have used that the derivatives oft are zero and hence the gradient of q is one half that of n (1) . In the last term the second derivatives are finally projected onto the parallel subspace. We have that ∇ int µ γ = ∇γ +(∇γ) 2 n (1) because this vector is parallel to the surface. Hence (∇ µ ∇ ν γ) int = ∇ int µ ∇ int ν γ − (∇γ) 2 g int µν /γ. Using angular coordinates for the surface we have the intrinsic metric ds 2 = γ(Ω) 2 dΩ 2 . We have, writing all covariant derivatives and contractions with respect to the metric g µν of the unit sphere, On the other hand, using formulae for the conformal transformations, the Ricci tensor and Ricci scalar are given by Using that for the unit sphere R µν = (d ⊥ − 1)g µν and R = d ⊥ (d ⊥ − 1) we have Therefore, from (B.4) and (B.8) we conclude that using the extrinsic curvatures of γ we can not form additional invariants to the ones formed with the intrinsic geometry of γ on the null cone. For example, the invariant multiplying the type B anomaly coefficient in Solodukhin's formula [34] for the universal logarithmic term of the entanglement entropy in d = 4 vanishes, Hence only the A anomaly contributes on the cone.

C EE for wiggly spheres in holographic RG flows
We are going to compute holographically terms in the entropy induced by a mass parameter in the difference between the entropy of a sphere of radius R and a wiggly sphere centered around the same radius. We work in d = 4 for concreteness. As a model for wiggly sphere we consider We are looking for the limit of small wiggle size, l → ∞, a → 0, and, as in the proof of the a-theorem, we take the size of the wiggles of the order of their width, a ∼ l −1 . The result is independent of m. We choose m = 0. The solution for the extremal surface for the UV CFT is given by (4.18) (r(θ, Ω)) −1 = R −1 1 + a Y l0 (Ω) √ πΓ(3 + l) 2 2+l Γ( 3+2l 2 ) (cos θ) l 2 F 1 ( l − 1 2 , l 2 , 3 2 + l, cos 2 θ) . (C. 2) The function of l and θ multiplying aY l0 has value 1 for θ = 0 and decays exponentially fast with l large for fixed θ > 0. It is not exponentially suppressed only for θ l −1 . This means that the deformation due to the wiggles on the minimal surface decays exponentially fast towards the interior of AdS, and, for small wiggle width, are only relevant near the AdS boundary. This means their contribution is dominated (except for terms exponentially small in the inverse wiggle size) by the UV fixed point. Hence, in an holographic calculation we can just use the UV perturbed AdS metric to compute the effect of the mass deformation on the wiggles.
Near the boundary the metric is deformed to leading order as where g is proportional to the coupling constant and α = d−∆, with ∆ the scaling dimension of the operator producing the RG flow. In terms of ther, θ coordinates, the change in the metric is δds 2 = −g 2 (r sin(θ)) 2α−2 dr − 2 sin(θ) + dr + 2 sin(θ) +r cos(θ)dθ 2 . (C.4) The variation of the area due to the variation of the metric is where h µν is the induced metric on the surface, and the computation is over the unperturbed surface. Then we get for the difference of entropies between wiggly and normal spheres, to leading order in g 2 , ∆A = δA wiggly − δA sphere = − g 2 2 dΩ dθ cos(θ) 4 sin(θ) 2α−3 ∆(r) 2α . (C.6) The factor ∆(r) 2α decays exponentially towards the bulk and makes the perturbative expansion on the metric deformation valid. Using (C.2), and expanding for small wiggly size to second order to get a non trivial angular integral, we get ∆A = α (α−1) a 2 g 2 R 2α (C.7) The integrand is proportional to θ 2α−3 for small θ. Then the integral diverges for ∆ ≥ 3, which is the onset of massive divergent area terms in d = 4. The divergences give place to local terms that are Markovian and can be subtracted. For ∆ < 3, we get a finite integral with the following behavior for large l ∆A ∼ a 2 l −2(3−∆) g 2 R 2α ∆ < 3 . (C. 8) This clearly vanishes in the limit of small wiggle size and width. For 4 > ∆ > 3 we have, once the divergence for θ → 0 has been subtracted, the same result (C.8). Since we are taking the limit of small wiggles with fixed slope, a ∼ l −1 , this term also vanishes in the limit of small wiggles. These terms represent the change of the non local term (5.18) due to the wiggles.