Entanglement, Holography and Causal Diamonds

We argue that the degrees of freedom in a d-dimensional CFT can be re-organized in an insightful way by studying observables on the moduli space of causal diamonds (or equivalently, the space of pairs of timelike separated points). This 2d-dimensional space naturally captures some of the fundamental nonlocality and causal structure inherent in the entanglement of CFT states. For any primary CFT operator, we construct an observable on this space, which is defined by smearing the associated one-point function over causal diamonds. Known examples of such quantities are the entanglement entropy of vacuum excitations and its higher spin generalizations. We show that in holographic CFTs, these observables are given by suitably defined integrals of dual bulk fields over the corresponding Ryu-Takayanagi minimal surfaces. Furthermore, we explain connections to the operator product expansion and the first law of entanglement entropy from this unifying point of view. We demonstrate that for small perturbations of the vacuum, our observables obey linear two-derivative equations of motion on the space of causal diamonds. In two dimensions, the latter is given by a product of two copies of a two-dimensional de Sitter space. For a class of universal states, we show that the entanglement entropy and its spin-three generalization obey nonlinear equations of motion with local interactions on this moduli space, which can be identified with Liouville and Toda equations, respectively. This suggests the possibility of extending the definition of our new observables beyond the linear level more generally and in such a way that they give rise to new dynamically interacting theories on the moduli space of causal diamonds. Various challenges one has to face in order to implement this idea are discussed.


Introduction
It has now been a decade since Ryu and Takayanagi [1,2] discovered an elegant geometric prescription to evaluate entanglement entropy in gauge/gravity duality. In particular, the entanglement entropy between a (spatial) region V and its complementV in the boundary theory is computed as That is, one determines the extremal value of the Bekenstein-Hawking formula evaluated on bulk surfaces v which are homologous to the boundary region V . In the subsequent years, holographic entanglement entropy has proven to be a remarkably fruitful topic of study. In particular, it provides a useful diagnostic with which to examine the boundary theory. For example, it was shown to be an effective probe to study thermalization in quantum quenches, e.g., [3][4][5][6] or to distinguish different phases of the boundary theory, e.g., [7][8][9]. In fact, such holographic studies have even revealed new universal properties that extend beyond holography and hold for generic CFTs, e.g., [10][11][12][13]. However, holographic entanglement entropy has also begun to provide new insights into the nature of quantum gravity in the bulk. As first elucidated in [14,15], the Ryu-Takayanagi prescription indicates the essential role which entanglement plays in creating the connectivity of the bulk geometry or more generally in the emergence of the holographic geometry. In fact, this has lead to a new prescription to reconstruct the bulk geometry in terms of a new boundary observable known as 'differential entropy', which provides a novel prescription for sampling the entanglement throughout the boundary state [16][17][18][19].
The distinguished role of extremal surfaces in describing entanglement entropy has led to several other important insights. There is by now significant evidence that the bulk region which can be described by a particular boundary causal domain is not determined by causality alone, as one might have naively thought, but rather it corresponds to the socalled 'entanglement wedge,' which in general extends deeper into the bulk, e.g., [20][21][22][23]. That is, the bulk region comprised of points which are spacelike-separated from extremal surfaces attached to the boundary region and connected to the corresponding boundary causal domain [22]. This entanglement wedge reconstruction in turn led to the insight that local bulk operators must have simultaneous but different approximate descriptions in various spatial subregions of the boundary theory, which resulted in intriguing connections to quantum error correction [24][25][26]. We also notice that while it is not at all clear that a suitable factorization of the full quantum gravity Hilbert space corresponding to the inside and outside of an arbitrary spatial domain exists (there certainly is no obvious choice of tensor subfactors on the CFT Hilbert space), the RT prescription does provide a natural choice for such a factorization for extremal surfaces, and entanglement wedge reconstruction supports this point of view. It is therefore conceivable that a reorganization of the degrees of freedom which crucially relies on extremal surfaces will shed some light on the (non)locality of the degrees of freedom of quantum gravity, and this was in fact one of the original motivations for this work.
One interesting result that was brought to light by holographic studies of the relative entropy [27] was the 'first law of entanglement'. The relative entropy is again a general diagnostic that allows one to compare different states reduced to the same entangling geometry [28,29]. For 'nearby' states, the leading variation of the relative entropy yields a result reminiscent of the first law of thermodynamics, i.e., where H m is the modular or entanglement Hamiltonian for the given reference state ρ 0 , i.e., H m = − log ρ 0 . While the latter is a useful device at a formal level [30], in generic situations, the modular Hamiltonian is a nonlocal operator, i.e., H m cannot be expressed as a local expression constructed from fields within the region of interest. However, a notable exception to this general rule arises in considering a spherical region in the vacuum state of a CFT and in this case, the first law (1.2) becomes Here B denotes a ball of radius R centred at x on a fixed time slice, while T tt is the energy density in the excited state being compared to the vacuum. Examining this expression holographically, the energy density is determined by the asymptotic behaviour of the metric near the AdS boundary, e.g., [31]. In contrast, through Eq. (1.1), the variation of the entanglement entropy is determined by variations of the geometry deep in the bulk spacetime. Hence Eq. (1.3) imposes a nonlocal constraint on perturbations of the AdS geometry which are dual to excitations of the boundary CFT. However, if one examines this constraint for all balls of all sizes and all positions, as well as on all time slices, this can be re-expressed in terms of a local constraint on the bulk geometry [32][33][34], namely, perturbations of the AdS vacuum geometry must satisfy the linearized Einstein equations! In terms of the boundary theory, the holographic results above point towards the utility of considering the entanglement entropy as a functional on the space of all entangling surfaces (or at least a broad class of such geometries) to characterize various excited states of a given quantum field theory. In this regard, one intriguing observation [35] is that the perturbations of the entanglement entropy of any CFT naturally live on an auxiliary de Sitter geometry. In particular, the functional δS EE (R, x), defined by Eq. (1.3), satisfies the Klein-Gordon equation ∇ 2 dS − m 2 δS EE = 0 , (1.4) in the following de-Sitter (dS) geometry: (1.5) Note that the radius of the spheres R plays the role of time in dS space. The mass above is given by where d is the spacetime dimension of the CFT. 1 In CFTs with higher spin symmetries, one can extend this construction using the corresponding conserved currents to produce additional scalars, which also propagate on the dS geometry according to a Klein-Gordon equation with an appropriate mass [35] -see section 3.2 below. The proposal of [35] was that this new dS geometry may provide the foundation on which to construct an alternative 'holographic' description of any CFT. That is, it may be possible to reorganize any CFT in terms a local theory of interacting fields propagating in the auxiliary spacetime. We stress that here the CFT under consideration need not be holographic in the conventional sense of the AdS/CFT correspondence, and hence there is no requirement of a large central charge or strong coupling. Of course, the discussion in [35] only provided some preliminary steps towards establishing this new holographic dictionary and such a program faces a number of serious challenges. For example, the dS scale only appears as an overall factor of L 2 in Eq. (1.4) and so remains an undetermined constant. Of course, our experience from the AdS/CFT correspondence suggests that L would be determined in terms of CFT data through the gravitational dynamics of the holographic geometry and so here one faces the question of understanding whether the new auxiliary geometry is actually dynamical.
Another challenge would be to produce a holographic description of the time dependence of quantities in the CFT, since the above construction was firmly rooted on a fixed time slice. A natural extension is to consider all spherical regions throughout the d-dimensional spacetime of the CFT, i.e., all of the ball-shaped regions of all sizes and at all positions on all time slices. As described in [35], this extended perspective yields an auxiliary geometry which is SO(2, d)/[SO(1, d − 1) × SO(1, 1)] and the perturbations δS EE can be seen to obey a wave equation on this coset. Further it was noted that this auxiliary space is 2d-dimensional and has multiple time-like directions.
This new expanded auxiliary geometry is the starting point for the present paper. As we will describe, in the context where we are considering all spheres throughout the spacetime, it is more natural to think in terms of the causal diamonds, where each causal diamond is the domain of dependence of a spherical region. Following [36], our nomenclature will be to refer to the moduli space of all causal diamonds as generalized kinematic space, since it is a natural generalization of the kinematic space introduced there, i.e., the space of ordered intervals on a time slice in d = 2. Our focus will be to construct interesting nonlocal CFT observables on causal diamonds, similar to the perturbation δS EE in Eq. (1.3). 2 Our objective will be two-fold: The first is to examine if these new observables and the generalized kinematic space provide a natural forum to construct a complete description of the underlying CFT. The second is to investigate how the new perspective of the nonlocal observables interfaces with the standard holographic description given by the AdS/CFT correspondence.
The remainder of the paper is organized as follows: Section 2 contains a detailed discussion of the geometry of the moduli space of causal diamonds. In section 3 we define linearized observables associated with arbitrary CFT primaries. These observables are local fields obeying two-derivative equations of motion on the space of causal diamonds and they explain and generalize various known statements about the first law of entanglement entropy, the OPE expansion of twist operators, and the holographic Ryu-Takayanagi prescription. From section 4 onwards, we focus on d = 2 and the question of extending the previous framework to nonlinearly interacting fields on the space of causal diamonds. Section 4 is concerned with a certain universal class of states, for which the entanglement entropy satisfies a nonlinear equation with local interactions on the moduli space. Section 5 generalizes this discussion to higher spin theories. In particular, we construct a framework where the entanglement and its spin-three generalization are described by two nonlinearly interacting fields on the space of causal diamonds. Some challenges for the definition of more general nonlinearly interacting fields are discussed in section 6. In section 7, we conclude with a discussion of open questions and future directions for this program of describing general CFTs in terms of nonlocal observables on the moduli space of causal diamonds, and also formulating the AdS/CFT correspondence within this framework for holographic CFTs. Appendix A discusses various geometric details and generalizations. Some of our conventions are fixed in appendix B. Appendix C contains explicit computations to verify the AdS/CFT version of our generalized first law.
Note: While this work was in progress, the preprint [38] by Czech, Lamprou, McCandlish, Mosk and Sully appeared on the arXiv, which explores ideas very similar to the ones presented here.

The geometry of causal diamonds in Minkowski space
In this section, we examine the geometry of the generalized kinematic space introduced in [35]. We begin by deriving the natural metric on this moduli space of all causal diamonds in a d-dimensional CFT. As noted above, this 2d-dimensional metric will turn out have multiple time directions, and in particular, has signature (d, d). We will also discuss how to intuit this signature geometrically in terms of containment relations between causal diamonds.
y µ x µ w µ c µ µ Figure 1. A causal diamond (in d = 3 dimensions) and our basic coordinates. Specifying the timelike separated pair of points (x µ , y µ ) is equivalent to specifying a spacelike (d − 2)-sphere which consists of all points w µ null separated from both x µ and y µ , i.e., satisfying Eq. (2.2). The alternative parametrization in terms of c µ = 1 2 (y µ + x µ ) and µ = 1 2 (y µ − x µ ) will prove convenient in section 2.2.

Metric on the space of causal diamonds
Spheres are destined to play a special role in CFTs, as the conformal group SO(2, d) in d dimensions maps them into each other. The past and future development of the region enclosed by a (d−2)-sphere form a causal diamond and hence the space of all (d−2)-spheres is the same as the space of all causal diamonds. 3 Therefore a generic (d − 2)-sphere can be parametrized in terms of the positions of the tips of the corresponding causal diamond. That is, given these positions, x µ and y µ , the (d − 2)-sphere is the intersection of the past light-cone of the future tip and the future light-cone of the past tip, as shown in Figure 1. Of course, these points are necessarily timelike separated, 4 i.e., (x − y) 2 < 0 . (2.1) The corresponding sphere comprising the intersection of the light-cones illustrated in the figure can be defined as the set of points w µ which are null-separated from both x µ and y µ : The generalized kinematic space is the moduli space of all causal diamonds. The easiest way to construct the metric on this space is to start with an (d + 2)-dimensional embedding space parametrized by coordinates with µ = 0, · · · , d − 1. Further this embedding space has a flat metric with signature (2, d): where η µν = diag(−1, +1, . . . , +1) is the usual d-dimensional Minkowski metric. Of course, this geometry is invariant under Lorentz group SO(2, d) -which, of course, matches the conformal group acting on a d-dimensional CFT. As a warm-up, let us discuss the familiar example of anti-de Sitter space in this language. The (d + 1)-dimensional anti-de Sitter (AdS) space with curvature radius R AdS corresponds to a hyperboloid defined as where · , · denotes the inner product with respect to the metric (2.4). It can be thought of as a set of all the points in the embedding space that can be reached by acting with SO(2, d) transformations on a unit timelike vector, e.g., on the vector (1, 0, . . . , 0). Since any timelike vector in (2.4) is preserved by an SO(1, d) subgroup of the conformal group, (d + 1)-dimensional anti-de Sitter space is a coset space SO(2, d)/SO (1, d). The metric on this coset is induced by the embedding space metric (2.4). For example, the Poincaré patch AdS metric is obtained from the metric (2.4) upon using the following parametrization of the AdS hyperboloid (2.5): Of course, the asymptotic boundary of AdS space is reached by taking the limit z → 0. In the context of the AdS/CFT correspondence, SO(2, d) transformations leaving the embedding geometry (2.4) invariant become the conformal transformations acting on the boundary theory. Of course, this highlights the advantage of the embedding space approach. Namely, the SO(2, d) transformations act linearly on the points (2.3) in the embedding space.
In the following, we will phrase our discussion in terms of the geometry of the CFT background being defined by the boundary of the AdS hyperboloid (2.5) because we feel that it is an intuitive picture familiar to most readers. However, with only minor changes, Figure 2. Anti-de Sitter hyperboloid in flat embedding space R 2,d is indicated in blue. The timelike embedding coordinates are X − and X 0 . The remaining directions (including the d − 1 suppressed dimensions X 2, ··· ,d at each point) are spacelike. The green d-plane is orthogonal to the timelike vector T b and to the spacelike vector S b (the latter being hidden in the suppressed dimensions). The intersection of the d-plane with AdS d+1 yields the green minimal surface. Its boundary as the hyperboloid approaches the red lightcone defines a (d − 1)-sphere in the CFT. the entire discussion can be phrased in terms of the embedding space formalism, e.g., [39][40][41], which can be used to consider any CFT and makes no reference to the AdS/CFT correspondence. Hence we stress that the geometry of the generalized kinematic space that emerges below applies for general d-dimensional CFTs.
We now turn to the moduli space of causal diamonds in a CFT, which we construct using the language of cosets, in similar manner to that introduced above in discussing the AdS geometry (2.5). In order to describe a sphere in a CFT, we choose a unit timelike vector T b and an orthogonal unit spacelike vector S b , both of which are anchored at the origin of the (d+2)-dimensional embedding space. That is, we choose two vectors satisfying The sphere is now specified by considering asymptotic points in the AdS boundary that are orthogonal to both of these unit vectors, i.e., T, X z→0 = 0 and S, X z→0 = 0 .
To explicitly illustrate this construction of a sphere in the CFT, let us consider the coordinates (2.7) yielding the Poincaré patch metric (2.6). A convenient choice of the unit vectors is then The expressions on the right denote the surfaces in the asymptotic geometry that are picked out by the orthogonality constraints (2.9), i.e., T b selects a particular time slice in the boundary metric while S b selects a timelike hyperboloid. Of course, the intersection of these two surfaces then yields the unit (d − 2)-sphere δ ij w i w j = 1 (on the time slice w 0 = 0). Now a particular choice of the unit vectors, T b and S b , picks out a particular sphere in the boundary geometry. Acting with SO(2, d) transformations, we can then reach all of the other spheres throughout the d-dimensional spacetime where the CFT lives. To determine the coset describing the space of all spheres, we must first find the symmetries preserved by any particular choice of the unit vectors. Given two unit vectors satisfying Eq. (2.8), we have defined a timelike two-plane in the embedding space. Hence the SO(2, d) symmetry broken to SO(1, d − 1) transformations acting in the d-dimensional hyperplane orthogonal to this (T, S)-plane, as well as the SO(1, 1) transformations acting in the two-plane. Thus, in analogy with AdS coset construction above, the natural coset describing the moduli space of spheres, or alternatively of causal diamonds, in d-dimensional CFTs is .
Of course, this is precisely auxiliary geometry already described in [35]. The interpretation of the stabilizer group, which preserves a given sphere in the CFT, is as follows: The SO(1, d − 1) factor of the stabilizer group is the subgroup of SO(2, d) comprising of (d − 1)(d − 2)/2 rotations and d − 1 spatial special conformal transformations leaving a given sphere invariant. While it is obvious that the former transformations preserve spheres centred at the origin, it can also be verified that the latter do so as well. Further, let us note that these transformation also leave invariant the time slice in which the sphere is defined. The additional SO(1, 1) represents a combination of special conformal transformations and translations, which both involve the timelike direction and leads to a modular flow generated by the conformal Killing vector K µ -see appendix A.2. The latter was constructed precisely in such a way to preserve a given spherical surface.
We can also perform a simple cross-check at the level of counting dimensions. The moduli space of causal diamonds can parametrized by a set of 2d coordinates: x µ and y µ , i.e., the positions of the tips of the causal diamonds. Now, the number of generators of the isometry group SO(2, d) is (d + 2)(d + 1)/2, whereas for the stabilizer group SO(1, d − 1) × SO(1, 1) we have d(d − 1)/2 + 1 = d(d + 1)/2 generators. The difference between the two numbers matches the dimensionality of the space of causal diamonds, i.e., 2d, as it must.
In the context of the AdS/CFT correspondence, we can remove the asymptotic limit from the orthogonality constraints (2.9), i.e., consider T, X = 0 and S, X = 0. These constraints now specify not only the sphere on a constant time slice of the AdS boundary (at z = 0), but the entire minimal surface anchored to this sphere. With the simple example of T b and S b given in Eq. (2.10), these constraints yield the unit hemisphere z 2 + δ ij w i w j = 1 on the time slice w 0 = 0. Of course, using the Ryu-Takayanagi prescription (1.1), the area of this surface computes the entanglement entropy of the region enclosed by the (asymptotic) sphere in the vacuum of the boundary CFT.
Let us now move to the object of prime interest for us, which is the metric on the coset M (d) ♦ induced by the flat geometry of the (d + 2)-dimensional embedding space. Towards this end, we parameterize motions in this generalized kinematic space by variations of the unit vectors T b and S b . Of course, these are naturally contracted with the embedding space metric (2.4) and so the most general SO(1, d − 1)-invariant metric can be written as: where α T T , α SS and α T S are constant coefficients. Also requiring invariance under SO(1, 1) transformations, i.e., under boosts in the (T, S)-plane, requires that we set α T S = 0 and α SS = −α T T ≡ L 2 -only the relative sign of α SS and α T T is determined by boost invariance but we choose α SS > 0 here for later convenience. This then yields Next, we must impose the conditions (2.8) and (2.9) in the above expression to fix the metric (up to an overall prefactor) in terms of geometric data in the CFT. This calculation is straightforward but somewhat tedious, and we refer the interested reader to Appendix A.1 for the details. Our final result for the metric on the coset M (d) ♦ given in (2.11) becomes: where x µ and y µ denote the past and future tips of the corresponding causal diamond, as illustrated in Figure 1. This metric is the main result of the present section and the starting point for our investigations of the generalized kinematic space in the subsequent sections. Some comments are now in order: First, it is straightforward to verify that this metric (2.14) is invariant under the full conformal group. Second, the pairs (x µ , y µ ) appear as pairs of null coordinates in the metric (2.14). As a result, this metric on the coset (2.14) has the highly unusual signature (d, d). Third, it is amusing to notice that while AdS geometrizes scale transformations, the coset geometrizes yet another d − 1 additional conformal transformations.
Let us now discuss two special cases for which the general result (2.14) simplifies: Example 1: Fixed time slice. The first example concerns the moduli space of spheres lying on a given constant time slice, which we can always take to be t = 0. That is, we choose y 0 = −x 0 = R and x = y and then we are considering spheres on the t = 0 slice with radius R and with x giving the spatial position of their centres. Constraining the coordinates x µ and y µ in this way, the coset metric (2.14) reduces to That is, we have recovered precisely the d-dimensional de Sitter space appearing in Eq. (1.5) as a submanifold of the full coset M Figure 3. Lightcone coordinates for two-dimensional causal diamonds. The coordinates (ξ,ξ) will provide a useful parametrization of the given diamond in section 3.7. Changing the endpoints corresponds to moving in the moduli space of causal diamonds parametrized by (u,ū, v,v); thereby u is constant if x µ moves along the line ξ = u, and so forth.
Example 2: CFT in two dimensions. A second special case of interest is the restriction to d = 2. The metric on the coset in two dimensions has a structure of a direct product of two copies of two-dimensional de Sitter space. One can see this explicitly by introducing right-and left-moving light-cone coordinates, e.g., we replace the Minkowski coordinates Then we may specify the two-dimensional causal diamonds, defined by (x µ , y µ ) above, in terms of the positions of their four null boundaries -see Figure 3, Finally re-expressing the coset metric (2.14) in terms of these coordinates yields Notice that the first copy of de Sitter metric is only a function of the right-moving coordinates, whereas the second copy depends only on the left-moving coordinates. We chose the normalization on the right hand side of Eq. (2.18) in such a way that L is the curvature scale in each de Sitter component and upon restricting to a timeslice (i.e.,ū = v ≡ x − R andv = u ≡ x + R), Eq. (2.15) obviously emerges. This way we can heuristically think of each of the two copies of dS 2 in (2.18) as a copy of the geometry in Eq. (2.15). Of course, the product structure found in the moduli space metric here has its origins in the fact that for two dimensions, the conformal group itself decomposes into a direct product, i.e., SO(2, 2) SO(2, 1) × SO(2, 1), where the two factors act separately on the right-and left-moving coordinates. Hence the moduli space (2.11) of intervals in d = 2 CFTs becomes where we recognize that each of factors corresponds to a two-dimensional de Sitter space.

The causal structure on the space of causal diamonds
Given the metric (2.14) on the moduli space of causal diamonds, we are in the position to study the causal structure of this space. The essential feature of this causal structure comes from the fact that the space possesses d spacelike and d timelike directions. We start by writing the metric (2.14) in terms of the coordinates Here, c µ denotes the position of the centre of the causal diamond or, equivalently, the centre of the corresponding sphere. Similarly, µ denotes the vector from the centre to the future tip of the causal diamond -see Figure 1. The metric (2.14) then becomes First, we note that 2 < 0 from Eq. (2.1), i.e., the tips of the causal diamond are timelike separated. Further, we observe that the tensor η µν − 2 2 µ ν is positive definite again because µ is a timelike vector. This is easily verified by picking a frame where, say, µ ∝ δ µ 0 . In such a frame, the metric (2.21) reduces to Therefore, the sign of ds 2 ♦ is determined solely by the last factor in Eq. (2.21) containing the differentials. In particular, we can now see that c µ are the d spacelike directions in the space of causal diamonds, while µ are the d timelike directions. To make this precise, consider two infinitesimally close causal diamonds specified by their coordinates ♦ 1 = (c µ , µ ) and ♦ 2 = (c µ + dc µ , µ + d µ ), we say that their separation is spacelike, timelike or null if ds 2 ♦ (c µ , µ ) is positive, negative or zero, respectively. From this, it is now easy to intuit the timelike, spacelike and null directions in the moduli space of causal diamonds as follows: (a) Moving the centre c µ of a causal diamond by an infinitesimal amount dc µ in any of the d directions of the background Minkowski spacetime of the CFT corresponds to moving in a spacelike direction in the coset space. Geometrically, this corresponds to translating the diamond without deforming it.
(b) Moving any of the 'relative' coordinates µ by some d µ corresponds to a timelike displacement in the coset space. In the diamond picture, this corresponds to stretching the diamond in one of d independent ways while holding the centre of the diamond fixed.
(c) Null movements correspond heuristically to deforming the diamond by the 'same' amount as it is translated in spacetime, as quantified by the condition ds 2 These cases are illustrated in Figure 4 for infinitesimal displacements. It is noteworthy that moving the centre of causal diamond in the time direction, i.e., with dc 0 , produces a spacelike displacement in the kinematic space. We return to discuss this point in section 7.
Let us now give a slightly different perspective on the measure of distances on this moduli space. Consider two causal diamonds, specified by the coordinates of their tips, . The conformal symmetry ensures that there exists a natural conformally invariant measure of distance, namely, the cross ratio As we will show the cross ratio paves the way to understanding the global causal structure of the moduli space of diamonds, however, first we relate this expression to the previous discussion. Hence we translate it to the 'centre of mass' coordinates and consider the two causal diamonds with ♦ 1 = (c µ , µ ) and ♦ 2 = (c µ + ∆c µ , µ + ∆ µ ). Then the invariant cross ratio reads In the second line, we are expanding the cross ratio for infinitesimal displacements and the ellipsis indicates terms of cubic order in ∆c µ and ∆ µ . Comparing to Eq. (2.21), we see that causal diamonds that are very nearby That is, for infinitesimal displacements, the cross ratio encodes the invariant line element (2.21) of the generalized kinematic space. Further, we observe that Eq. (2.25) shows that timelike, spacelike and null displacements in this moduli space correspond, respectively, to r > 1, r < 1 and r = 1. Two other observations about the cross ratio in Eq. (2.24): We note that the centre of mass coordinates c µ are Killing coordinates of the metric (2.21), i.e., the metric is independent of these coordinates. However, this feature also extends to finite separations, as is apparent from the first line of Eq. (2.24). That is, the position c µ of the reference diamond ♦ 1 is irrelevant for the distance to ♦ 2 and only the relative ∆c µ appears in this expression. Similarly, dc µ = d µ yields a null displacement in Eq. (2.21) but two diamonds separated by finite displacements with ∆c µ = ∆ µ are also null separated, i.e., it is straightforward to show that the first line of Eq. (2.24) yields r = 1 in this situation. Geometrically, ∆c µ = ∆ µ corresponds to two diamonds whose past tips coincide (and similarly, ∆c µ = −∆ µ corresponds to diamonds whose future tips coincide).
We can go further and define an invariant 'geodesic distance' function between two diamonds ♦ 1 = (x µ 1 , y µ 1 ) and ♦ 2 = (x µ 2 , y µ 2 ) in terms of the cross ratio as (2.26) As we will show in examples, this distance function computes geodesic distance between finitely separated diamonds, within the range of validity specified above. Note then that the corresponding cross ratio is greater than, less than or equal to 1 if two diamonds may be connected by a timelike, spacelike or null geodesic. However, the converse need not be true, i.e., , even if the cross ratio is positive, there may not be a geodesic connecting the corresponding diamonds -see further discussion below. Further, note that as r → ∞, the corresponding causal diamonds become infinitely timelike separated. However, there is a maximal spacelike separation that can achieved by following geodesics through the coset, i.e., at r = 0, we find d max = πL. Equipped with the distance function (2.26), let us briefly comment on the structure of the cross ratio (2.23). We have the following interesting cases in general: • (x 1 −y 1 ) 2 → 0 or (x 2 −y 2 ) 2 → 0: if one of the diamonds' volumes shrinks to zero, 5 the cross ratio and the distance function both diverge, in particular, d(♦ 1 , ♦ 2 ) → −∞. This is just the statement that zero-volume diamonds lie at the timelike infinity of the coset space M • y 1 → y 2 or x 1 → x 2 : if either the past or future tips of two diamonds coincide, the cross ratio becomes one and the invariant distance d(♦ 1 , ♦ 2 ) vanishes, i.e., the diamonds become null separated.
x 1 y 1 x 2 y 2 Figure 5. Illustration of lightcone singularities in the moduli space of causal diamonds. We compare the big blue reference diamond ♦ 1 with the small blue diamond ♦ 2 . If the red (green) tip of ♦ 2 leaves the red (green) shaded lightcone region, the geodesic distance d(♦ 1 , ♦ 2 ) becomes infinite, i.e., the diamonds are no longer geodesically connected. An example of this happening would be by moving the tip x 2 along the arrow towards the lightcone of y 1 .
• (y 1 − x 2 ) 2 → 0 or (y 2 − x 1 ) 2 → 0: if the future (past) tip of one causal diamond approaches the lightcone of the past (future) tip of the other diamond (as illustrated in Figure 5), the cross ratio vanishes and the corresponding separation again reaches the maximal value d max = πL.
Let us comment further on the domain of validity of our geodesic distance function. As defined in Eq. (2.26), this function is well-defined for r ≥ 0. However, as commented above, merely having r ≥ 0 does not ensure that the corresponding causal diamonds are connected by a geodesic. Further, certain pairs of causal diamonds will also yield r < 0. Examining Eq. (2.23), we see that both factors in the denominator are negative by construction, i.e., the tips of each casual diamond must be timelike separated, and hence the sign of r is determined by the numerator.
Let us consider beginning with two nearby diamonds, ♦ 1 and ♦ 2 . Both (y 1 − x 2 ) 2 < 0 and (y 2 − x 1 ) 2 < 0 so that the cross ration is positive. As indicated by Eq. (2.24), we will have r ≈ 1 in this situation. If we deform the second diamond away from ♦ 1 in a spacelike direction, (not necessarily along a geodesic), the cross ratio will decrease. As described above, if the future (past) tip of ♦ 2 reaches the lightcone of the past (future) tip of ♦ 1 , the cross ratio and the corresponding distance vanishes -see Figure 5. If we continue deforming in the same direction, one of the factors in the numerator is now positive and r becomes negative, e.g., pushing the future tip of ♦ 2 out of causal contact with the past tip of ♦ 1 gives (y 2 − x 1 ) 2 > 0. Now in this range of r, the distance function (2.26) is not defined and there is no geodesic connecting the corresponding causal diamonds. Hence submanifold of configurations where r (first) vanishes defines the 'maximum' range which  the geodesics originating at ♦ 1 can reach in the kinematic space. Note that generically if ♦ 2 lies on this boundary where r = 0, then the two diamonds will not be connected by a geodesic. However there are exceptional configurations with a vanishing cross ratio, which are connected. These are 'antipodal' points in the kinematic space, which are in fact connected by multiple geodesics -see further discussion below. As noted above, this configuration yields to the maximal spacelike separation that can be reached along a geodesic, i.e., d max = πL.
One can further deform ♦ 1 and ♦ 2 so that the two diamonds become completely out of causal contact with each other, i.e., both (y 1 − x 2 ) 2 > 0 and (y 2 − x 1 ) 2 > 0. In this case, the cross ratio passes through zero again to reach positive values. However, even though Eq. (2.26) is well defined for these diamonds, there will still be no geodesic connecting them. Figure 6 shows some more examples of the causal structure on the moduli space of (two-dimensional) causal diamonds. In particular, note the cases (a) and (b) of that figure, which illustrate two statements that are generally true (in any number of dimensions): (i) If two causal diamonds are contained within one another, then they are timelike separated.
(ii) If two causal diamonds touch in at least one corner, then they are null separated.
Let us now return to the two examples which we identified as being of particular interest in section 2.1: Example 1: Fixed time slice. If we compare diamonds ♦ 1,2 on a given time slice, we know from our previous discussion that we are restricting to a submanifold with the geometry of d-dimensional de Sitter space. Taking the time slice to be t = 0, we have c 0 1 = c 0 2 = 0 and i 1 = i 2 = 0. Using the same coordinates as before, x i ≡ c i and R ≡ 0 > 0, the cross ratio simplifies as We observe the following causal relations between spatial spheres lying on a common time slice: 6 , one sphere is contained within the other.
i.e., the spheres overlap but neither is fully contained within the other.
• r dS d = 1 if and only if ( i.e., the spheres tangentially touch in at least one point. • Note that r dS d → 0 as ( x 1 − x 2 ) 2 → (R 1 + R 2 ) 2 , which corresponds to the point where the two spheres become disjoint.
It is straightforward to show that this de Sitter geometry is a 'completely geodesic' submanifold of the full kinematic space (2.11). That is, all of the geodesics within dS d are also geodesics of M To provide some intuition for our previous discussion, Figure 7 illustrates representative geodesics emanating from a particular point in the dS geometry. 7 We observe here that the 6 We assume here that ( x1 − x2) 2 ≤ (R1 + R2) 2 , for otherwise the spheres would not be geodesically connected -see further discussion in the following. 7 The planar coordinates used in Eq. (2.15) and above actually only cover half of the de Sitter geometry.
The surface R = ∞ would correspond to a diagonal running across the Penrose diagram in Figure 7. The figure and our discussion here assume a suitable continuation of the cross ratio to the entire geometry. Let us add here that the additional Z2 identification discussed in footnote 3 would here identify points by an inversion in the square in figure 7, as well as an inversion on the corresponding S d−2 at each point cross ratio (2.27) never becomes negative for spheres restricted to a fixed time slice, however, it does reach zero as noted above just as the spheres become disjoint. As illustrated in the figure, the boundary where r = 0 corresponds to the past and future null cone emerging from the antipodal point to ♦ 1 . Hence there are 'shadow regions' in the dS space which cannot be reached along a single geodesic originating from this reference point. Note, however, that there are an infinite family of spacelike goedesics that extend from ♦ 1 to this antipodal point.
Example 2: CFT in two dimensions. In our previous discusion, we showed that for d = 2, the coset factorizes into dS 2 ×dS 2 , with the metric as in Eq. (2.18). The cross ratio r also factorizes when written in the {u, v,ū,v} coordinates: where the conformally invariant cross ratio for two points on the dS 2 factor is given by and similarly with bars. Using this factorization of the cross ratio, one can then compute the geodesic distance on dS 2 ×dS 2 using Eq. (2.26).
We close this section with two explicit examples of simple geodesics on the full kine- . We wish to compare it with the family of diamonds ♦ (λ) = (c µ 1 , √ λ µ 1 ) for 0 < λ < ∞. One can verify that λ parameterizes a timelike geodesic in the space of causal diamonds. As λ → 0, the diamond shrinks to zero size and approaches a locus in the asymptotic past. Similarly, λ → ∞ follows a geodesic to future asymptotia. The geodesic distance in this case can be computed explicitly: A second simple example corresponds to a class of null geodesics ♦(λ) = (c µ (λ), µ (λ)) with ∂ λ c µ = ±∂ λ µ , where λ denotes the affine parameter along the geodesic. Here we begin by noting that because the center of mass coordinates are Killing coordinates for the metric (2.21), the following are conserved quantities along any geodesics in the kinematic space: Further, the full geodesic equations for µ (λ) simplify greatly upon substituting ∂ λ c µ = ±∂ λ µ and one finds that they are solved by

32)
on the diagram, to produce elliptic de Sitter space. With regards to the minimal geodesic distances, this identification would essentially remove the right half of the square, e.g., there would no longer be any shadow regions.
which consistently maintains the desired equality between ∂ λ c µ and ±∂ λ µ . As noted above, ∆c µ = ±∆ µ corresponds to two diamonds whose past/future tips coincide and so these geodesics correspond to a simple monotonic trajectory through a family of causal diamonds where one tip remains fixed. A simple example is given by choosing µ = δ µ 0 R(λ) and c 0 = δ µ 0 t(λ), which yields where R 1 is a constant determining the radius of the corresponding sphere at λ = 1.

Observables in a linearized approximation
As discussed in the introductions, we are interested in trying to construct new nonlocal observables S O (x, y) with a (local) primary operator O in the CFT and associated to a causal diamond with past and future tips, x and y. Our motivation in the present section is to construct extensions of the first law of entanglement for spherical regions in the CFT vacuum. Again, as shown in Eq. (1.3), the perturbations in the entanglement entropy is given by the expectation value of a local operator, the energy density, integrated over the region enclosed by the sphere. This result was used in [35] to show that such first order perturbations obey a free wave equation on the corresponding kinematic space, i.e., d-dimensional de Sitter space. Moreover, a generalization of the first law was constructed for a conserved higher spin current, which yields an analogous charge Q (s) defined on the spherical region which also obeys a free wave equation on de Sitter space. Here, we would like to extend these results characterizing small excitations of the vacuum to arbitrary scalar primaries. 8 We propose that a natural generalization of the first law to arbitrary primaries takes the following form 9 : where the integral is over the causal diamond D(x, y) with past and future endpoints x, y, and ∆ O is the scaling dimension of the primary operator O. The constant C O is a normalization constant for which there is no canonical choice at the linearized level. Note that the integral above diverges for ∆ O ≤ d − 2, however, a universal finite term can still be extracted in this range. We return to this point in section 7.
In the following, we will show that the quantity Q(O) has the following four properties: 4. In the case where the CFT has a holographic dual in the standard sense, Q(O) has a very simple bulk description. If φ is the bulk scalar that corresponds to O, we define whereB(x, y) is the minimal surface whose boundary ∂B(x, y) matches the maximal sphere at the waist of the causal diamond in the boundary CFT, i.e., the intersection of the past light-cone of y with the future light-cone of x. We will show that Q holo (O; x, y) = Q(O; x, y) with an appropriate choice of the normalization constant C blk , which is determined by C O and standard AdS/CFT parameters -as we show explicitly in Appendix C.
We stress that the first three properties above do not rely on an underlying holographic construction and hence apply for generic CFTs. It is only point 4, which directly connects to the AdS/CFT construction and so hints at the interesting new perspective which these nonlocal observables may provide for holography. Below, we will provide a more detailed explanation of each of these points and then discuss various other aspects of Q(O).
However, before proceeding, we want to highlight that Eq. (3.1) can be compactly reexpressed in terms of the conformal Killing vector K µ which preserves the causal diamond -see appendix A.2 and Figure 8. In particular, using K µ , Eq. (3.1) becomes 10 where the factors of 2π arise from a standard choice of normalization for the vector. Of course, these factors could easily be absorbed by redefining the constant C O .

Dynamics on the space of causal diamonds
To show that Q(O) obeys a wave equation on the moduli space of causal diamonds is fairly straightforward. If we denote the generators of the conformal group by L i , then where L i (x) is the first order differential operator for the purely geometric action of the conformal group on the point x µ , and similarly for L i (y). 11 The fact that Eq. (3.4) holds follows from the fact that the kernel that appears in Eq. (3.1) can be interpreted formally as a three-point function of two primary operators of dimensions zero and one primary 10 Recall for appendix A.2 that K µ is a timelike vector and hence our notation is |K| = −ηµν K µ K ν . 11 That is, Li(x) and Li(y) are given by the expressions in Eq. (B.2) with ∆O = 0.
operator of dimension d − ∆ O . Such a three-point function is conformally invariant and as a result the action of L i (x) + L i (y) on the kernel can be converted in the action of −L i (ξ) (with a contribution from the non-trivial operator dimension of the third operator). A partial integration then yields Eq. (3.4). In fact, we could conversely have derived Eq. (3.1) by insisting that it obeys the intertwining property (3.4), and we can make this property more transparent by rewriting Eq. (3.1) using the shadow operator formalism [42] as where Y represents a formal non-trivial primary operator of conformal dimension zero. The action of second Casimir of the conformal group C 2 ≡ C ij L i L j on Q(O) is obtained by applying Eq. (3.4) twice. The left hand side of the equation then becomes (3.6) Because L i (x) + L i (y) represents the action of the conformal group on the moduli space, which is parametrized by pairs of (timelike separated) points (x, y), these are also the Killing vectors on this space, and the Casimir operator C ij (L i (x) + L i (y))(L j (x) + L j (y)) is the massless Klein-Gordon operator for the metric (2.14). On the right hand side, we get the combination 12 and therefore Q(O) obeys the following wave equation where ∇ 2 ♦ is the Klein-Gordon operator on the metric (2.14). We conclude that the Casimir is represented on the space of causal diamonds M (d) ♦ . This can also be explicitly verified by acting with the Lorentz representation of C 2 on Eq. (3.4). For our conventions and normalizations in this regard, see appendix B.

Operators with spin and conserved currents
Our construction can be easily generalized to the case where the primary operator is a traceless symmetric tensor of rank and scaling dimension ∆ O . In this case, conformal invariance again provides a natural candidate for a 'first law'-like expression which takes the form Figure 8. Domain of dependence D(x, y) in the CFT 2 (red shaded) of a sphere B(x, y), and the associated causal wedge in pure AdS 3 (blue shaded). The geodesic (blue) is the bulk minimal surfaceB(x, y). Green arrows indicate the timelike Killing flow generated by K µ , which becomes null at the boundary of the domain of dependence (and vanishes at x µ and y µ , see also Figure 10).
with K µ , the conformal Killing vector introduced above -see appendix A.2 and Figure  8. 13 Using this vector as in Eq. (3.15), the above generalization can be written in the compact form: This expression in Eq. (3.9) follows from the shadow field formalism developed in [43] and the explicit result for the three-point function of two scalars and one higher spin field, e.g., in [44]. From conformal symmetry arguments (or alternatively from explicit calculationsee appendix B), it follows again that the expression in Eq. (3.9) satisfies a 'spinning' wave equation on the space of causal diamonds: To illustrate the definition (3.9), we turn to the second point in our list of properties above and show that it reproduces the known first laws [35] when the operator is a conserved current. If the traceless symmetric tensor corresponds to a conserved current, then 14 (3.13) 13 Note that s µ and K µ are both future-directed vectors within the causal diamond. 14 Note that substituting Eq. (3.13) into Eq. (3.12) yields m 2 , which differs by a factor of two from the mass-squared reported in [35]. However, as described above Eq. (2.15), restricting Hence Eq. (3.11) can be written as where we have introduce the conserved current J µ ≡ K µ 2 · · · K µ O µµ 2 ···µ (ξ) . 15 Now suppose we foliate the causal diamond by slices that are everywhere orthogonal to the vector field K µ , and we also introduce a flow parameter in the direction of K µ which we will call λ: It is then clear that we can re-express the measure as As a result, we obtain with n µ = K µ /|K| being the timelike unit normal to the constant λ slices. However, because J µ is a conserved current, the integral over a slice of n µ J µ does not depend on the slice. Hence where B(x, y) is a constant λ slice, e.g., the spherical region for which D(x, y) is the domain of dependence. Note that the factor dλ is in fact divergent but it can be absorbed into the normalization constant C O . Hence upon a redefinition of the normalization constant, the final result can be written as Observe, that in fact, current conservation allows the (d − 1)-dimensional surface defining the range of the remaining integral to be chosen as any Cauchy surface within the causal diamond D(x, y), i.e., it need not be a constant λ slice. Hence as claimed in the second point on our list above, we have recovered precisely the first law for conserved currents proposed previously in [35]. In particular, the covariant version of the standard first law for entanglement entropy is immediately recovered with the choice = 2, i.e., O µ 1 µ 2 = T µ 1 µ 2 . 16 the submanifold of spheres on a fixed time slice requires 'equating' the coordinates for the two tips of the causal diamond. This has the effect of reducing the mass. Effectively one has ∇ 2 ♦ ∼ 2 ∇ 2 dS on this restricted moduli space studied in [35]. In two dimensions the space M (2) ♦ actually factorizes in two copies of dS2 as in (2.18). In this case one can make the above statement precise by noting that ∇ 2 ♦ = 2(∇ 2 dS 2 +∇ 2 dS 2 ) with each of the dS2 spaces contributing m 2 dS 2 L 2 = − ( − 1) and m 2 dS 2 = 0, respectively (c.f., Eq. (3.39) and the discussion there). 15 Current conservation follows here because Oµ 1 ...µ (ξ) is both traceless and conserved and because K µ is a conformal Killing vector [35] -see appendix A.2. 16 Note another case which deserves special attention is = 1 and ∆O = d − 1, which corresponds to ordinary conserved current, i.e., Jµ = Oµ . Naïvely, the above arguments would suggest that the corresponding operator (3.17) also satisfies the wave equation (3.12) on the moduli space. However, an implicit assumption in the derivation of the wave equation is that current vanishes on the sphere ∂B(x, y) and this is ensured in Eq.

Connection to the OPE
The third point in the list of features of Eq. (3.1) is the connection to the operator product expansion. In general, the OPE of two operators takes the form where on the right the sum is over all primary operators O i and its conformal descendants.
In two dimensions, where the conformal group is infinite, we will take the sum to be over all quasi-primary operators and their descendants under the global conformal group only. In principle, there is an infinite sum over conformal descendants on the right hand side, but this infinite sum can be repackaged as an integral of O i smeared against a suitable kernel, The kernel I ABO i (x, y, ξ) that appears here is completely fixed by conformal invariance. One can in principle construct it by working out the relevant conformal Ward identities and solving for them. If one does this one recognizes that the Ward-identities look exactly like those of a three-point function. In fact, this should not have come as a surprise, as the shadow field identity (3.5) indeed implies that I is proportional to a three-point function This three-point function (for scalar operators) equals (3.21) For the quantity Q(O) which appears in the first law, we imagine that there should not be any special operators located at either x or y, and indeed we recover the form of the first law in Eq. (3.1) by taking ∆ A = ∆ B = 0. Of course, in an actual conformal field theory, there is only one operator with vanishing dimension, the identity operator, for which the three-point function above actually vanishes. One should therefore view this is as a somewhat formal argument intended to explain the constraints imposed by conformal invariance alone.
Nevertheless, we notice from that Eq.
i also reproduces the kernel in Eq. (3.1) as long as ∆ A = ∆ B . We can therefore use either Eq. (3.1), or its bulk counterpart (3.24), to compute the contribution of a particular operator and all its conformal descendants to the OPE of two equal dimension scalar operators.
For example, consider a four-point function of four scalar operators with ∆ A = ∆ B and ∆ C = ∆ D . We can ask what the contribution to this four-point function is when a particular operator O runs in the intermediate (AB)− (CD)-channel, also known as a conformal block. Up to an overall normalization, we find that this conformal block equals We can now evaluate this two-point function using (3.1) and relate it to the integral of O(ξ 1 ) O(ξ 2 ) over two causal diamonds D(x 1 , y 1 ) and D(x 2 , y 2 ) on the boundary. In the context of the AdS/CFT correspondence, a Euclidean version of this argument underlies the geodesic Witten diagram prescription of [45]. Alternatively, we can use the bulk representation (3.24) which leads immediately to an expression involving a double integral over two minimal surfaces connected by a bulk-bulk propagator, reminiscent of the result in [45]. Finally, the same quantity admits yet another interpretation as the two-point function of Q(O) on the moduli space of causal diamonds. Notice that with all of the above we are working in Lorentzian signature (or mixed signature in case of the moduli space of causal diamonds) and one has to be careful to precisely define the types of correlators and Green's functions that appear. There is also a close relation to the 'splines' introduced in [46].

Holographic description
So far our discussion did not assume any special features of the CFT, however, we now turn to point 4 on our list which refers to the special case of holographic CFTs. In particular, we will be considering CFTs with a dual description in terms of weakly coupled gravity. In this setting, the scalar operator O in the boundary theory will be dual to a scalar field φ in the bulk and we wish to show that the following simple bulk expression provides an alternative definition of Q(O): Here, as discussed above,B(x, y) is the extremal surface reaching the asymptotic AdS boundary at the maximal sphere that bounds the causal diamond -see Figure 8. Further, the measure √ h d d−1 u is simply the induced volume element onB. Now our claim, which we demonstrate below, is that with an appropriate choice of the normalization constant C blk . Note that C blk is fixed by standard AdS/CFT techniques once the normalization C O in (3.1) is given. In Appendix C, we explicitly compute C blk as a function of the CFT normalization C O , the dimension d and the weight ∆ O -see Eq. (C.8) for the result. Note that it is natural to include an inverse factor of 8πG N in the definition of Q holo (O), as this factor ensures that our new observable is dimensionless 17 just as with its counterpart (3.1) in the boundary theory. The above holographic relation (3.25) is in line with the general philosophy that minimal 17 Recall that 8πGN = d−1 P and we are assuming the usual 'supergravity' convention where the bulk scalar φ(u) is a dimensionless field. surfaces should play a prominent role in the construction of these new boundary observables Q(O; x, y), as is the case for holographic entanglement entropy.
To show the equality of Eqs. (3.1) and (3.24), we can argue as follows: If we apply the conformal generator L i (x) + L i (y) to the above expression, this has the effect of an infinitesimal displacement ofB(x, y) in the direction of the Killing vector field L i . The field φ at this displaced location differs from the original value by an amount L i φ, but the rest of the integrand remains unchanged because L i is a Killing vector field. Therefore In case, this equation appears to be somewhat confusing, a simple one-dimensional version of this equation which illustrates the idea is We see that the bulk description of Q(O) enjoys a similar intertwining property as in Eq. (3.4).
Applying the quadratic Casimir C 2 requires iterating Eq. (3.26) twice and we find (3.27) Now we can use the fact that in our conventions C ij L i L j is proportional to the d'Alembertian acting in the AdS spacetime, e.g., [47]. 18 In particular, since φ(u) obeys a free massive field equation, we then have where we used the standard relation between the conformal dimension and the mass of the dual field, m 2 Hence we find that the holographic bulk expression in Eq. (3.24) yields the same eigenvalue as found for the boundary expression in section 3.1, i.e., x, y), and the same wave equation (3.8) on kinematic space follows.
Hence we have shown that for scalar operators, Q(O) defined in Eq. (3.1) for the boundary theory and Q holo (O) defined in Eq. (3.24) for the bulk theory obey the same wave equation of kinematic space. If we would in addition show that both quantities obey the same boundary conditions for these equations, this would be sufficient to establish their equivalence up to an overall normalization. However, instead of studying the boundary conditions, there is a more direct argument to show the equivalence of Eqs. (3.1) and (3.24).
Inside the bulk causal domain attached to the boundary causal diamond, often referred to as the bulk causal wedge, we can reconstruct the value of the field using a bulk-boundary propagator which only involves the expectation value of the corresponding operator inside 18 See appendix B for details. the causal diamond [48,49]. In other words, there exists a bulk-boundary propagator such that for any u inside the causal wedge associated with D(x, y). Inserting this expression into Eq. (3.24), we find The integral between brackets does not depend on the values of the field and we denote the result of this integral by H(x, y, ξ) resulting in This already takes the form of the first law and all that is left to do is to show that the kernel H(x, y, ξ) agrees with that appearing in Eq. (3.1). This can be seen as follows: The bulk-boundary propagator G b∂ (u, ξ) is invariant under the isometries of AdS, implying Combining the above with Eq. (3.26) shows that Eq. (3.31) also obeys the intertwining property (3.4), and as discussed below Eq. (3.4), this uniquely fixes H(x, y, ξ) up to an overall constant and hence it must agree with the kernel appearing in the first law (3.1). One potential subtlety in the above analysis is that causal wedge reconstruction strictly speaking only applies to the interior of the causal wedge, and to extend it to the boundary of the causal wedge requires us to make an assumption that the field is continuous there. At the linearized level one could contemplate that there exist solutions of the φ field equations with support outside and on the boundary of the causal wedge only. For example, one could assume that O is zero everywhere inside the causal diamond and then discontinuously jumps to a finite value at the boundary of the causal diamond. While one cannot, strictly speaking, exclude such field configurations, they will tend to produce strange effects at higher orders and can for example produce a singular energy-momentum tensor leading to a large back-reaction. For simplicity, we will in this paper simply ignore this issue and restrict to continuous field configurations. An explicit calculation demonstrating the desired equality (3.25) for d = 2 with smooth configurations is given in appendix C.1.
It is interesting to observe that the bulk-boundary propagator for causal wedges is usually written in momentum space and behaves in such a way that a direct Fourier transform to position space is ill-defined [49][50][51]. However, after integrating the bulk operator over a bulk minimal surface, we apparently obtain the rather simple expression (3.1) where the expectation value of the operator is smeared with a perfectly well-behaved kernel. It would be good to have a better understanding of the origin of this simplification, but one could certainly say that this observation provides further evidence that the Q(O; x, y) are natural objects to study in the CFT.
Vector fields: It is instructive to see what the bulk description is for an example of a non-scalar field. We briefly describe the result for a (massless or massive) vector field, leaving the case of higher spin fields as an interesting exercise for the reader. Given a bulk vector field A M , we can always construct the (d − 1)-form * F , with the field strength in this case turns out to be To show that this agrees with the first laws (3.9) and (3.16) for massive and massless vector fields, we can use the same group-theoretic argument that we used for scalar fields above. Interestingly, while the CFT counterpart (3.9) diverges in the massless limit and becomes an infinite factor times (3.16), the bulk expression (3.33) remains finite in the massless limit.
For massless vectors, the field equation reads d * F = 0, and therefore we can continuously deform the bulk minimal surfaceB(x, y) without changing the value of Q holo (O). In particular, we can deform it all the way up to a spatial slice in the aymptotic AdS boundary. Then using the asymptotic behavior of a massless vector field in AdS directly, we find that Eq. (3.33) agrees with Eq. (3.17) for a spin-one current. Further this argument can be applied for a vector field version of the derivation of the linearized Einstein equations from entanglement entropy [33], described in the introduction. For massive vectors, d * F ∼ m 2 * A and this simple argument no longer applies. We return to these observations in the closing discussion section.
We finally notice that in writing Eq. (3.33) we assumed the bulk action for the gauge field to be of Maxwell type. In 2+1 dimensions it is also possible to have a topological theory with only a Chern-Simons term instead. In that case, the bulk description should be replaced by Wilson loop of the gauge field.

Euclidean signature
We can repeat much of the above logic in Euclidean signature, but there are some significant modifications. In this case, one might consider two distinct possibilities: The first would be the moduli space of pairs of (spacelike separated) points, which becomes SO(1, d + 1)/(SO(d) × SO(1, 1)). The second distinct case would be the moduli space of (d − 2)dimensional spheres, which becomes SO(1, d (2)). In either case, there is still a natural metric on the moduli space given by a suitable Wick rotation of Eq. (2.14).
Similarly, as described in appendix A.2, the conformal Killing vector K µ may be analytically continued to produce a conformal Killing vector of R d which has fixed points either on a pair of points or on a (d − 2)-sphere. This provides us two extensions of our new observables to Euclidean space through Eqs. (3.15) and (3.11). However, the causal diamonds are lost in Euclidean signature and so there is no natural finite domain with which to associate these observables. As a result our analogous Euclidean expressions would now involve an integral over the entire Euclidean boundary. Of course, there is no obvious reason that such an integral should converge. However, we conjecture that it is possible to extract a universal finite term when the integrals are suitably regulated. This issue would not arise in the case of a conserved current where the integral in Eq. (3.17) is reduced to a Cauchy surface spanning the causal diamond. In Euclidean signature, when considering the (d − 2)-spheres, this integral naturally continues to an integral over any (d − 1)-dimensional surface whose boundary is the corresponding sphere. In the case of pairs of points, the natural domain would be an integral over closed (d − 1)-dimensional surface enclosing one of the points. 19 The connection with the resummation of a local operator and its conformal descendants in the OPE remains valid given a pair of points. In fact, shadow operator formalism [42,43] was originally developed in Euclidean signature. A similar discussion might be developed for the case of spheres, however, it would the OPE limit of (d − 2)-dimensional surface operators, e.g., see discussions in [52,53].
In a holographic context, if we consider a sphere in the boundary theory, this again naturally defines a preferred extremal surface in the bulk. Hence the discussion of the holographic description of Q holo (O) is essentially unchanged from that given in section 3.4. On the other hand, given two spacelike separated points on the boundary, we must turn to a new class of natural minimal surfaces, namely, the geodesic connecting the two boundary points. Of course, integrating over this bulk surface in Eq. (3.24) provides a natural construction of a bulk observable which is again entirely geometric in nature. The arguments we gave in Lorentzian signature, which crucially relied on conformal invariance, can be repeated in Euclidean signature (at least formally) to show that the boundary and bulk descriptions of Q(O) agree in either case, and that it still obeys a wave equation on the corresponding moduli space. With the case of pairs of points, one makes direct contact with the geodesic Witten diagram prescription of [45], which was also derived in Euclidean signature -as well as with the 'splines' introduced in [46]. Further in this case, for higher-spin symmetric traceless tensor fields, the natural bulk quantity to consider is the contraction of the rank-j tensor field with j times the unit tangent vector along the geodesic. This object is again quite distinct from its spherical or Lorentzian counterparts, for which we only worked out the vector field case -we return to this point in section 7.
Of course, one could also consider the moduli space of spacelike separated pairs of points in Lorentzian signature and we discuss this possibility at length in appendix A.3. In this case, the coset geometry is SO 1)), which matches that for the moduli space of spheres in Eq. (2.11). In the appendix, by extending our considerations from Minkowski space to R × S d−1 geometry, we show that the two moduli spaces are in fact identical. That is, the moduli space of spacelike pairs of points is the same geometric object as the moduli space of spheres or timelike pairs of points. It is interesting that in the R × S d−1 geometry, a pair of spacelike separated points defines a region of finite volume, namely that enclosed by the past and future lightcones of both points. Further the conformal Killing vector K µ is naturally extended to generate a flow on this region (with fixed points on the spacelike pair). Hence in this context, it would be natural to define nonlocal observables using Eqs. (3.15) and (3.11) where the integration would now run over this new volume. In a holographic CFT, these observables would naturally have a gravity description analogous to Eq. (3.24) except the bulk integral would run over the geodesic connecting the spacelike separated pair of points on the boundary. The latter construction would again connect directly to the discussion of geodesic Witten diagrams [45]. Of course, it would be interesting to fully explore the implications of this equivalence of these two moduli spaces in Lorentzian signature. It would also be good to develop a better conceptual understanding of the peculiar differences between the various Lorentzian and Euclidean versions of Q(O).

Other fields
We have so far discussed scalar fields and some aspects of higher spin fields described by symmetric traceless tensors. There are clearly many other types of fields one could contemplate studying carrying different representations of the Lorentz group such as fermions or antisymmetric tensors. It would be interesting to study such fields as well, and to examine whether the natural generalization of Q(O) remains a scalar on the space of causal diamonds or whether it can become a quantity which carries nontrivial quantum numbers under the local SO(d, d) Lorentz group on the generalized kinematic space.
A natural starting point to explore such generalizations would be to put different fiducial operators A, B at the tips of the causal diamond and to write down a first law with a kernel of the form (3.21) for some operator O which appears in the OPE of A and B. The corresponding Q(O), perhaps better denoted by Q A,B (O), will then obey a modified field equation which can be obtained by repeating the logic around Eq. (3.6). However, now L i (x) and L i (y) are no longer purely geometric but also involve an internal piece due to the non-scalar nature of A and/or B. It is however less obvious how to generalize the bulk description (3.24) to this case, nor whether Q A,B (O) can be extended in any natural way beyond this linearized approximation. In Euclidean signature, the bulk geodesic connecting A and B could be understood as the leading classical trajectory for a scalar particle connecting A and B. However, the discussion of geodesic Witten diagrams [45] suggests that integral along the geodesic should be weighted by a measure depending on the difference in the conformal weights of the operators A and B. Further, if A and B carry non-trivial Lorentz representations, one should presumably use classical trajectories for particles transforming precisely under those representations. Once again, there are many interesting directions to explore and we have presumably only uncovered the tip of the iceberg.

Two dimensions
In the remainder of the paper, our examination will focus primarily on two-dimensional CFTs. Hence to set the stage for the subsequent sections, we will explicitly illustrate ideas appearing in the previous discussion of our nonlocal observables for two dimensions. We will also show that certain straightforward generalizations and simplifications emerge for d = 2. The latter seem to be closely related to the fact that light-cone coordinates (or complex coordinates in Euclidean section) provide a preferred framework in which to describe two-dimensional CFTs, e.g., conserved currents naturally split into independent left-or right-moving components. The preferential role of null coordinates for d = 2 was also reflected in the discussion of the geometry of kinematic space in section 2.1. In particular, we found that in this case, the metric on the moduli space of causal diamonds factorized into two copies of the metric on two-dimensional de Sitter space when using null coordinates -see discussion below Eq. (2.16).
Hence to begin, consider a general quasi-primary operator O with conformal weights (h, h) in a two-dimensional CFT. Now we adopt the null coordinates introduced in Eqs. (2.16) and (2.17) and then we may define the following observable: forh = 0 -with an analogous expression for h = 0. Note we had to redefine the normalization constant above since the integral overξ yields a divergent result in the limit h → 0, i.e., Note that observables of the form given in Eq. (3.35) are completely independent ofū andv, i.e., they are completely independent of the positions of the top-right and lower-left boundaries of the causal diamond in Figure 9. Since O(ξ) involves only right-moving modes, the nonlocal observable in Eq. (3.35) is only sensitive to expanding (or contracting) the causal diamond in the ξ direction. In terms of the kinematic space, the result is completely independent of the position of the causal diamond in the second dS 2 factor in Eq. (2.18), i.e., the factor involvingū andv.
When the operator O in Eq. (3.35) is the right-moving component of the stress tensor T ξξ with h = 2, we have Note that the above observable is completely independent of that constructed with the left-moving component Tξξ, i.e., independent of Q(Tξξ;v,ū). The first law of entanglement (1.3) (with d = 2) actually corresponds to the sum of these two expressions along with the appropriate choice of the normalization factor, i.e., we use T tt = T ξξ + Tξξ and C O = 2π. Hence we observe that the entanglement entropy, at least in weakly excited states, naturally splits into right-and left-moving contributions. We will return to this important point in the next section.
Wave equations: Following [35], it is immediate to see that Eq. (3.34) satisfies not just one but two wave equations on the kinematic space. This follows because the right-and left-moving parts of the integration kernel are both bulk-boundary propagators in twodimensional de Sitter space, i.e., in the separate dS 2 factors in Eq. (2.18). As a result, we find where the d'Alembertians ∇ 2 dS 2 and ∇ 2 dS 2 only act on the copy of de Sitter space defined in Eq. (2.18) involving the right-and left-moving coordinates, respectively.
Note that the wave equation (3.8) considered above for general dimensions corresponds to the sum of the two equations in Eq. (3.38), . 20 However, we see that Eq. (3.8) is supplemented here by a second equation of the form Of course, we recover precisely the mass in Eq. (3.8) upon substituting ∆O = h +h and = h −h.
Note that here we can see the reason why m 2 O L 2 appears to be twice the mass that one would expect from the analysis on a fixed time-slice (as in [35]): the mass on each copy of dS2 in (3.38) contributes twice (i.e., with a factor of 2 in Eq. (3.39)) to the full mass m 2 O L 2 .
We might note that in the case of a spinless equation (with h =h), this second equation The importance of this constraint equation was emphasized in [38] and we return to this point in section 6.1.
Relation to OPE: We wish to return to the arguments of section 3.3 in the special case of two dimensions, where we can make the necessary calculations more explicit. Let us consider the OPE of two general operators A(u,ū) and B(v,v) inserted at the tips of the causal diamond, e.g., see Figure 9. It is natural to write the analog of the resummation ansatz (3.19) for the OPE blocks in a factorized form for two dimensions: As in general dimensions, the kernels I ABO i andĪ ABO i are fixed by the global part of the conformal group. However, we can easily carry out this exercise in the case of d = 2. Recall that for a quasi-primary O(z,z) we have where L k is the k-th element of the Virasoro algebra and k = −1, 0, 1 (for a primary this holds for all k ∈ Z). Commuting L k through the OPE (3.41) therefore yields The solution to the three differential equations found by setting k = −1, 0, 1 in the above equation is where the symmetry analysis does not fix the normalization. Eq. (3.44) provides then the right-moving contribution. One can repeat the same argument for the left-moving factor and find the kernelĪ ABO i (ū,v,ξ) which has an analogous form. Let us add that having found a unique consistent solution of the Ward identities then justifies the validity of our factorized ansatz in Eq. (3.41). Again, above we are only considering global conformal blocks, however, it would be interesting to extend this analysis to a resummation of entire Virasoro blocks. The holographic description of such full conformal blocks has already been studied in [54].

Interacting fields on d = 2 moduli space
In the previous section, our considerations focused on new nonlocal observables Q(O) whose construction was motivated as a generalization of the first law of entanglement (1.3). A natural question then is whether for each observable Q(O), we can go beyond the linearized approximation implicit in the first law. That is, whether there is some nonlinear quantity equivalent to the full S EE in the first law, which can be defined for finite excitations away from the vacuum state. Of course, one would hope that such nonlinear quantities would still share at least some of the nice features of the corresponding observable Q(O), such as local 2-derivative equations of motion in the auxiliary space M (d) ♦ or a natural appearance in the OPE. We start this line of investigation here by considering the full entanglement entropy S EE and asking whether there is a nonlinear extension of the wave equation (3.12) for this quantity.

Vacuum excitations
When we apply the entanglement first law (1.3) as a diagnostic to characterize CFT states, i.e., we apply the first law to all possible spheres, implicitly we are considering excited states which have a very small expectation value of the energy-momentum tensor everywhere. So our next step is to consider states where the expectation value T µν may be finite. In particular, we will focus on two-dimensional CFTs, where an infinite number of excited states characterized only by the expectation value of the energy-momentum tensor can be obtained simply by acting with a local conformal transformation on the CFT vacuum on a plane. The expectation value of the energy-momentum tensor for any of these states is given by the Schwarzian derivative, i.e., with a local conformal transformation 21 we obtain right-and left-moving components of the stress tensor Such functions f (z) andf (z) may not exist globally, but we will ignore global issues in what follows, in particular the fact that minimal surfaces can change discontinuously as, e.g., in the BTZ black hole. Our discussion will therefore only be valid for sufficiently short intervals or for excited states which are connected to the ground state via a diffeomorphism. Now let us consider the entanglement entropy in the above states. 22 First, we recall that the entanglement entropy can be evaluated using the replica trick, e.g., [52,[55][56][57]: one begins with the Rényi entropies which involves the reduced density matrix ρ A on the spatial domain A. Here Trρ n A can be evaluated as a path integral on an n-fold cover of the original geometry on which the CFT lives. However an alternative description of this quantity comes in terms of twist fields, σ n , which act in an n-fold replicated version of the CFT and implement twisted boundary 21 For simplicity, we will continue to phrase our discussion here in terms of the null coordinates introduced in Eq. (2.16). This is not entirely consistent with certain points in the following discussion where a Euclidean signature is implicit, e.g., the path integral representation of Trρ n A . However, one can easily Wick rotate to complex coordinates in Euclidean signature, i.e., z = x + i τ andz = x − i τ with τ = it. 22 Further discussion of twist operators is presented in section 7.
conditions connecting the copies of the CFT along the entangling surface, i.e., the boundary of A. In general dimensions then, the σ n are codimension-two surface operators with support on the boundary of A. However, two dimensions are special since A will consist of a union of a set of disjoint intervals and the twist fields are local primary operators inserted the endpoints of each interval. In particular, in two-dimensional CFTs, the twist operators σ ±n are local primaries with conformal weights [55,56] h n =h n = c Now we are interested in the n → 1 limit in which the Rényi entropy reduces to the entanglement entropy, i.e., S EE = lim n→1 S n .
In particular, the above discussion lets us relate the entanglement entropy of a single interval or of a single causal diamond to the two-point correlator of a pair of twist operators where in the Minkowski vacuum of the CFT, the desired correlation function takes the form [55] σ where a n is a numerical coefficient, with a 1 = 1. Since the twist operators are primaries in a replicated CFT, this two-point function transforms in a standard way under the conformal transformations (4.1). Hence using Eqs. (4.4) and (4.5), the final result for the entanglement entropy in any of the above states (4.2) is given by where we introduced the short distance cut-off δ. Further, (u,ū) and (v,v) denote the tips of the causal diamond, as in figure 9, after the conformal transformation (4.1). As a simple example, if we choose f (z) = z andf (z) =z, we recover the standard result for a single interval where, however, the length of the interval is specified here by the distance between the past and future tips of the corresponding causal diamond. A number of comments are in order here: First, Eq. (4.7) appeared already in [58] and subsequently this approach was applied in [55] to obtain the entanglement entropy of a single interval both in a circular spatial domain and inin a thermal state (on an infinite line). Second, for holographic CFTs, Eq. (4.7) can be derived in full generality by using the Ryu-Takayanagi prescription and AdS 3 gravity -see the discussion in the next section, as well as [59]. Finally, we observe that the full entanglement entropy (4.7) is the sum of independent contributions from the right-and left-moving modes That is, we interpret S R (f ; u, v) as the contribution to the entanglement entropy for the right-moving modes in the state generated by the conformal transformation w = f (z) since it is only sensitive to variations in the width of the causal diamond in the ξ direction, as illustrated in figure 9. Of course, S L (f ;ū,v) is the analogous contribution from the left-movers. Recall that the same split for δS EE in Eq. (3.37). There it resulted since 'holomorphicity' was a general feature of the observables constructed with conserved currents, e.g., the stress tensor, as in Eq. (3.35). With our analysis here, we see that this splitting survives at nonlinear level for the entanglement entropy. Furthermore, the right-moving contribution S R (f ) turns out to solve the Liouville equation in the form 11) and similarly with S L (f ). To relate this result to our previous wave equations, we turn back to our motivation which was to find a nonlinear generalization of δS EE , or rather of Q(T zz ; v, u) in Eq. (3.37). Having right-and left-moving contributions to the entanglement entropy, which are still sensibly defined in Eq. (4.10) for states with finite energy densities, it is natural to define That is, we consider the finite difference between the right-moving contribution to the entanglement entropy for intervals in the state generated by w = f (z) and that in the original vacuum state. Of course, there is a completely analogous difference ∆S L for the left-moving contribution. Note that these differences are UV finite and independent of the short distance cut-off δ. Further, for an infinitesimal conformal transformation (4.1), e.g., f (z) = z + g(z) and using Eq. (4.2), one can confirm that to leading order in , with C O = 2π. That is, to leading order, the finite difference observable (4.12) matches the linearized observable (3.36) which yields the right-moving contributions to δS EE in the first law (3.37). Quite remarkably, Eq. (4.11) yields a local equation of motion for both ∆S R and ∆S L on, respectively, the right-and left-moving de Sitter factors in the kinematic space for two-dimensional CFTs, e.g., see Eq. (2.18), where the nonlinear potential is given by Hence if we linearize the wave equation (4.14) for, e.g., ∆S R , then the term −2s is the above expression corresponds precisely to the mass term (with L 2 = 1) for h = 2 andh = 0 needed to reproduce Eq. (3.38). Of course, we also recover the desired linearized wave equation for ∆S L . Implicitly, we also have the equations which remain unchanged from their linearized counterparts. Further, we might note that the higher order interactions in Eq. (4.15) are suppressed by inverse powers of the central charge at large c. Finally, notice that we have written the wave equations (4.14) as though they resulted from the variation of an action. In particular then, the underlying potential would be V (s) = − c 6 s − 1 2 exp(−12s/c), which has a single unstable extremum at s = 0. Our results here demonstrate that at least for the universal family of states characterized by (4.2), the spatiotemporal organization of the entanglement entropy, or more precisely of its right-and left-moving contributions over the vacuum entanglement entropy, is governed by the Lorentzian structure of the moduli space of causal diamonds (2.18) and obeys a nonlinear and local propagation law on this space.

Beyond vacuum excitations
In Eq. (4.12), the vacuum state was chosen as the reference state in defining ∆S R and ∆S L . However, this choice was arbitrary and we could easily consider the difference where the reference state is that generated from the vacuum by the conformal transformation w = f 0 (z). In this case, the corresponding wave equation derived from Eq. (4.11) becomes with the same potential as in Eq. (4.15). Inspecting the derivative term, we recognize this as the Laplacian on two-dimensional de Sitter space in transformed coordinates. That is, if we begin with dS 2 with null coordinates (u 0 ,v 0 ) -e.g., , see Eq. (2.18) -then the coordinate transformation u 0 = f 0 (u) and v 0 = f 0 (v) yields 23 Of course, the analogous discussion applies for the left-moving contribution withw =f 0 (z) defining the reference state. Hence, with a new choice of reference state, one recovers precisely the same nonlinear wave equations and the only change is a coordinate transformation on the corresponding dS 2 geometry. We should note that we are only examining the local geometry of the moduli space here and we have not concerned ourselves with any subtleties that may arise at the global level.
The above results connect directly to the recent work of [60]. They examined the linearized propagation of entanglement excitations in various nontrivial states, e.g., for finite temperature or for finite spatial periodicity. Of course, both of these states in the CFT can be generated with an appropriate conformal transformation [55], e.g., w = exp(2πz/β) generates the thermal state from the flat space vacuum. 24 Hence these states fall into the universal class of states studied above and as described above, the auxiliary geometry describing the appropriate moduli space for these two examples will again be the direct product of two de Sitter spaces. Upon restricting to a fixed time slice as in [60], one then finds that δS EE propagates on the diagonal dS 2 .
A natural interpretation emerging from the above discussion is that S R (f ) and S L (f ) are proportional to conformal factors in the corresponding de Sitter factors of the kinematic space metric (2.18), i.e., This expression for the metric then directly connects the independent diffeomorphisms on each of the two-dimensional de Sitter spaces with the conformal transformations (4.1) in the CFT, as discussed above. In this interpretation, the short distance cut-off δ sets the curvature scale for each of the dS 2 geometries: where R dS 2 is the Ricci scalar for each of the de Sitter geometries. Hence this approach makes a definite choice of the curvature of the moduli space, i.e., L = δ. Evaluating the above constant curvature condition on the right-moving part of the ansatz (4.20) yields precisely the Liouville equation (4.11). While this approach is very peculiar to two dimensions, the interpretation of the wave equation on the moduli space of causal diamonds as a constant curvature condition might well be more general. Indeed, as we discuss in section 7, a field equation identical to the one obeyed by δS EE can be derived in arbitrary dimensions by observing that M 5 More interacting fields on d = 2 moduli space: higher spin case In the previous section, we have seen that entanglement entropy obeys nontrivial and local field equations on the moduli space which has the dS 2 ×dS 2 geometry. In fact, the 'fields' on the moduli space were right-and left-moving contributions to the entanglement entropy, e.g., as defined in Eq. (4.12). These results explicitly applied for a universal family of states, which were generated by a conformal transformation acting on the flat space vacuum or alternatively by exciting the vacuum state by the action of the stress tensor alone. Moreover, the field equations (4.14), as well as (4.16), are independent of the precise reference state. Rather all dependence on the latter is encoded in a choice of coordinates, or implicitly on the form of the boundary conditions, on dS 2 ×dS 2 . In section 24 In this case, we are thinking of a conformal transformation of Euclidean space.
3, we introduced the idea that in more general states where new operators (other than the stress tensor) acquire an expectation value, one would have to expand the discussion to consider new fields on the moduli space to account for these operators. In this section, we wish to provide an explicit example of this generalization in the context of d = 2 CFTs with higher spin symmetries. In particular, we will consider a theory with a conserved spinthree current. At the linearized level, we have seen that there are new nonlocal observables (3.35) associated with the right-and left-moving components of this current and that these satisfy linearized wave equations (3.38) on the moduli space. In the following, we will demonstrate that the latter extend to nonlinear equations where the fields corresponding to the spin-three current and to the entanglement entropy develop local interactions with each other on dS 2 ×dS 2 . More specifically, we will consider a theory with only spin-two and spin-three fields, which can be described by a SL(3, R) × SL(3, R) Chern-Simons theory (for a review with an emphasis on black hole solutions see e.g., [61]). The most general solution of the field equations is described by a flat gauge field subject to suitable boundary conditions, the latter encoding the expectation values of the stress-tensor and spin-three currents in the dual CFT. In keeping with the general philosophy of this paper, we would like to probe such backgrounds both with ordinary entanglement entropy as well as with a spin-three generalization thereof. The existence and definition of such a generalization of entanglement entropy was proposed in [62] and some additional features were discussed in [63]. The spinthree entanglement entropy of [62] can be viewed as a generalization of the expressions for ordinary entanglement entropy in higher spin theories in terms of Wilson lines originally proposed in [64,65]. These two proposals were shown to be equivalent in [66] and were tested against CFT computations in [67,68], for more recent work see [69] and references therein. In this section we will consider theories holographically dual to classical Chern-Simons theory, i.e., we assume a large central charge (equivalently, a large Chern-Simons level k).
We will be interested in computing the entanglement entropy in nontrivial states in Lorentzian signature and in particular we will not be turning on any chemical potentials or any sources for the higher spin currents. In the presence of such sources there are different types of boundary conditions depending on whether one takes a Lagrangian or Hamiltonian point of view [70] but we will not have to worry about this issue. Thanks to this the proposals of [64,65] and [62] can be phrased as follows. In Chern-Simons theory one has two gauge fields A andĀ, one for each copy of the gauge group. Entanglement entropy for the interval with endpoints P, Q is computed by constructing the open Wilson loop W(P, Q) from P to Q for A, the open Wilson loopW(Q, P ) from Q to P forĀ, and then to evaluate S R (P, Q) = c R log Tr R (W(Q, P ) W(P, Q)) (5.1) with a suitable normalization and in a suitable representation R. Depending on the choice of representation R, different types of entanglement can be computed, as we will see below.
Standard entanglement entropy is for example obtained by taking R to be the fundamental representation for pure gravity constructed with SL(2, R) × SL(2, R), and the adjoint representation for the spin-three theory based on SL(3, R) × SL(3, R).

Evaluation of Wilson loops
The boundary condition on the Chern-Simons gauge fields A,Ā states that these should be gauge transforms of a suitable two-dimensional gauge field with a purely radial gauge transformation. It is more convenient to write everything purely in terms of 2d data and write the radial dependence explicitly. So from now on the open Wilson loops and gauge fields will be 2d and not 3d, and S R (P, Q) becomes Here Λ 0 is the diagonal element of a sl(2, R) subalgebra of the gauge group. This sl(2, R) subalgebra (given by a choice of an embedding of sl(2, R) in the gauge group) is what sets the boundary conditions for Chern-Simons theory and is also what determines the precise nature of the higher-spin symmetry of the dual CFT. Each inequivalent choice of sl(2, R) embedding describes a different higher spin theory. There is a preferred sl(2, R) embedding in sl(N, R), the so-called principal embedding, for which the fundamental representation of sl(N, R) is an irreducible representation of sl(2, R). These give rise to the standard W N -algebras with one generator of spins 2, . . . , N each, and this is for N = 3 the case that we will study. To regulate the divergent quantity S R (P, Q), we need to pick a fixed and large value of ρ. The two-dimensional gauge fields are of the following form, where, as discussed above, we assume no sources have been turned on: Here, Λ − , Λ + together with Λ 0 are the three generators of the sl(2, R) subalgebra, and U (x + ) is short-hand notation for Since A andĀ are flat, they are locally pure gauge 25 , A = g −1 dg andĀ =ḡ −1 dḡ, and We will now assume that when we decompose the representation R in eigenvectors of Λ 0 there are unique eigenvectors with a smallest and largest eigenvalue, and we will refer to 25 This need not be the case globally, because the gauge field can have a non-trivial monodromy around the spatial circle. For example, this is the case for the higher spin analogues of conical defect and black hole geometries. Therefore, our analysis is strictly speaking restricted to sufficiently small intervals on the boundary if the spatial geometry is a circle, but should be valid for arbitrarily large intervals in the planar case. It would be interesting to explore global aspects and possible consequences for the field equations on dS2×dS2 in more detail.
these highest and lowest weight vectors as |µ + and |µ − . This assumption will hold in the cases that we will consider, and if it holds the dominant contribution to S R (P, Q) is coming from the matrix elements where we pick up the largest powers of e ρ , which is precisely given by the matrix elements between the highest and lowest weights. The dominant contribution is then where ρ is a fixed and large cutoff, we find that S R (P, Q) can be written as Hence we see that the result splits into separate right-and left-moving contributions. Our experience in the previous section suggests that these separate terms will form a interesting basis to develop a local field theory on the moduli space. Hence from now on, we will restrict to the right-moving contribution to S R (P, Q), which we denote by S R (P, Q) which only depends on the right-moving coordinates P + and Q + and therefore naturally defines a function on one of the two-dimensional de Sitter spaces.

Pure gravity example
It is instructive to see how this works in the pure gravity case. There the two-dimensional gauge field is of the form where T (x + ) is the right-moving component of the stress tensor. Now we can write A = g −1 ∂ + g with which indeed has the property that and where, not surprisingly, T is expressed as a Schwarzian derivative In the pure gravity case, the embedding of sl(2, R) in sl(2, R) is given by the identity map and in particular where the highest and lowest weight states above correspond to R being the fundamental representation. To compute S fun (P, Q) in this case we therefore only need the 12-matrix element of g −1 (Q)g(P ) and we get which is indeed in precise agreement with the results (4.10) obtained in the previous section, when we identify c fun = c/6, as well as fun = δ.

Spin-three entanglement entropy
We would now like to consider the spin-three case with the principal embedding, in which case the right-moving two-dimensional gauge field takes the form where T (x + ) and W (x + ) are the right-moving components of the stress tensor and the spin-three current, respectively. Once again, we need to find a g which obeys g −1 dg = A. Such a g can be parametrized in terms of two functions γ 1 and γ 2 , but the equations are quite a bit more cumbersome compared to the pure gravity case. To write g, it is convenient to first define and to parametrize g as One can explicitly show that with this choice of g where T 1 , T 2 and W are lengthy expressions in terms of γ 1 , γ 2 and θ, which can be viewed as generalizations of the Schwarzian derivative to the spin-three case. For a suitable choice of θ in terms of γ 1 and γ 2 , which one can algebraically determine, one gets T 1 = T 2 . We will however not need the explicit form of θ in what follows.
As an aside, we notice that there is an interesting action of SL(3, R) on γ 1 and γ 2 , which follows from g → g, and which leaves T and W invariant. It takes the form which is a direct generalization of the standard SL(2, R) action in the pure gravity case and which presumable plays some sort of role in 'W-geometry'.
With the explicit form of g at hand we can now evaluate S R (P, Q) for various choices of representations R. From [64,65] we know that ordinary entanglement entropy is obtained by taking R to be the adjoint representation. It is, however, a priori less clear which representation one should take in order to get the spin-three generalization of entanglement entropy. We claim that the right quantity (up to an overall normalization) is obtained by taking a linear combination of the fundamental and adjoint representations where we put the normalization constants c fun = c adj . To see that Eq. (5.19) is the right quantity one can either consider its expansion around the vacuum to first order in T (x + ) and W (x + ) , and verify that it produces an observable of precisely the form in Eq. (3.35) with h = 3 -see also below. Alternatively, one can translate the original proposal of [62] (see also appendix B of [63]) in the present language and also arrive at Eq. (5.19). The relation of these papers to Eq. (5.19) can be summarized as follows: the highest weight of the fundamental representation minus one half the highest weight of the adjoint representation is proportional to the sl(3, R) generator which is precisely the generators used in the construction of [62]. We notice that if we decompose the adjoint representation of SL(3, R) with respect to the SL(2, R) subgroup, we obtain a three-and a five-dimensional representation which contain T (x + ) and W (x + ) as lowest weight respectively. The Cartan generator of the five-dimensional representation is precisely U 0 . This suggests that to obtain a higher spin entropy in more general cases we should take linear combination of S R with various representations R in such a way the the corresponding highest weight is proportional to a Cartan generator which is part of the same SL(2, R) representation as a particular higherspin generator.
We are thus led to consider the following two quantities which we refer to as the spin-two and spin-three 'entanglement entropies', respectively. That is, S EE is proportional to the ordinary entanglement entropy while S EE defines a new nonlinear observable related to the spin-three current, i.e., it vanishes in states where W (x + ) = 0. We also note that the short distance regulators cancel out in our definition of S (3) EE since adj = 2 fun -see the definition of R above Eq. (5.6). Hence our spin-three entanglement entropy is a completely UV finite observable.
Note that both expressions (5.21) carry an overall factor of the normalization constant c fun since as above, we set c fun = c adj . However, we have not fixed the precise normalization of S (2) EE and S (3) EE since we are mostly interested in the question whether these obey local field equations on dS 2 or not. It is in principle straightforward, by examining the first laws for S (2) EE and S (3) EE and using the known relation between the generators in Eq. (5.14) and the spin-two and spin-three generators with their canonical normalization to determine the precise normalization factors.

de Sitter field equations for higher spin entanglement entropy
We can now explicitly compute S (2) EE and S (3) EE in the most general spin-two and spin-three background. To simplify the final answers, we will denote u = P + and v = Q + , and define and with these we find that Interestingly, these quantities obey the following local field equations, EE with S R . In passing, we observe that these equations (5.24) are identical to the so-called Toda equations for SL(3, R) (for a summary of some aspects of Toda theory and further references see, e.g., [71]) and reserve further comments for later.
To see the de Sitter geometry of the kinematic space emerge in the field equations, we follow our previous approach in Eq. (4.12) and consider the following difference with L 2 = 1. It is therefore indeed true that spin-two and spin-three entanglement entropy obey local interacting field equations on dS 2 . Note that at the linearized level, these equations yield the expected masses -see Eq. (3.38) -for the nonlocal observables (3.35) associated with conserved currents with h = 2 and 3. Further notice that these equations can be formulated in terms of extremizing the following action of an interacting field theory in dS 2 This action can again be related to the SL(3, R) Toda equations. The Toda equations (5.24) are widely believed to have to same relation to W 3 -gravity as Liouville theory has to ordinary two-dimensional gravity (see, e.g., [72]). It therefore appears that we have found the field equations of some higher spin theory of gravity on de Sitter space. Since SL(3, R) Toda theory is intimately related to the W 3 -algebra, and in fact has the W 3 -algebra as its symmetry, one suspects that there should be more direct argument to explain the appearance of Toda equations here, and it would be interesting to explore this further.
In defining ∆S (2) EE and ∆S EE above, we only considered subtracting the vacuum entanglement entropies. However, we could consider subtracting the result of other reference states as in section 4.2. In particular, if we choose states with γ 2 = (γ 1 ) 2 , these are all states where the spin-three current vanishes and Eq. (5.26) becomes As above, this matches the right-moving contribution to the entanglement entropy in Eq. (4.10) with γ 1 (z) playing the role of the 'holomorphic' function f (z). Hence we could carry out the same analysis in section 4.2 choosing any of these states as the reference state, e.g., as in Eq. (4.12). The results would be essentially the same, i.e., the conformation transformation in the CFT would produce a coordinate transformation on the moduli space with u 0 = γ 1,0 (u) and v 0 = γ 1,0 (v). 27 Of course, W (x + ) = 0 for all of these reference states. An interesting new direction to explore, however, would be to consider using a general reference state from the class defined by Eqs. (5.22) and (5.23). That is, a reference state where γ 1 (z) and γ 2 (z) are completely independent functions and so the spin-three current has a nonvanishing expectation value. This may well reveal some 'higher spin' structure in the geometry of the moduli space. Let us mention once more that the results derived in the spin-three case are only valid for large c and large Chern-Simons level k, and one expects these results to receive 1/c corrections in the full quantum theory.

First law from Wilson loops
To conclude this discussion of higher-spin CFTs, we briefly describe the form of the 'first law' in this formalism for a general SL(N, R)×SL(N, R) theory. Given the form of S R (P, Q) in Eq. (5.7), we can vary it by varying g, and with a bit of algebra, the general variation becomes However, since we have g −1 ∂g = Λ + + U from Eq. (5.3), the variation δ(g −1 (z)∂g(z)) is just δU . In other words, This indeed has the form of some sort of kernel integrated against the local perturbation δU . The usual first law is obtained by perturbing global AdS 3 background, which we get by taking as background g −1 ∂g = Λ + , so g = exp(zΛ + ). Then From this one can see that the various components of δU will be multiplied by (z − Q) a (P − z) b /(P − Q) c for suitable powers a, b, c and with further work one can show that

Dynamics and interactions: future challenges
As we have seen in the previous section, there exist examples where nonlinear and local interactions in the space of causal diamonds occur naturally. This nonlinear dynamics was found to describe scale dependence of entanglement entropy (and its spin-three generalization) in states with general spin-two and spin-three excitations. A very interesting question is whether this extends to higher dimensions (see also the discussion section), and whether other degrees of freedom can be included, i.e., other fields on the moduli space associated with our new observables (3.1). One might be tempted to look for local field equations just as in the spin-two and spin-three example, however, even in that case there was an issue with the apparent locality: to write local equations we had to decompose entanglement entropy into left-and right-moving contributions and each of which obeyed a local field equation on a single dS 2 . It is not possible to capture this in terms of a single local field equation obeyed by the sum of the left-and right-moving contributions.
When moving to scalar primaries as in Eq. (3.1), the situation becomes more complicated, since for these scalars in d = 2, no simple separation in terms of left-and rightmoving modes exists, while scalars do obey the constraint (3.40). As is familiar from several different examples such as exceptional and doubled field theory (for reviews of the latter see, e.g., [73][74][75][76]), constructing interacting theories for constrained fields can be quite difficult. For example, it is in general not true that the product of two fields that obey the constraints still continues to satisfy them. Sometimes it is possible to modify the constraints in perturbation theory, but then there is the potential issue that the theory becomes over-constrained. Keeping the constraints unaltered, one may have to introduce explicit projection operators acting on products of fields, projecting the product back into the subspace of fields which obey the constraints, thereby introducing nonlocalities into the theory. While we have not been able to find a compelling systematic framework to incorporate interactions, we believe this is an important open problem with possibly many new applications, and as a prelude we describe below a preliminary attempt at including interactions, which clearly demonstrates the sorts of issues one is running into.

Constraints
Even at the linearized level, there is already a challenge since we have identified a single wave equation in the moduli space M SL(2, R) × SL(2, R). Each of these factors has an independent quadratic Casimir, which in turn produce two independent wave equations on the moduli space, as shown in Eq. (3.38). The sum of these equations (3.39) matches the wave equation (3.8) which we constructed for general dimensions, while their difference (3.40) can be regarded as a supplemental constraint.
Unfortunately, the conformal group is irreducible in higher dimensions and so the same structure does not appear in general. However, it was proposed in [38], that one can identify constraints by examining SO(2, 2) subgroups of the full SO(2, d) group. The reasoning will become apparent in section 6.2, where we consider the left-and right-moving Casimirs acting on the holographic version of our observables given in Eq. (3.24). In either case, the action of the Casimir acting on Q holo (O) will produce the AdS d'Alembertian acting on the bulk scalar field, using Eqs. (6.17) and (6.20), and hence their difference vanishes. This calculation is then easily lifted higher dimensions by considering AdS 3 submanifolds within the full AdS d+1 bulk geometry. The form and action remains unchanged for the quadratic Casimirs of the left-and right-moving SL(2, R) factors in the SO(2, 2) group acting on the AdS 3 slice and hence their difference again vanishes when acting on Eq. (3.24). Hence in this holographic framework, one is able to identify additional operators which annihilate the nonlocal observables (3.24). We observe that implicitly a key ingredient here was the intertwining property (3.26) which carries the action of the conformal generators on the boundary observable to the scalar field appearing inside the bulk integral. It then turns out that the difference of the 'Casimirs' is trivial when acting on the scalar.
The latter observation allows us to extend this construction of constraints to the nonlocal observables (3.1) for general CFTs. Here again the conformal generators satisfy an intertwining property (3.4). Hence the idea is to find (combinations of) generators which are trivial in the representation acting on a scalar primary -see Eq. (B.2). Motivated by the holographic discussion, we can identify a large number of such trivial operators, which can be elegantly written in pure CFT language as The operators identified with SO(2, 2) subgroups then emerged from the four-planes spanned by X − , X µ , X ν , X d , where µ, ν = 0, · · · , d − 1 correspond to the spacetime directions of the CFT. The corresponding operators can be written in terms of the conformal generators as: Again acting on any scalar primary, we have Σ µν |O(x) = 0. That is, substituting for the generators in Eq. (6.2) with the expressions in Eq. (B.2), one finds that the above combination of generators simply vanish, i.e., Σ µν = 0. As we will see below, when we substitute the representation of the generators acting on functions on the moduli space, these operators are nontrivial and hence Σ µν Q(O) = 0 becomes an nontrivial constraint on the nonlocal observables.
With k = 1, we can consider the four-planes spanned by X − ± X d , X µ , X ν , X ρ in the embedding space (in the notation of Eq. (2.4)), for which we find constraints of the form One can readily verify that these operators vanish identically on scalar primaries, using the explicit representation given in (B.2). The final case (i.e., k = 0) comes from considering a four-plane spanned by X µ , X ν , X ρ , X σ , for which we obtain Again, given the expressions (6.2-6.4), the identities Γ abcd |O(x) = 0 may not look terribly familiar, however, they follow from conformal invariance and one can readily confirm that they hold for any scalar primary in any CFT by substituting for the conformal generators using Eq. (B.2).
Having identified the family (6.1) of trivial operators acting on scalar primaries, we again make use of the intertwining property (3.4) satisfied by the conformal generators to write It is also interesting to consider this constraint in the center of mass coordinates of Eq. (2.20), with which the same operator takes the form As a consistency check, we note that in two dimensions there is only one non-trivial constraint and that it reduces to the spinless constraint equation, i.e., Eq. (3.40) with h =h: where we are using the null coordinates defined in Eq. (2.17).
As a further confirmation of these conclussions, consider inserting a point-like source for the operator O at a point ξ µ in R 1,d−1 , which is timelike separated from the causal diamond ♦ = (x µ , y µ ), i.e., timelike separated from both x µ and y µ . This source generates an expectation value O inside the causal diamond and it follows, for example, from the shadow field representation (c.f., Eq. (3.5)) that in this case the x µ and y µ dependencies are captured by -see also Eq. (C.18) for the two-dimensional version of this formula. Now one can verify that Σ µν (x, y) indeed yields zero when acting on this expression, for all values of ∆ O and all choices of ξ µ .
As commented above, Eq. (6.6) produces (d + 2)(d 3 − d)/24 additional constraints in higher dimensional CFTs. However, not all of these constraint equations are independent. In particular, there are relations which show that the Σ µν constraints are sufficient to ensure all constraints of the form (6.6) will be satisfied. One can show this by using simple but tedious algebra to express all the combinations in Eq. (6.1) in terms of Σ µν as follows: (6.12) where these relations are to be understood to hold with the Γ abcd (x, y) expressed as in Eq. (6.7), i.e., the operators are represented as acting on functions f (x, y) on the moduli space. Therefore the Σ µν alone form a sufficient set of constraints. Hence we expect that these together with the field equation and initial data on a codimension-d surface determine the value of the physical solutions (corresponding to the nonlocal observables) everywhere on the space of causal diamonds.
However, let us observe that the number of constraints is still larger than what one might have naïvely expected: Σ µν has d(d − 1)/2 components, 28 while we would a priori expect d − 1 independent constraint equations. That is, we might expect that the total number of equations, i.e., the wave equation and the constraints combined, would equal d, the number of timelike directions. Hence, we conjecture that further analysis will show that the sufficient set of constraints can be further reduced to Σ 0i , which would give the desired number of equations.
It is relatively straightforward to establish that there are no algebraic relations amongst the Σ µν , i.e., relations of a form similar to those given in Eq. (6.12). 29 However, one can still consider differential relations between the constraints. For example, one can show that for all sets of six indices. Note here we are saying that these combinations of operators vanish when acting on any function on the moduli space. 30 Hence Eq. (6.13) yields d+2 6 relations amongst the Γ abcd , and implicitly then, amongst the Σ µν through Eq. (6.12). For example, with d = 4, Eq. (6.13) provides 1 additional relation, whereas our discussion above suggested we should be able to find 3 extra relations. We have preliminary results on a set of further relations between the Σ µν operators, which may allow us to reduce them to a set of (d − 1) independent constraints. However, the full structure is intricate and we hope to report on these issues elsewhere -see also further discussion in section 7.
Notice that there is an interesting similarity between the constraints that appear here and those that feature in doubled and exceptional field theories, e.g., [73][74][75][76], in that both are expressed in terms of a set of second order differential operators. It would be interesting to explore whether the techniques developed in the context of these theories could be of relevance for understanding dynamics on the moduli space of causal diamonds as well. We expect that there will be further relations, however, we defer a more detailed analysis of the constraints, which is presumably essential in order to properly formulate interactions, to future work.

Holographic dynamics in AdS 3
To illustrate some of issues one encounters while attempting to generalize Q(O; x, y) to nonlinear order, we consider possible nonlinear generalisation of the decoupled dS 2 ×dS 2 wave equations, Eqs. (3.38), for operators which are not conserved currents. We will seek guidance in holography, that is, we want to define holographically Q holo (O; x, y) by 28 We note that [38] also found d(d − 1)/2 constraint equations, however, their constraints have a slightly different form from that given in Eq. (6.8), e.g., their constraints would be independent of the c µ coordinates which appear in Eq. (6.9). 29 One approach is to consider these operators on a specific submanifold of the moduli space where their explicit form simplifies, e.g., the submanifold µ = R δ µ 0 . insisting that it obeys a local wave equation on the space of causal diamonds even if the corresponding bulk probe scalar φ interacts nonlinearly in AdS 3 .
We assume that the equation of motion for the scalar φ reads (6.14) Eventually we will specialize to and we will work perturbatively in the bulk coupling constant ζ. We use the standard holographic result where we now work with the explicit representation of C 2 on AdS 3 of the form with AdS 3 isometry generators where ξ = x − t andξ = x + t. This immediately yields Note that we used 'right-moving' generators L n above. One can similarly define 'leftmoving generators'L n by exchanging ξ andξ in their definition, which would lead to the same Laplacian on AdS 3 . Using these results, Eq. (6.17) reads Hence it is clear that the dynamics of the AdS 3 scalar field φ is intimately linked to the dynamics of Q holo (O; x, y). If φ satisfies a linear wave equation, so will Q holo (O; x, y).
However, if we assume that φ interacts nonlinearly as in (6.15), we find the following identity on the moduli space M (d) ♦ : where we used ∇ 2 ♦ = 1 L 2 C 2 on the space causal diamonds. It is clear that the last term in this expression, γ dκ φ 2 is not a local functional of Q holo (O; x, y). One may for example notice that while local functionals of Q holo (O; x, y) will no longer obey the constraints, the additional quadratic term in (6.22) still does because it is the integral of a scalar quantity over a minimal surface. As a result, the equation of motion in the space of causal diamonds becomes nonlocal. In the next two subsections we will examine possible remedies. First, we will study whether there are any quadratic interaction terms that can be consistently added to Eqs. (3.38). Subsequently, we will look for natural quadratic modifications of the holographic definition of Q holo (O; x, y) given by Eq. (3.24) which will induce simple nonlinear dynamics in dS 2 ×dS 2 .

Allowed quadratic local interaction terms on the space of causal diamonds
The simplest possible solution to the nonlocality encountered in Eq. (6.22) would be a nonlinear modification of the dS wave equations. We will now give an argument that there is no straightforward and consistent nonlinear extension of the field equations at the quadratic 2-derivative level.
To quadratic order in Q(O; x, y), we can try to supplement Eqs. (3.38) by the following general set of local (i.e. at most 2-derivative) interaction terms To avoid such trivial solutions of (6.23) we can for example choosē but for simplicity we keep our notation for α 1 and α 2 unaffected. To examine whether (6.23) is an consistent set of equations for Q(O) (1) , we notice that a necessary condition is the "integrability" condition Expanding the commutator and using the field equations (6.23) converts this into To test it, we consider several sample forms 31 of O and evaluate the integrability condition using Q(O) (0) obtained from Eq. (3.34). Perhaps unsurprisingly, we need to make all α i andᾱ j vanish in order to satisfy the integrability condition (6.3) for generic O . As a result, the only consistent 2-derivative set of local equations for Q(O) up to quadratic order in the amplitude are the free wave equations (3.38) or the trivial modifications obtained from them using the field redefinition (6.24).

Quadratic modifications of the holographic definition of Q(O)
Let us now sketch an attempt to extend our definition of Q holo (O; x, y) beyond the linearized approximation, and in particular let us specify to the case of quadratic interactions as in Eq. (6.15) and work perturbatively in the coupling ζ. To this end, we start with the most general ansatz for a charge Q holo (O; x, y) which is quadratic in the bulk field φ, contains two derivatives acting on φ and a double integral over the bulk geodesic. Since points in the space of causal diamonds are represented by minimal surfaces, having a double integral over the same bulk surface has a chance of corresponding to local interactions in the space of causal diamonds. We will show that these requirements are insufficient.
Before writing the ansatz, we need a convenient parametrization of the geodesic γ. On a constant time slice (t = 0) this is a semi-circle of the form (6.28) The AdS 3 isometry generators are then given by L n andL n , c.f., Eq. (6.19). We are going to substitute x = −r tanh κ , z = r cosh κ , (6.29) so that r = R describes a geodesic centered at x = x 0 , and the latter is affinely parametrized by κ ∈ [−∞, ∞]. The AdS 3 metric in this coordinate system is Note that we do not insist on O being the expectation value in an actual CFT state. However, quite remarkably, for O coming from insertion(s) of O at a point outside but causally affecting the causal diamond, one can fulfill the integrability condition by fixing only one of the interaction terms leaving the rest arbitrary. Precisely in these cases the expectation value of O , but also Q holo (O; x, y) are holomorphically factorized, which was a crucial feature in the derivation of the Liouville equation for entanglement entropy.
One might speculate that holomorphic factorization will be an important ingredient in understanding the role of interactions in two dimensions.
In these variables, the symmetry generators (6.19) read 32) (6.33) and the barred generators are obtained by sending t → −t. The AdS 3 wave equation now reads The last observation needed before we can write our ansatz for the nonlinear Q holo (O; x, y) is that the generators simplify when evaluated on the geodesic t = 0 and r = R: From this it is clear that, while the combination L − | γ ≡ RL −1 − R −1 L 1 γ = −∂ κ parametrizes derivatives along the geodesic, there are two independent derivative operators, which act as derivatives orthonormal to the geodesic: on the one hand, we have simply L 0 | γ , on the other hand We can now write the general two-derivative ansatz for a quadratic charge Q holo (O; x, y) as the linearized solution known from (3.24) plus the following double integral: where K i ≡ K i (γ(κ), γ(κ )) is a bilocal kernel along the two integrals over the geodesic, and we also abbreviate φ ≡ φ(γ(κ)) and φ ≡ φ(γ(κ )). The idea behind this ansatz is that it makes manifest some of the desired symmetry properties. At the same time the ansatz is completely general (within our assumptions) for the following reason: we have distributed the two orthonormal transverse derivatives over the φ's in all possible ways. Further, we have not used any L − generators as 'outermost' derivatives because they would reduce to pure κ-derivatives along the geodesic, as noted above -but ∂ κ can always be integrated by parts and absorbed into the definition of the kernel K i . Finally, we used the equations of motion (see below) to remove some other combinations (such as φ L 0 L 0 φ ) that one could have written in (6.36). We note that in a time-independent setup, one can show that the kernels K i≥5 provide nothing new and can be absorbed into K i≤4 .
Giving the ansatz (6.36), the goal is to determine the kernels K i such that the quantity thus defined satisfies a nonlinear wave equation of the form where we expect as before m 2 O L 2 = −m 2 AdS R 2 AdS and an analogous identity relating ζ O to ζ. To evaluate the left hand side of (6.37) explicitly, we write 32 The round brackets make it clear that the full Casimir written above factorizes into a holomorphic and an anti-holomorphic part. If we write L n andL n in terms of AdS 3 derivative operators, then the two parts act as the same operator (i.e., each of them is proportional to the AdS 3 Klein-Gordon operator). For our present purposes we can therefore replace allL n in (6.38) by L n . The operator (6.38) is then written in a form that makes it easy to act on the ansatz (6.36) and manipulate the resulting expression purely by using group theoretic commutators between the L n , and the equations of motion of φ which can now be stated as Commuting the L n through the ansatz and demanding a result of the form of the right hand side of Eq. (6.37) yields a set of differential equations for the kernels K i . We find that these differential equations have no non-trivial solution. An ansatz of the form (6.36) is therefore not consistent with the nonlinear dynamics described by (6.37). It will be a very interesting future problem to investigate this issue more closely. What nonlinear form of Q holo (O; x, y) does satisfy nonlinear dynamical equations on the space of causal diamonds? One can start by including higher derivative terms in the ansatz (6.36). One quickly finds that only an infinite number of derivatives leads to a consistent set of differential equations for the kernels. The resulting solution is hence highly nonlocal. We believe that all these facts might be a hint that a nonlocal completion of the generalized first law might suffer from similar nonlocal behavior as does the general modular Hamiltonian in the familiar case of entanglement entropy. We hope that the space of causal diamonds might provide a useful new perspective for reorganizing (or perhaps resumming) such objects in an illuminating way.
Another approach that would be interesting to explore in this context involves integrals not just over bulk minimal surfaces, but over codimension one spatial slices connecting the minimal surfaceB to the boundary interval B (c.f., Figure 8). This approach has recently been taken in [77] to compute perturbations of entanglement entropy at second order in perturbation theory around the vacuum state. It would be interesting to use the second order results of [77] as a starting point to learn about the general structure of higher order interactions also for scalar primaries: if there is interesting dynamics on the space of causal 32 See appendix B for details. diamonds then the second order expansion of entanglement entropy should tell us about the three-point function between δS EE and two other operators Q(O; x, y), and higher order terms about higher-point functions involving at least one δS EE . If δS EE somehow couples universally to the other degrees of freedom Q(O; x, y), just like gravity couples universally to all fields in AdS, this should allow us to completely construct essentially the full interacting theory on the space of causal diamonds, and, by working backwards, also tell us what the right nonlinear extension of Q(O; x, y) should be. It is tempting to speculate that the integrals over spatial slices which appear in the second order expansion of entanglement entropy need to be upgraded to integrals over the entire bulk causal wedge to describe the nonlinear extension of Q(O; x, y). We leave the investigation of these interesting possibilities for future work.

Discussion
With the goal of extending the 'holographic' structure presented in [35] to a dynamical framework, in this paper, we extended our discussion to consider all spherical regions throughout the d-dimensional spacetime of the CFT, rather than focusing on those in a fixed time slice. In this context, it is also useful to think in terms of the causal diamonds associated with each of the spheres. Then one readily shows that the moduli space of all causal diamonds is described the coset geometry M ♦ . In sections 4 and 5, we showed that in two-dimensional CFTs, these linear wave equations could be extended to nonlinear equations with local interactions, at least for particular observables evaluated in a certain universal class of states. Hence these CFT observables can be described in terms of local dynamics on the moduli space of causal diamonds. Another nice feature of our new observables is that for holographic CFTs, they have a simple bulk description (3.24) involving a integral of the dual field over the extremal bulk surface reaching the asymptotic AdS on the sphere in the boundary theory. In many earlier works, e.g., [16-19, 36, 78], these extremal Ryu-Takayanagi surfaces were found to serve as useful probes of the bulk geometry. Then here, we are beginning to see that they also provide interesting probes of the configuration of the matter fields in the bulk. While we have presented a number of compelling results in this paper, the program of describing general CFTs in terms of nonlocal observables on the moduli space of causal diamonds, and also formulating holography in this framework for holographic CFTs, still faces a number of technical challenges.
Two features of the moduli space, which seem rather surprising at first, are that this new space is 2d-dimensional and has signature (d, d). Of course, recognizing the space of causal diamonds and the space of timelike separated pairs of points makes clear that the dimension of the moduli space must be 2d, i.e., twice the number of coordinates needed to specify a single point. However, this represents quite a departure from the framework studied in [35], which had a character more akin to standard holography. In particular, for spheres on a fixed time slice, there was a single 'holographic' direction associated with the size of the spheres.
Too Many Times: On the other hand, coming to grips with the (d, d) signature of the moduli space presents a greater challenge. In particular, as noted in section 6.1, the wave equations, (3.8) and (3.12), that we have identified are quite unconventional since they involve d timelike directions, i.e., the µ directions in Eq. (2.21). A related comment would be that a natural set of initial conditions would come from the value of the observables on infinitesimal causal diamonds, i.e., from the submanifold where µ → 0. From the discussion of section 2.2, i.e., Eq. (2.24), this submanifold lies on the time infinity of the moduli space, however, by definition, it is a codimension d surface. Hence it seems clear that the wave equation by itself is insufficient to produce full solution. Rather, it must be supplemented by additional constraint equations, as discussed in part in section 6.1.
The case of d = 2 is special and in fact two independent (conventional) wave equations emerged very naturally, as shown in Eq (3.38). The sum of these equations (3.39) matches the wave equation identified for general dimensions, and hence their difference (3.40) can be regarded as a supplemental constraint. These two equations (3.38) appeared because the moduli space factorized into the product of two de Sitter geometries for d = 2 CFTs, as discussed below Eq. (2.18). However, an alternative perspective is that, as shown in section 3.1, the wave equation results from acting on the new observables with the quadratic Casimir of the conformal group. In this regard, d = 2 is special because the conformal group factorizes as can be seen to SO(2, 2) SL(2, R) × SL(2, R) and hence each factor produces an independent quadratic Casimir. The sum of the two Casimirs yields that for the full group and hence generates the expected wave equation, while their difference yields the constraint equation (3.40).
In higher dimensions, the conformal group is irreducible and however, as proposed in [38], one can still focus on SO(2, 2) subgroups (as well as other subgroups) of the full SO(2, d) group. This approach gives rise to an elaborate system of constraints (6.2-6.4), as discussed in section 6.1. However as discussed there, there are various (algebraic and differential) relations amongst the Γ abcd (x, y) operators constructed there, e.g., see Eqs. (6.12) and (6.13). While we were unable to prove it, it seems that a natural conjecture is that the Σ 0i form a sufficient set of constraints to identify the physical solutions of the wave equation (3.8). It is interesting to note then that the operators appearing in these constraints are second order differential operators in the timelike directions on the moduli space, i.e., µ . This becomes clearer if we consider Eq. (6.9) on the submanifold where µ = R δ µ 0 : where k is only summed over k = 1, · · · , d − 1. For comparison purposes, we also consider Here we see these latter constraints are only first order in 'time' derivatives. While the appropriate initial value problem is not entirely clear, our proposal above was that initial data would be specified on the codimension d submanifold where µ → 0. Then, combining the constraints Σ 0i (x, y) f (x, y) = 0 with Eq. (3.8), we have d second-order wave equations which would propagate the physical solutions out across the moduli space. 33 In this context, we can regard the constraints Σ ij (x, y) f (x, y) = 0 as imposing constraints on the initial data, i.e., the values of f and its first 'time' derivatives on the codimension d initial value surface. Further, the additional constraints discussed towards the end of section 6.1 would verify that the Σ ij constraints are consistent with the propagation produced by the Σ 0i equations. This intriguing structure is then reminiscent of the constraint equations appearing in gauge theories or gravity and it may be hinting that there is a hidden gauge symmetry underlying the present equations. However, the full structure of the constraints and the associated initial value problem is intricate and remains to be understood. We hope to return to these issues in future work.
Let us note in this context that one might anticipate some simplifications when we restrict attention to stationary configurations. In particular, it seems that in such a case, we would only need to consider spherical domains on a fixed time slice. The moduli space of such balls 34 again reduces the d-dimensional de Sitter geometry studied in [35], as discussed in section 2. Hence one may expect that the problem reduces to solving the standard Lorentzian wave equation on this geometry. However, it turns out that when evaluated on a stationary configuration, our nonlocal observables (3.1) do not satisfy the naïve wave equation on the dS d space. In the case of holographic CFTs, there is a simple intuition for this fact: stationary bulk solutions do also not obey the Euclidean AdS d equations of motion. That is, performing the time integral in the first law with stationary sources yields a stationary kernel which is not appropriate for a free wave propagation on Euclidean AdS d . It would be interesting to fully investigate this in a more general context.
Time Evolution: One of our motivations here was to move extend the construction of [35], which focused on fixed time slices, to a new framework which could describe the time dynamics of the CFT. Hence we must observe that describing the time evolution of the CFT remains to be understood in the current framework. As discussed in section 2.2, according to the metric (2.21), the displacements dc µ of the centre of the causal diamond are all spacelike while the displacements d µ deforming the causal diamond are all timelike. 33 Note that the distinction between different types of constraints, e.g., as in Eqs. (7.1) and (7.2), should be done covariantly with the projection operators: P µν = µ ν / 2 and P ⊥ µν = ηµν − P µν . That is the two classes of constraints would be replaced by Σ0i → P σ µ P ⊥ρ ν Σσρ and Σij → P ⊥σ µ P ⊥ρ ν Σσρ. This covariant description reinforces the idea that all d(d − 1)/2 components of Σµν play a role in describing the physical observables on the moduli space. 34 Equivalently we can consider the space of maximally symmetric minimal surfaces of codimension-one in Euclidean AdS d .
But in particular then, translating the causal diamonds in time corresponds to a spacelike motion on the moduli space! This 'unusual' feature becomes readily evident in Eq. (2.22), however, we can gain some insight into the time evolution as follows: Choose a fixed time foliation of the original flat spacetime and consider spheres restricted to these time slices. Following the discussion of section 2.1, this amounts to choosing the coordinates for the tip and the tail of the corresponding causal diamonds as x 0 = t + R, y 0 = t − R and x = y. With this restriction, the metric (2.14) becomes Hence we have identified a submanifold of the full coset with the geometry of (d + 1)dimensional de Sitter space. 35 However, this submanifold clearly exposes the somewhat surprising feature noted above, namely, the sphere radius R plays the role of time while the CFT time t appears as a space-like coordinate. Hence this key issue remains an open question for this new moduli space approach, i.e., how to construct a natural description of the real-time dynamics of the underlying CFT using this framework.
Kähler-like structure: In a two-dimensional CFT, we saw that right-and left-moving contributions (4.10) to the entanglement entropy had an interesting interpretation as conformal factors for the two de-Sitter factors of the moduli space in Eq. (4.20). With this interpretation the Liouville equations (4.11) were equivalent to demanding a positive constant curvature for the conformally rescaled de-Sitter metrics. One problem with this interpretation, however, is that the sum S EE = S R (f ) + S L (f ) itself does not appear in the geometry and we need to able to split it into right-and left-moving components to write Eq. (4.20). As a consequence, it is not easy to generalize this structure to higher dimensions.
Interestingly, it is possible to identify a different mode in the metric on the moduli space such that demanding constant scalar curvature gives rise to a field equation which is identical to the field equation obeyed by δS EE , suggesting a close relation between the two. To write down this mode, we first point out that the metric on the space of causal diamonds (2.14) can be obtained from the following Kähler-like structure: (7.5) 35 One might note that the above de Sitter geometry is a somewhat unusual choice in the context of the present paper because it is not a totally geodesic submanifold. This can be seen since we can regard the new submanifold (7.3) as a coset itself: dS d+1 = SO(1, d + 1)/SO (1, d). However, the SO(1, d + 1) isometries of the submanifold do not form a subgroup of SO(2, d), the isometries of the full coset. As a result, as is easily verified, our nonlocal operators do not satisfy a simple wave equation on the dS d+1 geometry. This is reminiscent of the findings of Refs. [36,79], where the role of the potential V was played by the entanglement entropy in a two-dimensional CFT. It is clear though that in higher dimensions this direct association is no longer true, albeit one might still try to express Eq. (7.4) in terms of the entanglement entropy through its leading divergent term, i.e., the area law contribution.
To make the connection with Kähler geometry more transparent, we will temporarily relabel y µ as xμ and define g µν ≡ 1 2 h µν and g µν ≡ 0. The Ricci tensor for this type of Kähler metric takes a simple form where we used a specific property of the metric (2.14) that The moduli space is therefore a constant curvature space with Ricci scalar Another property of Kähler metrics is the simplicity of the scalar Laplacian Let us now try to look for variations of the Kähler potential that do not change the value of the scalar curvature and see whether these variations obey an interesting equation. By explicitly varying the scalar curvature R = 2g µν R µν , we find where in the second line we used Eq. (7.7). Using the explicit form of g αβ in terms of δV we can finally write the requirement δR = 0 as If we therefore were to take δV = δS EE , this equation would indeed be satisfied. It would be interesting to explore this intriguing potential connection between δS EE and δV further. If correct one could speculate that it might even be valid at the nonlinear level, and that constant scalar curvature on the moduli space of causal diamonds yields the full nonlinear equation for entanglement entropy valid in generic gravitational backgrounds but in the absence of other sources. It is also intriguing to notice that for space-like separated points, V itself is proportional to the geodesic distance between the two points, so that the constant curvature condition may have a natural meaning in that case as well. To test these ideas, one could for example check whether they apply to entanglement entropy in explicitly known non-trivial gravitational backgrounds such as black holes. We hope to return to these issues at some point in the future.

Generalized twist operators:
One open question is to provide a nonlinear generalization of observables introduced in section 3. Motivated by considerations of entanglement entropy, we are drawn to consider twist operators with regards to this issue. Recall that as was briefly reviewed in section 4, the entanglement entropy, as well as the Rényi entropies, can be evaluated in terms of twist operators in an n-fold replicated version of the CFT -see also [52,[55][56][57]. Further in higher dimensional CFTs, i.e., for d ≥ 3, the twist operators σ n are codimension-two surface operators with support on the entangling surface. In [52,80], it was argued that an effective twist operatorσ n is defined if one considers correlation functions where the twist operator only interacts with other operators which are all from a single copy of the replicated CFT. In particular, one finds σ n = e −(n−1)Hm (7.12) where H m is the modular Hamiltonian. This expression should apply for general geometries but, of course, the special case of a spherical entangling surface (in the CFT vacuum) is of interest here, where H m is given by the local expression in Eq. (1.3). This expression is particularly useful to investigate the limit n → 1, which then yields In particular, this demonstrates that the modular Hamiltonian is the only nontrivial contribution in the OPE limit of the twist operator which survives in the n → 1 limit. Ref. [81] suggested augmenting the twist operators with (the exponential of) a charge term which had the form of one of our new observables (3.17) with a spin-one conserved current. A similar extension [62] involving higher spin observables (3.35) was considered in the context of two-dimensional CFTs of the form discussed in section 5. Given these considerations, it is tempting to generalize Eq. (7.12) to a family of 'generalized twist operators' based on our nonlocal observables, e.g., (7.14) We have included a numerical coefficient µ so that the linearized observable would emerge in a 'first law'-like expression with the limit µ → 0. 36 However, it is not immediately clear whether one can meaningfully construct the power series in µ implicit in the above definition ofσ(O). We hope to return to study this question and other issues for this possible nonlinear generalization of our nonlocal observables in the future. to the appearance of a logarithmic divergence whose coefficient would yield the universal contribution. These considerations would then put these universal contributions on the same footing as the constant F in the F -theorem [10,11,82,83]. However, there are subtleties defining F using entanglement entropy [84] and so as in that case, one might ask if a more robust definition of Q(O) for the cases where Eq. It is clear from the discussion above that our studies here have left open a variety of interesting questions and we hope to continue to study these in future research.
second section, we discuss the moduli space for pairs of spacelike separated points, which arises naturally in a number of instances, e.g., two dimensions. Finally, in the last section, we elaborate on the form and properties of the conformal Killing vector which can be constructed to preserve the form of any given causal diamond.

A.1 Derivation of metric on the space of causal diamonds
In the following, we present further details in the derivation of the metric (2.14) on the moduli space of causal diamonds. Our approach is to continue working in the embedding space introduced in section 2.1, make a general ansatz compatible with the required symmetries, and subsequently impose conditions which fix the free parameters.
We remind the reader that the metric needs to be of the form (2.13), which we reproduce here for convenience: where the vectors T b and S b still need to be fully determined, subject to the conditions in Eqs. (2.8) and (2.9), i.e., The form of the metric (A.1) was derived in section 2.1 by demanding SO(1, d−1)×SO(1, 1) invariance. Let us start with the observation that for any metric of the form where a is a Killing coordinate, i.e., none of the metric components depends on a, one obtains its SO(1, 1) coset by taking where m i are the coordinates on the final coset. In order to obtain the metric on the space of causal diamonds, we thus need to parametrize T b and S b in terms of the corresponding m i -coordinates, which in our case are simply x µ and y µ specifying the tips of a causal diamond. We then need to evaluate Eq. (A.1). The corresponding Killing coordinate will be that associated with the SO(1, 1) boost and this will allow us to use Eq. (A.5) to explicitly write out the desired coset metric.
As it turns out, the following parametrization of T b and S b does the job for us: In order to demonstrate this, let us start with the conditions (A.3), which by taking their two independent linear combinations can be recast as x w 2 = 0 and C (0) y − 2wy + C (2) y w 2 = 0 , (A.7) where Clearly, neither T b nor S b can depend on w µ . As a result, demanding conditions (A.3) amounts to solving a set of 4 independent equations: x = x 2 and C (2) x = 1 and C (0) y = y 2 and C (2) y = 1 .
We will regenerate the missing parameter by evaluating the metric (A.1) by performing a boost in the (T, S)-plane,  Figure 10. Flow lines of the conformal Killing vector K µ . The causal diamond ♦(x µ , y µ ) is shaded blue.

A.2 Conformal Killing Vectors
Given a causal diamond in Minkowski space, which is defined by the positions of the future and past tips (y µ , x µ ), there is a conformal Killing vector which preserves the diamond: 37 (A.13) From this expression, one can easily see that the vector vanishes at w µ = x µ and w µ = y µ , and when both (y − w) 2 = 0 and (x − w) 2 = 0, i.e., when Eq. (2.2) is satisfied. Hence the tips of the causal diamond and also the maximal sphere at the waist of the causal diamond are fixed points of the flow defined by K. Further, one sees that K is null on the boundaries of the causal diamond, i.e., when either (y − w) 2 = 0 or (x − w) 2 = 0. Finally, one can also observe that within the rest of the causal diamond K is timelike and future directed. Figure 10 illustrates the Killing flow both inside and outside of the causal diamond for a cross-section of the diamond. Working with standard 'Cartesian' coordinates w µ = (t, x) in Minkowski space, if one chooses the frame where y µ = (R, x 0 ) and x µ = (−R, x 0 ) then the conformal Killing vector takes a recognizable form, e.g., [33] Given this expression, one sees that the perturbation of the entanglement entropy in Eq. (1.3) can be written in a covariant form as 37 As usual, our notation here is that ( where the integration runs over | x − x 0 | 2 ≤ R 2 on the t = 0 time slice. However, in this form, we can regard the integrand is a conserved current which allows us to move the surface of integration to be any Cauchy surface spanning the associated causal diamond. That is, if we define J µ ≡ T µν K µ , it follows that ∇ µ J µ = 0 because the stress tensor is conserved and traceless, i.e., ∇ µ T µν = 0 = T µ µ , and because K is a conformal Killing vector, i.e., ∇ µ K ν + ∇ ν K µ = 2 d ∇ · K η µν . Of course, similar statements apply for the higher spin observables constructed in section 3.2. We might note that in two dimensions using the null coordinates introduced in Eqs. (2.16) and (2.17), the conformal Killing vector takes a particularly simple form: This allows us to re-express the observables (3.34) for d = 2 CFTs as In the context of the AdS/CFT correspondence, the conformal Killing vector (A.13) extends to a proper Killing vector of the AdS geometry as follows: We describe the AdS geometry with Poincaré coordinates where we have introduced a (d + 1)-dimensional vector notation, e.g., we denote the bulk coordinates as W M = (w µ , z). Hence we indicate the tips of the causal diamond in the boundary with Y M = (y µ , 0) and X M = (x µ , 0). With this notation, the bulk Killing vector becomes where our notation here is that (Y −X) 2 = G M N (Y −X) M (Y −X) N . With this expression, one can easily verify that the tips of the causal diamond in the boundary are fixed points of the Killing flow, as is the extremal surface where (Y − W ) 2 = 0 and (X − W ) 2 = 0. Further one can see that the Killing vector becomes null on the boundaries of the causal wedge in the bulk.
One can also consider the analytic continuation of Eq. (A.13) to Euclidean signature, which follows by simply replacing the Lorentzian inner product there by (y − x) 2 = δ µν (y − x) µ (y − x) ν . As discussed in section 3.5, there are two distinct moduli spaces to consider in Euclidean signature and associated conformal Killing vectors arise from different choices of the vectors x µ and y µ . If we choose real vectors, then x µ and y µ now define a pair of spacelike points and these points are the only fixed points of the flow defined by K µ . 38 38 That is, in Euclidean signature, the only solution of (x − w) 2 = 0 is w µ = x µ and hence we cannot simultaneously solve (y − w) 2 = 0 and (x − w) 2 = 0. Note that if we were considering spacelike separated points but in Lorentzian signature, there would be the simultaneous solution of these two equations would define a spacelike hyperbola -see the following section.
The second distinct moduli space in Euclidean space is the space of all (d − 2)dimensional spheres, which is described by the coset SO(1, d + 1)/(SO(1, d − 1) × SO (2)). In this case, the associated conformal Killing vector results from choosing 'complex' vectors x µ and y µ . In particular, using the notation of Eq. (2.20), we choose where n µ is an arbitrary unit vector in R d . The conformal Killing vector then becomes where we have introduced an extra overall factor of i to produce a real vector. Since w µ correspond to real positions, we cannot satisfy the equations w µ = y µ or w µ = (y * ) µ . On the other hand, the equations (y − w) 2 = 0 and (y * − w) 2 = 0 can be simultaneously solved by setting That is, the flow of the new vector K µ has a fixed point on a (d − 2)-sphere of radius R centred at w µ = c µ and lying in the (d−1)-dimensional hyperplane defined by n·(c−w) = 0. Hence this new Killing vector generates the SO(2) symmetry in the coset describing the moduli space of (d − 2) dimensional spheres in R d .

A.3 Moduli space of spacelike separated pairs of points
Here we would like to consider the analog of our generalized kinematic space (2.11) for pairs of spacelike separated points in a d-dimensional CFT (with Lorentzian signature). Recall that M (d) ♦ was the moduli space of all causal diamonds, or equivalently of all spheres, or equivalently of all timelike separated pairs of points. Considering the space of spacelike separated points arises naturally in a number of instances, e.g., upon analytically continuing to a Euclidean signature, as discussed briefly in section 3.5. In fact, in two dimensions, a causal diamond can be defined either in terms of a pair of timelike separated points or a pair of spacelike separated points. 39 Hence it seems that d = 2 is a special case where the two moduli spaces are equivalent, i.e., the space of timelike separated pairs of points is the same geometric object as the space of spacelike separated pairs of points. Our final conclusion here is that in fact this equivalence extends to CFTs in arbitrary dimensions! To understand this new moduli space, we begin by considering the intersection of the lightcones from a pair of spacelike separated points. As illustrated in Figure 11, the intersection of the lightcones defines a spacelike hyperbola lying in a fixed timelike hyperplane (of codimension one). Hence in analogy to the previous discussion of kinematic space, we Figure 11. Illustration of the one-to-one correspondence between spacelike separated points and the space of spacelike hyperbolas: the intersection of lightcones of two spacelike separated points forms a spacelike hyperbola (dashed maroon curve) which lies in a timelike codimension-one hyperplane (shaded in yellow). may say that the moduli space of pairs of spacelike separated points is equivalent to the moduli space of spacelike hyperbola. There is no obvious analog of the causal diamonds since for spacelike separated points, the two lightcones do not enclose a finite-volume region anywhere, as can be seen in the figure.
Next we would like understand the coset structure of this moduli space by turning to the embedding space introduced in section 2.1. However, it is easiest to think in terms of a construction of the moduli space of spacelike hyperbolae in a d-dimensional CFT. A bit of thought shows that such a hyperbola will be described by choosing a pair of orthogonal unit vectors, T b and S b , satisfying precisely the same conditions given in Eqs. (2.8) and (2.9). This construction is again easily illustrated with the Poincaré patch coordinates (2.7) where a convenient choice of the unit vectors is The expressions on the right denote the surfaces in the asymptotic geometry that are picked out by the orthogonality constraints (2.9), i.e., S b selects a particular timelike codimensionone hyperplane in the boundary while T b selects a spacelike hyperboloid. The intersection of these two surfaces then yields the (codimension-two) hyperbola Now following the discussion of section 2.1, a particular pair of unit vectors, T b and S b , specifies a particular hyperbola in the boundary geometry. We sweep out the rest of the moduli space by acting with SO(2, d) transformations, i.e., Lorentz transformations in the embedding space. However, the coset structure of the resulting moduli space of hyperbolae is then determined by the symmetries preserved by any particular choice of the unit vectors. However, since the constraints on the present unit vectors are precisely the same as in section 2.1, these symmetries are also the same and hence we arrive at the same coset as given in Eq. (2.11), namely, .
At first sight, this result may seem rather counterintuitive. Spacelike and timelike separated pairs of points are by definition very different kinds of objects in Minkowski space and yet we found that in a d-dimensional CFT, the moduli spaces of such pairs are described by the same coset structure irrespective of whether the separation is spacelike or timelike. Further in the language of the embedding space, the two spaces are being described by precisely the same family of orthogonal unit vectors, i.e., pairs satisfying Eqs. (2.8) and (2.9). Of course, this indicates that not only do we have two moduli spaces described by the same coset geometry (A.25) but that in fact we are considering one and the same moduli space from two different perspectives! In order to develop a better understanding of this counterintuitive result consider the following: The first point to note is that our intuition about spacelike and timelike separated pairs of points is firmly rooted in flat Minkowski space. However, recall that in the embedding space, the the Poincaré patch coordinates (2.7) only cover a portion of the AdS hyperboloid (2.5) and some SO(2, d) transformations will take us out of this region, i.e., pairs of points maybe mapped beyond the corresponding Minkowski space in the asymptotic boundary. Hence it is more appropriate to think of working on global coordinates for the AdS geometry or transforming the CFT to the 'cylindrical' background R × S d−1 (with R being the time direction). 40 In the latter geometry, there are limits to how far apart the pairs of points can be. 41 In particular for spacelike separated points, the maximum separation is πR sph where R sph is the radius of curvature of the S d−1 , i.e., maximally separated pairs are antipodal pairs on the (d − 1)-sphere -see Figure 12. Similarly, the maximal separation for a timelike pair is 2πR sph . For example, if the two points lie at the same pole on the sphere, then with this maximal time separation, the lightcones from these two points intersect at a point on the the opposite pole and hence the corresponding sphere has the maximal angular size, i.e., the sphere's proper size has actually shrunk to zero but the 'enclosed' ball covers the entire S d−1 . In fact, as illustrated in the figure, the null cones of these two maximally 40 With this transformation, we are actually extending the original Minkowski space to a geometry where the conformal group acts properly everywhere. 41 As in flat space, we measure the separation between points in R×S d−1 as the (minimal) proper distance along geodesics connecting the points. Figure 12. Illustration of the CFT on cylindrical background R×S 1 . The point z µ is the antipodal point from the point x µ on the constant time slice containing this point and z µ has the maximal spacelike geodesic distance πR sph from x µ . Blue lines are past and future lightcones of x µ . The pointx µ corresponds to the position where the future light cone of x µ first self-intersects. The sphere S (indicated by black points) can be described as the intersection of past lightcone of y µ either with future lightcone of x µ , or alternatively with past lightcone of the antipodal pointx µ . In the former case, S is characterized by a pair of timelike separated points, in the latter case by a pair of spacelike separated points.
(timelike) separated points actually coincide. 42 This leads to the observation that because of the compact structure of the S d−1 , when we choose any single point in the R × S d−1 , by following the past and future null cones, we actually specify two families of preferred points. The first being points lying at the same pole of the sphere at t = 2πnR sph where n is any integer (and we have assumed the initial point lies at t = 0, i.e., n = 0). The second family is points on the opposite pole lying at t = 2π(n + 1 2 )R sph where n is again any integer.
This insight then allows us to understand the equivalence of the two spaces discussed above in very concrete terms. Consider the two timelike separated points designated x and y shown in Figure 12. The future lightcone of x and the past light cone of y intersect on the sphere designated S. However, now consider the pointx where the future lightcone of x (first) converges to a point on the opposite pole of the sphere. The pairx and y is now 42 In the embedding space, the two points considered here are actually coincident points on the boundary of the AdS hyperboloid (2.5). It is only when we consider the universal cover of the AdS hyperboloid (as we do implicitly here) that the points are separated. In particular, if we had been more precise we should have replaced the SO(2, d) group in the numerator of (A.25) by a suitable infinite cover in this case. a spacelike separated pair of points. The past and future lightcones from these two points intersect at the spheres, S andS, respectively. Now, in an appropriate conformal frame, wherex and y are spacelike separated points in flat Minkowski, these two spheres become the two branches of the corresponding spacelike hyperbola discussed above. 43 However, the key point here is that in the R × S d−1 conformal frame, we can specify spheres either in terms of the intersection of the past and future lightcones of a pair of timelike separated points or in terms of the intersection of the past light cones from two spacelike separated points. Hence we recognize that moduli spaces of spacelike and timelike pairs in fact provide two different perspectives of the same geometric object! Given that the moduli spaces of spacelike and timelike pairs (on R × S d−1 ) are the same, it is interesting that the discussion in section 2.2 implies that the limit in which a timelike separated pair approaches a null separated pair of points is a limit that takes us to timelike infinity in the moduli space -see footnote 5. This is a consistency check in that it shows that there is no trajectory on the moduli space that carries one between timelike separation to spacelike separation. Of course, it would be interesting to further explore the implications of this equivalence.

B.1 General definitions
Spinless case: Given the conformal symmetry generators L i (x), we define the second Casimir as the object C 2 ≡ C ij L i (x)L j (x) (where i, j = 1, · · · , (d + 1)(d + 2)/2) which acts on scalar primaries O in the CFT with dimension ∆ O such that: (B.1) In this appendix we discuss various realizations of C 2 on objects which carry a representation of the conformal group: fields in AdS d+1 and functions on the moduli space of causal diamonds. The conformal algebra in d dimensions is isomorphic to the group SO(2, d) Lorentz group of the embedding space (2.4). We write the action of SO(2, d) generators on primaries as (see, e.g., [85]) These satisfy the algebra [J ab , J cd ] = i η (2,d) bc J ad − i η (2,d) ac J bd − i η (2,d) bd J ac + i η (2,d) ad J bc , (B .4) where η (2,d) ab = diag(−1, −1, 1, . . . , 1) is the embedding space metric. In terms of these Lorentz generators, we can represent the action of the Casimir on operators by C 2 ≡ − 1 2 J ab J ab , which acts as a differential operator whose eigenfunctions are the primary states: Since SO(2, d) acts on the AdS d+1 hyperboloid in embedding space as standard Lorentz transformations, the above generators can also be represented as isometry generators of AdS d+1 . This representation is given in embedding space coordinates by J ab = i(X a ∂ b − X b ∂ a ). In particular, the AdS d+1 Laplacian is represented by the combination Similarly, the action of the Casimir is represented on the moduli space of causal diamonds. Using the explicit representation (B.2), it is straightforward to verify the following relation between the second Casimir as a differential operator acting on the space of causal diamonds, and the scalar Laplacian on the same space: where f (x, y) is any function on the space of diamonds ♦ = (x µ , y µ ), and ∇ 2 ♦ is the Laplacian on the moduli space of diamonds (2.14).

B.2 Two-dimensional case
Let us briefly make the statements of the previous subsection more explicit in the case of two-dimensional CFTs (and AdS 3 , respectively). In this case, we can work in right-and left-moving coordinates ds 2 CFT 2 = −dt 2 + dx 2 = dξ dξ . The above discussion concerned the action of conformal generators on CFT states. There is an analogous set of identities for AdS 3 isometry generators. We work in Poincaré coordinates Using the general definitions of section B.1, we then find the following isometry generators in AdS 3 : 18) and similarly forL n with ξ andξ interchanged. We then have that the combinations appearing in the Casimir C 2 , and its left-and right-moving parts defined in (B.16), all correspond to the scalar Laplacian on AdS 3 :

C Relative normalization of CFT and bulk quantities
In this appendix we demonstrate how to fix the relative normalization between Q(O) as defined in Eq. (3.1) and its holographic couterpart Q holo (O) in Eq. (3.24). Our strategy will be to exploit the fact that the normalization can be determined in the limit of very small diamonds, or equivalently with O = constant. For simplicity, we assume the centre of the diamond is located at 1 2 (x µ + y µ ) = 0, and we work on a time slice such that 1 2 (y µ − x µ ) = R δ µ 0 . Consider first the field theory observable Q(O; x, y) in the limit x → y, i.e., for a constant expectation value O throughout the causal diamond: . (C.1) To evaluate the integral, it is useful to parameterize the causal diamond as follows: where ω ∈ S d−2 is a unit vector that parameterizes the spacelike spherical slices. The full range ζ,ζ ∈ [−1, 1] would cover the diamond twice. Considering the symmetries of the integrand in (C.1), we can effectively integrate over the range ζ ∈ [−1, 1] andζ ∈ [0, 1]: where we binomially expanded the measure factor (ζ +ζ) d−2 to perform a term-by-term integration. The final line can be simplified slightly by substituting Ω d−2 = 2π for the volume of a unit (d − 2)-sphere, however, the present form is convenient for our comparison below. Next, we compute Q holo (O) as defined in Eq. (3.24) using standard holographic techniques. In particular, we will work in Poincaré coordinates If one considers the dual field φ(u) in a linearized approximation in this background, the asymptotic behaviour takes the following form: Here λ(ξ) is the coupling to the operator in the boundary CFT and we set it to zero in the following. 44 In keeping with the previous calculation, we also assume that O is constant, at least within the boundary region of interest. The boundary sphere in the previous calculation was chosen to be: t = 0 and r = R. The corresponding extremal surface in the bulk is the hemisphere: t = 0 and z 2 + r 2 = R 2 . We can parameterize this bulk surface with z = R sin λ and r = R cos λ where 0 ≤ λ ≤ π 2 . Then keeping on the leading term in the asymptotic expansion of the bulk scalar, the computation of the observable Q holo (O) reads as follows: where we have substituted d−1 P = 8πG N and applied Eq. (C.6) in the second line. We can now equate the two results (C.3) and (C.7) and thus fix the relative normalization: (C.8)

C.1 Holographic computation for a free scalar in AdS 3
We expect that the generalized first law (3.34) provides the leading order contribution to a set of novel physical quantities in CFTs in an analogous way in which the entanglement first law provides the leading order perturbation of the vacuum entanglement entropy for excited states. In the present section we want to corroborate this proposal by providing the holographic dual of δS O in a class of CFTs which admit a semi-classical gravity description. In section 3.4, we argued that Q(O) = Q holo (O) with an appropriate choice of the bulk normalization constant C blk . The latter was fixed above by comparing the two expressions in a situation where O was a constant. In the following, we explicitly demonstrate that the equivalence of the boundary and bulk expressions for a more nontrivial field configuration. To do so, we focus on AdS 3 with a free probe scalar field φ dual to a primary operator O with h =h = ∆ O /2 in a two-dimensional holographic CFT. In this case, the 'sphere' of interest becomes an interval of length 2R, which for simplicity, we assume is centred at the origin on the t = 0 time slice. Further Eq. (3.34) becomes O(ξ,ξ) . (C.9) 44 Eqs. (C.5) and (C.6) present a standard set of holographic conventions, e.g., see [86], although perhaps not unique. Further we note that the choice λ = 0 means that we are only studying excitations the CFT ground state here. It would be interesting to extend the discussion in this paper to holographic RG flows where the boundary theory is deformed away from a conformal fixed point.
The holographic expression in Eq. (3.24) reduces to an integral of the bulk scalar over the spatial geodesic γ connecting the endpoints of the interval in the boundary theory: where we have used 8πG N = P for d = 2 and substituted for the normalization constant C blk using Eq. (C.8).
For our explicit computation, we pick a simple linearized perturbation by putting a delta-function source at a point (ξ 0 ,ξ 0 ) on the boundary. The linearized solution is given by the usual bulk-boundary propagator φ(r, ξ,ξ) = α z Here, α is an arbitrary constant measuring the strength of the source and we are using Poincaré coordinates on AdS 3 ds 2 = 1 z 2 dz 2 + dξ dξ (C. 12) where the curvature radius is set to unity and w,w denote the null coordinates introduced in Eq. (2.16), i.e., ξ = x − t andξ = x + t. For simplicity, we will assume that the source is spacelike separated from the interval, i.e., (ξ − ξ 0 ) 2 > 0 for any point ξ =ξ = x ∈ [−R, R] in the interval. The bulk geodesic spanning the boundary interval above may be parametrized by x = R cos λ and z = R sin λ . (C.13) The line element along the geodesic is dλ/ sin λ and then Eq. (C.10) yields where in a slight abuse of notation, we have defined |R ± ξ 0 | 2 ≡ (R ± ξ 0 )(R ±ξ 0 ) in the second line. The integral there can be found, e.g., in [87]. Note that the final result can be split into right-and left-moving factors, which was not at all clear from the initial expression. 45 45 In the limit ξ 2 0 R 2 , the expectation value is essentially constant across the interval -see Eq. (C.16). Hence in this limit, the leading contribution above can be matched with that in Eq. (C.7) with d = 2. Note that in this case, Ω0 = 2. Now let us now turn to the boundary computation. First we should extract the expectation value from the O from our linearized solution (C.11) for the bulk scalar. As we take z → 0 in Eq. (C.11), we immediately recognize the behavior of a normalizable mode (C.15) Now applying Eq. (C.6) with d = 2, we find Since this profile factorizes into right-and left-moving contributions, upon substitution into Eq. (C.9), we also find a factorized answer: (C.17) where as above, we are using |f (ξ)| 2 = f (ξ)f (ξ) in the notation of complex coordinates. This integral can also be performed, e.g., see Eq. (3.199) in [87] and one finds 46 which provides a perfect agreement with the holographic result in Eq. (C.14). Since this is a linearized calculation, the agreement (3.25) readily extends to arbitrary field configurations that are generated by the insertion of sources that are spacelike separated from the interval of interest. Of course, Eqs. (C.14) and (C. 18) show that there are singularities that appear when the sources cross the lightcones of the endpoints of the interval, i.e., when the sources move into causal contact with the interval. It would be interesting to investigate further here to understand if Q(O) = Q holo (O) still applies in the latter situation. Following the general arguments in section 3.4, this is intimately related to the question of better understanding causal wedge reconstruction in the bulk.