State-Dependent Divergences in the Entanglement Entropy

We show the entanglement entropy in certain quantum field theories to contain state-dependent divergences. Both perturbative and holographic examples are exhibited. However, quantities such as the relative entropy and the generalized entropy of black holes remain finite, due to cancellation of divergences. We classify all possible state-dependent divergences that can appear in both perturbatively renormalizeable and holographic covariant $d\le 6$ quantum field theories.

1 Introduction There has been much recent interest in the entanglement entropy of quantum field theories (QFTs). Given a QFT, a state in the theory, and a choice of region R, the entanglement entropy is formally defined as S R = −tr(ρ R ln ρ R ), (1.1) where ρ R is the restriction of the state to the region R. This quantity is sensitive to all degrees of freedom in R (in fact, it is invariant under unitary transformations ρ R → U ρ R U −1 ) and obeys a set of interesting inequalities [1]. It is related to c-theorems in various dimensions [2,3,4,5,6], and in certain cases it provides information about topological phases which cannot be obtained from local order parameters [7,8,9]. It is also of interest in holographic theories, where there is a simple geometric description in terms of the area of an extremal surface in the dual bulk theory [10,11,12]. However, in any unitary theory with local degrees of freedom, S R is divergent. The leading order divergence is proportional to the area A, but there may also be subleading divergences. This makes Eq. (1.1) a purely formal expression until a UV regulator is specified, e.g. a lattice 1 , brick wall [16], Pauli-Villars [17], heat kernel (reviewed in [18]), mutual information regulator [2,19], etc. In general, the value of S R will depend on the choice of regulator, making the interpretation of S R somewhat subtle. Nevertheless, one simplifying feature is that the divergences are always local integrals of quantities defined on the boundary entangling surface ∂R. For example, in a d-dimensional scale-invariant theory with a regulator at an energy scale Λ, one often finds power laws and/or logarithmic divergences: 2 S R = n<d ∂R X (n) dA Λ D−2−n + ∂R X (d) dA ln(Λ) + finite, (1.2) where X (n) are local integrands of dimension n ≤ d. In cases where each X is constructed out of geometrical structures such as intrinsic and extrinsic curvatures, these dimensions are integers. 3 But more generally X can also depend on scale-dependent parameters in the theory such as masses, or even-as we shall argue below-on expectation values of quantum fields. If these sources or fields have anomalous dimensions, then n will generally be noninteger.
There are several distinct strategies for dealing with this divergence, depending on the needs of your particular application: 1. Specialize: Accept that each different regulator defines a distinct quantity, so just pick one choice of regulator and stick to it. (This is a somewhat narrow viewpoint because it makes it difficult to compare different calculations, but it may be fine within a particular project.) 2. Universalize: Identify "universal" aspects of S R which are the same for every good regulator (although they may still depend on the theory or state). This includes the coefficients of log divergences, or the finite piece modulo a local counterterm [10,11,21,20] Other sets of universal quantities related to entanglement are the mutual information I A,B = S A + S B − S AB between two regions A and B separated by a finite spatial separation [21], and the relative entropy of two states S(ρ | σ) = tr(ρ(ln ρ − ln σ)) [22]. These quantities are typically finite. 3. Renormalize: In a gravitational theory, black holes have finite entropy S BH = A/4 G + subleading corrections. Divergences in the QFT state outside the horizon can be handled by absorbing them into renormalizations of the gravitational parameters, e.g. the area law divergence corresponds to a shift of 1/G, so that the total "generalized entropy' S gen = S BH + S ent remains finite. (See the Appendix of [23] for a review and references.) Although this approach was originally designed for black holes and other causal horizons, at least semiclassically one can assign an entropy to more general choices of entangling surface ∂R as well [24,25,26]. 4. Subtract: If the divergences do not depend on the state, then one can subtract off the entropy of some reference state σ, e.g. the vacuum state, as done in [27,28,29,30,31] (This is obviously not useful if you are only interested in the structure of vacuum entanglement entropy!) The purpose of this article is to identify situations in which the divergences depend on the state via the expectation value of some operator O . In such examples, the vacuum subtraction approach to controlling divergences will fail, although the other three approaches remain valid.
One might have thought that any two reasonable states (having the same UV vacuum structure) should differ by only a finite entropy in any region, but this turns out not to be the case. Physically reasonable states do indeed have finite relative entropy and/or generalized entropy, but in each of these cases there is an additional term in the expression (the modular Hamiltonian tr(ρ ln σ) or a "Wald-like" surface term [32,33,34,35] respectively). In the simplest examples, the divergences which cancel between the two terms are state independent so that ∆S R = −tr(ρ ln ρ) + tr(σ ln σ) is well defined. But in more complicated examples, ∆S R has a divergence, which is nevertheless cancelled by the remaining part of the relative/generalized entropy, as discussed below. 4 In order to keep the analysis under control, we will focus our attention on perturbatively renormalizeable or holographic theories with spacetime dimension d ≤ 6. Thus we allow the theory to have marginal or relevant couplings, as long as these couplings are covariant. In other words, if we start with the CFT that describes physics in the UV, then the only sources we may add are those given by the spacetime metric and constant scalars. We will classify all state-dependent divergences that can appear in such theories. In the CFT case, we argue that our divergences are generic by using bottom-up holographic examples defined by positing a dual asymptotically AdS bulk description. While there are no state-dependent divergences in the simplest possible cases (free theory and source-free holographic CFT's), going beyond these assumptions allows us to find examples.
A subtlety, however, is that not all terms superficially allowed by dimensional analysis can appear in the entanglement entropy. An important consistency principle comes from the "replica trick" 5 , which is a relationship between the entropy and the effective action. Consider QFT states which are defined by some Euclidean path integral defined on a manifold M . Given a choice of entanglement surface ∂R, we can define an n-sheeted replica manifold M (n) by introducing a conical singularity at ∂R with total angle 2πn, so that away from ∂R the manifold is copied exactly n times. Let the partition function on this manifold be Z (n) . Assuming we can analytically continue to real valued n, the entropy is given by: This tells us that local divergences in the entropy must be associated with the counterterms in the effective action of the QFT (at least if the UV regulator respects Eq. (1.3)) [37]. We will make free use of such constraints in what follows. In order for a term in the action to contribute, it must involve the Riemann curvature tensor, so that there is a nontrivial contribution coming from the tip of the conical singularity. (Away from the conical singularity, local divergences drop out due to being linear in n.) A further consequence is that the dependence of S R on the extrinsic curvature K ab of ∂R is determined relative to other terms in the entropy functional [38,39,40]. 4 Note that the finiteness of relative entropy occurs only for physically reasonable QFT states. Even in a system as simple as a single harmonic oscillator and taking σ to be the thermal state, there exist normalizable states ρ whose probability falls off sufficiently slowly with energy to make the relative entropy S(ρ|σ) diverge. The relative entropy can also be infinite if there exist projections on which σ has 0 probability, although when σ is a vacuum state restricted to a compact region, the Reeh-Schlieder property tells us that no such projection operators exist. 5 reviewed in [36], Cf. section A.1 of [23] for additional references.
We begin with perturbations around free theories in section 2. Here our classification is performed simultaneously with the construction of examples exhibiting state-dependent divergences. Section 3 then proceeds to classify possible state-dependent divergences at leading order in large N for marginal or relevant deformations of d ≤ 6 holographic conformal theories; i.e., those with dual descriptions in terms of the classical gravitational dynamics of asymptotically AdS bulk spacetimes. In this case we save the construction of examples for separate treatment in section 3.1. These examples are constructed in a bottom-up manner on the gravitational side of the duality. In section 4 we explain how quantities such as the mutual information, generalized entropy, and relative entropy can remain finite even when the entanglement entropy has state-dependent divergences. We close with some final discussion in section 5.

Perturbatively Renormalizable Theories
Implementing the replica trick perturbatively involves evaluating Feynman diagrams on the replica manifold M (n) . Assuming the spacetime metric to be smooth, obtaining a statedependent divergence from a path integral requires a Feynman diagram with three properties: a) it contains at least one loop (to make it divergent) b) it has external legs ending on the entangling surface (to make it state-dependent) c) it renormalizes a term in the action involving curvature (in order to contribute to the replica trick) In the case of free field theory, since there are no nontrivial vertices, a connected Feynman diagram cannot have both loops and loose ends. So state-dependent divergences are forbidden in regular states of a free theory. 6 On the other hand, it is easy to generate such diagrams in interacting theories. For example, in a φ 4 theory with d = 4, heat kernel methods give a logarithmically divergent counter-term in the action proportional to the integral of φ 2 R. Such curvature couplings are well-known to contribute an entropy term proportional to the integral of φ 2 , here with logarithmically divergent coefficient. A similar result may also be obtained by noting the mass-dependence of the logarithmic entropy divergence found in [44], and that linearizing the φ 4 term about states with non-zero expectation values of φ 2 generally shifts the effective mass by an amount that depends on the choice of such a state 7 . See Fig 1 for an explanation of this state-dependent divergence in terms of Feynman diagrams. 6 Cooperman and Luty [41] claim to have found states with state-dependent divergences in free field theories. However, these states were constructed by a path integral on a Euclidean manifold M which differed from (the Wick rotation of) the manifold M on which the states were defined to live, and in particular where M and M do not match smoothly. States generated from this construction are in general not guaranteed to be regular states, for example they need not obey the Hadamard condition [42,43]. 7 We thank Vladimir Rosenhaus for suggesting this point of view. Euclidean spacetime manifold which is shaped like a cone of angle β = 2πn near the entangling surface. (In the case where there is rotational symmetry, one can easily analytically continue to noninteger n values, and for ease of visualization we are showing the case where n is slightly less than 1 so that a small angle is cut out of the plane normal to R; however similar diagrams exist for integer n. Here the central dot is ∂R and we are suppressing all but 2 dimensions. Above are shown position-space Feynman diagrams which provide contributions to ln Z. a) A divergent diagram which contributes to the entanglement entropy S R , which is state-independent because there are no external lines. Its counterterm is a purely geometric functional of the boundary ∂R. b) In a nonminimally-coupled theory with a φ 2 R term, there is a contact interaction where the conical singularity couples directly to φ 2 . This diagram provides a state-dependent finite contribution to the boundary Wald entropy term S ∂R . (There is an associated state-independent divergence in the Wald entropy if the two external lines are contracted into a loop.) c) Here at last is a state-dependent divergence, which involves a single quartic interaction and two external lines. This is a divergence in S R , but the counterterm involves contracting the loop to a point to produce diagram (b), resulting in a nontrivial RG flow of the nonminimal coupling term. Thus the renormalized sum S gen = S R + S ∂R is finite.

Classification of Divergences
We now classify all possible state-dependent divergences in perturbatively (superficially) renormalizable field theories in every dimension. We emphasize that we treat these theories in a perturbative manner, so we allow unstable theories, and theories with Landau poles. All divergences are either power laws or log divergences. 8 We allow covariant (constant scalar) sources only, though the analysis does not significantly change if one also allows coupling to a background gauge field. As in the example above, state-dependent divergences correspond to renormalizations of nonlinear coupling terms such as ξφ 2 R. However, we do not consider the effects of bare nonminimal coupling terms added directly to the action, i.e. we do not calculate state-dependent divergences in the entropy that are themselves proportional to powers of ξ. These terms produce a nontrivial Wald entropy term associated with the black hole entropy, arising from the conical singularity in R at the entangling surface, e.g. in this case a term proportional to −ξ φ 2 . Such terms in the Wald entropy may have divergences which renormalize other terms in the Wald entropy [45], but are not normally considered to have a statistical interpretation from the perspective of the field theory. For example, on a flat spacetime, −tr(ρ ln ρ) should be the same regardless of the value of ξ. On a curved spacetime, the statistical entropy may be affected, but one would expect this dependence to have the same form as from the corresponding position dependent mass term m 2 φ 2 where m 2 = R. 9 We start by considering all renormalizable scalar field theories in the range 2 ≥ d ≥ 6 and analyzing the state-dependent divergences that may appear in the entanglement entropy. Our results are summarized in the following chart: The first column DIM indicates the spacetime dimension. (As is well known, there are no perturbatively renormalizable interactions in d > 6.) The MARGINAL column indicates the number of powers of ϕ in the action which are necessary to make a marginally renormalizable interaction term, with the weight of the term being of course equal to the dimension. In order to count as an interaction term, it must depend on more than two powers of the field φ. For d = 2, the marginal interactions depend on derivatives of the scalar field, and take the form of a nonlinear sigma model interaction. In d = 5 no marginal term is possible. These marginal interactions can be used to construct the most divergent possible Feynman diagrams that appear in the effective action. 10 The BEST ODD column indicates the highest weight renormalizable term with an odd number of scalar fields in it. This is important because no Feynman diagram can have an odd number of external legs, unless it contains at least one vertex with an odd number of legs. Thus, in order to obtain a ϕ -dependent divergence in d = 3, 4, 5, we must include a relevant vertex in the Feynman diagram. We have written the relevant source term which turns on the interaction as λ, and put the dimension of λ in parentheses next to the interaction term.
In d = 2, 6 there are already terms with an odd number of vertices among the marginal couplings; in this case we have copied the term from the previous column into this one. In the case d = 2, we require f to have a piece which is odd under ϕ → −ϕ in order to get an odd state dependent divergence.
STATE DEPEND indicates which expectation values of the scalar field may appear as coefficients in an entropy divergence. (In the case d = 2, the state-dependent divergence is written as g(φ) to emphasize that this is not the same function f (φ) which appears in the action.) The quantity in parentheses represents the weight of the scalar object listed in the entropy, including any relevant source terms. In order for there to be a divergence, one needs to find a divergent term in the effective action proportional to the quantity indicated times R, the Ricci scalar. Upon performing the replica trick, the R drops out, subtracting 2 from the weight. Hence in order to be a divergence, the weight of the terms listed must be no more than d − 2.
Finally, DIVERGE indicates the maximum degree of divergence of the expressions in the previous column. This is calculated by subtracting the weight of the state-dependent divergence from d − 2.
In the case of d = 6, in addition to the quadratic ϕ -divergence, it is also possible to have a log divergence, by replacing the X in the expression Xϕ with any of five possible weight 2 items, listed below the chart. Here g ab ⊥ is the inverse metric normal to the entangling surface, and i, j are indices restricted to the entangling surface. These terms come from the renormalization of the following terms in the action respectively: and we have used the work of [38,39,40] to determine the extrinsic curvature dependence of the entropy for the final two cases. A sixth candidate term ∇ 2 ϕR, which produces a ∇ 2 ϕ divergence, is not listed either here or above, since it can be related to the other terms by means of the field equation for φ (or equivalently, by a field redefinition of the action). A seventh candidate term ∇ a ∇ b ϕG ab is a total derivative as a result of the Bianchi identity, and hence should not contribute. 11 For d ≤ 4, there also exist peturbatively renormalizable interactions involving spinor or vector fields. These add additional possible marginal and relevant interactions. However, it turns out that they do not add new kinds of state-dependent divergences, for the following reasons: In d = 2 any state-dependence in the entropy must be weight 0. The only possible terms with weight 0 are functions of scalar field φ, which are already included. Such terms are already allowed in pure scalar field theories. Allowing additional marginal terms in the action may change the coefficients of the state dependent divergences, but not the allowed kinds of divergences.
11 Its Wald entropy [32], obtained by differentiating with respect to the Riemann curvature, is proportional to g i j∇ i ∇ j φ, with i, j restricted to the four dimensional entangling surface. However due to the ambiguities in the Noether charge approach, this formula is only valid for stationary entangling surfaces [33,34]. Presumably once the extrinsic curvature terms are properly calculated a la [38,39,40], one finds that it is actually g i jD i D j ϕG ab where D a is the covariant derivative intrinsic to the geometry of the entangling surface. This is a total derivative, and thus does not contribute to the entropy of a compact entangling surface.
Since a spinor bilinear ψ 1 ψ 2 is weight d − 1, it cannot appear as a coefficient of a divergence. So we can only consider diagrams where spinors are internal lines. Such interactions do not introduce any qualitatively new kinds of state dependent divergences. Spinor interactions may add new kinds of relevant source terms, but they do not change the set of field operators which can appear in state-dependent divergences.
One might have thought that in d = 3, 4, the Yukawa couplings might help by producing diagrams with an odd number of external ϕ lines. But in d = 3, the Yukawa coupling has dimension 2½, so it is no better than ϕ 5 , while a marginal ϕ 2 ψ 2 coupling produces an even number of scalars. While in d = 4, the Yukawa couplings (and all other marginal couplings) preserve an accidental Z 2 symmetry that counts the number of scalars plus lefthanded fermions, mod 2; this prevents any new kinds of state-dependent divergences from existing. 12 A minimally coupled gauge boson also does not change anything of significance. In d = 4, on dimensional grounds one might have expected divergences proportional to either the electric flux F ab ab or the magnetic flux * F ab ab , in C-violating theories such as the Standard Model. But in fact these terms are ruled out by covariance, since there is no way to contract one copy of F ab with the Riemann tensor to build a viable term in the action. Furthermore, F ab has dimension d/2, and thus cannot appear directly in perturbatively renormalizable interactions with scalars or spinors.
A massive vector boson V a is also not useful, since the longitudinal modes of V a also have weight d/2 for purposes of renormalization theory. In d = 3, 4, the only new perturbatively renormalizable terms are the Proca mass V a V a (which is not useful), and the mixed propagator V a ∇ a ϕ (which can be removed by a field redefinition).
Interactions involving higher spin fields are necessarily nonrenormalizable in d > 2, so we conclude that the table above gives a complete list of the possible state-dependent divergences.

Holographic Theories
Having classified all state-dependent entropy divergences that can arise in perturbation theory around a free fixed point, it is natural to ask about more general theories where the interactions may be strong. Consider, for example, covariant relevant and marginal deformations of unitary conformal field theories with asymptotically AdS gravity duals. We will work at the level of classical bulk physics, or equivalently at leading order in a limit where an appropriate integer N labelling the holographic field theory has been taken to be large.
As discussed in the introduction, entropy divergences are constrained only by their connection to action divergences via (1.3). In the holographic context, this point was recently emphasized by [46] in connection with the Lewkowycz-Maldacena argument [47] for the Ryu-Takayanagi (RT) [10] and covariant Hubeny-Rangamani-Tayakanagi (HRT) [12] entropies. This in particular means that the degree of divergence will be d − ∆, where ∆ ≤ d is the weight of the corresponding term in the action. From (1.3), action terms that are algebraic in the metric give no contribution to the entropy. As a result, we will show below that nontrivial contributions arise only from terms containing two or more derivatives of the metric. With this restriction, we will see that all possible action counter-terms take the form (3.1). For our purposes, the key feature of large N holographic theories that they admit a description via a weakly-coupled bulk path integral, inside which the dimensions of composite operators are given by simply adding the dimensions of their component 'elementary' operators and sources. In this context, let us use the term 'operator' below to refer only to objects for which the dependence on the background metric is at most algebraic when the elementary operators are held fixed. With this understanding, boundary terms terms in the bulk gravitational action contribute to (1.3) only when they involve explicit dependence on derivatives of the metric. Since we in any case integrate over all values of the elementary operators, there is no effect from any implicit dependence of these operators on the background metric. 13 The only other property of holographic theories used below is that while unitarity generally requires the dimension of scalar operators O to satisfy only ∆ O ≥ (d−2)/2, in ghost-free holographic theories they in fact satisfy the strict inequality ∆ O > (d − 2)/2 [48]. This is to be expected as unitary generally allows ∆ O = (d − 2)/2 only when the correlators of O are those of a free field. Now, since we require at least one derivative of the metric, the operator in our action term can have dimension at most d−1. But the unitarity bounds (see e.g. [49] and references therein) require any operator with dimension d − 1 or less to be a scalar, a derivative of a scalar, an antisymmetric tensor, or a conserved vector (satisfying ∇ a j a = 0). We may ignore the anti-symmetric tensors as they cannot be combined with derivatives or powers of the Riemann tensor to build a covariant term. And since we forbid vector sources, any conserved vector operator can appear only through ∇ a j a = 0.
This leaves us with terms that involve only scalar operators O. Integrating by parts allows us to remove derivatives from O, so it suffices to consider only terms given by multiplying such a scalar O by a scalar Φ built from the metric. The possible such terms are then classified by scalars Φ of weight d − ∆ O or less 14 .
Since ∆ O > (d − 2)/2, the scalar Φ can contain at most three derivatives. This in particular forbids divergences associated with terms that the table in section 2.1 would describe as having ln Λ divergences for d = 6. Covariance requires derivatives to occur in pairs, so only one pair can be present. In this case the only allowed action-counterterm than can affect the entropy is just (3.1) as claimed above.

State-dependent RT divergences
We suggested above that action divergences of the form (3.1) should be generic in leadingorder large N holographic theories when ∆ O ≤ d − 2 in the presence of low dimension sources, and that they should be accompanied by state-dependent divergences in the RT and HRT entropies. Although the literature contains statements [50] that such divergences do not in fact occur, we now describe bottom-up examples where they do, and which support our claim that they are generic. We will analyze the entropy divergences directly, though a similar computation may of course be performed at the level of the action. As above, we will work at the level of classical bulk physics, or equivalently at leading order in a limit where an appropriate integer N labelling the holographic field theory has been taken to be large.
It is sufficient to take the bulk dual to consist of (d + 1)-dimensional Einstein-Hilbert gravity with negative cosmological constant coupled to two scalar fields φ, χ. We will study solutions locally asymptotic to AdS d+1 , suppressing discussion of any possible compact factor X in the bulk space time (though in a top-down model the scalars φ, χ may in fact arise from Kaluza-Klein reduction on X). We take the bulk scalar action to be of the standard second-derivative form with scalar potential where . . . indicates terms of at least third order in φ. We assume that g(χ) admits a power series expansion about χ = 0, and whose first non-trivial terms are g(χ) = αχ n +αχñ + . . .  [48], we choose ∆ O > (d − 2)/2. To preserve the asymptotically AdS boundary conditions we require ∆ s ≥ 0. We will not need to turn on the source for O, and it turns out that we will be interested only in ∆ s < d+2 4 ≤ d/2 where the last inequality uses d ≥ 2. For simplicity, we also requirẽ n∆ s > ∆ O . Our bulk scalars then admit asymptotic expansions in terms of a Fefferman-Graham radial coordinate z (see below). In the above, + . . . represents terms of higher order in z than the last term explicitly displayed and P x,y , Q x,y are scalar polynomials in s and its derivatives in the QFT spacetime (i.e., derivatives along the boundary directions from the bulk point of view) which are homogeneous of order x in derivatives and order y in s. When (n + 2(n − 1)m)∆ s + 2k = ∆ O , the P n+2m,k (∇, s) term will also contain a factor of ln z, but any other logs must be a part of the higher order terms indicated by + . . . . See e.g. [51,52] for the details of an analogous computation and e.g. [53] (as well as the earlier work [54,55,56,57]) for a discussion of the normalization of the coefficient of O in (3.5). For our purposes, the important coefficients and polynomials are and . (3.7) The RT and HRT conjectures state that the (divergent) entropy of the field theory restricted to a region of its spacetime is given by the area of an appropriate bulk extremal (d − 1) surface anchored to the asymptotically locally AdS boundary on some (d − 2) surface. Such divergences are dictated by the asymptotic expansion of the bulk metric g AB , which is usefully expressed in the Fefferman-Graham gauge in terms of bulk coordinates X A = (z, x a ) as with lim z→0 g ab (z) giving the metric g (0) ab of the spacetime with coordinates x a carrying the d-dimensional holographic QFT. Indeed, [58] shows that HRT divergences are determined by the terms in (3.8) of order z d−2 or larger as z → 0. 15 This is particularly clear in contexts with sufficient symmetry to guarantee the extremal surface to be described by fixing the values of two of the x a coordinates. The divergences are then given immediately by integrating the associated induced metric over z and the remaining x a , so that terms of order z (d−2) in (3.8) induce logarithmic divergences. By our general arguments above, at least for 2 ≤ d ≤ 6 such highly symmetric cases are sufficient to determine the coefficient of all allowed divergences.
The expansion of g ab (z) may be computed by iteratively solving the (A, B) = (a, b) components of the bulk Einstein equation, where G AB is the bulk Einstein tensor and, since = 1 the factor Λ = − d(d−1) 2 is the bulk cosmological constant. On the right-hand side, is the bulk matter stress tensor in terms of the bulk covariant derivative D A . As described in [59], when one disentangles the iterative equation, it turns out to be the trace-reversed bulk stress tensor whose ab components feed directly into the recursion relation for coefficients in the expansion of g ab (z). For T bulk matter AB = 0 the resulting expansion takes a familiar form that at order z d−2 or less involves only even powers of z and no logarithms, with coefficients dictated by the field theory's metric g , (3.12) with K and W representing respectively the contributions from the terms involving ∂ z and those algebraic in bulk fields. We find (3.14) where γ, σ were given in (3.6), (3.7). For β = 0 this term yields a state-dependent divergence of order (d − 2) − n∆ s + ∆ O . Here order zero represents a logarithm and a negative result indicates no divergence. Since ∆ O can be close to (d − 2)/2 and ∆ s can be close to zero, our class of models leaves ample room for non-negative degrees of divergence. It is clear that for ∆ s = 0 there is generically no cancellation between the K and W terms in (3.12), and that the leading divergence is unchanged by adding additional terms to V , including φ-independent terms proportional to χn forn > 2.
In contrast, our results (3.14) vanish quadratically near ∆ s = 0. But since it becomes increasingly difficult to ignore additional terms in this regime, it remains an open question whether state-dependent divergences are allowed for vanishing ∆ s . It would thus be interesting to explore the ∆ s = 0 case further. This is particularly so as the proto-typical holographic theory of 4d N = 4 SU(N) super-Yang-Mills (dual to type IIB supergravity compactified to AdS 5 × S 5 ) has the property [60] the lowest dimension operator has ∆ O = 2 so that any sources in a term of dimension d − 2 = 2 or less must have ∆ s = 0. An analogous statement holds for eleven-dimensional supergravity compactified to AdS 4 × S 7 (where d = 3 and the lightest scalar operator has ∆ O = 1), though we have not surveyed more general top-down models of holographic theories.

Finiteness of Various Quantities
Although the entanglement entropy may have state-dependent divergences, there are several closely related quantities in which all divergences are expected to cancel (including therefore state-dependent divergences). These include three closely related quantities: the mutual information, the generalized entropy, and the relative entropy: Mutual Information State-dependent divergences cannot afflict computations of mutual information I(A : B) = S(A)+ S(B) − S(A ∪ B), when the regions A and B are separated by a finite proper spatial distance. This is because all divergences cancel between the various boundary regions [21].
Note that the mutual information is a special case of the relative entropy: Generalized Entropy On a similar note, even in the presence of our state-dependent divergences, coupling our quantum field theory to gravity yields a finite generalized entropy S gen = S BH + S outside , where S BH is the entropy of the black hole including any statedependent counterterms. Due to the replica trick, this follows directly from finiteness of the renormalized partition function. (The renormalization procedure has been extensively studied in the literature; see the Appendix of [23] for a review and citations.) In cases where a QFT state makes a small gravitational perturbation to a Killing horizon, the generalized entropy on the causal horizon is given by S gen = C − S(ρ | σ) where σ is the associated Hartle-Hawking state and C is an additive constant [29,61,62].

Relative Entropy
The fact that relative entropy S(ρ | σ) is finite (for well-behaved states) should also be confirmable by replica trick calculations; we will now show this in cases where, for simplicity, the states ρ and σ both come from a path integral which is rotationally symmetric around the entangling surface ∂R. To get a divergence that depends on the state, we assume that some scalar Φ (e.g. φ or φ 2 ) associated with the state-dependence differs between the two states at the entangling surface (due to some rotationally symmetric source or boundary condition), so that S(ρ) − S(σ) is infinite.
Let us now consider the path integral formed by gluing together r consecutive copies of the path integral used to define ρ, with s consecutive copies of the path integral used to define σ, for a total angle deficit at the origin of 2π(1 − r − s). This path integral defines the partition function Z(r, s) = tr(ρ r σ s ), where ρ and σ are not yet taken to be normalized. Since the whole setup is rotationally symmetric, we can allow r and s to take noninteger values and still retain a geometrical description.
The modular Hamiltonian K = ln σ for the state ρ is now given by while the entropy (after normalization of ρ) is given by Hence the relative entropy is where the first two terms require differentiating with respect to a small conical angle deficit, while the last two terms are evaluated on the original smooth space time.
Let us assume that in order to properly define the modular Hamiltonian K above, all bulk divergences of ln Z not associated with the conical angle deficit have already been renormalized by absorption into bulk counterterms. We therefore restrict attention to the divergences which multiply the conical angle deficit 1−r−s, appearing in the first two terms. For example, these might correspond to state-dependent divergences which are absorbed by a nonminimal ΦR sing term, where R sing is the singular part of the curvature associated with the conical singularity. 16 It seems reasonable to suppose that the scalar quantity Φ at the singularity should itself be a smooth function of r and s at ln Z(1, 0), namely Φ(r, s). Then at first order, the state-dependent divergence in the effective action is given by where the ∂ r and ∂ s derivatives in the Taylor expansion cannot act on Φ because R sing = 0 at ln Z(1, 0). But combining the derivative terms from (4.3) with (4.4), Therefore there are no state-dependent divergences in the relative entropy between the two states ρ and σ, defined by the path integrals above. The underlying reason is that the relative entropy (4.3) can be evaluated using only the nonsingular partition function at r + s = 1. However, state-dependent divergences may still be present if we consider ∆S or ∆K on their own. More generally, we may consider states ρ and σ defined by non-rotationally symmetric path integrals. Here we lose the geometrical interpretation, but we expect that for purposes of analyzing the UV divergence structure near the entangling surface, a formally similar argument will still hold.

Discussion
By working in both perturbative and holographic contexts, we have shown that the von Neumann entropy of a field theory in a region of spacetime can display a variety of statedependent divergences. Each such divergence is associated with divergences in the (bare) partition function involving the Ricci or Riemann curvature of the background spacetime, with the possible such terms classified by the low-dimension scalar operators present in the theory. We have argued that the coefficient of such divergences is generically nonzero, but it remains possible that state-dependent divergences are forbidden in theories with exact conformal symmetry. We also remind the reader that our holographic examples were constructed by simply postulating a certain bulk Lagrangian. It thus remains to be shown that the required structure actually arises in models with known field theory duals.
The state-dependent logarithmic divergences are particularly interesting because, as usual, any compensating logarithmic counter-term in the action requires a choice of scale. As a result, such terms constitute a new type of conformal anomaly and provide a corresponding state-dependent contribution to the trace of the stress tensor.
Furthermore, a priori, there is no preferred choice of the scale in this counter-term. In curved space, we note that the choice of scale directly affects the correlation functions of the theory. But since the term is proportional to some power of Ricci or Riemann curvature, this is not so in flat spacetime. In that case, taking the correlators to define the QFT means that the choice of scale can have no physical effect. Nevertheless, changing the scale will shift the renormalized entropy S ren by some finite amount.
This argues that quantum field theories generally have only families of finite quantities that could be called renormalized von Neumann entropy, but that there is no preferred member of this class. The same issue has been raised and discussed many times before in the context of possible finite (i.e., non-divergent) curvature couplings (see e.g. [37,63,64,18,65,66,67,68,44,69,70,71]). In that context one might hope (as in some of the above references) that either minimal coupling or some other prescription will give rise to a preferred definition of von Neumann entropy, but in cases with a logarithmic divergence any such prescription must entail the introduction of a new preferred scale.
On the other hand, we have argued that certain universal quantities like relative entropy S(ρ | σ) should remain finite and independent of the above ambiguity. When σ is the Hartle-Hawking state associated with a bifurcate Killing horizon, this could also be derived by noting that the computations of boundary terms are essentially classical and then using the argument of [72] to show cancellation between "energy" and "entropy" contributions.
As a final comment, we note on the other hand that more general state-dependent entropy divergences will generally arise in effective low-energy field theories as these generally feature couplings or background fields with negative mass dimensions. In such contexts, the action counter-terms may contain arbitrary operators multiplied by any number of derivatives and powers of the curvature, leading to correspondingly complicated state-dependent divergences in the entropy. However, if the theory flows to a UV fixed point, these extra divergences will be regulated by the short-distance physics.