The second law of black hole mechanics in effective field theory

We investigate the second law of black hole mechanics in gravitational theories with higher derivative terms in the action. Wall has described a method for defining an entropy that satisfies the second law to linear order in perturbations around a stationary black hole. We show that this can be extended to define an entropy that satisfies the second law to quadratic order in perturbations, provided that one treats the higher derivative terms in the sense of effective field theory. We also address some outstanding issues with Wall’s method, in particular its gauge invariance and its relation to the Iyer-Wald entropy.


Introduction and overview 1.Introduction
The laws of black hole mechanics are a set of classical laws governing the behaviour of black holes. When combined with Hawking's discovery of black hole radiation, these laws provide compelling evidence for the interpretation of black holes as thermodynamic objects. These laws were first proved for Einstein gravity minimally coupled to matter. It is natural to ask whether they are also valid in extensions of Einstein gravity, for example in the presence of higher-derivative terms in the action that are expected from an effective field theory (EFT) perspective.
The first law of black hole mechanics concerns linear perturbations of a stationary (i.e. time-independent) black hole. Wald has shown that a version of the first law holds for any diffeomorphism invariant theory of gravity coupled to matter [1]. In particular, this leads to a definition of the entropy -the Wald entropy -of a stationary black hole in such a theory. It provides a fully satisfactory definition of the entropy of an equilibrium black hole in this very large class of theories.
The Wald entropy is defined unambiguously only for a stationary black hole. Iyer and Wald have made a proposal for the entropy of a dynamical (i.e. non-stationary) black hole solution of a general diffeomorphism-invariant theory [2]. The procedure is based on classifying possible terms according to their "boost weight", which determines how they transform under a constant rescaling of affine parameter of the horizon generators. The Iyer-Wald entropy is built from quantities of zero boost weight. Iyer and Wald showed that their definition is independent of any choice of coordinates or basis. However, they left open the question of whether or not it satisfies a second law.
Jacobson, Kang and Myers (JKM) investigated black hole entropy for the class of theories for which the gravitational Lagrangian is a function of the Ricci scalar, so-called f (R) theories [3]. Such theories can be transformed into a conventional scalar-tensor theory using a field redefinition. Using this, JKM were able to define an entropy that satisfies a second law. Their entropy is proportional to the integral of f ′ (R) over a horizon cross-section. For a stationary black hole, the JKM entropy coincides with the Wald entropy. However, for a dynamical black hole, the JKM entropy differs from the Iyer-Wald entropy.
In general, one expects the Lagrangian of a gravitational EFT to include scalars built from contractions of the Riemann tensor and its derivatives. For such a theory, Wall has sketched a procedure for constructing an entropy that satisfies the second law to linear order in perturbations around a stationary black hole [4]. As we shall explain, the Wall procedure supplements the Iyer-Wald entropy with terms that are linear in quantities with positive boost weight. The simplest example of such a term is the integral over a horizon cross-section of KK where K andK are the expansions of outgoing and ingoing null geodesic congruences orthogonal to the cross-section. We shall refer to the result of Wall's construction as the Iyer-Wald-Wall (IWW) entropy. For a stationary black hole it reduces to the Wald entropy.
To linear order, the second law does not imply an entropy increase, but only that the entropy does not change in time. To see why, assume that there exists a linear perturbation that leads to an entropy increase over time. Now multiply this perturbation by minus one. The result is a linear perturbation that decreases the entropy over time, in violation of the second law. Thus, to linear order, Wall's result implies that the IWW entropy can neither decrease nor increase but must remain constant in time. 1 For f (R) theories, the JKM and IWW entropies agree if f (R) is quadratic in R but not if cubic or higher order terms are present. This suggests that, except for special theories (e.g. 2-derivative Einstein gravity), the IWW entropy is unlikely to satisfy a second law beyond the linearized approximation. For f (R) theories, the JKM entropy involves adding terms to the IWW entropy that are quadratic, or higher, order in quantities with positive boost weight. So, except for special theories, it seems likely that such higher order terms will need to be included if the second law is to be extended beyond the linearized approximation.
In this paper we will establish a second law beyond the linearized approximation for a large class of theories. Our approach will be to treat these theories as EFTs, ordering terms in the equations of motion according to how many derivatives they contain. The lowest order terms are the 2-derivative terms familiar from conventional Einstein theory. These are then supplemented by terms with 4 derivatives, then 6 derivatives and so on. We assume that the coupling constants multiplying these terms are all powers of some fundamental (UV) length scale ℓ. We restrict attention to solutions lying within the regime of validity of EFT. Roughly this means that if a solution varies over a length (or time) scale L then ℓ/L ≪ 1. One then expects terms with many derivatives to be less important than terms with few derivatives.
Consider truncating such a theory by retaining only terms with N, or fewer, derivatives. Rather than attempting to prove a second law that holds exactly in the truncated theory, our approach will be to prove a second law that holds to the same expected accuracy as the theory itself, i.e., up to neglect of terms with more than N derivatives. We will show that, for any N, one can define an entropy that satisfies the second law to quadratic order in perturbations around a stationary black hole, in the sense that any violation of the second law will be of the same order (in ℓ/L) as the terms with more than N derivatives that have been neglected in truncating the theory. By increasing N one improves the result, so the better one knows the EFT, the better the accuracy to which the second law is satisfied.
Our entropy is defined by adding new terms to the IWW entropy. The new terms are of quadratic (or higher) order in quantities with positive boost weight. The simplest example is the integral over a horizon cross-section of (KK) 2 . By counting derivatives, one can see that the new terms are required only once N ≥ 6. In particular, if N = 4 then our result implies that the IWW entropy satisfies the second law to quadratic order in perturbations, in the EFT sense described above.
An important question concerning the entropy is its gauge-invariance. The Iyer-Wald entropy of a horizon cross-section is independent of any choice of coordinates -it depends only on the local geometry of the cross-section. Wall's procedure, which our approach extends, is based on a fixed Gaussian null coordinate system. Such a coordinate system has gauge freedom corresponding to a rescaling of the affine parameter along each horizon generator. The terms generated by Wall's procedure are not manifestly invariant if this rescaling differs from generator to generator. Nevertheless, we will prove below that the IWW entropy is invariant to linear order in perturbations around a stationary black hole. Thus the IWW entropy is gauge invariant to the same extent that it satisfies the second law, namely to linear order in perturbations. In fact, with quite a lot more work, we shall prove that the freedom to adjust higher order terms in the IWW entropy can be used to bring it to a form that is gauge invariant in a fully nonlinear sense. It is natural to guess that our extension of the IWW entropy is gauge invariant at least to quadratic order in perturbations, in the sense of EFT. We have not yet been able to prove this and we will discuss this issue further below. This is not an issue for N = 4 since our entropy reduces to the IWW entropy in this case.
In the rest of this section we shall explain our main results in more precise terms but omitting the proofs. Then section 2 will prove various technical results used in later sections. Section 3 will present proofs of our results concerning the IWW entropy. Section 4 presents our construction of the improved entropy that satisfies the second law to quadratic order.

Summary of conventions
The metric signature is (− + + · · · +), and the conventions for the curvature tensors are as in Wald's text [5]. Bold face letters denote differential forms, with the conventions for ∧ and d as in [5]. The spacetime dimension is n > 2. Lower case Greek letters µ, ν, . . . are spacetime coordinate indices, upper case Roman letters A, B, . . . are coordinate indices on an (n − 2) dimensional spacelike cut C associated with the Gaussian null coordinates defined in section 1.3. (µ 1 . . . µ s ) resp. [µ 1 . . . µ a ] denotes a total symmetrization resp. anti-symmetrization with combinatorial factors included to make these operations projections onto the space symmetric resp. anti-symmetric tensors.
The notation L[Ψ] (square brackets) indicates that L is a local functional of some fields Ψ, i.e. at each point, x, L| x depends on finitely many derivatives ∂ µ 1 . . . ∂ µ d Ψ| x . We generally set 16πG = 1 and ℓ denotes a UV length scale.

Gaussian null coordinates
Consider a spacetime of dimension n containing a smooth null hypersurface N that is ruled with affinely parameterized null geodesics with future-directed tangent vector l µ . We assume that the generators are future-complete, i.e., they extend to infinite affine parameter to the future. This hypersurface might be the event horizon of a black hole but some of our analysis applies more generally. Note that the smoothness assumption excludes important physical situations such as black hole mergers. However, it includes spacetimes describing a black hole "settling down to equillibrium", which is the physical situation we have in mind.
Assume that every null geodesic generator of N intersects a spacelike cross section C precisely once. We can introduce Gaussian Null Coordinates (GNCs) in a neighbourhood of N as follows. On N we let v be an affine parameter along each null generator such that v = 0 on C and such that l µ ∂ µ v = 1. Then we transport C by affine time v into cross sections C(v) thereby obtaining a null foliation of N . On each C(v), a (past-directed) null vector field n = n µ ∂ µ is next defined uniquely by demanding that it is orthogonal to C(v) and g µν l µ n ν = 1. We consider the affinely parameterized null geodesics tangent to n and call the affine parameter r with r = 0 on N . Finally, we choose on C a coordinate chart x A , A = 1, . . . , n − 2. Then we transport this first along the geodesics tangent to l along N and then along n off N at each fixed value of v. It can be shown that the metric and vector fields n, l take the form (1) in the GNCs (v, r, x A ), where α, β A and µ AB are, at least for small r, smooth functions of the coordinates. N is the surface r = 0 and C is the surface v = r = 0. We denote the inverse of µ AB as µ AB and A, B, . . . indices will be raised and lowered with µ AB and µ AB . The covariant derivative on C(v) defined by µ AB is denoted D A . We also define On N , K is the expansions of the null geodesics tangent to l. SimilarlyK is the expansion of the (past-directed) null geodesics tangent to n. The tracefree parts of K AB andK AB define the shear of these families of geodesics. The definition of GNCs is not unique. If one fixes the initial cut C then one still has the freedom to rescale the affine parameter along each generator of N : if a(x A ) > 0 then v ′ ≡ v/a is also an affine parameter along the generators. The corresponding tangent vector to these generators is l ′ = al. This change of affine parameter leads to a new set of GNCs If a is constant then we have simply If a quantity X transforms homogeneously under such a change of coordinates then we define its boost weight b by (see Section 2.1 for a precise definition); for example α, β A and µ AB have boost weight zero, K AB has boost weight +1 andK AB has boost weight −1. A quantity D A 1 . . . D Am ∂ p v ∂ q r ψ with ψ ∈ {α, β A , µ AB } has boost weight p − q. The boost weight of a tensor component T µν... ρσ... is given by the sum of +1 for each subscript v or superscript r index, and −1 for each superscript v or subscript r index e.g. R vv has boost weight +2.
If a is not constant then the transformation between the GNCs is highly non-trivial away from N . This implies that many quantities (e.g. α, β A , ∂ v ∂ r µ AB ) transform inhomogeneously, even on N , with terms involving the derivative of a.

Second law
We now review a simple proof of the second law in conventional GR. The proof is simple because it makes various strong assumptions, specifically that the horizon is smooth, the horizon generators are future-complete, and that the black hole "settles down to equilibrium" at late time. One can of course prove the second law in conventional GR under much weaker assumptions than these [6].
Let H be the future event horizon of a black hole and take N = H to define GNCs. The area of C(v) is Recall Raychaudhuri's equation: We now assume that the expansion of the generators of H vanishes at late time, i.e., K → 0 as v → ∞ on H. This would be the case if the black hole is "settling down to equilibrium". We can now writeȦ where we used Raychaudhuri's equation in the second step. If the spacetime satisfies the null convergence condition (R µν V µ V ν ≥ 0 for any null V µ ) then the RHS is manifestly non-negative (as R vv = R µν l µ l ν ≥ 0) and so we haveȦ(v) ≥ 0, i.e, A(v) is an increasing function.
If we consider a theory consisting of conventional 2-derivative GR coupled to matter satisfying the null energy condition then spacetimes satisfying the Einstein equation will obey the null convergence condition and so the second law holds in such a theory. However, we will be interested in more general theories in which higher derivatives are present in the Lagrangian. In this case there is no reason to expect the null convergence condition to be satisfied by solutions of the equations of motion and so the above argument no longer applies.

Stationary black holes
Much of this paper will concern perturbations of stationary black holes. We will now describe the class of stationary black holes to be considered. In conventional GR coupled to various types of matter fields it is known that the event horizon of a stationary black hole must be a Killing horizon. This result has not been extended to theories of the type that we will be considering. We will simply assume that the theory admits a family F of stationary black hole solutions for which the event horizon is a Killing horizon. Moreover, we will assume that this is a bifurcate Killing horizon. This implies that the zeroth law of black hole mechanics is satisfied, 2 i.e., that the surface gravity is constant on the horizon (conversely, the zeroth law implies that the spacetime can be extended to contain a bifurcation surface [8]).
These assumptions ensure that if H is the future event horizon of a black hole in F then all positive boost-weight quantities vanish on H. Furthermore, all non-zero boost-weight quantities vanish on the bifurcation surface. Wall [4] has sketched an approach that, for any diffeomorphism invariant theory of gravity, produces an entropy S which satisfies the second law to linear order, i.e., δṠ = 0 for perturbations of a black hole in the family F . 3 Wall's approach has been discussed in more detail in [11,12], where it has been reformulated in terms of an entropy current. To explain this, let √ −gE µν = δI/δg µν , where I is the action, so the Einstein equation of our theory is E µν = 0. Then, for a general null hypersurface N the entropy current is a vector field

Wall's procedure
where s v and s A are functions of the GNC components of the metric, and their derivatives, with boost-weights 0 and 1 respectively. It satisfies the identity where the ellipsis denotes terms that are of quadratic or higher order in quantities of positive boost weight. This is an off-shell identity, i.e., it holds independently of the equations of motion. To obtain the linearized second law, we consider this equation linearized around a member of F , taking N to be the event horizon H. Since positive boost weight quantities vanish on the event horizon of the unperturbed spacetime we obtain On-shell this becomes The quantity inside square brackets vanishes for the background solution and can be expected to decay as v → ∞ if the perturbed spacetime settles down to a stationary black hole belonging to F . Hence integrating the above equation w.r.t. v gives Equation (11) shows that if this condition holds on C then it holds on all of H. This equation can be regarded as a gauge condition on the metric perturbation. It arises because we have chosen our coordinates such that the perturbation does not change the location of H, i.e., the horizon of the perturbed black hole remains at r = 0. For conventional GR (where (12) is simply δK = 0) this was explained in [13].
Integrating (12) Thus this definition of entropy satisfies the second law to linear order in perturbations. Wall considered coupling the original theory to a matter source so that the Einstein equation becomes E µν = (1/2)T µν (units: 16πG = 1) where T µν is the energy-momentum tensor of the matter. If one treats T µν as a term of linear order then equation (11) has (1/2)T µν on the RHS and if the matter obeys the null energy condition T vv ≥ 0 then δṠ IWW ≥ 0 so one can have a genuine increase in entropy driven by the matter source. However, we shall not include such a matter source below for several reasons: (i) if the matter fields are treated using the linearized approximation, as with the gravitational field, then T µν is of quadratic, not linear, order; (ii) in EFT one does not expect a clear division of the Lagrangian into a "gravitational part" and a "matter part"; instead there will be higher derivative terms mixing matter and gravitational fields; (iii) higher derivative matter terms will not satisfy the null energy condition. We take the view that one should treat the gravitational and matter fields on an equal footing. 5 Section 3 of this paper will address some outstanding issues concerning Wall's approach. First, one needs to make sure that this definition of entropy also satisfies the first law of black hole mechanics, which relates δS evaluated on the bifurcation surface of the black hole to the perturbations in mass and angular momentum. Wall argues that this must be true as follows. His result for s v can be divided into terms built entirely from quantities with vanishing boost weight and a part that is at least quadratic in quantities with non-vanishing boost weight. Wall claims that the integral of the former part is the same as the entropy S IW defined by Iyer and Wald. From this claim it follows that δS IWW = δS IW on the bifurcation surface of a member of F and since S IW is known to satisfy the first law, so must S IWW . However, a proof of this claim is lacking. We shall present one in section 3.2.
Second, Wall's procedure is carried out using a particular set of GNCs and produces an expression for s v that depends on the metric components, and their derivatives, w.r.t. those GNCs. These quantities transform in a complicated way under transformations between GNCs with non-constant a(x A ). Therefore it is very unclear whether the result of Wall's procedure must be gauge-invariant. (This has also been noted in [14].) The issue of gauge invariance is important if we wish to compare the entropy of two cuts C, C ′ of H with C ′ strictly to the future of H. The linearized second law described above lets us do this for the special case where C ′ = C(v). So to compare the IWW entropy of C ′ and C we must choose our GNCs such that C ′ is a constant v cut of H. This can always be achieved by rescaling the affine parameters of the generators of H, i.e., by a choice of the function a(x A ) defined in section 1.3. But we do not want the definition of the entropy of C to depend on our choice of C ′ (or vice versa). Hence we must demonstrate that this definition is gauge-invariant under changes of GNCs involving non-constant a(x A ).
In section 3.3, we shall give a short proof that the IWW entropy is gauge-invariant at the linearized level. However, in the class of examples studied by Wall [4] (a Lagrangian that is a function of the Riemann tensor) one can see that the result is actually gauge-invariant at the fully nonlinear level and one can ask whether this is true generally. In Wall's analysis, he implicitly makes use of the fact that his procedure only determines terms in S IWW that are of up to linear order in quantities with positive boost weight since terms of quadratic (e.g. K 2K 2 ) or higher order vanish when linearized around a member of F . Wall makes a specific choice of these higher order terms in his explicit examples. The question is whether it is always possible to choose these higher order terms to render S IWW fully gauge-invariant. We shall show in section 3.3 that the answer is yes.

Second law in EFT
We can now explain our main result, derived in section 4.1. For simplicity, we shall consider pure gravity, without matter fields. Consider an EFT for a diffeomorphism invariant theory of gravity. We assume that the Lagrangian is a formal sum of terms with increasing numbers of derivatives of the fields, multiplied by suitable powers of some UV length scale ℓ. A term with k + 2 derivatives will be multiplied by ℓ k . For pure gravity (assuming parity symmetry if n is odd) only terms with even numbers of derivatives can occur. The terms with 2 or fewer derivatives are assumed to take the standard Einstein-Hilbert form, with a possible cosmological constant.
Validity of EFT requires that terms with increasing numbers of derivatives are increasingly less important. This is the case if we restrict ourselves to spacetimes varying over some length/time scale L satisfying ℓ/L ≪ 1. This L should be a lower bound for the size of the final black hole equilibrium state and any length/time scales associated with the perturbation away from equilibrium. More precisely, we assume that there exists L > 0 with ℓ/L ≪ 1 such that, in a neighbourhood of the event horizon, w.r.t. some set of GNCs, any quantity X k involving k derivatives of the metric obeys a uniform bound of the form |X k | ≤ C k /L k for some dimensionless constants C k depending on the initial data. We assume that the cosmological constant satisfies |Λ|L 2 ≤ 1.
In practice, an EFT Lagrangian will not be known to all orders. We assume that only the terms with N or fewer derivatives are known explicitly. The EFT equation of motion is then where E µν = −Λg µν − G µν + . . . denotes the terms with up to N derivatives and the RHS represents the effects of the terms with N + 2 or more derivatives. We should really write this as O(ℓ N /L N +2 ) but we shall mostly suppress the L-dependence below. Consider first the case N = 2, i.e., conventional GR viewed as an EFT. Without matter . Hence equation (8) iṡ The RHS may become negative but only by a small O(ℓ 2 ) amount. From an EFT perspective the second law no longer holds exactly, but only to the same O(ℓ 2 ) accuracy as the theory itself. IfȦ(v 0 ) < 0 then the above equation implies (reinstating L) where || · || L 2 (v 0 ) denotes the norm defined by the above integral on the portion v ≥ v 0 of H. Hence ifȦ(v 0 ) < 0 then the expansion and shear of the generators of H must be small for all times v ≥ v 0 . This suggests that the black hole is close to equilibrium. Thus the process of relaxation to equilibrium seems the most likely situation in which the higher derivative EFT terms could cause a (small) decrease in horizon area.
Our main result is to show that, for general N, the following EFT generalization of (9) (evaluated on-shell) holds on a null hypersurface N : 6 Here S A = s A , and S v is defined by adding new terms to the expression for s v arising from E µν . These new terms are quadratic (or higher order) in terms with positive boost weight so they do not affect any of the results described in the previous section, i.e., such terms are not fixed by Wall's procedure. On the RHS, X AB (which is symmetric) and Y A are both O(ℓ 2 ) and have boost weight 1, 2 respectively, and Y A is a sum of terms that each contain at least two factors of positive boost weight. In contrast with (9), the above equation holds on-shell, i.e., its derivation makes use of most of the components of the Einstein equation (14). The O(ℓ N ) term arises not only from the O(ℓ N ) corrections on the RHS of (14) but also from various steps in the derivation. In other words, even if we worked with the exact equation E µν = 0 we would still generate a O(ℓ N ) term. (In this regard the N = 2 case is exceptional.) To obtain a second law we take N to be the horizon H of a black hole and use S v to define an entropy S as in (13): Equation (16) implieṡ where again we assume that our black hole settles down to a member of F at late time, which implies that all quantities with positive boost weight vanish at late time, hence the (boost-weight 1) quantity in square brackets on the LHS of (16) vanishes as v → ∞. In constrast with the N = 2 case, we do not know anything about the sign of D A Y A on the RHS of (18) in a fully nonlinear situation. However, we can make progress by resorting to second order perturbation theory about a member of F .
Positive boost weight quantities vanish on the horizon of a member of F , which implies that S v and s v agree at zeroth and first order in perturbation theory. Since Y A is at least quadratic in positive boost weight quantities we have We are working to quadratic order so on the RHS D A and √ µ are evaluated in the background spacetime, where they are independent of v. Hence we can interchange the order of integration and use the divergence theorem to see that the RHS vanishes. This leaves where we used the fact that K AB and X AB have boost weight one and hence vanish on N in the background spacetime. Since the first term on the RHS is positive definite, this shows that, to quadratic order in perturbations,Ṡ(v) can become negative only by an amount O(ℓ N ), i.e., of the same size as the unknown effects caused by our ignorance of higher order EFT terms. In particular, the better we know the EFT, the larger N is, and the smaller the amount by whichṠ(v) can become negative. 7 Since our equations of motion involve N or fewer derivatives, it follows that S v and S A involve N − 2 or fewer derivatives. These quantities are sums of monomials where each monomial is at most quadratic in terms with positive boost weight. It is natural to group such monomials into those with zero, one or two (or more) factors with positive boost weight. The former terms generate the Iyer-Wald entropy, including the second set of terms gives the Iyer-Wald-Wall entropy and including the final set of terms gives our generalization of this. A term with non-zero boost weight b costs at least b derivatives so a term quadratic in positive boost weight quantities, with overall boost weight zero, must contain at least 4 derivatives and therefore occurs only for N ≥ 6. Hence for N = 4 we have S v = s v and S A = s A so in this case our result implies that, without further modification, the Iyer-Wald-Wall entropy satisfies the second law to quadratic order in perturbations around a member of F . Equation (16) holds for a purely gravitational EFT. We shall make a few remarks about the generalization to include matter fields. With matter fields there is a greater variety of higher derivative terms that can occur on the RHS of (16). However, there are also further terms on the RHS of (16) arising from the energy-momentum tensor of the 2-derivative part of the matter Lagrangian. If this 2-derivative Lagrangian satisfies the null energy condition then these terms have a "good sign" (i.e. they are negative definite). The hope is that this can be used to help control the behaviour of higher derivative terms involving the matter fields, by completing the square in the same way that we did with the K AB K AB term in vacuum gravity. For example, a minimally coupled 2-derivative scalar field contributes −(1/2)(∂ v Φ) 2 to the RHS of (16). So one can try to use this to control higher derivative terms involving ∂ v Φ by completing the square. This will lead to an extra term of the form −(1/2)(∂ v Φ + P ) 2 on the RHS of (16). (We will study an example of this in section 1.9.) This will give an extra term [δ(∂ v Φ + P )] 2 inside the integral of (20). Since this term has a good sign, the above argument that the second law holds to quadratic order, in the sense of EFT, is still valid. In section 4.2 we shall sketch a proof that this can be done for any scalar-tensor EFT.

Example: vacuum gravity
Field redefinitions can be used to simplify the Lagrangian of an EFT. For example, in vacuum gravity, a field redefinition can be used to bring the terms with up to 4 derivatives to the "Einstein-Gauss-Bonnet" form: 8 where k is a dimensionless constant and L GB is the Euler density associated with the Gauss-Bonnet invariant: This term is topological in n = 4 dimensions but non-trivial in higher dimensions. This theory has second order equations of motion and admits a well-posed initial value problem [17,18] as long as it remains within the regime of validity of EFT.
Since we have N = 4 here, we have S v = s v and S A = s A as explained above. Using previous results for s v and s A [4,11,12] gives where R[µ] is the Ricci scalar of µ AB . In this case S v involves only quantities of zero boost weight, so our entropy is simply the Iyer-Wald entropy S IW . Hence, for Einstein-Gauss-Bonnet theory, our result implies that the Iyer-Wald entropy satisfies the second law to quadratic order, in the EFT sense explained above, i.e., δ 2Ṡ IW is positive up to O(ℓ 4 ) terms. We can also demonstrate how to obtain (16) for this theory. A calculation gives the off-shell identity where the terms on the first line are the usual terms arising from the Einstein-Hilbert Lagrangian, and the terms on the second line arise from L GB . These involve and where R[µ] AB is the Ricci tensor of µ AB . These expressions involve a new derivative operator D A . Acting on a quantity of boost weight b this is defined as D A = D A − bβ A /2 (b = 1 in the above expressions). As we shall explain below, this a connection on the normal bundle of the horizon cross-section. Note that each term in Y A contains a factor of K BC so we can error" term arising from completing the square. So in this case, we have the off-shell result that −E vv can be written as the difference between the LHS and RHS of (16) up to O(ℓ 4 ) terms. On-shell, the equation of motion gives E vv = O(ℓ 4 ) so (16) holds as claimed. 9 We have used only the vv-component of the Einstein equation. This is atypical: in general, the derivation of (16) makes use of multiple components of the Einstein equation.
In this example, the entropy is simply the IW entropy, which is manifestly gaugeinvariant, in agreement with our general result. However note that S A is not gauge-invariant. This is because it is written in terms of D A rather than D A . So the entropy current is not gauge-invariant in this example (see also [14] for discussion of this point).

Example: scalar-tensor EFT
As a second example, we will consider the EFT of gravity coupled to a scalar field in n = 4 dimensions. As above, field redefinitions can be used to simplify the Lagrangian [19]. In particular, assuming a parity symmetry, terms with up to 4 derivatives can be written where X ≡ − 1 2 g µν ∂ µ Φ∂ ν Φ, and V, α, β are arbitrary functions. (The coupling functions α, β should not be confused with the GNC metric components α, β A .) The coupling functions α, β are dimensionless. This theory has second order equations of motion and admits a well-posed initial value problem [17,18] as long as it remains within the regime of validity of EFT.
The Einstein equation is E µν = O(ℓ 4 ) where the RHS arises from the unkown EFT terms with 6 or more derivatives and the LHS is A calculation gives the off-shell result where and Y A , W AB , W are O(ℓ 2 ) quantities that we shall not write out explicitly. We can complete the square on K AB as in our previous example, generating an O(ℓ 4 ) error term. We can also complete the square on ∂ v Φ, again generating an O(ℓ 4 ) error term. So on-shell we obtain an equation of the form (16) with an extra term of the form −(1/2)(∂ v Φ + P ) 2 on the RHS for some O(ℓ 2 ) quantity P . This term has a good sign and so the second law holds to quadratic order in perturbations, in the sense of EFT, as explained above. As in our previous example, only the vv component of the Einstein equation is used above, and S v involves only quantities of zero boost weight, so for this theory our entropy is simply the Iyer-Wald entropy S IW , which is manifestly gauge-invariant.

Example: Ricci squared gravity
The above examples were atypical because the IWW entropy coincides with the IW entropy, and because equations (24) and (29) hold off-shell. As a more typical example consider a theory of vacuum gravity with For n = 4 dimensions this includes the most general (non-topological) terms with N = 4. These terms could be eliminated via a field redefinition but we choose not to do so here.
Since we have N = 4 we must have S v = s v and S A = s A as in our previous examples. Using previous results for s v , s A [4,11,12] gives The k 2 term in S v agrees with the JKM entropy [3]. In this example, the IWW entropy differs from the IW entropy because of the KK term and similar terms contained in R rv and R. On C, R rv ≡ R µν ℓ µ n ν and KK are both invariant under a change of GNCs. (This is explained in more detail in section 2.1.) Hence the entropy is gauge invariant, in agreement with our general result. Once again, the entropy current is not gauge-invariant if k 1 = 0. We shall not write out equation (16) explicitly. However, we emphasise that, for this example, it is necessary to use several components of the Einstein equation to obtain (16). In particular the vA component is used to eliminate a term on the RHS quadratic in ∂ v β A .

Discussion
We shall conclude this overview with a discussion of some important open issues, namely the gauge invariance and uniqueness of the entropy. Section 3 of this paper establishes that the IWW entropy can be defined in a gaugeinvariant manner. In more detail, what this means is that if we pick a cross-section C of the horizon and define GNCs based on C then the resulting definition of the IWW entropy of C is the same for any choice of these GNCs. Clearly it is important to determine whether our improved entropy (based on S v ) is also gauge-invariant in this sense. In the examples discussed above, i.e. EFTs with up to 4 derivatives, our improved entropy is the same as the IWW entropy and therefore the entropy is gauge-invariant in these examples. But it is unclear whether this remains true if we include terms with 6 or more derivatives.
To understand why this is important, let C, C ′ be two cross-sections of H with C ′ lying entirely to the future of C. We can choose GNCs based on C and normalize the affine parameter along the horizon generators such that C ′ is given by v = v 0 for some v 0 > 0. The entropy defined using our approach will then satisfy a second law: S[C ′ ] ≥ S[C] (to quadratic order, in the sense of EFT). However, if the entropy is not gauge-invariant then the definition of S[C] depends on the choice of GNCs, and hence on C ′ , which is clearly unsatisfactory.
Our entropy satisfies the second law only to quadratic order in perturbations, in the sense of EFT (i.e. modulo higher derivative terms), so maybe a proof of gauge invariance would also only hold to quadratic order, in the sense of EFT. We leave the construction of such a proof to future work. Now we turn to uniqueness of the entropy current. The definition of entropy in nonequilibrium thermodynamics is known to suffer from ambiguities. An example is a relativistic viscous fluid. The fluid equations of motion can be viewed as an expansion in increasing numbers of derivatives of the fields and one can define an entropy current also in terms of an expansion in increasing numbers of derivatives of the fields. The aim is to find a definition of the entropy current that satisfies a second law on-shell. It has been shown that such an entropy current is not unique: there are multi-parameter families of entropy currents that satisfy the second law [20,21,22]. Something similar can be seen for black hole entropy in the perturbative context we are considering. For example, consider conventional vacuum GR. The IWW entropy is simply given by the horizon area: s v = 1. Now considers v = 1 + cℓ 2 KK. Recall that K = 0 on H in the background, and that, on-shell, linear perturbations satisfy (12), which here reduces to δK = 0. Hence we have δ( √ µs v ) = δ( √ µs v ) on-shell. So, to linear order, s v ands v define the same entropy on shell. However, off-shell they differ at linear order. If we restrict to linear perturbations then the only sense in which s v is preferred overs v appears to be that the former satisfies the off-shell equation (9).
Going beyond linear order we must adopt the EFT perspective. For n = 4 vacuum gravity, field redefinitions can be used to eliminate 4-derivative terms from the equations of motion, i.e., the equation of motion is G µν + Λg µν = O(ℓ 4 ). Our "improved" entropy S v differs from s v by terms with at least 4 derivatives, which appear at higher order in EFT than the order to which we are working. Hence we have S v = s v = 1 here. Now we could uses v as the starting point in our algorithm for improving the entropy and we would then obtainS v =s v = 1 + cℓ 2 KK. On-shell S v andS v agree at linear order but they differ at quadratic order. Since c is arbitrary, we therefore have a non-unique entropy current.
Generalising this example, for a general theory we denote the term in square brackets on the RHS of (9) as ∂ · s. Equation (12) says that on-shell we have δ(∂ · s) = 0. Now consider where L, L A are arbitrary linear operators built from ∂ v , D A and the metric components.
On-shell we have δ( √ µs v ) = δ( √ µs v ) as above, so the entropies agree to linear order. But, as we saw above,S v and S v will differ at quadratic order. Clearly there is a lot of freedom in the choice of L and L A so there seems to be a lot of freedom in defining an entropy that satisfies a second law in the perturbative sense we have discussed. 10 This non-uniqueness in the entropy can arise from field redefinitions. 11 In general, a redefinition of the metric would change the location of H but we can restrict to field redefinitions that are trivial on-shell to avoid having to deal with this. An example is n = 4 vacuum gravity viewed as an EFT: as mentioned above we can eliminate 4-derivative terms from the equations of motion to obtain the equations arising from the Lagrangian L = R + O(ℓ 4 ) (ignoring Λ for simplicity). Starting from this Lagrangian we could perform a field redefinition of the form g µν → g µν + a 1 ℓ 2 R µν + a 2 ℓ 2 Rg µν to obtain a new Lagrangian L ′ of the form (31). This field redefinition is trivial on-shell (as the original equation of motion is R µν = O(ℓ 4 )). The IWW entropy resulting from L ′ is given by (32), which differs off-shell from that of L, although they agree on-shell to linear order. Going beyond linear order, the expressions for S v for the two Lagrangians differ on-shell by a multiple of KK, so at quadratic order the entropy is different before and after the field redefinition. 12

More on Gaussian Null Coordinates (GNCs)
In a spacetime (M, g), consider a smooth co-dimension 1 null surface N that is ruled by affinely parameterized null geodesics with tangent l = l µ ∂ µ such that every null geodesic intersects a spacelike cross section C precisely once. Associated with this structure one can construct GNCs (1) as described in Section 1.3.
The tensors β A dx A and µ AB dx A dx B appearing in the Gaussian null form of g (1) have an invariant geometric meaning on the cross section C (or more generally on each fixed leaf C(v) of the foliation) of N . The meaning of µ AB is obvious: it is the induced metric on C, with Levi-Civita connection D. To understand the role of β A , note that the foliation defines a split where the normal bundle (T C C) ⊥ is spanned by the null vectors n = ∂ r , l = ∂ v . Now β = β A dx A can be regarded as a connection 1-form on the 2-dimensional Lorentzian vector bundle (T C C) ⊥ . Said differently, on C we can reduce the SO(n − 1, 1) principal fibre bundle F g M| C of orthonormal frames defined from the metric g to the product of the SO(n − 2) principal fibre bundle F µ C associated with orthonormal frames of µ AB dx A dx B and the (trivial) SO(1, 1) principal fibre P bundle of pairs of null directions (rather than vectors) in (T C C) ⊥ . The group SO(1, 1) may be identfied with R + and it acts in this parameterization on a pair of null vectors n, l by the local rescaling al, a −1 n, where a > 0 is a smooth function on C identified with a local gauge transformation. The representation of R + on a line R given by π b : a → a b for a given boost weight b gives rise to a line bundle P ⋉ π b R over C via the associated vector bundle construction, with corresponding covariant derivative operator D = D − 1 2 bβ. If a given tensor field on M is decomposed into its tangential components along C, its components along l and along n (corresponding to the coordinate indices A, v, r in GNCs), then b corresponds exactly to the boost weight of such a component as described briefly in Section 1.3 and more precisely in Section 2.2 below.
We shall consider theories including matter fields, assumed to be either scalar fields or abelian p-form fields. In the latter case, we shall fix the gauge as follows. A p-form field By a suitable choice of Λ, we can always achieve that n · A = 0 in some neighborhood of N and l · A = 0 on N . For a 1-form field this means that in such a gauge, so the differential dr does not appear and the other differentials occur in a boost invariant combination.
As mentioned in section 1.3, there is freedom in the choice of GNCs on N . Understanding this freedom will be important in our discussions of gauge-invariance below. 13 The freedom present in the choice of GNCs is: (1) we may start from a different cross section, C ′ ; (2) we may choose a different set of coordinates x ′A on C ′ ; (3) we may choose a different affine parameterization. Making such a change implies that the relationship between the GNCs is where a is positive and gives the change of affine parameter and where b corresponds to a change of the reference cross section C ′ defined by v ′ = 0. The relationship between (x µ ) = (v, r, x C ) and (x ′µ ) = (v ′ , r ′ , x ′C ) away from N is in general complicated. Let us, for later purposes, consider only a change of the affine parameter leaving C as it is and keep the coordinates on C as they are (invariance under a change of coordinates on C will be manifest below). Thus, we take b = 0 and x ′C = x C on the cut C for simplicity, so that on N , the change of GNCs is We will determine the relationship between (x µ ) = (v, r, x C ) and (x ′µ ) = (v ′ , r ′ , x ′C ) away from N order by order in r ′ in the following way. First note that and then imposing the defining relations for n ′ on N , g(n ′ , n ′ ) = g(n ′ , ∂/∂x ′C ) = 0, g(l ′ , n ′ ) = 1 gives that In other words, since n ′ = ∂/∂r ′ in the primed GNCs, Eqs. (40), (41) are the initial conditions for the geodesic equation for n ′ , which since n ′µ = ∂x µ /∂r ′ , is Here, Γ µ σρ is the Christoffel symbol of the metric in the coordinates (x α ) = (r, v, x C ) (see Appendix A). By integrating the geodesic equations we can determine In general the GNC components of the metric in (r ′ , v ′ , x ′C ) coordinates are rather complicated functions of the original metric components. However, some expressions simplify on N or C. For example using (41) one obtains, on N (i.e. for r = r ′ = 0) subject to the same identification. The formulae for the transformation ofK AB and its rderivatives, or the Riemann tensor R ABCD [µ] of µ AB , are complicated on N . However, they simplify on C, i.e., for v = v ′ = r = r ′ = 0: A quick way to determine the transformation law of a given tensor in GNCs of the form is to transform to one of the covariant bases of monomials described in lemmas 2.2, 2.3 below.
We can think of the above change of GNCs as a diffeomorphism f of some open neighborhood of N by assigning to a point p with a given set (x µ ) of GNCs a new point p ′ = f (p) with the same values (x ′µ ) in the unique GNC system related by (v, It is important to note that, although the restriction of f to N only refers to the function a > 0 on C but not to a particular metric, the diffeomorphism f depends on both a and the metric g off of N , because its construction involves solving for the transverse geodesics relative to the metric g. Thus, we should write f [a, g] to indicate properly the dependencies on a and g. Then we have the cocycle condition where • indicates the composition of diffeomorphisms and g ′ = f [g, a] * g. This relationship can be proven by noting that both sides trivially have the same action on N which together with the preservation of the Gaussian null form uniquely determines the diffeomorphisms off of N on both sides.

Local covariant tensors and GNCs
Consider a tensor field such as t α 1 ...αr β 1 ...βs [g] that is constructed locally and covariantly out of the metric. By the Thomas replacement lemma [2] it is built from contractions of g µν , g µν , ∇ (α 1 · · · ∇ α k ) R µναβ (and possibly the volume form). We may evaluate such a tensor in GNCs, laboriously computing the curvature tensors and covariant derivatives in these coordinates. This will lead to, in general complicated, expressions involving µ AB , β A , α, r and their derivatives, see Appendix A for the Riemann and Ricci components, for example. Here we would like to make certain general statements about the resulting expressions. These general features arise from the functional property t α 1 ...αr , taking f to be a diffeomorphism preseriving the Gaussian null form, as described in the previous section.
Recall that a general change of coordinates preserving the Gaussian null form is given on N by (36). We first specialize this equation to a = 1, b = 0. Then it is easy to see that, ..βs , and view this as a tensor relative to the indices chosen as α i = A i , β j = B j (i.e. those not chosen as r, v), then the resulting tensor is on C a local covariant functional of µ AB , β A , α and r. In other words, it is a functional which is built out of contractions (with the inverse metric , multiplied by non-negative powers of r. The latter will be absent on N (as r = 0). It is convenient to introduce the following terminology (here ǫ[µ] denotes the volume form induced by µ AB on C(v)): . A "primitive monomial" is a (possibly contracted) product of primitive factors.
On N , any component of the Riemann tensor (or its derivatives) is a sum of primitive monomials.
If we specialize the change of GNCs defined by (37) to the case when x C = x ′C , b = 0 and a > 0 is constant then globally we have v = av ′ , r = a −1 r ′ , x C = x ′C . Recall that we say that a quantity X has boost weight b if it transforms as in (4) under such a change of GNCs. Recall also the simple rule described in section 1.3 for computing the boost weight of a tensor component: A given GNC component of t α 1 ...αr β 1 ...βs is a sum of monomials built from µ AB -contractions of , multiplied by non-negative powers of r. The boost weight of the various terms appearing in such expressions is given as follows: The boost weight of each monomial in the expression for a given tensor component is equal to the boost weight of that tensor component. For example, the boost weight of each monomial in the Riemann component R rAvB is zero, as can be seen explicitly by the expressions in Appendix A. Likewise, the boost weight of each monomial in the Ricci component R rr is −2, etc.
When matter fields with a tensorial character such as 1-forms A J are present, then we should first decompose them into GNC components and pick a suitable gauge as e.g. in (35) for a 1-form field. In such a gauge, A JA , φ J count as having boost weight zero and are treated on the same footing as the µ AB , β A , α or scalar fields Φ I , so in this case tensor components are expressed in terms of derivatives of ψ ∈ {Φ I , A JA , φ J , µ AB , α, β A } and non-negative powers of r, and similarly for higher form fields. D A should be replaced by an appropriate charged covariant derivative using A BJ dx B on any charged scalar fields. Quantities descending from a diffeomorphism and gauge covariant quantity remain gauge covariant under the restricted gauge transformations preserving the gauge (35), that is Λ J that do not depend on r, v. We will usually suppress discussion of matter fields below and only comment on then where they lead to important differences.
The notion of boost weight depends on the chosen cut C and it refers to a particular choice of GNCs. If we perform a non-trivial change of GNCs, corresponding to non-constant a, then the definition of boost weight w.r.t. the new GNCs will not agree with the definition w.r.t. the old GNCs. However, (for fixed C) the definitions will agree on N . More precisely, } be the quantities defined by the components of the metric g w.r.t. a second set of GNCs x ′µ . Then we have the following lemma.

Lemma 2.1 On N , the definition of boost weight is independent of the choice of GNCs. In other words, the expressions for
and powers of v ′ will only contain terms of boost weight p ′ − q ′ , with v ′ counting as boost weight −1 and a counting as boost weight 0.
Proof. On N , the quantities α, β A , µ AB and their derivatives can all be written as ∂ µ 1 . . . ∂ µ N g νρ for some choice of indices (e.g. β A = −∂ r g vA on N ). If we consider the corresponding quantity ∂ µ ′ 1 . . . ∂ µ ′ N g ν ′ ρ ′ in another set of GNCs then this will be given by some contraction of We claim that, on N , each component of such an expression has a definite boost weight determined by the same rule described above for tensor components, i.e., by counting the number of up and down indices of each type. (For example J r v ′ A ′ | N will have boost weight 1 + 1 + 0 = 2.) Hence, when contracted with ∂ µ 1 . . . ∂ µ N g νρ , the additivity of boost weight under products will ensure that the result will have a definite boost weight. An example of this can be seen in (45) where both the LHS and RHS have boost weight zero (since we assign a boost weight zero).
To see why each component of J µ The result is then immediate from (37) (and r ′ = 0 on N ). For example ( A a, which has boost weight zero, in agreement with the above rules applied to the LHS. Next assume exactly one of the ν ′ i is r ′ . For p = 1 the result is immediate from (40), where each term on the RHS has the correct boost weight to match the above rules applied to the LHS. Taking derivatives w.r.t. v ′ and x A ′ respects these rules and so this result extends to any p. If exactly two of the ν ′ i are r ′ then, for p = 2, the result follows by evaluating (42) on N and using the fact that the components of the Christoffel symbols have definite boost weight. The result for p > 2 follows by taking v ′ and x A ′ derivatives of (42). If exactly three of the ν ′ i are r ′ then one takes another r ′ derivative of (42) and evaluates on N and so on. Q.E.D.
Since we will be dealing with equations involving higher derivatives, we will sometimes need to keep track of the number of derivatives associated with a given quantity. We are mostly interested in doing this on N . In the above proof we saw that, on N , the quantities of interest can all be expressed in terms of partial derivatives of the metric tensor: α involves 2 r-derivatives and β A involves 1 r-derivative. We make the following definition to count derivatives: A quantity with boost weight b involves at least |b| derivatives and so its dimension is bounded below by |b|. The dimension of a component of the kth covariant derivative of the Riemann tensor is k + 2.
The primitive factors of definition 2.1 are employed in Wall's method. However, in various places we will find it more convenient to express quantities on N in terms of a different set of quantities. These are described in lemmas 2.2 and 2.3.

Lemma 2.2 On N , any of the expressions D
Note that, by the Bianchi identities, these components are not all independent. When j ≥ 1, an independent set is obtained e.g. by choosing Proof: First consider the case of ψ = α. To express D (A 1 · · · D A k ) ∂ p v ∂ q r α in the desired form, we start by writing the Ricci component R vr in terms of GNC components (see Appendix A) where the ellipsis denotes terms that do not depend on α. Evaluating this equation at r = 0 uniquely determines α| r=0 in terms of β A , D A β B and quantities of the form Next, we act on the above equation with ∂ r and use 15 where the ellipsis denotes terms that vanish at r = 0. By virtue of the identities of Appendix A, this can be used to determine ∂ r α| r=0 in terms of quantities of the following form: Proceeding inductively, we assume that we can express ∂ q−1 r α| r=0 in terms of the following quantities: We have already shown that this is true for q = 1 and q = 2. Now we act on (49) with ∂ q r and evaluate at r = 0. Then we determine ∂ q r α| r=0 in terms of primitive factors of the form One can then rewrite ∂ q r R rv in terms of GNC components ∇ α 1 · · · ∇ α j R µν with j ≤ q and terms of the form ∂q r Γ µ rν withq < q. It follows from the formulae of Appendix A that the terms involving Christoffel symbols can be written in terms of primitive factors of the form ∂ q ′ r β A with q ′ ≤ q, ∂ q ′′ r µ AB with q ′′ ≤ q + 1 and ∂ q ′′′ r α with q ′′′ ≤ q − 1. However, the terms of the form ∂ q ′′′ r α with q ′′′ ≤ q − 1 can be eliminated in favour of derivatives of β A , µ AB and covariant derivatives of R µν , as specified above. This shows that if the induction hypothesis holds then ∂ q r α| r=0 can be expressed with the quantities Since this statement is true for q = 1, it follows that it is true for general q.
By taking v or x A derivatives of this result, we determine as a sum of products of factors depending only on Each monomial in this sum must have boost weight p − q. In the rest of this argument it is understood that whenever r α| r=0 appears, we will rewrite it in terms of the above quantities.
The next step is to show that we can use similar methods to eliminate certain derivatives of β A . First we consider the following expression for R rA (see Appendix A) where the first ellipsis on the RHS denotes terms quadratic inK AB and the second ellipsis denotes terms linear in D AKBC (contracted in some way with µ AB in both cases). Evaluating this equation at r = 0 determines ∂ r β A | r=0 in terms of β A , µ AB , ∂ r µ AB , D C ∂ r µ AB and R rA . Acting with ∂ q−1 r for q ≥ 1 and evaluating at r = 0 lets us determine, inductively (similarly to the ψ = α case above), Next we consider R vA evaluated at r = 0: The last one of these can then be written in terms of GNC components ∇ α 1 . . . ∇ α j R µν with j ≤ p − 1 and ∂p v Γ µ vν r=0 withp < p. Employing the expressions of Appendix A relating Christoffel symbols to primitive factors then gives a formula for ∂ p v β A | r=0 in terms of β A , terms of the form Now we return to (51): taking a v-derivative of this equation and using our result for Similarly, taking multiple v-derivatives of (51) and employing an induction on the number of v-derivatives gives us an expression for ∂ p v ∂ r β A | r=0 containing the following quantities: β A , primitive factors with ψ = µ AB and components of the form ∇ α 1 . . . ∇ α j R µν with j ≤ p. Furthermore, taking v-derivatives of our previous expressions for ∂ q r β| r=0 and proceeding inductively as before we obtain an expression for From now on, it is assumed that at every occurrence of D A 1 . . . D Am ∂ p v ∂ q r β B | r=0 it is expressed with the above quantities.
Consider finally a factor of the form We wish to argue that this factor can be written as a sum of monomials whose factors are of the form specified in the statement of the lemma. To this end, we consider the expression for the Ricci component R AB in terms of GNCs (given in more detail in Appendix A) where the ellipsis now stands for terms that vanish at r = 0. This equation allows us to express ∂ vKAB | r=0 in terms of R AB , R[µ] AB , K AB ,K AB , β A and D A β B . Next, we act with ∂ r on equation (53) and use Here the ellipsis stands for terms that are of the required form and terms involving ∂ vKAB that can be written in the required form as described in the previous paragraph. As for the term involving α, we have shown it previously that it can be expressed (at r = 0) in terms of R rv , µ AB , K AB ,K AB , β A , D A β B and ∂ vKAB . Combining this with our result on ∂ vKAB yields a formula for α in terms of Regarding the terms involving ∂ r β A and ∂ r D A β B in (54), we have a prescription to write these (at r = 0) in terms of β A , D A β B , µ AB ,K AB , D A 1K AB , D A 1 D A 2K AB and covariant components involving the Ricci tensor. Therefore, equation (54) provides an expression for ∂ v ∂ rKAB | r=0 as a sum of monomials that are products of factors of the following quantities: Proceeding inductively, one can fix ∂ v ∂ q rK AB | r=0 (by taking multiple derivatives of (53) w.r.t. r and using previous results) in terms of the quantities Similarly, an inductive argument establishes that taking multiple v-derivatives of our expressions for ∂ v ∂ q rK AB | r=0 allows us to write ∂ p v ∂ q r µ AB | r=0 using only the factors specified below: Taking D A derivatives of this result then lets us write Combining this with our previous results on primitive factors of α and β, we can determine as a sum of monomials with each monomial being a product of factors To obtain the same dependencies as in the statement of the lemma, as well as uniqueness, we first note that (on N ) the covariant quantities ∇ α 1 . . . ∇ α j R vv with α j ∈ {v, A} and ∇ α 1 . . . ∇ α j R rr with α j ∈ {r, A} can be determined by all the other factors (i)-(iv) listed above. To see this, we consider the identity (see Appendix A) Hence, R rr is clearly expressible in terms ofK AB and ∂ rKAB . Taking derivatives of this result w.r.t. r and x A , one can inductively express ∇ α 1 . . . ∇ α j R rr with α j ∈ {r, A} in terms of factors of the form D A 1 . . . D A j ∂ p ′ v K AB and other factors listed in (i)-(iv). A very similar argument establishes that ∇ α 1 . . . ∇ α j R vv with α j ∈ {v, A} is also redundant in the list of factors above. Furthermore, we may express all the quantities listed in (i)-(iv) with totally symmetrized derivatives such as D (A 1 . . . D A j ) . The reason for this is that an anti-symmetrization of a quantity D A 1 . . . D A j Ψ over a subset of the indices A 1 , . . . A j is expressible via the Riemann tensor R[µ] ABCD and fewer than j derivatives of the quantity Ψ (using the Bianchi identities). Similarly, the quantities ∇ α 1 . . . ∇ α j R µν can be expressed with the totally symmetrized derivatives ∇ (α 1 . . . ∇ αm) R µν and (covariant derivatives of) the Riemann tensor associated with ∇. To eliminate the dependency on R µνρσ , one can use the formulae of Appendix A and the procedure described above to write the components of the Riemann tensor (at r = 0) in AB and Ricci components R µν . Then one can argue that any derivative of the Riemann tensor can be expressed in terms of the quantities listed in the statement of the lemma by using induction on the number of derivatives and by making use of identities of the form where the ellipsis on the RHS stands for terms involving fewer than j derivatives of the curvature tensor. Finally, the Bianchi identities give us a freedom when expressing a primitive factor in terms of the factors in the statement of the lemma. It can be seen that this is all the remaining freedom that one has using induction, because the leading derivative terms of the Bianchi identities cancel and these are the only possible linear dependencies among the leading derivative terms used in our induction argument. Q.E.D.
Proof: By lemma 2.2, we only need to eliminate First we consider the evolution equation for the expansion and shear in the r-direction, which is which lets us eliminate ∂ rKAB in terms ofK AB , plus a GNC component of the Riemann tensor. Next, we consider, on N , the Riemman component which lets us write D A β B in terms of D (A β B) plus terms containing K AB ,K AB or a GNC component of the Riemann tensor. The evolution equation for the expansion and shear in the v-direction is on N given by which lets us eliminate ∂ v K AB in terns of K AB , plus a GNC component of the Riemann tensor. Then we consider the Gauss-Codacci equation on N , This lets us eliminate D [A K B]C , D [AKB]C in favor of terms containing K AB ,K AB , β A or a GNC component of the Riemann tensor We now continue this process by taking suitable derivatives of the above equations, using the non-zero Christoffel symbols (on N ) given in Appendix A.
First we show inductively that ∂N rK AB can be expressed with onlyK AB and covariant components of the form ∇ (α 1 · · · ∇ α j ) R µνσρ with j ≤N − 1. We have already shown above that this is true forN = 1. Now assume that ∂q rK AB with q ≤N − 1 can be expressed usinḡ K AB and GNC components ∇ (α 1 · · · ∇ α j ) R µνσρ with j ≤q − 1. Now actN times with ∂ r on (57), noting that (57) holds off N . We can convert any partial derivatives of Riemann tensor components to components of covariant derivatives of the Riemann tensor. This lets us eliminate ∂N rK AB in terms of ∂ q rK AB with 0 ≤ q ≤N − 1, ∂ q ′ r Γ A rB with 0 ≤ q ′ ≤N − 1, plus GNC components of the Riemann tensor. Note that other Christoffel symbols cannot appear in this expression since Γ µ rr and Γ v rµ vanish (on and off N ). Furthermore, Γ A rB =K A B on and off N . By the induction hypothesis we can eliminate the terms ∂ q rK AB in favor of K AB and covariant derivatives of the Riemann tensor, closing the induction loop.
A very similar inductive argument lets us write ∂ N v K AB in terms of K AB and GNC components ∇ (α 1 · · · ∇ α j ) R µνσρ with j ≤ N − 1. This argument is based on acting with ∂ v multiple times on (59) and noting that Γ r vµ = Γ µ vv = 0 and Γ A vB = K A B on N . Next, we express the factors in terms of the quantities listed in the statement of the lemma by using induction on j. These expressions have been obtained for j = 1 above. Now suppose we have the corresponding expressions for . We next use an identity of the form where the ellipsis in the second line stands for a sum of monomials containing factors of the form Then we consider ∇ A 1 · · · ∇ A j−1 R ABvC and write it in terms of primitive factors. This lets us eliminate Writing ∇ A 1 · · · ∇ A j −1 R ABCD in terms of primitive factors yields an expression relating with k, l, m ≤ j − 1 and covariant components. By employing the induction hypothesis, we get an expression for D (A 1 · · · D A j−1 ) R ABCD [µ] in terms of the quantities listed in the statement of the lemma.
Regarding D (A 1 · · · D A j ) β B , we first use an identity similar to (62): where again the ellipsis in the second line stands for a sum of monomials containing fac- The terms in the first line of the RHS of this equation can be eliminated by writing ∇ (A 1 · · · ∇ A l−1 ) R rvAB in terms of primitive factors. The terms in the ellipsis can be dealt with by invoking the induction assumption. This yields an expression for D (A 1 · · · D A j ) β B in terms of the desired quantities, closing the induction loop. Finally, consider the case of D (A 1 · · · D A j ) ∂N rK AB withN ≥ 1. Let us take derivatives w.r.t. x A on our previous expression relating ∂N rK AB toK AB and covariant components involving the Riemann tensor. This gives expressions relating ABCD and covariant components. Using previous substitutions yields the desired expression for D (A 1 · · · D A j ) ∂N rK AB . A similar argument establishes that D (A 1 · · · D A j ) ∂ N r K AB is also expressible in the desired form. Q.E.D.

Covariance of GNC tensors and BRST-formalism
We now consider how quantities on our cut C transform under a non-trivial change of GNCs, i.e., a change with non-constant a(x A ) > 0. It follows from the results of section 2.1 that under such a change, on C we have 16 16 The first two transformations hold on all of N , not just on C.
where b is the boost weight of the GNC component of the covariant tensor ∇ (α 1 · · · ∇ α j ) R µνσρ , see eq. (48). A useful way to think about the replacements (64) on the cut C is that β = β A dx A is a gauge potential and that K AB ,K AB and the GNC components of ∇ (α 1 · · · ∇ α j ) R µνσρ are charged under the gauge group R + , with charges +1, −1 and b respectively. The function a corresponds to a finite gauge transformation associated with the gauge group R + . It is therefore useful to define the gauge covariant derivative with b the boost weight (= charge) of the quantity that it is acting on. See the start of section 2.1 for the geometrical interpretation of D as a connection on the normal bundle of C.
Our eventual aim is to write an expression for an entropy as an integral of a boost-weight zero n − 2 form over the cut C. For this entropy to be invariant under a change of GNCs, the n − 2 form should be gauge-invariant up to addition of an exact form. The following lemma characterizes the most general structure of such a form:

Lemma 2.4 Let w m be a local m-form on C of boost weight 0 built from contractions of
• GNC components of the covariant tensors ∇ (α 1 · · · ∇ α j ) R[g] µνσρ (By the Bianchi identities, not all these Riemann components are independent, see lemma 2.3.) To prove lemma 2.4 and other similar lemmas which we shall need later on, we now reformulate the gauge transformations (64) using the BRST method (see e.g. [24]). First, we write a = e tΛ/2 , and then obtain the transformation, on N , of any quantity under an infinitesimal gauge transformation by differentiating its transformation law w.r.t. t and evaluating at t = 0. This gives -in general very complicated -formulas for the infinitesimal change of any monomial D (A 1 · · · D A k ) ∂ p v ∂ q r ψ under an infinitesimal change of GNCs on N . On C, these formulas simplify if we pass from this basis of primitive monomials to one of the bases provided by lemmas 2.2 or 2.3. Following the usual BRST approach, we now declare Λ to be an anti-commuting field (of boost weight 0) and the infinitesimal version of the transformations (64) becomes a "BRST transformation", which we denote as γ: where the last transformation is imposed as usual to ensure that γ 2 = 0 and where b is the boost weight of the corresponding GNC component. The action of γ is extended to a product via the graded Leibniz rule. The degree in Λ of a monomial in the fields and their derivatives is referred to as the ghost number and Λ as the ghost field. Consistent with the last two equations equation in (67), we declare that the boost weight of Λ is zero, so that γ does not change the boost weight of any quantity. We furthermore follow the custom to declare the action of the exterior differential d on a ghost number g monomial to be (−1) g times the usual definition, and with this convention we have dγ = −γd, for example.
For constant Λ, the action of γ on a quantity of boost weight b is to multiply that quantity by bΛ/2. However, in general Λ is not constant. The transformation law assumed in lemma 2.4 may be stated as saying that γw m = dw m−1 and w m has ghost number 0 whereas w m−1 has ghost number 1. A form such that γw m = 0 is called BRST-closed (γ-closed) and a form of the type γw m is called BRST-exact.

Lemma 2.5
Let w m be a local m form on C of boost weight 0 and ghost number g > 0 such that γw m = dw m−1 . Then w m is a sum of the following: • A local γ-exact local form of boost weight 0.
• A local d-exact local form of boost weight 0.

Proof of lemmas 2.4 and 2.5:
The proofs of lemmas 2.4 and 2.5 are quite similar and interrelated so we will give them together. We shall start with the special case that γw m = 0 where w m is a local form of ghost number g = 0. In w m we make the replacements as in lemma 2.3. Then we further replace at the expense of additional terms of the form D (A 1 · · · D A j β B) . Now we impose γw m = 0. Since the factors D (A 1 · · · D A j β B) are the only factors which transform with derivatives of Λ and since the terms with undifferentiated Λ drop out using that w m has boost weight 0, it follows At any point in C the terms D (A 1 · · · D A j ) Λ can be chosen as linearly independent from each other so we learn that ∂w m /∂D (A 1 · · · D A j−1 β A j ) = 0 for all j. Thus there is no dependence on β A in the chosen basis of monomials and we have demonstrated lemma 2.4 in our special case γw m = 0. We remain in the same special case but now we go to ghost number g > 0. In the basis of monomials given by D ( and D (A 1 · · · D A j ) Λ as "contractible pairs" in the terminology of [24], meaning that γD (A 1 · · · D A j−1 β A j ) = D (A 1 · · · D A j ) Λ, γD (A 1 · · · D A j ) Λ = 0 and the rest of the BRST transformations are independent of these variables. By a standard result [24], Appendix 2.B, we know that w m is up to a γ-exact local form equal to a local form that has no explicit dependence on β A in the chosen basis of monomials and only depends on Λ in undifferentiated form. The proof of this works as follows: First define to be the number operator for the contractible pair, and let where γ-exact means an m-form in the image of the local m-forms and where I m has boost weight zero and is gauge invariant, i.e. it is a local m-form constructed from D (A 1 · · · D A j K A)B , D (A 1 · · · D A jKA)B and GNC components of ∇ (α 1 · · · ∇ α j ) R µνσρ . Since Λ g = 0 for g > 1, it follows that when g > 1, w m is γ-exact. When g = 1, we have w m = ΛI m plus γ-exact. Thus, we have demonstrated lemma 2.5 in our special case γw m = 0.
We now turn to the general case of lemmas 2.4 and 2.5 where we only assume that γw m = dw m−1 for some local m − 1 form w m−1 . This case will be analyzed using the standard technique of descent equations. To construct the descent equations, we need to appeal to the algebraic Poincare lemma [25]:

Lemma 2.6 (Algebraic Poincare Lemma) Let ξ([Ψ, Φ], x) be an m form on a manifold
C that is locally constructed out of smooth fields Ψ and "background" fields Φ such that dξ([Ψ, Φ], x) = 0 for all Ψ in a neighborhood of a special configuration Ψ 0 . Assume that all Ψ in that neighborhood can be joined to Ψ 0 by a differentiable path Ψ λ , λ ∈ [0, 1] of smooth configurations such that Ψ λ | x depends on finitely many derivatives ∂ k Ψ| x for any x ∈ C and such that Ψ λ=1 = Ψ . Then ξ([Ψ, Φ] Proof: [25] We sketch the proof here since we later want to see how η preserves certain structures of ξ below. We will suppress the dependence on the background fields Φ in this proof. We fix an auxiliary covariant derivative D A on C (coordinate indices A, B, . . . ). Consider the linearization of ξ, where k is the maximum number of derivatives of Ψ that ξ depends on. Then we let (73) The arguments given in [25] show that δξ([Ψ], x) − dγ k ([Ψ, δΨ], x) is a closed local functional depending on up to (k − 1) derivatives of δΨ and so verifies the assumptions of the theorem because δΨ is an arbitrary variation. The argument is then iterated to determine γ k−1 ([Ψ, δΨ], x) and so on and then set γ([Ψ, δΨ], x) = j≤k γ j ([Ψ, δΨ], x). Finally, we set (using a dot to denote a derivative w.r.t. λ) which is local and satisfies the desired equation. Q.E.D.

Remark:
In the applications below, the space of Ψ will consist of certain monomials as in lemmas 2.2,2.3, and will have the structure of vector space. Thus, the interpolating family can be chosen to be simply λΨ = Ψ λ and Ψ 0 = 0. Furthermore, the forms ξ in question will be at least quadratic in positive boost weight monomials. It is clear from the above proof that also η will then be at least quadratic in positive boost weight quantities.
We now return to the proof of lemmas 2.4 and 2.5 in the general case. Taking γ of γw m = dw m−1 , using γ 2 = dγ + γd = 0, we learn that d(γw m−1 ) = 0. γw m−1 depends at least linearly on Λ, so we can apply the algebraic Poincare lemma for Ψ = Λ, Ψ 0 = 0 (viewing the other fields as background fields), to learn that γw m−1 = dw m−2 , where w m−2 is again local. Iterating this process we get the descent equations where the ghost number increases by 1 at each step. The last equation does not involve a boundary term. Such an equation must exist for some k -in the worst case, i.e. the longest descent, we have k = 0. The important point is now that (as w k has positive ghost number) we can apply our previous results to γw k = 0. We learn w k = γb k + ΛI k , where the last term ΛI k is present only if w k has ghost number 1. Thus, if w m had had ghost number 0 (lemma 2.4), the last term is present when k = m − 1, whereas if w m had had ghost number g > 0 (lemma 2.5), the last term is present only when g = 1 and k = m. So long as the last term ΛI k is not present, we can considerw k+1 := w k+1 + db k ,w j := w j for j > k + 1, and then the new ladder {w m , . . . ,w k+1 } will satisfy by construction the descent i.e. we have shortened the descent. We continue with this until we get a nontrivial term ΛI k .

Lemma 2.5:
Since g ≥ 1, the ghost number ofw j is greater than 1 for j < m so we can shorten the descent until we reach γw m = 0. Our previous result givesw m = γb m + Λ g I m and hence w m = γb m − db m−1 + Λ g I m , establishing the result (as Λ g = 0 for g > 1).

Lemma 2.4:
In this case we have g = 0 and we can shorten the descent until we reach This gives The left side depends only on Λ in differentiated form by the definition of γ and because all forms have boost weight 0, whereas the right side depends on Λ only in undifferentiated form. Therefore, since the equation must hold for all Λ, and since D Although it is not essential for our proof of gauge invariance of the IWW entropy, it is interesting to note that we can be more explicit about the form of I m−1 in Lemma 2.4. In the following Lemma note that dβ (the curvature of the connection on the normal bundle of C) can be expressed in terms of the quantities listed in Lemma 2.4 via equation (58).

Lemma 2.7
The gauge-invariant forms I m−1 , I m of Lemma 2.4 can be defined so that I m−1 is written in terms of dβ and characteristic classes built from the curvature of µ AB as follows:

where R[µ] A B = R[µ]
CDA B dx C ∧ dx D and p i is an invariant symmetric polynomial of the Lie-algebra so(n − 2). Such terms can only appear when m is odd.
Proof. Given our metric corresponding to the ψ = (α, β A , µ AB ) in its GNC system, let ψ 0 be such that α 0 = (β 0 ) A = 0 and (µ 0 ) AB (v, r, x A ) = µ AB (0, 0, x A ). We can construct a 1-parameter family ψ λ with ψ 1 = ψ by setting α λ = λα, where J m−2 is a local form such that dJ m−2 is gauge invariant. The latter condition gives γdJ m−2 = 0 hence dγJ m−2 = 0 but γJ m−2 has ghost number 1 so we can use the algebraic Poincaré lemma interpolating Λ to 0 (treating other fields as background fields) and deduce that γJ m−2 is exact. We can now apply Lemma 2.4 to deduce J m−2 = β∧I m−3 + I m−2 − db m−3 where I m−3 is gauge-invariant and closed, and I m−2 is gauge-invariant. This gives We now repeat the process starting from I m−3 to obtain and hence Proceeding inductively we reach On the RHS, I k [ψ 0 ] is an identically closed k-form constructed locally and covariantly only from (µ 0 ) AB ≡ µ AB | C : these are the characteristic classes built from R[µ] A B = R[µ] CDA B dx C ∧ dx D [see for example eq. 6.32 of [27], replacing spacetime by (C, µ AB )]. The result now follows upon substituting the above equation into our formula for w m in Lemma 2.4 and setting I ′

Covariant phase space formalism
We shall use the covariant phase space formalism of Iyer and Wald [2] and so we summarize the main results of that formalism that we shall need. Consider a Lagrangian n-form L depending locally and covariantly on some fields Ψ i = (g µν , Φ I , A Jµ ) including a Lorentzian metric g. When the fields include 1-form fields A J , we also require that L is gauge invariant. Local and covariant means by definition that in any coordinate system and at any point x, L| x is expressible in terms of a finite number of derivatives ∂ α 1 . . . ∂ α d Ψ| x , and that, for every diffeomorphism f , f * L[Ψ] = L[f * Ψ]. By the Thomas replacement lemma [2], L can be written as Lǫ, with ǫ the positively oriented volume form defined from g µν , where L can be expressed entirely in terms of g µν , g µν , ∇ α 1 · · · ∇ α k R µναβ , ǫ α 1 ...αn , ∇ α 1 · · · ∇ α k Φ I , and ∇ α 1 · · · ∇ α k F µ 1 ...µ p+1 J when p-form fields with curvatures F J are present. However, for most parts of this paper we will not consider matter fields and treat (local and covariant) Lagrangians depending only on the metric. Note that anti-symmetrized combinations of covariant derivatives can be rewritten in terms of curvature tensors, so without loss of generality we will usually consider symmetrized derivatives as the variable fields in our Lagrangians and other functionals. There are further dependencies among the derivatives of the curvature components arising from the Bianchi identities which are usually understood to be eliminated by choosing a suitably linearly independent set, see below.
In such a case, the only equation of motion is the Einstein equation E µν = 0 where . (83) E µν specifies the classical dynamics so any classical physical quantity should be specified only in terms of this dynamical content plus potentially some global quantities of topological nature. The symplectic (n − 1) potential is defined by θ[Ψ; δΨ] We use the general notation for a 1-parameter family of dynamical fields. More generally, we write δ 1 Ψ, δ 2 Ψ, δ 1 δ 2 Ψ for families depending on multiple parameters. The definition of θ[Ψ; δΨ] leaves some ambiguities because we may add a d-exact covariant term or topological term to the Lagrangian L → L + dµ + L top without changing the equations of motion, E µν g . Furthermore, we may add a local and covariant d-exact term to θ itself, θ → θ + dY . Altogether, these ambiguities result in the freedom to modify θ → θ + δµ + θ top + dY without changing the local dynamical content of the theory.
The symplectic (n − 1)-form is, for a purely gravitational theory, The Noether current is again for a purely gravitational theory, and the above ambiguities of θ and L propagate into ambiguities of J X and of ω. One can establish [2] the existence of a non-unique Noethercharge (n − 2)-form Q X satisfying where C X , which we call the constraint, does not depend on derivatives of X, and C X = 0 when the equations of motion are fulfilled. For a purely gravitational theory as considered here, the constraint is [12] ( To see this, note that equation (84) and the (off-shell) Bianchi identity ∇ µ E µν = 0 give using Cartan's magic formula for the Lie derivative. Hence d(J X [g] + C X [g]) = 0. Since this is true for any X and any g µν , the algebraic Poincare lemma 2.6 yields the existence of a corresponding local Q X . Note also that the expression (89) is unaffected by the above ambiguity to change L → L + dµ + L top and the corresponding ambiguity θ → θ + δµ + θ top + dY , even though Q X itself is affected by this ambiguity.
A general result from [2] is that, up to the ambiguities affecting θ and L just described, Q X may be written as The ambiguities affecting θ and L result in the change , which represents the total ambiguity in defining the Noether charge. Another useful general identity which holds for general 1-parameter families of fields which are not necessarily solutions is obtained starting from using Cartan's magic formula L X θ = d(X · θ) + X · dθ and the definition of ω in the last step. Now take a C 1 -codimension 1 surface S in spacetime, take X tangent to S, and pull back the above equation for forms to S. Since the pull back of X · (E g δg) vanishes, we get: It is important to note that this is an off-shell identity: neither g nor δg have to satisfy any equations.

Covariant phase space derivation of IWW entropy current
The aim of this section is to demonstrate the relation between the entropy determined by Wall's procedure and the Iyer-Wald entropy. We will apply the results of the previous section to the case where S is a smooth null hypersurface N ruled by affinely parameterized null geodesics. We choose an arbitrary cut C and introduce GNCs as described in section 1.3. We now define the vector field generating a boost in the coordinates r, v. We take X = K and consider the constraint C K , which we write The pull-back of C K to N is 2vE vv √ µdv ∧ d n−2 x in GNCs. To analyze the structure of E vv it is useful to introduce an equivalence relation on functionals on N :

Definition 3.1 Two functionals A, B each of which is a sum of primitive monomials (definition 2.1) with definite boost weight p are said to be equivalent A ∼ B if A − B contains only monomials having at least two primitive factors with positive boost weight.
In the following, functionals of boost weight p will be denoted with symbols such as A (p) , X (p) , . . . . Consider a functional A (2) of boost weight +2 (later we will take this to be E vv ). One can use a "partial integration" argument to show that there exist X (−k) , W (1)A , k ≥ 0 such that (i) X (−k) is a sum of primitive monomials for which each primitive factor has non-positive boost weight, such that (ii) W (1)A is a sum of primitive monomials for which precisely one primitive factor has positive boost weight, and such that (iii) Note that (i) implies that each primitive monomial in an X (0) term must be a product of primitive factors with zero boost weight. A derivation of the above result was sketched in [4] and explained in more detail in [12].
The key trick is now to define a 1-parameter family of metrics interpolating between a stationary "background" metric (similarly as in [2]) and some other metric. Let g µν (x α ) be the given metric expressed in GNCs and read off ψ ∈ {α, β A , µ AB }. The background fields are defined byψ where χ is smooth and of compact support and identically equal to 1 in a neighborhood of the origin and {c p } is a sequence of positive numbers increasing sufficiently fast in order that the series converges 17 . Fromψ we define a corresponding background metricḡ µν in GNCs. This background metric has the properties: • L Kḡ = 0. In particular, the background metricḡ µν is such that N is a Killing horizon with bifurcation surface C. Note that this background metric depends on the original choice of GNCs.
• In the background, any primitive factor with positive boost weight vanishes on N and any primitive factor with negative boost weight vanishes on C. This implies that when X (k) has positive boost weight k, then X (k) [ḡ] = 0 on N , and when X (−k) has negative boost weight −k, then X (−k) [ḡ] = 0 on C.
• If X (0) is a sum of primitive monomials where each monomial involves only primitive factors of boost weight 0 then Next for λ ∈ [0, 1] (the interpolation parameter), and for an arbitrary smooth δψ we set and from ψ(λ), we define a corresponding 1-parameter family of metrics g µν (λ, x α ) in GNCs. 18 This 1-parameter family has the properties • g| λ=0 =ḡ • If we choose δψ = ψ −ψ then g| λ=1 is the original metric g. Note that we do not make this choice at the beginning of the following argument.
We now take A (2) = E vv in (97) and consider the first variation of this equation within our 1-parameter family. Using the properties just described, we get on N (schematically denoting by Y (k) the monomials because all other terms have an unvaried monomial with positive boost weight and such a monomial vanishes in the backgroundḡ. The terms with k > 0 in curly brackets can be rewritten as ∂ We will now show that the terms with k = 0 are related to the Noether charge using the restriction of eq. (94) (with X = K) to N 19 and evaluated on the backgroundḡ: as L K g| λ=0 = 0 by construction. We now assume that δψ has compact support so ∂ ∂λ g| λ=0 also has compact support. If we integrate the above equation over the v > 0 portion of N , use the structure (100), and perform integrations by part in v, we obtain In deriving this equation it has been used that X (−k) (g| λ=0 ) = 0 for a negative boost weight −k quantity on C, we have used that all terms with an explicit dependence on v vanish at C, and that divergence terms (i.e. D A ∂ ∂λ (⋆Q Ar where q (0) is a sum of primitive monomials for which all primitive factors have boost weight 0 20 and the ellipsis represents a sum of primitive monomials, each containing at least one primitive factors of strictly positive boost weight and at least one primitive factor of strictly negative boost weight (since the total boost weight is 0). The Iyer-Wald entropy is defined as the integral over a horizon cross-section of 2π √ µq (0) [2]. The terms in the ellipsis of (104) are O(λ 2 ) on C and therefore their λ-derivative vanishes on C for λ = 0, so q (0) is the only surviving term on C after we take a variation off our stationary background g| λ=0 . Therefore we have derived By construction, the integrand here has the general form whereD A is the covariant derivative ofμ AB = µ AB | λ=0 . The integral over C of this quantity vanishes for completely arbitrary δψ = ∂ ∂λ ψ| λ=0 , and in particular the restrictions (∂ v ∂ r ) p δψ| C can be chosen independently. Therefore we can write (still on C) 19 When considering the restriction of forms to N it is useful to work with the Hodge duals of the forms; for example we write (Q X ) µ1...µn−2 = 1 2 ǫ µ1...µn−2αβ (⋆Q X ) αβ . 20 By (91), we may determine q (0) by considering all contributions to E rvrv R made up exclusively of boost weight 0 primitive factors [2].
We will now argue that (107) holds not just on C but on any constant v cross section C(v) of the horizon. To this end we note that evaluating the integrals over C(v) instead of C is equivalent to evaluating them over the variation T * v δψ instead of δψ, where T v is a translation by v. This is because, on N , all boost weight zero quantities are invariant under T v in the backgroundψ. But (107) is an identity that holds for arbitrary variations δψ of compact support and in particular also for T * v δψ. Hence we conclude that this equation holds on C(v). Then we can take a derivative with respect to v and note that this equation becomes ∂ ∂λ where Thus, in a GNC system, the first variation of E vv around a backgroundψ of the form described above can be written on N as where with q (0) as in (104), X q,ψ as in (97) for A (2) = E vv , and where with V (1) as in (110) and W (1) as in (97) for A (2) = E vv . In deriving this result we assumed a perturbation of compact support. If we have instead a perturbation of non-compact support then, for any compact subset of N , we can define a perturbation of compact support that agrees with the original perturbation on this subset and therefore the above result holds for the original perturbation on this subset. Since the subset is arbitrary, it follows that this result holds on all of N even for perturbations of non-compact support. If N is a black hole event horizon then, when substituted into (13), the first term of (112) gives the Iyer-Wald (IW) entropy [2]. The other terms in (112) are those predicted by the argument of Wall [4]. We will therefore refer to (112) as the Iyer-Wald-Wall (IWW) entropy density. The need for the term in (111) involving s A was pointed out in [12].
We will now investigate E vv at the fully non-linear level. Consider a metric g µν (x α ) (not necessarily a solution) in GNCs with corresponding ψ ∈ {α, β A , µ AB }, defineψ as the series in (98), choose δψ to be ψ −ψ and define g µν (λ, x α ) to be the 1-parameter family of metrics in GNCs corresponding to ψ(λ) as in (98). By construction, this family interpolates between the stationary metric g| λ=0 and g| λ=1 which is equal to the given metric g. Now let Using (97) and the expressions for s v and s A gives (recall definition 3.1) We expand F to all orders in λ around the background λ = 0. The zeroth order term must vanish on N since all positive boost weight quantities vanish on N in the background. The linear term must vanish on N by (109) or (111). Hence F contains only terms of quadratic or higher order in λ.
If we consider a typical monomial in F then it is a product of factors with different boost weights. If we evaluate this on C then the zero boost weight factors coincide with those ofψ. For a non-zero boost weight factor we have Thus the power of λ in any given monomial equals the number of factors in that monomial that have non-zero boost weight. Since non-zero boost weight arises from v or r derivatives, the number of such factors will be bounded if the total number of derivatives in each monomial of E vv is bounded above (as is the case if we consider an EFT in which we include terms in the Lagrangian only up to some fixed total number of derivatives). In such a case, the expansion in λ terminates on C.
We will now show that F ∼ 0. Note that the expression in square brackets in (115) involves no terms of boost weight higher than 1. Therefore expanding out the v-derivative gives F ∼ n,q,ψ where each A (0) is a sum of monomials within which each factor has zero boost weight. Quantities that are ignored in the ∼ relation involve monomials with at least two positive boost weight factors, and such terms are of quadratic order, or higher, in λ. Hence However, as we have explained above, the terms in F linear in λ must cancel for any ψ. In particular, if we fixψ and then construct ψ by specifying δψ then the linear terms above must cancel. Since this δψ is arbitrary, this implies A .

.An npψ
[ψ] = 0 on C. Our choice ofψ here was arbitrary so this must hold for anyψ. Now given ψ, let ψ v = T * v ψ where T v 0 is the diffeomorphism corresponding to translation in v by v 0 . We then have where C(v) is a constant v cut of N . But v is arbitrary here so it follows that A .

.An npψ
[ψ] vanishes on all of N . This implies F ∼ 0 as claimed.
We summarize the discussion so far in the following lemma: • Each s v , s A , F is a sum of primitive monomials.
• Each primitive monomial in s v , s A , F has boost weight 0, 1, 2, respectively.
• Each primitive monomial in F contains at least two primitive factors with strictly positive boost weight. This justifies equation (9). The main novelty of our approach compared to earlier work [4,11,12] has been to demonstrate the connection to the Iyer-Wald entropy. Note that we could add extra terms to s v and s A that are of quadratic (or higher) order in primitive factors of positive boost weight without affecting the above results. This will be important below.

Transformation of IWW entropy under a change of GNCs
The above procedure for determining s v and s A worked with the GNC components of the metric. Even though GNCs are defined geometrically starting from a cross section C of N , their construction requires apart from the chosen cut C a choice of affine parameter for each null generator of N . Under the remaining reparameterization freedom, α, β A , µ AB and their derivatives in general transform in a very complicated way (e.g. equation (45)), so it is not clear that for a given cut C, s v is fully covariant, i.e. a functional of g µν that is independent of arbitrary choices apart from the cut C. Thus, we must investigate if, and how, s v transforms under a change of GNCs.
Consider a change of coordinates x µ = x µ (x ′α ) in a neighborhood of N preserving the Gaussian Null Form, see (37). Thus, on N , we have (v, x A ) = (a(x ′C )v ′ , x ′A ), and off of N , the -in general very complicated -form of the coordinate transformation is described in sec. 2.1. The GNC components of g in the primed coordinates are denoted by We want to understand the behavior of s v under such a transformation. If the multiplication factor a in (37) is constant, then this behavior is trivially determined by the boost weight of each quantity, so we get s v [ψ] = s v [ψ ′ ] (a constant), because s v has boost weight 0. But for non-constant a, the transformation is not at all evident. In this subsection, we shall show (Proposition 1) that s v can be adjusted by terms quadratic in positive boost weight quantities, and a total divergence (which does not affect the total entropy), such that the modified s v is manifestly invariant under a change of GNCs.
To investigate this, we first define l = g µν l ν dx µ , and define, uniquely, an (n − 1) form ǫ l on N by demanding that l ∧ ǫ l = ǫ and n · ǫ l = 0. In our GNCs this gives ǫ l = √ µdv ∧ dx 1 . . . ∧ dx n−2 . The constraint (n − 1) form C l is defined as in (96), giving C l = 2E vv ǫ l . Similarly we define F l = F ǫ l . We now define the entropy current as a (n − 2) form s on N by In forms notation, (114) can be rewritten as In our second set of GNCs we have l ′µ = al µ on N and so the primed version of the above equation gives In this formula, s ′ is the same functional as s but evaluated on the transformed ψ ′ ∈ {µ ′ AB , β ′ A , α ′ } and in the transformed coordinates. s ′ can be transformed back to the original coordinates in the usual way using the Jacobian of the coordinate transformation.
Since C l is defined by contracting l with a local, covariant tensor, we have C al = aC l . We also have L al ds ′ = d(al · ds ′ ) = aL l ds ′ + da ∧ l · ds ′ = aL l ds ′ where the final equality uses l · ds ′ ∝ dx ′1 ∧ . . . ∧ dx ′(n−2) and da = (∂ A ′ a)dx A ′ . Combining these results gives We shall now show that this equation implies that the IWW entropy is gauge invariant. In our first set of GNCs, letψ arise from a "background" metric (not necessarily a solution) for which all positive boost weight primitive factors vanish on N . This need not be a background constructed as in the previous subsection (e.g. negative boost weight quantities need not vanish on C). The most interesting case is whenψ corresponds to a stationary black hole solution with event horizon N . Now we perturb this background: let ψ(λ) be a 1-parameter family such that ψ(λ = 0) =ψ. Let the corresponding metric be g αβ (λ, x µ ). Consider transforming this metric to the "primed" set of GNCs. These are defined by a function a > 0 on the cut C. On N the coordinate transformation does not depend on λ. However, since the definition of GNCs depends on the metric, the change of coordinates does depend on λ away from N : x µ = x µ (λ, x ′α ). Under this coordinate transformation, we obtain g = g ′ αβ (λ, x ′µ )dx ′α dx ′β , corresponding to ψ ′ (λ, will in general be very complicated on N (with explicit factors of v but not r), but it follows from Lemma 2.1 that a positive boost weight primitive factor always transforms to an expression only involving positive boost weight primitive factors times appropriate powers of v (which has boost weight −1). Hence ψ ′ (λ = 0) has the property that positive boost weight primitive factors vanish on N . As a consequence, because F l is at least quadratic in positive boost weight primitive factors, we obtain from (122) Now assume that the perturbation ∂ψ(λ, x µ )/∂λ| λ=0 has compact support (we relax this below). Integrating up the above equation gives We now use the algebraic Poincaré Lemma (lemma 2.6) taking Ψ λ = λ(∂ λ ψ| λ=0 ) and Φ = (ψ, a). This gives where b n−3 is a local form depending on ∂ λ ψ| λ=0 ,ψ and a. Next we investigate (s − s ′ )| λ=0 . Because all positive boost quantities built from ψ(λ = 0, ] (as q (0) only contains zero boost weight primitive factors), Then, because E µνσρ R is a covariant tensor we easily see Combining (125) and (126) we find Integrating this equation over C gives where the LHS refers to the IWW entropy defined w.r.t. the two different sets of GNCs.
Hence we have shown that the IWW entropy is gauge invariant to linear order in perturbations around our background metric. Note that this is an off-shell result: neither the background nor the perturbation need satisfy any equations of motion. In deriving this result we assumed that the perturbation has compact support. For a non-compactly supported perturbation we can pick a perturbation of compact support that agrees with the given perturbation in a neighbourhood of C and then apply the above result.
We shall now extend this to a fully nonlinear result, eliminating any reference to the background metric. We assume that we have some given metric g αβ and a set of GNCs, with corresponding quantities ψ. We now define a background metricψ such that, on C, primitive factors of zero or negative boost weight agree with those of our original metric. We do this by settingψ where χ is of compact support, with χ identically equal to 1 in a neighbourhood of 0, and where c p goes to infinity suitably rapidly so as to make the sums convergent. The corresponding metricg αβ is our background metric. We now define the one parameter family where δψ is of compact support. We let g αβ (λ, x µ ) denote the corresponding family of metrics in GNCs. We choose δψ such that δψ = ψ −ψ in a neighbourhood of C so, on C, all derivatives of g αβ (λ = 1, x µ ) coincide with those of the original metric. Since we will in the end be interested in the behavior of s v on the chosen C, we may therefore replace g αβ (x µ ) with g αβ (λ = 1, x µ ). We now apply (127) to g αβ (λ, x µ ). By construction, the O(λ 2 ) terms must on C be at least quadratic in positive boost weight quantities. This is because each factor of λ must be accompanied by δψ, and on C the only non-vanishing quantities linear in δψ are those with positive boost weight (e.g.
Similarly, on C we can write terms involvingψ in terms of ψ using the definition ofψ (e.g. where the ellipses represent terms that, when expressed w.r.t. the first set of GNCs, are at least quadratic in positive boost weight primitive factors. These terms depend on a in an extremely complicated way, but we will now show by a general argument using lemmas 2.2-2.5 that they can always be reabsorbed in a redefinition of s v , thereby rendering this quantity invariant (up to total derivative) under a change of GNCs. For this purpose, it is useful to pass to an infinitesimal version of the transformation (131), which in view of (47) is fully equivalent to transformation law (131) under finite GNC coordinate transformations.
Consider an a of the form e tΛ/2 and take a derivative of the above equation with respect to t at t = 0. Calling the corresponding infinitesimal transformation γ (given in the basis of monomials of lemma 2.3 concretely by (67)), we see that, on C, where X A , Y are local and Y is linear in Λ and at least quadratic in positive boost weight factors. Taking γ of this equation, we find a consistency condition on Y , which is, on C, We can use lemma 2.5 to characterize the solutions to this cohomological problem, identifying Y √ µd n−2 x =: w n−2 , −γ(⋆X A dx A ) =: w n−3 . Lemma 2.5 made no assumption that w n−2 be quadratic in positive boost weight factors, but the method for constructing the solutions to the cohomological problem given in the proof is constructive, and we can see from the proof that the solution provided by the lemma can be chosen to be at least quadratic in positive boost weight: Firstly, the forms in the descent equations constructed in the proof do not leave the space of local forms at least quadratic in positive boost weight factors and with overall boost weight zero. Indeed, the BRST operator γ preserves this space, and applying the algebraic Poincare lemma to construct the subsequent rungs in the descent equations, we do not leave this space either, as we can see from its proof. Secondly, the arguments used to analyze the bottom rung of the descent equations which relies on the method of contractible pairs can be carried out in that space, too, because the BRST operator γ, the exterior differential d on C, and the operators N, ρ [see (69),(70)] all commute with the boost weight number operator. 21 Thus, lemma 2.5 gives, on C, for local forms Z, U A , V at least quadratic in positive boost weight and V invariant under changes of GNCs, and all quantities have overall boost weight zero. Therefore, on C, which must hold for all Λ, ψ. The terms containing undifferentiated Λ in the BRST transformation γ (67) must cancel in the above expression because Z, s v have boost weight 0, so the explicit term ΛV [ψ] with Λ in undifferentiated form must be cancelled by the total divergence term. Thus, V has to be a total divergence of a local quantity of boost weight 0 which is at least quadratic in positive boost weight monomials, V = D A W A . Therefore , which still satisfies (114) for a new F [ψ] that is at least quadratic in positive boost weight, and which satisfies additionally ). This already shows that our modified IWW entropy C s v √ µd n−2 x is gauge invariant.
We can say more about the structure of the redefined s v using Lemmas 2.4 and 2.7 for m = n − 2, setting w n−2 = s v √ µd n−2 x. The non-covariant terms β ∧ I n−3 in Lemma 2.4 involving the characteristic classes I n−3 as described in Lemma 2.7 (which can only appear 21 That is, the operator if n is odd) consist exclusively of monomials in the primitive basis such that each primitive factor has zero boost weight. Such terms are contained in the IW-part of the entropy -i.e. the term q (0) in (112) -which in turn is given by that part the manifestly covariant expression E rvrv R consisting only of zero boost weight terms. Thus, non-covariant terms β ∧ I n−3 must in fact be absent in s v and we have shown: Proposition 1 (Invariance of modified IWW-entropy density) s v can be modified by terms at least quadratic in positive boost weight [so it still satisfies the properties claimed in lemma 3.1 and (114)] in such a way that it is a sum of: • A local functional of boost weight 0 that is a contraction of the following factors: • A total divergence D A ξ A , where ξ A is a local functional of boost weight 0.
In particular, on C, s v is invariant up to at total divergence under a change of the affine parameter in GNCs.
Remark: Proposition 1 uses that L is locally and covariantly constructed out of the metric g µν , meaning that it is a functional of g µν , ǫ µ 1 ...µn and covariant derivatives of the Riemann tensor (in a purely gravitation theory). Hence Proposition 1 does not cover the case of gravitational Lagrangians that are covariant only up to d-exact terms like Chern-Simons terms in odd dimensions n, such as tr(Γ ∧ dΓ + 2 3 Γ ∧ Γ ∧ Γ) in n = 3, where Γ is the spin-connection associated with a 3-bein of g µν . Such terms will result in terms of the type (78) in s v . For example from a gravitational Chern-Simons term in L in n = 5, we can see from the formulas for Christoffel symbols in GNCs in Appendix A that we would get a Chern-Simons term of the form β ∧ dβ in s v √ µd 3 x, consistent with Lemmas 2.4 and 2.7.
On the other hand, fully covariant topological terms in L are covered by our analysis. For example, in any even dimension an Euler density in L, i.e. e n [g] := R[g] µ 1 µ 2 ∧ · · · ∧ R[g] µ n−1 µn ǫ µ 1 ...µn will result in an Euler density term e n−2 [µ] in s v √ µd n−2 x, whereas a Chern class in L, i.e. c n [g] := R[g] µ 1 µ 2 ∧· · ·∧R[g] µ n−1 µ 1 will result in a term of the form c n−4 [µ]∧dβ in s v √ µd n−2 x. (Recall that dβ can be eliminated in favour of Riemann components using (58).) All of these terms are built from boost weight zero quantities and are already present in the IW-part of s v .

Improved entropy current 4.1 Vacuum gravity
For simplicity we shall consider vacuum gravity in this subsection, i.e., no matter fields. (In the next subsection we shall consider gravity coupled to a scalar field.) If the spacetime dimensionality is odd then we assume parity symmetry, again for simplicity. Relaxing this assumption should be straightforward. We assume the Lagrangian takes the form 22 with a single UV length scale ℓ. L n is a local covariant scalar, depending only on the metric and its derivatives, and of dimension n + 2, i.e., it involves n + 2 derivatives of the metric. We define the dimension of ℓ to be −1 and the dimension of Λ to be +2. Each derivative ∂ µ carries one index, and g µν has two indices, so non-zero L n is possible only for even n. 23 It will be convenient in this section to imagine that the L n are known for all n. It is then not hard to obtain results for which only finitely many of the L n are known, as we will discuss at the end of this section. The Einstein equation has the form where again only even n appears in the sum and H n µν has dimension n + 2.
Recall from section 1.7 that we are not interested in all solutions of (138). Instead, we only consider solutions that fall within the regime of validity of EFT. This means that we restrict attention to solutions that, in some set of GNCs near N , are slowly varying compared to the scale ℓ, i.e., ℓ/L ≪ 1 where L is the length scale of variation of the fields. In the case of a dynamical black hole, L will be the minimum of the size of the black hole and any length/time scales associated with the dynamics. Writing this out formally we have Definition 4.1 (Validity of EFT condition.) We consider a 1-parameter family of smooth solutions of (138) with parameter L, such that N is a smooth null hypersurface for all members of the family. We assume that there exist GNCs defined near N such that, near N , if T n is a quantity of dimension n (in the sense of Def. 2.3) constructed from {α, β A , µ AB } and their derivatives then there is a dimensionless constant C n (independent of L) such that |T n | ≤ C n /L n . We also require |Λ|L 2 ≤ 1. Then EFT is valid for sufficiently small ℓ/L (i.e. large enough L).
For example, on the RHS of (138), |H n µν | is bounded above by C n /L n+2 so ℓ n H n µν = O(ℓ n /L n+2 ). As in section 1.7, from now on we shall not indicate the L-dependence explicitly below so we would write ℓ n H nµν = O(ℓ n ). Factors of L can be reinstated by dimensional analysis.
Recall the definition of F in equation (114). We can decompose this into a sum of terms arising from the various terms in (138) where F n has dimension n + 2 and has the same general structure as described in lemma 3.1.
As above, only even n can appear in the sum. This is because only even powers of ℓ appear in the Einstein equation. F 0 is the expression coming from the Einstein-Hilbert part of the Lagrangian and reads In the EFT spirit, we might expect F 0 , which is sign definite, to dominate the higher order terms in ℓ. This would require in particular that if K AB vanishes, then so should F n , n > 0. However, this is not true in general. Nevertheless, we can still make progress with this idea.
To do this we need to understand the structure of the terms F n . We start by using the equation of motion (138) to obtain the following result: (139) where each F n is a polynomial in

Lemma 4.1 On-shell, F can be written as in
and GNC components of covariant derivatives of the Ricci tensor. The idea now is to use the Einstein equation (138) to eliminate the terms involving the Ricci tensor. In doing this we replace each occurrence of a Ricci tensor either with a multiple of Λg µν or with powers of ℓ 2 times curvature terms of higher dimensions. Then we decompose the GNC components of the latter terms again as in lemma 2.2, eliminate any occurrence of a Ricci tensor again and so forth. At each step, any Ricci terms arise with explicit factors of ℓ 2 and therefore get pushed to higher order in ℓ. The end result is that, on-shell, F can be written in terms of the quantities stated in the lemma, up to terms vanishing to infinite order in ℓ. Finally we can express this on-shell result in the form (139) where each F n can be written in terms of the quantities just listed, i.e., explicit occurrence of Ricci tensors and their covariant derivatives have been eliminated.
Since F n has dimension n + 2, each term in F n contains at most n + 2 derivatives. Terms with fewer than n + 2 derivatives must have enough factors of Λ to make the dimension up to n + 2. Q.E.D.
So far we assumed that our EFT is defined to all orders in n. If only terms with n ≤ N −2 are known then the Einstein equation will contain unknown "errors terms" of order ℓ N (equation (14)). The off-shell expression for F will have a similar form. The above argument still works and any terms of order ℓ N or higher can be absorbed into the error term in F . The on-shell expression for F will be a sum of terms F n with n ≤ N − 2, each of the form just described, and an error term of order ℓ N .
We will now argue that the on-shell structure of F can be further rearranged in a "nice" way. The argument involves induction on the order in ℓ: Induction hypothesis. For each even n ≥ 0, there are local tensors X nAB , ς v n having boost weights (1, 0) and local tensors y A n , O n , each depending polynomially on the quantities listed in lemma 4.1, such that on-shell we have such that X nAB is symmetric, O n+2 , D A y A j , ς v n has the same general structure as F described in lemma 3.1 (in particular they are at least quadratic in positive boost weight quantities) and as well as X nAB = O(ℓ 2 ). Furthermore ς v j and y A j depend explicitly on ℓ only through an overall factor ℓ j .
Induction step (n − 2 → n, n ≥ 2). Consider O n written in terms of the quantities described in lemma 4.1. The induction hypothesis gives O n = O(ℓ n ) and that O n is a sum of terms where each term is at least quadratic in terms of positive boost weight, and so must contain at least two factors of the form D (A 1 · · · D A j ) ∂ p−1 v K AB (p ≥ 1) (as other possible factors listed in lemma 4.1 have non-positive boost weight). In terms of µ AB this means that O n must take the schematic form where p, p ′ > 0 in all terms, the coefficient A n,k,p,k ′ ,p ′ has boost weight 2 − p − p ′ and, aside from an overall factor of ℓ n , depends only on β, D k β, Λ, µ and derivatives of µ. O n+2 satisfies (142). Here and in the following, tensor indices A, B, C, . . . are suppressed. The total number of derivatives in each term of the above sum is at most n + 2. Our aim is to show that the above expression for O n can be rearranged to ensure that the induction hypothesis is true for n. The next sequence of steps is designed to bring each term in the sum in (143) into the form A n,k,p D k ∂ p v µ∂ v µ, i.e. linear in ∂ v µ and at least quadratic in positive boost weight quantities, for a new A n,k,p , possibly at the expense of further terms to be absorbed in O n+2 , or terms involving y A n , ς v n as in the induction hypothesis. This is done by moving, one by one, the derivatives D k ′ from the factor D k ′ ∂ p ′ v µ over to the other terms as if we were performing a partial integration, but keeping the "boundary terms". These boundary terms are then absorbed in y A n as in the induction hypothesis. As a result, with the summation indices subject to p, p ′ > 0. Next, we would like to bring p ′ − 1 of the v-derivatives on ∂ p ′ v µ in the sum over to the other terms, leaving us finally with expressions that are linear in ∂ v µ and quadratic in positive boost weight. This is done by a similar "partial integration", moving all boundary terms into a new quantity ς v n , as in the induction hypothesis, so in particular quadratic in positive boost weight quantities. 24 We can do this by an induction on p ′ + p on the terms in the sum, lowering at each step this counter by at least one until we reach terms of the form A n,k,p D k ∂ p v µ∂ v µ or A n,k,p D k ∂ v µ∂ p v µ (where p > 0 in each case). The latter term then can be rewritten in the same form as the former by the same "partial integration" w.r.t. D argument that we just used, which generates further additions to y A n . To see how this induction works in more detail, note that the induction hypothesis is true for p + p ′ ≤ 3 (since then either p = 1 or p ′ = 1) so assume p + p ′ > 3. We claim that there exist numbers a j such that Then the p ′ + p − 3 coefficients a j must satisfy the following p ′ + p − 3 linear equations It is easy to check that such a matrix has a nonzero determinant. In fact, the determinant of an N × N matrix of this form is N + 1 which is found by induction on N: expanding the determinant along the first row of the matrix gives the simple recurrence relation det M N +1 = 2 det M N − det M N −1 . Therefore, M p ′ +p−3 is invertible and the linear system above has a unique solution for the coefficients a j so we can indeed rewrite terms of the form A n,k,p,p ′ D k ∂ p v µ∂ p ′ v µ as in (145) (with the first and last terms in square brackets in (147) contributing to the ellipsis in (145)). The terms a j A n,k,p,p ′ D k ∂ j v µ∂ p+p ′ −2−j v µ are absorbed in ς v n . We repeat the procedure until we have obtained where p > 0. Since A n,k,p,k ′ ,p ′ depend only on D k β and derivatives of µ, the quantity ς v n have the same dependency on µ and β as in the induction hypothesis.
The first term in (150) depends linearly on K AB so, reinstating indices, we can write it as −2K AB ∆X AB where ∆X AB is symmetric in AB and of order ℓ n because all of the terms above, except O n+2 , depend on ℓ only via an overall factor of ℓ n . Recalling our original induction hypothesis we can now "recomplete the square" on K AB as follows: where X n ≡ X n−2 + ∆X and we used X n−2 = O(ℓ 2 ) (from our induction hypothesis). The error term above can be absorbed into O n+2 . This closes the induction loop.
The inductive structure of the term F can be combined with lemma 3.1 to get the following central result of this section, proposition 2. We set, for each given even n ≥ 2 such that S v n , S A n , X nAB , Y A n , O n+2 have the same general structure as follows from lemma 3.1 and the induction hypothesis above. In particular, O n+2 = O(ℓ n+2 ), X nAB = O(ℓ 2 ) is symmetric, Y A n = O(ℓ 2 ). As discussed in section 1.7, in practice the EFT Lagrangian will not be known to all orders, instead only the terms with n ≤ N − 2 will be known, for some N ≥ 2. The equation of motion will take the form (14). In this case we could repeat all of the above arguments, now with lemma 4.1 only determining F up to unknown terms of order ℓ N and the above induction stopping when we reach n = N − 2. Equivalently we could simply appeal to the above results applied to the full EFT, including the unknown terms with n ≥ N, and note that these unknown terms will only affect the final answer at O(ℓ N ). Either way, we obtain equation (16) by setting n = N − 2 in the above proposition (and dropping the n index on S v , S A , X AB and Y A ).
We can now define the entropy by equation (17) and this will obey the second law to quadratic order, in the sense of EFT, as explained in section 1.7.

Scalar-tensor effective field theories
Now we briefly discuss how the previous construction (valid for vacuum gravity) is modified when we include the simplest form of matter: a real scalar field. We again exclude the case of parity violating theories in odd spacetime dimension. The Lagrangian for a scalar-tensor EFT takes the form where the scalar potential V (φ) has dimension +2 and the scalar kinetic term is given by X ≡ − 1 2 (∂φ) 2 . Here, we assumed again that there is a single UV length scale ℓ (of dimension −1) suppressing the terms L n which are therefore subleading corrections to Einstein's theory with a minimally coupled scalar. We further assume that L n are diffeomorphism covariant scalars (of dimension n + 2) that are locally constructed out of the metric g, the scalar field φ and derivatives of these fields. The fact that L n is a local covariant scalar implies that n must be even. Moreover, in each monomial in L n the number of derivatives acting on the metric and the scalar field must be n + 2 in total. ABCD , and covariant quantities of the form ∇ (α 1 . . . ∇ α j ) φ, ∇ (α 1 . . . ∇ α j ) R µν so that the expression has polynomial dependence on each of these quantities. To prove this statement, one starts with the expression of φ in GNCs: where the ellipsis stands for terms vanishing at r = 0. Taking derivatives of this identity and arguing inductively as in the proof of lemma 2.3 lets us eliminate all mixed r − v derivatives of φ, in favour of the quantities listed above, establishing our claim. The next step is to apply this result to F (defined in equation (114)) and use the gravitational and scalar equations of motion (155)-(156) to eliminate the dependencies on ∇ (α 1 . . . ∇ α j ) φ and ∇ (α 1 . . . ∇ α j ) R µν . This yields the scalar-tensor version of lemma 4.1: on-shell, F can be written as where each F n depends polynomially on µ AB , To obtain (157), one can argue inductively in n. At n = 0 (157) holds with X 0AB , P 0 , . Next, we assume that (157) is true for any n ′ ≤ n − 2 and argue that it must hold for n ′ = n. This amounts to showing that O n can be brought to a form that has the same structure as the r.h.s. of (157). To do this, we first write O n in terms of the quantities listed at the end of the previous paragraph. By assumption, O n must be O(ℓ n ) and it must take the form (tensor indices are suppressed in the rest of the discussion) O n = k,k ′ ,p,p ′ , ψ,ψ ′ ∈{µ,φ} where p, p ′ > 0 in all terms, i.e. O n must contain at least two factors with quadratic boost weight and these factors can only involve µ and φ. Now one can proceed as in the case of vacuum gravity and perform a sequence of "integrations by parts" to produce terms that are linear either in ∂ v µ or in ∂ v φ. This of course comes at the cost of boundary terms to be absorbed in Y A n , S v n or terms that can be absorbed into O n+2 . (The latter class of terms may arise when a v-derivative is moved from one of the two positive boost weight factors to the coefficients A n,k,p,k ′ ,p ′ ,ψ,ψ ′ . The reason for this is that these coefficients can depend on D m β, D m ∂ q r µ or D m ∂ q r φ. When ∂ v is relocated to act on such terms, the resulting objects need to be expressed with the desired set of primitive factors and some covariant components involving R µν and φ. However, the covariant components can be shifted to higher order in ℓ by using the gravitational or scalar equations of motion.) Note in particular that to determine the boundary terms to be added to S v n , one needs to use an identity of the form (cf. equation (145)) where ψ, ψ ′ ∈ {µ, φ}, and the ellipsis stands for terms of the form A n,k,p,k ′ ,p ′ ,ψ,ψ ′ Dk∂p v ψ∂p ′ v ψ ′ with either (1)p +p ′ < p + p ′ ,p,p ′ > 0, (2)p +p ′ = p + p ′ ,p > 0,p ′ = 1, or (3)p +p ′ = p + p ′ , p ′ > 0,p = 1. The coefficients a j in this identity are to be determined by solving the linear system (148) (for any ψ, ψ ′ ∈ {µ, φ}). In the end, this procedure lets us write The quantities y A n , ς v n can be absorbed into Y A n , S v n , and they have the same dependencies on µ, φ and β as required by the proposition. The terms in the first line of the r.h.s. of this equation can be dealt with by "completing the square" on K AB and ∂ v φ, thereby producing terms with definite signs and an error term that is higher order in ℓ and to be absorbed into O n+2 , concluding the sketch of the proof of proposition 3.
The components of the Ricci tensor are given by the following expressions

A.1 Expressions on N
The nonzero Christoffel symbols on N are given by The Riemann components on N simplify as follows The components of the Ricci tensor on N are given by