Abstract
We review recent progress in massive gravity. We start by showing how different theories of massive gravity emerge from a higherdimensional theory of general relativity, leading to the DvaliGabadadzePorrati model (DGP), cascading gravity, and ghostfree massive gravity. We then explore their theoretical and phenomenological consistency, proving the absence of BoulwareDeser ghosts and reviewing the Vainshtein mechanism and the cosmological solutions in these models. Finally, we present alternative and related models of massive gravity such as new massive gravity, Lorentzviolating massive gravity and nonlocal massive gravity.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
For almost a century, the theory of general relativity (GR) has been known to describe the force of gravity with impeccable agreement with observations. Despite all the successes of GR the search for alternatives has been an ongoing challenge since its formulation. Far from a purely academic exercise, the existence of consistent alternatives to describe the theory of gravitation is actually essential to test the theory of GR. Furthermore, the open questions that remain behind the puzzles at the interface between gravity/cosmology and particle physics such as the hierarchy problem, the old cosmological constant problem and the origin of the latetime acceleration of the Universe have pushed the search for alternatives to GR.
While it was not formulated in this language at the time, from a more modern particle physics perspective GR can be thought of as the unique theory of a massless spin2 particle [287, 483, 175, 225, 76], and so in order to find alternatives to GR one should break one of the underlying assumptions behind this uniqueness theorem. Breaking Lorentz invariance and the notion of spin along with it is probably the most straightforward since nonLorentz invariant theories include a great amount of additional freedom. This possibility has been explored at length in the literature; see for instance [398] for a review. Nevertheless, Lorentz invariance is observationally well constrained by both particle and astrophysics. Another possibility is to maintain Lorentz invariance and the notion of spin that goes with it but to consider gravity as being the representation of a higher spin. This idea has also been explored; see for instance [466, 52] for further details. In this review, we shall explore yet another alternative: Maintaining the notion that gravity is propagated by a spin2 particle but considering this particle to be massive. From the particle physics perspective, this extension seems most natural since we know that the particles carrier of the electroweak forces have to acquire a mass through the Higgs mechanism.
Giving a mass to a spin2 (and spin1) field is an old idea and in this review we shall summarize the approach of Fierz and Pauli, which dates back to 1939 [226]. While the theory of a massive spin2 field is in principle simple to derive, complications arise when we include interactions between this spin2 particle and other particles as should be the case if the spin2 field is to describe the graviton.
At the linear level, the theory of a massless spin2 field enjoys a linearized diffeomorphism (diff) symmetry, just as a photon enjoys a U (1) gauge symmetry. But unlike for a photon, coupling the spin2 field with external matter forces this symmetry to be realized in a different way nonlinearly. As a result, GR is a fully nonlinear theory, which enjoys nonlinear diffeomorphism invariance (also known as general covariance or coordinate invariance). Even though this symmetry is broken when dealing with a massive spin2 field, the nonlinearities are inherited by the field. So, unlike a single isolated massive spin2 field, a theory of massive gravity is always fully nonlinear (and as a consequence nonrenormalizable) just as for GR. The fully nonlinear equivalent to GR for massive gravity has been a much more challenging theory to obtain. In this review we will summarize a few different approaches to deriving consistent theories of massive gravity and will focus on recent progress. See Ref. [309] for an earlier review on massive gravity, as well as Refs. [134] and [336] for other reviews relating Galileons and massive gravity.
When dealing with a theory of massive gravity two elements have been known to be problematic since the seventies. First, a massive spin2 field propagates five degrees of freedom no matter how small its mass. At first this seems to suggest that even in the massless limit, a theory of massive gravity could never resemble GR, i.e., a theory of a massless spin2 field with only two propagating degrees of freedom. This subtlety is at the origin of the vDVZ discontinuity (van DamVeltmanZakharov [465, 497]). The resolution behind that puzzle was provided by Vainshtein two years later and lies in the fact that the extra degree of freedom responsible for the vDVZ discontinuity gets screened by its own interactions, which dominate over the linear terms in the massless limit. This process is now relatively well understood [463] (see also Ref. [35] for a recent review). The Vainshtein mechanism also comes hand in hand with its own set of peculiarities like strong coupling and superluminalities, which we will discuss in this review.
A second element of concern in dealing with a theory of massive gravity is the realization that most nonlinear extensions of FierzPauli massive gravity are plagued with a ghost, now known as the BoulwareDeser (BD) ghost [75]. The past decade has seen a revival of interest in massive gravity with the realization that this BD ghost could be avoided either in a model of soft massive gravity (not a single massive pole for the graviton but rather a resonance) as in the DGP (DvaliGabadadzePorrati) model or its extensions [208, 209, 207], or in a threedimensional model of massive gravity as in ‘new massive gravity’ (NMG) [66] or more recently in a specific ghostfree realization of massive gravity (also known as dRGT in the literature) [144].
With these developments several new possibilities have become a reality:

First, one can now more rigorously test massive gravity as an alternative to GR. We will summarize the different phenomenologies of these models and their theoretical as well as observational bounds through this review. Except in specific cases, the graviton mass is typically bounded to be a few times the Hubble parameter today, that is m ≲ 10^{−30} − 10^{−33} eV depending on the exact models. In all of these models, if the graviton had a mass much smaller than 10^{−33} eV, its effect would be unseen in the observable Universe and such a mass would thus be irrelevant. Fortunately there is still to date an open window of opportunity for the graviton mass to be within an interesting range and providing potentially new observational signatures.

Second, these developments have opened up the door for theories of interacting metrics, a success long awaited. Massive gravity was first shown to be expressible on an arbitrary reference metric in [296]. It was then shown that the reference metric could have its own dynamics leading to the first consistent formulation of bigravity [293]. In bigravity two metrics are interacting and the mass spectrum is that of a massless spin2 field interacting with a massive spin2 field. It can, therefore, be seen as the theory of general relativity interacting (fully nonlinearly) with a massive spin2 field. This is a remarkable new development in both field theory and gravity.

The formulation of massive gravity and bigravity in the vielbein language were shown to be both analytic and much more natural and allowed for a general formulation of multigravity [314] where an arbitrary number of spin2 fields may interact together.

Finally, still within the theoretical progress front, all of these successes provided full and definite proof for the absence of BoulwareDeser ghosts in these types of theories; see [295], which has then been translated into a multitude of other languages. This also opens the door for new types of theories that can propagate fewer degrees than naively thought.
Independent of this, developments in massive gravity, bigravity and multigravity have also opened up new theoretical avenues, which we will summarize, and these remain very much an active area of progress. On the phenomenological front, a genuine task force has been devoted to finding both exact and approximate solutions in these types of gravitational theories, including the ones relevant for black holes and for cosmology. We shall summarize these in the review.
This review is organized as follows: We start by setting the formalism for massive and massless spin1 and 2 fields in Section 2 and emphasize the Stückelberg language both for the Proca and the FierzPauli fields. In Part I we then derive consistent theories using a higherdimensional framework, either using a braneworld scenario à la DGP in Section 4, or via a discretization^{Footnote 1} (or KaluzaKlein reduction) of the extra dimension in Section 5. This second approach leads to the theory of ghostfree massive gravity (also known as dRGT) which we review in more depth in Part II. Its formulation is summarized in Section 6, before tackling other interesting aspects such as the fate of the BD ghost in Section 7, deriving its decoupling limit in Section 8, and various extensions in Section sec:Extensions. The Vainshtein mechanism and other related aspects are discussed in Section 10. The phenomenology of ghostfree massive gravity is then reviewed in Part III including a discussion of solarsystem tests, gravitational waves, weak lensing, pulsars, black holes and cosmology. We then conclude with other related theories of massive gravity in Part IV, including new massive gravity, Lorentz breaking theories of massive gravity and nonlocal versions.
Notations and conventions: Throughout this review, we work in units where the reduced Planck constant ℏ and the speed of light c are set to unity. The gravitational Newton constant is related to the Planck scale by \(8\pi {G_N} = M_{{\rm{P1}}}^{ 2}\). Unless specified otherwise, d represents the number of spacetime dimensions. We use the mainly + convention (−+ ⋯ +) and space indices are denoted by i, j, ⋯ = 1, ⋯, d − 1 while 0 represents the timelike direction, x^{0} = t.
We also use the symmetric convention: \((a,b) = {1 \over 2}(ab + ba)\) and \([a,b] = {1 \over 2}(ab  ba)\). Throughout this review, square brackets of a tensor indicates the trace of tensor, for instance \([{\mathbb X}] = {\mathbb X}_\mu ^\mu, [{{\mathbb X}^2}] = {\mathbb X}_v^\mu {\mathbb X}_\mu ^v\), etc. … We also use the notation Π_{μν} = d_{μ}d_{ν} and \({\mathcal I} = \delta _{v \cdot}^\mu \,{\varepsilon _{\mu v\alpha \beta}}\) and ε_{abcde} represent the LeviCevita symbol in respectively four and five dimensions, ε_{0123} = ε_{01234} = 1 = ε^{0123}.
2 Massive and Interacting Fields
2.1 Proca field
2.1.1 Maxwell kinetic term
Before jumping into the subtleties of massive spin2 field and gravity in general, we start this review with massless and massive spin1 fields as a warm up. Consider a Lorentz vector field living on a fourdimensional Minkowski manifold. We focus this discussion to four dimensions and the extension to d dimensions is straightforward. Restricting ourselves to Lorentz invariant and local actions for now, the kinetic term can be decomposed into three possible contributions:
where a_{1,2,3} are so far arbitrary dimensionless coefficients and the possible kinetic terms are given by
where in this section, indices are raised and lowered with respect to the flat Minkowski metric. The first and third contributions are equivalent up to a boundary term, so we set a_{3} = 0 without loss of generality.
We now proceed to establish the behavior of the different degrees of freedom (dofs) present in this theory. A priori, a Lorentz vector field A_{μ} in four dimensions could have up to four dofs, which we can split as a transverse contribution \(A_\mu ^ \bot\) satisfying \({\partial ^\mu}A_\mu ^ \bot = 0\) bearing a priori three dofs and a longitudinal mode χ with \({\mathcal X}\)
2.1.1.1 Helicity0 mode
Focusing on the longitudinal (or helicity0) mode χ, the kinetic term takes the form
where □ = η^{μν}∂_{μ}∂_{ν} represents the d’Alembertian in flat Minkowski space and the second equality holds after integrations by parts. We directly see that unless a_{1} = −a_{2}, the kinetic term for the field χ bears higher time (and space) derivatives. As a well known consequence of Ostrogradsky’s theorem [421], two dofs are actually hidden in χ with an opposite sign kinetic term. This can be seen by expressing the propagator □^{−2} as the sum of two propagators with opposite signs:
signaling that one of the modes always couples the wrong way to external sources. The mass m of this mode is arbitrarily low which implies that the theory (2.1) with a_{3} = 0 and a_{1} +a_{2} = 0 is always sick. Alternatively, one can see the appearance of the Ostrogradsky instability by introducing a Lagrange multiplier \(\tilde {\mathcal X}(x)\), so that the kinetic action (2.5) for χ is equivalent to
after integrating out the Lagrange multiplier^{Footnote 2} \(\tilde {\mathcal X} \equiv 2\square{\mathcal X}\). We can now perform the change of variables χ = ϕ_{1} + ϕ_{2} and \(\tilde {\mathcal X} = {\phi _1}  {\phi _2}\) giving the resulting Lagrangian for the two scalar fields ϕ_{1,2}
As a result, the two scalar fields ϕ_{1,2} always enter with opposite kinetic terms, signaling that one of them is always a ghost.^{Footnote 3} The only way to prevent this generic pathology is to make the specific choice a_{1} + a_{2} = 0, which corresponds to the wellknown Maxwell kinetic term.
2.1.1.2 Helicity1 mode and gauge symmetry
Now that the form of the local and covariant kinetic term has been uniquely established by the requirement that no ghost rides on top of the helicity0 mode, we focus on the remaining transverse mode \(A_\mu ^ \bot\),
which has the correct normalization if a_{1} = −1/2. As a result, the only possible local kinetic term for a spin1 field is the Maxwell one:
with F_{μν} = ∂_{μ}A_{ν} − ∂_{μ}. Restricting ourselves to a massless spin1 field (with no potential and other interactions), the resulting Maxwell theory satisfies the following U (1) gauge symmetry:
This gauge symmetry projects out two of the naive four degrees of freedom. This can be seen at the level of the Lagrangian directly, where the gauge symmetry (2.11) allows us to fix the gauge of our choice. For convenience, we perform a (3 + 1)split and choose Coulomb gauge ∂_{i}A^{i} = 0, so that only two dofs are present in A_{i}, i.e., A_{i} contains no longitudinal mode, \({A_i} = A_i^t + {\partial _i}{A^l}\), with \({\partial ^i}A_i^t = 0\) and the Coulomb gauge sets the longitudinal mode A^{l} = 0. The timecomponent A_{0} does not exhibit a kinetic term,
and appears instead as a Lagrange multiplier imposing the constraint
The Maxwell action has therefore only two propagating dofs in \(A_i^t\),
To summarize, the Maxwell kinetic term for a vector field and the fact that a massless vector field in four dimensions only propagates 2 dofs is not a choice but has been imposed upon us by the requirement that no ghost rides along with the helicity0 mode. The resulting theory is enriched by a U (1) gauge symmetry which in turn freezes the helicity0 mode when no mass term is present. We now ‘promote’ the theory to a massive vector field.
2.1.2 Proca mass term
Starting with the Maxwell action, we consider a covariant mass term A_{μ}A^{μ} corresponding to the Proca action
and emphasize that the presence of a mass term does not change the fact that the kinetic has been uniquely fixed by the requirement of the absence of ghost. An immediate consequence of the Proca mass term is the breaking of the U (1) gauge symmetry (2.11), so that the Coulomb gauge can no longer be chosen and the longitudinal mode is now dynamical. To see this, let us use the previous decomposition \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}\hat {\mathcal X}\) and notice that the mass term now introduces a kinetic term for the helicity0 mode \({\mathcal X} = m\hat {\mathcal X}\)
A massive vector field thus propagates three dofs, namely two in the transverse modes \(A_\mu ^ \bot\) and one in the longitudinal mode χ. Physically, this can be understood by the fact that a massive vector field does not propagate along the lightcone, and the fluctuations along the line of propagation correspond to an additional physical dof.
Before moving to the Abelian Higgs mechanism, which provides a dynamical way to give a mass to bosons, we first comment on the discontinuity in number of dofs between the massive and massless case. When considering the Proca action (2.16) with the properly normalized fields \(A_\mu ^ \bot\) and χ, one does not recover the massless Maxwell action (2.9) or (2.10) when sending the boson mass m → 0. A priori, this seems to signal the presence of a discontinuity which would allow us to distinguish between for instance a massless photon and a massive one no matter how tiny the mass. In practice, however, the difference is physically indistinguishable so long as the photon couples to external sources in a way which respects the U (1) symmetry. Note however that quantum anomalies remain sensitive to the mass of the field so the discontinuity is still present at this level, see Refs. [197, 204].
To physically tell the difference between a massless vector field and a massive one with tiny mass, one has to probe the system, or in other words include interactions with external sources
The U (1) symmetry present in the massless case is preserved only if the external sources are conserved, ∂_{μ} J^{μ} = 0. Such a source produces a vector field which satisfies
in the massless case. The exchange amplitude between two conserved sources J_{μ} and J_{μ}′ mediated by a massless vector field is given by
On the other hand, if the vector field is massive, its response to the source J^{μ} is instead
In that case, one needs to consider both the transverse and the longitudinal modes of the vector field in the exchange amplitude between the two sources J_{μ} and J_{μ}′. Fortunately, a conserved source does not excite the longitudinal mode and the exchange amplitude is uniquely given by the transverse mode,
As a result, the exchange amplitude between two conserved sources is the same in the limit m → 0 no matter whether the vector field is intrinsically massive and propagates 3 dofs or if it is massless and only propagates 2 modes. It is, therefore, impossible to probe the difference between an exactly massive vector field and a massive one with arbitrarily small mass.
Notice that in the massive case no U (1) symmetry is present and the source needs not be conserved. However, the previous argument remains unchanged so long as ∂_{μ}J^{μ} goes to zero in the massless limit at least as quickly as the mass itself. If this condition is violated, then the helicity0 mode ought to be included in the exchange amplitude (2.21). In parallel, in the massless case the nonconserved source provides a new kinetic term for the longitudinal mode which then becomes dynamical.
2.1.3 Abelian Higgs mechanism for electromagnetism
Associated with the absence of an intrinsic discontinuity in the massless limit is the existence of a Higgs mechanism for the vector field whereby the vector field acquires a mass dynamically. As we shall see later, the situation is different for gravity where no equivalent dynamical Higgs mechanism has been discovered to date. Nevertheless, the tools used to describe the Abelian Higgs mechanism and in particular the introduction of a Stückelberg field will prove useful in the gravitational case as well.
To describe the Abelian Higgs mechanism, we start with a vector field with associated Maxwell tensor F_{μν} and a complex scalar field ϕ with quartic potential
The covariant derivative, \({{\mathcal D}_\mu} = {\partial _\mu}  iq{A_\mu}\) ensures the existence of the U (1) symmetry, which in addition to (2.11) shifts the scalar field as
Splitting the complex scalar field ϕ into its norm and phase ϕ = φe^{iχ}, we see that the covariant derivative plays the role of the mass term for the vector field, when scalar field acquires a nonvanishing vacuum expectation value (vev),
The Higgs field φ can be made arbitrarily massive by setting λ ≫ 1 in such a way that its dynamics may be neglected and the field can be treated as frozen at φ ≡ Φ_{0} = const. The resulting theory is that of a massive vector field,
where the phase χ of the complex scalar field plays the role of a Stückelberg which restores the U (1) gauge symmetry in the massive case,
In this formalism, the U (1) gauge symmetry is restored at the price of introducing explicitly a Stückelberg field which transforms in such a way so as to make the mass term invariant. The symmetry ensures that the vector field A_{μ} propagates only 2 dofs, while the Stückelberg χ propagates the third dof. While no equivalent to the Higgs mechanism exists for gravity, the same Stückelberg trick to restore the symmetry can be used in that case. Since the in that context the symmetry broken is coordinate transformation invariance, (full diffeomorphism invariance or covariance), four Stückelberg fields should in principle be included in the context of massive gravity, as we shall see below.
2.1.4 Interacting spin1 fields
Now that we have introduced the notion of a massless and a massive spin1 field, let us look at N interacting spin1 fields. We start with N free and massless gauge fields, \(A_\mu ^{(a)}\), with a = 1, ⋯, N, and respective Maxwell tensors \(F_{\mu v}^{(a)} = {\partial _\mu}{A^{(a)}}  {\partial _v}A_\mu ^{(a)}\),
The theory is then manifestly Abelian and invariant under N copies of U (1), (i.e., the symmetry group is U (1)^{N} which is Abelian as opposed to U (N) which would correspond to a YangMills theory and would not be Abelian).
However, in addition to these N gauge invariances, the kinetic term is invariant under global rotations in field space,
where \(O_b^a\) is a (global) rotation matrix. Now let us consider some interactions between these different fields. At the linear level (quadratic level in the action), the most general set of interactions is
where \({{\mathcal I}_{ab}}\) is an arbitrary symmetric matrix with constant coefficients. For an arbitrary rankN matrix, all N copies of U (1) are broken, and the theory then propagates N additional helicity0 modes, for a total of 3N independent polarizations in four spacetime dimensions. However, if the rank r of \({\mathcal I}\) is r < N, i.e., if some of the eigenvalues of \({\mathcal I}\) vanish, then there are N − r special directions in field space which receive no interactions, and the theory thus keeps N − r independent copies of U (1). The theory then propagates r massive spin1 fields and N − r massless spin2 fields, for a total of 3N − r independent polarizations in four dimensions.
We can see this statement more explicitly in the case of N spin1 fields by diagonalizing the mass matrix \({\mathcal I}\). A mentioned previously, the kinetic term is invariant under field space rotations, (2.29), so one can use this freedom to work in a field representation where the mass matrix I is diagonal,
In this representation the gauge fields are the mass eigenstates and the mass spectrum is simply given by the eigenvalues of \({{\mathcal I}_{ab}}\).
2.2 Spin2 field
As we have seen in the case of a vector field, as long as it is local and Lorentzinvariant, the kinetic term is uniquely fixed by the requirement that no ghost be present. Moving now to a spin2 field, the same argument applies exactly and the EinsteinHilbert term appears naturally as the unique kinetic term free of any ghostlike instability. This is possible thanks to a symmetry which projects out all unwanted dofs, namely diffeomorphism invariance (linear diffs at the linearized level, and nonlinear diffs/general covariance at the nonlinear level).
2.2.1 EinsteinHilbert kinetic term
We consider a symmetric Lorentz tensor field h_{μν}. The kinetic term can be decomposed into four possible local contributions (assuming Lorentz invariance and ignoring terms which are equivalent upon integration by parts):
where b_{1,2,3,4} are dimensionless coefficients which are to be determined in the same way as for the vector field. We split the 10 components of the symmetric tensor field h_{μν} into a transverse tensor \(h_{\mu v}^T\) (which carries 6 components) and a vector field χ_{μ} (which carries 4 components),
Just as in the case of the spin1 field, an arbitrary kinetic term of the form (2.32) with untuned coefficients b_{i} would contain higher derivatives for χ_{μ} which in turn would imply a ghost. As we shall see below, avoiding a ghost within the kinetic term automatically leads to gaugeinvariance. After substitution of h_{μν} in terms of \(h_{\mu v}^T\) and χ_{μ}, the potentially dangerous parts are
Preventing these higher derivative terms from arising sets
or in other words, the unique (local and Lorentzinvariant) kinetic term one can write for a spin2 field is the EinsteinHilbert term
where \(\hat \varepsilon\) is the Lichnerowicz operator
and we have set b_{1} = −1/4 to follow standard conventions. As a result, the kinetic term for the tensor field is invariant under the following gauge transformation,
We emphasize that the form of the kinetic term and its gauge invariance is independent on whether or not the tensor field has a mass, (as long as we restrict ourselves to a local and Lorentzinvariant kinetic term). However, just as in the case of a massive vector field, this gauge invariance cannot be maintained by a mass term or any other selfinteracting potential. So only in the massless case, does this symmetry remain exact. Out of the 10 components of a tensor field, the gauge symmetry removes 2 × 4 = 8 of them, leaving a massless tensor field with only two propagating dofs as is well known from the propagation of gravitational waves in four dimensions.
In d ≥ 3 spacetime dimensions, gravitational waves have d (d +1)/2−2d = d (d −3)/2 independent polarizations. This means that in three dimensions there are no gravitational waves and in five dimensions they have five independent polarizations.
2.2.2 FierzPauli mass term
As seen in seen in Section 2.2.1, for a local and Lorentzinvariant theory, the linearized kinetic term is uniquely fixed by the requirement that longitudinal modes propagate no ghost, which in turn prevents that operator from exciting these modes altogether. Just as in the case of a massive spin1 field, we shall see in what follows that the longitudinal modes can nevertheless be excited when including a mass term. In what follows we restrict ourselves to linear considerations and spare any nonlinearity discussions for Parts I and II. See also [327] for an analysis of the linearized FierzPauli theory using Bardeen variables.
In the case of a spin2 field h_{μν}, we are a priori free to choose between two possible mass terms \(h_{\mu v}^2\) and h^{2}, so that the generic mass term can be written as a combination of both,
where A is a dimensionless parameter. Just as in the case of the kinetic term, the stability of the theory constrains very strongly the phase space and we shall see that only for α = 1 is the theory stable at that order. The presence of this mass term breaks diffeomorphism invariance. Restoring it requires the introduction of four Stückelberg fields χ_{μ} which transform under linear diffeomorphisms in such a way as to make the mass term invariant, just as in the AbelianHiggs mechanism for electromagnetism [174]. Including the four linearized Stückelberg fields, the resulting mass term
is invariant under the simultaneous transformations:
This mass term then provides a kinetic term for the Stückelberg fields
which is precisely of the same form as the kinetic term considered for a spin1 field (2.1) in Section 2.1.1 with a_{3} = 0 and a_{2} = Aa_{1}. Now the same logic as in Section 2.1.1 applies and singling out the longitudinal component of these Stückelberg fields it follows that the only combination which does not involve higher derivatives is a_{2} = a_{1} or in other words A = 1. As a result, the only possible mass term one can consider which is free from an Ostrogradsky instability is the FierzPauli mass term
In unitary gauge, i.e., in the gauge where the Stückelberg fields χ^{a} are set to zero, the FierzPauli mass term simply reduces to
where once again the indices are raised and lowered with respect to the Minkowski metric.
2.2.2.1 Propagating degrees of freedom
To identify the propagating degrees of freedom we may split further into a transverse and a longitudinal mode,
(where the normalization with negative factors of m has been introduced for further convenience).
In terms of h_{μν} and the Stückelberg fields and π the linearized FierzPauli action is
with F_{μν} = ∂_{μ}A_{ν} − ∂_{ν}A_{μ} and Π_{μν} = ∂_{μ}∂_{ν}π and all the indices are raised and lowered with respect to the Minkowski metric.
Terms on the first line represent the kinetic terms for the different fields while the second line represent the mass terms and mixing.
We see that the kinetic term for the field π is hidden in the mixing with h_{μν}. To make the field content explicit, we may diagonalize this mixing by shifting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\eta _{\mu v}}\) and the linearized FierzPauli action is
This decomposition allows us to identify the different degrees of freedom present in massive gravity (at least at the linear level): h^{μν} represents the helicity2 mode as already present in GR and propagates 2 dofs, A_{μ} represents the helicity1 mode and propagates 2 dofs, and finally π represents the helicity0 mode and propagates 1 dof, leading to a total of five dofs as is to be expected for a massive spin2 field in four dimensions.
The degrees of freedom have not yet been split into their mass eigenstates but on doing so one can easily check that all the degrees of freedom have the same positive mass square m^{2}.
Most of the phenomenology and theoretical consistency of massive gravity is related to the dynamics of the helicity0 mode. The coupling to matter occurs via the coupling \({h_{\mu v}}{T^{\mu u}} = {\tilde h_{\mu v}}{T^{\mu v}} + \pi T\), where T is the trace of the external stressenergy tensor. We see that the helicity0 mode couples directly to conserved sources (unlike in the case of the Proca field) but the helicity1 mode does not. In most of what follows we will thus be able to ignore the helicity1 mode.
2.2.2.2 Higgs mechanism for gravity
As we shall see in Section 9.1, the graviton mass can also be promoted to a scalar function of one or many other fields (for instance of a different scalar field), m = m (ψ). We can thus wonder whether a dynamical Higgs mechanism for gravity can be considered where the field(s) ψ start in a phase for which the graviton mass vanishes, m (ψ) = 0 and dynamically evolves to acquire a nonvanishing vev for which m (ψ) ≠ 0. Following the same logic as the Abelian Higgs for electromagnetism, this strategy can only work if the number of dofs in the massless phase m = 0 is the same as that in the massive case m ≠ 0. Simply promoting the mass to a function of an external field is thus not sufficient since the graviton helicity0 and 1 modes would otherwise be infinitely strongly coupled as m → 0.
To date no candidate has been proposed for which the graviton mass could dynamically evolve from a vanishing value to a finite one without falling into such strong coupling issues. This does not imply that Higgs mechanism for gravity does not exist, but as yet has not been found. For instance on AdS, there could be a Higgs mechanism as proposed in [431], where the mass term comes from integrating out some conformal fields with slightly unusual (but not unphysical) ‘transparent’ boundary conditions. This mechanism is specific to AdS and to the existence of timelike boundary and would not apply on Minkowski or dS.
2.2.3 Van DamVeltmanZakharov discontinuity
As in the case of spin1, the massive spin2 field propagates more dofs than the massless one. Nevertheless, these new excitations bear no observational signatures for the spin1 field when considering an arbitrarily small mass, as seen in Section 2.1.2. The main reason for that is that the helicity0 polarization of the photon couple only to the divergence of external sources which vanishes for conserved sources. As a result no external sources directly excite the helicity0 mode of a massive spin1 field. For the spin2 field, on the other hand, the situation is different as the helicity0 mode can now couple to the trace of the stressenergy tensor and so generic sources will excite not only the 2 helicity2 polarization of the graviton but also a third helicity0 polarization, which could in principle have dramatic consequences. To see this more explicitly, let us compute the gravitational exchange amplitude between two sources T^{μν} and T′^{μν} in both the massive and massless gravitational cases.
In the massless case, the theory is diffeomorphism invariant. When considering coupling to external sources, of the form h_{μν}T^{μν}, we thus need to ensure that the symmetry be preserved, which implies that the stressenergy tensor T^{μν} should be conserved ∂_{μ}T^{μν} = 0. When computing the gravitational exchange amplitude between two sources we thus restrict ourselves to conserved ones. In the massive case, there is a priori no reasons to restrict ourselves to conserved sources, so long as their divergences cancel in the massless limit m → 0.
2.2.3.1 Massive spin2 field
Let us start with the massive case, and consider the response to a conserved external source T^{μν},
The linearized Einstein equation is then
To solve this modified linearized Einstein equation for h_{μν} we consider the trace and the divergence separately,
As is already apparent at this level, the massless limit m → 0 is not smooth which is at the origin of the vDVZ discontinuity (for instance we see immediately that for a conserved source the linearized Ricci scalar vanishes d_{μ}d_{ν}h^{μν} − □h = 0 see Refs. [465, 497]. This linearized vDVZ discontinuity was recently repointed out in [193].) As has been known for many decades, this discontinuity (or the fact that the Ricci scalar vanishes) is an artefact of the linearized theory and is resolved by the Vainshtein mechanism [463] as we shall see later.
Plugging these expressions back into the modified Einstein equation, we get
with
The propagator for a massive spin2 field is thus given by
where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,
In Fourier space we have
The amplitude exchanged between two sources T_{μν} and T_{μν}′ via a massive spin2 field is thus given by
As mentioned previously, to compare this result with the massless case, the sources ought to be conserved in the massless limit, \({\partial _\mu}T_v^\mu, {\partial _\mu}T_v^{{\mu \prime}} \to 0\) as m → 0. The gravitational exchange amplitude in the massless limit is thus given by
We now compare this result with the amplitude exchanged by a purely massless graviton.
2.2.3.2 Massless spin2 field
In the massless case, the equation of motion (2.50) reduces to the linearized Einstein equation
where diffeomorphism invariance requires the stressenergy to be conserved, \({\partial _\mu}T_v^\mu = 0\). In this case the transverse part of this equation is trivially satisfied (as a consequence of the Bianchi identity which follows from symmetry). Since the theory is invariant under diffeomorphism transformations (2.38), one can choose a gauge of our choice, for instance de Donder (or harmonic) gauge
In de Donder gauge, the Einstein equation then reduces to
The propagator for a massless spin2 field is thus given by
where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,
The amplitude exchanged between two sources T_{μν} and T′_{μν}, via a genuinely massless spin2 field is thus given by
and differs from the result (2.60) in the small mass limit. This difference between the massless limit of the massive propagator and the massless propagator (and gravitational exchange amplitude) is a wellknown fact and was first pointed out by van Dam, Veltman and Zakharov in 1970 [465, 497]. The resolution to this ‘problem’ lies within the Vainshtein mechanism [463]. In 1972, Vainshtein showed that a theory of massive gravity becomes strongly coupled a low energy scale when the graviton mass is small. As a result, the linear theory is no longer appropriate to describe the theory in the limit of small mass and one should keep track of the nonlinear interactions (very much as what we do when approaching the Schwarzschild radius in GR.) We shall see in Section 10.1 how a special set of interactions dominate in the massless limit and are responsible for the screening of the extra degrees of freedom present in massive gravity.
Another ‘nonGR’ effect was also recently pointed out in Ref. [280] where a linear analysis showed that massive gravity predicts different spinorientations for spinning objects.
2.3 From linearized diffeomorphism to full diffeomorphism invariance
When considering the massless and noninteractive spin2 field in Section 2.2.1, the linear gauge invariance (2.38) is exact. However, if this field is to be probed and communicates with the rest of the world, the gauge symmetry is forced to include nonlinear terms which in turn forces the kinetic term to become fully nonlinear. The result is the wellknown fully covariant EinsteinHilbert term \(M_{{\rm{Pl}}}^2\sqrt { gR}\), where R is the scalar curvature associated with the metric g_{μν}, = η_{μν} + h_{μν}/M_{pl}.
To see this explicitly, let us start with the linearized theory and couple it to an external source \(T_0^{\mu v}\), via the coupling
This coupling preserves diffeomorphism invariance if the source is conserved, \({\partial _\mu}T_0^{\mu v} = 0\). To be more explicit, let us consider a massless scalar field φ which satisfies the KleinGordon equation □φ = 0. A natural choice for the stressenergy tensor T^{μν} is then
so that the KleinGordon equation automatically guarantees the conservation of the stressenergy tensor onshell at the linear level and linearized diffeomorphism invariance. However, the very coupling between the scalar field and the spin2 field affects the KleinGordon equation in such a way that beyond the linear order, the stressenergy tensor given in (2.68) fails to be conserved. When considering the coupling (2.67), the KleinGordon equation receives corrections of the order of hμν/M_{pl}
implying a failure of conservation of \(T_0^{\mu v}\) at the same order,
The resolution is of course to include nonlinear corrections in h/M_{Pl} in the coupling with external matter,
and promote diffeomorphism invariance to a nonlinearly realized gauge symmetry, symbolically,
so this gauge invariance is automatically satisfied onshell order by order in h/M_{pl}, i.e., the scalar field (or general matter field) equations of motion automatically imply the appropriate relation for the stressenergy tensor to all orders in h/M_{Pl}. The resulting symmetry is the wellknown fully nonlinear coordinate transformation invariance (or full diffeomorphism invariance or covariance^{Footnote 4}), which requires the stressenergy tensor to be covariantly conserved. To satisfy this symmetry, the kinetic term (2.36) should then be promoted to a fully nonlinear contribution,
Just as the linearized version \({h^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{h_{\alpha \beta}}\) was unique, the nonlinear realization \(\sqrt { g} R\) is also unique.^{Footnote 5} As a result, any theory of an interacting spin2 field is necessarily fully nonlinear and leads to the theory of gravity where nonlinear diffeomorphism invariance (or covariance) plays the role of the local gauge symmetry that projects out four out of the potential six degrees of freedom of the graviton and prevents the excitation of any ghost by the kinetic term.
The situation is very different from that of a spin1 field as seen earlier, where coupling with other fields can be implemented at the linear order without affecting the U (1) gauge symmetry. The difference is that in the case of a U (1) symmetry, there is a unique nonlinear completion of that symmetry, i.e., the unique nonlinear completion of a U (1) is nothing else but a U (1). Thus any nonlinear Lagrangian which preserves the full U (1) symmetry will be a consistent interacting theory. On the other hand, for spin2 fields, there are two, and only two ways to nonlinearly complete linear diffs, one as linear diffs in the full theory and the other as full nonlinear diffs. While it is possible to write selfinteractions which preserve linear diffs, there are no interactions between matter and h_{μν}. which preserve linear diffs. Thus any theory of gravity must exhibit full nonlinear diffs and is in this sense what leads us to GR.
2.4 Nonlinear Stückelberg decomposition
2.4.1 On the need for a reference metric
We have introduced the spin2 field h_{μν} as the perturbation about flat spacetime. When considering the theory of a field of given spin it is only natural to work with Minkowski as our spacetime metric, since the notion of spin follows from that of Poincaré invariance. Now when extending the theory nonlinearly, we may also extend the theory about different reference metric. When dealing with a reference metric different than Minkowski, one loses the interpretation of the field as massive spin2, but one can still get a consistent theory. One could also wonder whether it is possible to write a theory of massive gravity without the use of a reference metric at all. This interesting question was investigated in [75], where it shown that the only consistent alternative is to consider a function of the metric determinant. However, as shown in [75], the consistent function of the determinant is the cosmological constant and does not provide a mass for the graviton.
2.4.2 Nonlinear Stückelberg
Full diffeomorphism invariance (or covariance) indicates that the theory should be built out of scalar objects constructed out of the metric g_{μν} and other tensors. However, as explained previously a theory of massive gravity requires the notion of a reference metric^{Footnote 6} f_{μν} (which may be Minkowski f_{μν} = η_{μν}) and at the linearized level, the mass for gravity was not built out of the full metric g_{μν}, but rather out of the fluctuation h_{μν} about this reference metric which does not transform as a tensor under general coordinate transformations. As a result the mass term breaks covariance.
This result is already transparent at the linear level where the mass term (2.39) breaks linearized diffeomorphism invariance. Nevertheless, that gauge symmetry can always be ‘formally’ restored using the Stückelberg trick which amounts to replacing the reference metric (so far we have been working with the flat Minkowski metric as the reference), to
and transforming χ_{μ} under linearized diffeomorphism in such a way that the combination h_{μν} − 2∂(_{μ}χ_{ν}) remains invariant. Now that the symmetry is nonlinearly realized and replaced by general covariance, this Stückelberg trick should also be promoted to a fully covariant realization.
Following the same Stückelberg trick nonlinearly, one can ‘formally restore’ covariance by including four Stückelberg fields ϕ^{a} (a = 0, 1, 2, 3) and promoting the reference metric f_{μν}, which may of may not be Minkowski, to a tensor [446, 27],
As we can see from this last expression, \({{\tilde f}_{\mu v}}\), transforms as a tensor under coordinate transformations as long as each of the four fields ϕ^{a} transform as scalars. We may now construct the theory of massive gravity as a scalar Lagrangian of the tensors \({{\tilde f}_{\mu v}}\) and g_{μν}. In unitary gauge, where the Stückelberg fields are ϕ^{a} = x^{a}, we simply recover \({{\tilde f}_{\mu v}} = {f_{\mu v}}\).
This Stückelberg trick for massive gravity dates already from Green and Thorn [267] and from Siegel [446], introduced then within the context of open string theory. In the same way as the massless graviton naturally emerges in the closed string sector, open strings also have spin2 excitations but whose lowest energy state is massive at tree level (they only become massless once quantum corrections are considered). Thus at the classical level, open strings contain a description of massive excitations of a spin2 field, where gauge invariance is restored thanks to same Stückelberg fields as introduced in this section. In open string theory, these Stückelberg fields naturally arise from the ghost coordinates. When constructing the nonlinear theory of massive gravity from extra dimension, we shall see that in that context the Stückelberg fields naturally arise at the shift from the extra dimension.
For later convenience, it will be useful to construct the following tensor quantity,
in unitary gauge, \({\mathbb X} = {g^{ 1}}f\).
2.4.3 Alternative Stückelberg trick
An alternative way to Stückelberize the reference metric is to express it as
As nicely explained in Ref. [14], both matrices \(X_v^\mu\) and \(Y_b^a\) have the same eigenvalues, so one can choose either one of them in the definition of the massive gravity Lagrangian without any distinction. The formulation in terms of Y rather than X was originally used in Ref. [94], although unsuccessfully as the potential proposed there exhibits the BD ghost instability, (see for instance Ref. [60]).
2.4.4 Helicity decomposition
If we now focus on the flat reference metric, f_{μν} = η_{μν}, we may further split the Stückelberg fields as \({\phi ^a} = {x^a}  {1 \over {{M_{{\rm{Pl}}}}}}{{\mathcal X}^a}\) and identify the index a with a Lorentz index,^{Footnote 7} we obtain the nonlinear generalization of the Stückelberg trick used in Section 2.2.2
where in the second equality we have used the split performed in (2.46) of in terms of the helicity0 and 1 modes and all indices are raised and lowered with respect to η_{μν}.
In other words, the fluctuations about flat spacetime are promoted to the tensor H_{μν}
with
The field are introduced to restore the gauge invariance (full diffeomorphism invariance). We can now always set a gauge where h_{μν} is transverse and traceless at the linearized level and A_{μ} is transverse. In this gauge the quantities h_{μν}, A_{μ}; and π represent the helicity decomposition of the metric. h_{μν} is the helicity2 part of the graviton, A_{μ} the helicity1 part and π the helicity0 part. The fact that these quantities continue to correctly identify the physical degrees of freedom nonlinearly in the limit M_{Pl} → ∞ is nontrivial and has been derived in [143].
2.4.5 Nonlinear FierzPauli
The most straightforward nonlinear extension of the FierzPauli mass term is as follows
this mass term is then invariant under nonlinear coordinate transformations. This nonlinear formulation was used for instance in [27]. Alternatively, one may also generalize the FierzPauli mass nonlinearly as follows [75]
A priori, the linear FierzPauli action for massive gravity can be extended nonlinearly in an arbitrary number of ways. However, as we shall see below, most of these generalizations generate a ghost nonlinearly, known as the BoulwareDeser (BD) ghost. In Part II, we shall see that the extension of the FierzPauli to a nonlinear theory free of the BD ghost is unique (up to two constant parameters).
2.5 BoulwareDeser ghost
The easiest way to see the appearance of a ghost at the nonlinear level is to follow the Stückelberg trick nonlinearly and observe the appearance of an Ostrogradsky instability [111, 173], although the original formulation was performed in unitary gauge in [75] in the ADM language (Arnowitt, Deser and Misner, see Ref. [29]). In this section we shall focus on the flat reference metric, ƒ_{μν} = η_{μν}
Focusing solely on the helicity0 mode π to start with, the tensor \({\mathbb X}_v^\mu\) defined in (2.76) is expressed as
where at this level all indices are raised and lowered with respect to the flat reference metric η_{μν}. then the FierzPauli mass term (2.83) reads
Upon integration by parts, we notice that the quadratic term in (2.86) is a total derivative, which is another way to see the special structure of the FierzPauli mass term. Unfortunately this special fact does not propagate to higher order and the cubic and quartic interactions are genuine higher order operators which lead to equations of motion with quartic and cubic derivatives. In other words these higher order operators ([Π^{3}] − [Π][Π^{2}]) and ([Π^{4}] − [Π^{2}]^{2}) propagate an additional degree of freedom which by Ostrogradsky’s theorem, always enters as a ghost. While at the linear level, these operators might be irrelevant, their existence implies that one can always find an appropriate background configuration π = π_{0} + δπ, such that the ghost is manifest
with Z^{μναβ} = 3∂^{μ}∂^{α}π_{0}η^{νβ} − □π_{0}η^{μα}η^{νβ} − 2∂^{μ} ∂^{ν}π_{0}η^{αβ} + ⋯. This implies that nonlinearly (or around a nontrivial background), the FierzPauli mass term propagates an additional degree of freedom which is a ghost, namely the BD ghost. The mass of this ghost depends on the background configuration π_{0},
As we shall see below, the resolution of the vDVZ discontinuity lies in the Vainshtein mechanism for which the field takes a large vacuum expectation value, ∂^{2}π_{0} ≫ M_{Pl}m^{2}, which in the present context would lead to a ghost with an extremely low mass, \(m_{{\rm{ghost}}}^2 \lesssim {m^2}\).
Choosing another nonlinear extension for the FierzPauli mass term as in (2.84) does not seem to help much,
where we have integrated by parts on the second line, and we recover exactly the same type of higher derivatives already at the cubic level, so the BD ghost is also present in (2.84).
Alternatively the mass term was also generalized to include curvature invariants as in Ref. [69]. This theory was shown to be ghostfree at the linear level on FLRW but not yet nonlinearly.
2.5.1 Function of the FierzPauli mass term
As an extension of the FierzPauli mass term, one could instead write a more general function of it, as considered in Ref. [75]
however, one can easily see, if a mass term is actually present, i.e., F ′ ≠ 0, there is no analytic choice of the function F which would circumvent the nonlinear propagation of the BD ghost. Expanding F into a Taylor expansion, we see for instance that the only choice to prevent the cubic higherderivative interactions in π, [Π^{3}] [Π ]−[Π^{2}] is F ′(0) = 0, which removes the mass term at the same time. If F (0) ≠ 0 but F ′(0) = 0, the theory is massless about the specific reference metric, but infinitely strongly coupled about other backgrounds.
Instead to prevent the presence of the BD ghost fully nonlinearly (or equivalently about any background), one should construct the mass term (or rather potential term) in such a way, that all the higher derivative operators involving the helicity0 mode (∂^{2}π)^{n} are total derivatives. This is precisely what is achieved in the “ghostfree” model of massive gravity presented in Part II. In the next Part I we shall use higher dimensional GR to get some insight and intuition on how to construct a consistent theory of massive gravity.
3 Part I Massive Gravity from Extra Dimensions
4 HigherDimensional Scenarios
As seen in Section 2.5, the ‘most natural’ nonlinear extension of the FierzPauli mass term bears a ghost. Constructing consistent theories of massive gravity has actually been a challenging task for years, and higherdimensional scenario can provide excellent frameworks for explicit realizations of massive gravity. The main motivation behind relying on higher dimensional gravity is twofold:

The fivedimensional theory is explicitly covariant.

A massless spin2 field in five dimensions has five degrees of freedom which corresponds to the correct number of dofs for a massive spin2 field in four dimensions without the pathological BD ghost.
While string theory and other higher dimensional theories give rise naturally to massive gravitons, they usually include a massless zeromode. Furthermore, in the simplest models, as soon as the first massive mode is relevant so is an infinite tower of massive (KaluzaKlein) modes and one is never in a regime where a single massive graviton dominates, or at least this was the situation until the DvaliGabadadzePorrati model (DGP) [208, 209, 207], provided the first explicit model of (soft) massive gravity, based on a higherdimensional braneworld model.
In the DGP model the graviton has a soft mass in the sense that its propagator does not have a simple pole at fixed value m, but rather admits a resonance. Considering the KallenLehmann spectral representation [331, 374], the spectral density function ρ (μ^{2}) in DGP is of the form
and so DGP corresponds to a theory of massive gravity with a resonance with width Δm ∼ m_{0} about m = 0.
In a KaluzaKlein decomposition of a flat extra dimension we have, on the other hand, an infinite tower of massive modes with spectral density function
We shall see in the section on deconstruction (5) how one can truncate this infinite tower by performing a discretization in real space rather than in momentum space à la KaluzaKlein, so as to obtain a theory of a single massive graviton
or a theory of multigravity (with Ninteracting gravitons),
In this language, bigravity is the special case of multigravity where N = 2. These different spectral representations, together with the cascading gravity extension of DGP are represented in Figure 1.
Recently, another higher dimensional embedding of bigravity was proposed in Ref. [495]. Rather than performing a discretization of the extra dimension, the idea behind this model is to consider a twobrane DGP model, where the radion or separation between these branes is stabilized via a GoldbergerWise stabilization mechanism [255] where the brane and the bulk include a specific potential for the radion. At low energy the mass spectrum can be truncated to a massless mode and a massive mode, reproducing a bigravity theory. However, the stabilization mechanism involves a relatively low scale and the correspondence breaks down above it. Nevertheless, this provides a first proof of principle for how to embed such a model in a higherdimensional picture without discretization and could be useful to tackle some of the open questions of massive gravity.
In what follows we review how fivedimensional gravity is a useful starting point in order to generate consistent fourdimensional theories of massive gravity, either for softmassive gravity à la DGP and its extensions, or for hard massive gravity following a deconstruction framework.
The DGP model has played the role of a precursor for many developments in modified and massive gravity and it is beyond the scope of this review to summarize all of them. In this review we briefly summarize the DGP model and some key aspects of its phenomenology, and refer the reader to other reviews (see for instance [232, 390, 234]) for more details on the subject.
In this section, A, B, C ⋯ = 0, …, 4 represent fivedimensional spacetime indices and μ, ν, α ⋯ = 0, …, 3 label fourdimensional spacetime indices. y = x^{4} represents the fifth additional dimension, {x^{A}} = {x^{μ}, y}. The fivedimensional metric is given by ^{(5)}g_{ab} (x, y) while the fourdimensional metric is given by g_{μν} (x). The fivedimensional scalar curvature is ^{(5)}R [G ] while R = R [g ] is the fourdimensional scalarcurvature. We use the same notation for the Einstein tensor where ^{(5)}G_{ab} is the fivedimensional one and G_{μν} represents the fourdimensional one built out of g_{μν}.
When working in the EinsteinCartan formalism of gravity, \({\mathbb A},{\mathbb B},{\mathbb C}\) label fivedimensional Lorentz indices and a,b,c = ⋯ label the fourdimensional ones.
5 The DvaliGabadadzePorrati Model
The idea behind the DGP model [209, 208, 207] is to start with a fourdimensional braneworld in an infinite sizeextra dimension. A priori gravity would then be fully fivedimensional, with respective Planck scale M_{5}, but the matter fields localized on the brane could lead to an induced curvature term on the brane with respective Planck scale M_{Pl}. See [22] for a potential embedding of this model within string theory.
At small distances the induced curvature dominates and gravity behaves as in four dimensions, while at large distances the leakage of gravity within the extra dimension weakens the force of gravity. The DGP model is thus a model of modified gravity in the infrared, and as we shall see, the graviton effectively acquires a soft mass, or resonance.
5.1 Gravity induced on a brane
We start with the fivedimensional action for the DGP model [209, 208, 207] with a brane localized at y = 0,
where ψ_{i} represent matter field species confined to the brane with stressenergy tensor T_{μν}. This brane is considered to be an orbifold brane enjoying a ℤ_{2}orbifold symmetry (so that the physics at y < 0 is the mirror copy of that at y < 0.) We choose the convention where we consider −∞ < y < ∞, reason why we have a factor or \(M_5^3/4\) rather than \(M_5^3/2\) if we had only consider one side of the brane, for instance y ≥ 0.
The fivedimensional Einstein equation of motion are then given by
with
The Israel matching condition on the brane [323] can be obtained by integrating this equation over \(\int\nolimits_{ \varepsilon}^\varepsilon {{\rm{d}}y}\) dy and taking the limit ε → 0, so that the jump in the extrinsic curvature across the brane is related to the Einstein tensor and stressenergy tensor of the matter field confined on the brane.
5.1.1 Perturbations about flat spacetime
In DGP the fourdimensional graviton is effectively massive. To see this explicitly, we look at perturbations about flat spacetime
Since at this level we are dealing with fivedimensional GR, we are free to set the fivedimensional gauge of our choice and choose fivedimensional de Donder gauge (a discussion about the branebending mode will follow)
In this gauge the fivedimensional Einstein tensor is simply
where \({\square_5} = \square + \partial _y^2\) is the fivedimensional d’Alembertian and □ is the fourdimensional one.
Since there is no source along the μy or yy directions (^{(5)}T_{μy} = 0 = ^{(5)}T_{yy}), we can immediately infer that
up to an homogeneous mode which in this setup we set to zero. This does not properly account for the branebending mode but for the sake of this analysis it will give the correct expression for the metric fluctuation h_{μν}. We will see in Section 4.2 how to keep track of the branebending mode which is partly encoded in h_{yy}.
Using these relations in the fivedimensional de Donder gauge, we deduce the relation for the purely fourdimensional part of the metric perturbation,
Using these relations in the projected Einstein equation, we get
where \(h \equiv h_\alpha ^\alpha = {\eta ^{\mu v}}{h_{\mu v}}\) is the fourdimensional trace of the perturbations.
Solving this equation with the requirement that h_{μν} → 0 as y → ±∞, we infer the following profile for the perturbations along the extra dimension
where the □ should really be thought in Fourier space, and h_{μν} (x) is set from the boundary conditions on the brane. Integrating the Einstein equation across the brane, from −ε to +ε, we get
yielding the modified linearized Einstein equation on the brane
where all the metric perturbations are the ones localized at y = 0 and the constant mass scale m_{0} is given by
Interestingly, we see the special FierzPauli combination h_{μν} − hη_{μν} appearing naturally from the fivedimensional nature of the theory. At this level, this corresponds to a linearized theory of massive gravity with a scaledependent effective mass \({m^2}(\square) = {m_0}\sqrt { \square}\), which can be thought in Fourier space, m^{2}(k) = m_{0}k. We could now follow the same procedure as derived in Section 2.2.3 and obtain the expression for the sourced metric fluctuation on the brane
where T = η^{μν}T_{μν} is the trace of the fourdimensional stressenergy tensor localized on the brane. This yields the following gravitational exchange amplitude between two conserved sources T_{μν} and
where the polarization tensor \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the same as that given for FierzPauli in (2.57) in terms of m_{0}. In particular the polarization tensor includes the standard factor of −1/3Tη_{μν} as opposed to −1/2Tη_{μν} as would be the case in GR. This is again the manifestation of the vDVZ discontinuity which is cured by the Vainshtein mechanism as for FierzPauli massive gravity. See [165] for the explicit realization of the Vainshtein mechanism in DGP which is where it was first shown to work explicitly.
5.1.2 Spectral representation
In Fourier space the propagator for the graviton in DGP is given by
with the massive polarization tensor f^{massive} defined in (2.58) and
which can be written in the KallenLehmann spectral representation as a sum of free propagators with mass μ,
with the spectral density ρ (μ^{2})
which is represented in Figure 1. As already emphasized, the graviton in DGP cannot be thought of a single massive mode, but rather as a resonance picked about μ = 0.
We see that the spectral density is positive for any μ^{2} > 0, confirming the fact that about the normal (flat) branch of DGP there is no ghost.
Notice as well that in the massless limit m_{0} → 0, we see appearing a representation of the Dirac delta function,
and so the massless mode is singled out in the massless limit of DGP (with the different tensor structure given by \(f_{\mu v\alpha \beta}^{{\rm{massive}}} \ne f_{\mu v\alpha \beta}^{(0)}\) which is the origin of the vDVZ discontinuity see Section 2.2.3.)
5.2 Branebending mode
5.2.1 Fivedimensional gaugefixing
In Section 4.1.1 we have remained vague about the gaugefixing and the implications for the brane position. The branebending mode is actually important to keep track of in DGP and we shall do that properly in what follows by keeping all the modes.
We work in the fivedimensional ADM split with the lapse \(N = 1/\sqrt {{g^{yy}}} = 1 + {1 \over 2}{h_{yy}}\), the shift N_{μ} = g_{μy} and the fourdimensional part of the metric, g_{μν} (x,y) = η_{μν} + (x,y). The fivedimensional EinsteinHilbert term is then expressed as
where square brackets correspond to the trace of a tensor with respect to the fourdimensional metric g_{μν} and K_{μν} is the extrinsic curvature
and D_{μ} is the covariant derivative with respect to g_{μν}.
First notice that the fivedimensional de Donder gauge choice (4.5) can be made using the fivedimensional gauge fixing term
where we keep the same notation as previously, h = η_{μν}h_{μν} is the fourdimensional trace.
After fixing the de Donder gauge (4.5), we can make the addition gauge transformation x^{A} → x^{A} + ξ^{A}, and remain in de Donder gauge provided satisfies linearly □_{5}ξ^{A} = 0. This residual gauge freedom can be used to further fix the gauge on the brane (see [389] for more details, we only summarize their derivation here).
5.2.2 Fourdimensional Gaugefixing
Keeping the brane at the fixed position y = 0 imposes = 0 since we need ξ^{A} (y = 0) = 0 and should be bounded as y → ∞ (the situation is slightly different in the selfaccelerating branch and this mode can lead to a ghost, see Section 4.4 as well as [361, 98]).
Using the bulk profile \({h_{AB}}(x,y) = {e^{ \sqrt { \square} \vert y\vert}}{h_{AB}}(x)\) and integrating over the extra dimension, we obtain the contribution from the bulk on the brane (including the contribution from the gaugefixing term) in terms of the gauge invariant quantity
Notice again a factor of 2 difference from [389] which arises from the fact that we integrate from y = −∞ to y = +∞ imposing a ℤ_{2}mirror symmetry at y = 0, rather than considering only one side of the brane as in [389]. Both conventions are perfectly reasonable.
The integrated bulk action (4.27) is invariant under the residual linearized gauge symmetry
which keeps both \({\tilde h_{\mu v}}\) and h_{yy} invariant. The residual gauge symmetry can be used to set the gauge on the brane, and at this level from (4.27) we can see that the most convenient gauge fixing term is [389]
with again \({m_0} = M_5^3/M_{{\rm{Pl}}}^2\), so that the induced Lagrangian on the brane (including the contribution from the residual gauge fixing term) is
Combining the fivedimensional action from the bulk (4.27) with that on the boundary (4.31) we end up with the linearized action on the fourdimensional DGP brane [389]
As shown earlier we recover the theory of a massive graviton in four dimensions, with a soft mass \({m_2}(\square) = {m_0}\sqrt { \square}\). This analysis has allowed us to keep track of the physical origin of all the modes including the branebending mode which is especially relevant when deriving the decoupling limit as we shall see below.
The kinetic mixing between these different modes can be diagonalized by performing the change of variables [389]
so we see that the mode π is directly related to h_{yy}. In the case of Section 4.1.1, we had set h_{yy} = 0 and the field π is then related to the brane bending mode. In either case we see that the extrinsic curvature K_{μν} carries part of this mode.
Omitting the mass terms and other relevant operators, the action is diagonalized in terms of the different graviton modes at the linearized level h′_{μν} (which encodes the helicity2 mode), N ′_{μ} (which is part of the helicity1 mode) and π (helicity0 mode),
5.2.3 Decoupling limit
We will be discussing the meaning of ‘decoupling limits’ in more depth in the context of multigravity and ghostfree massive gravity in Section 8. The main idea behind the decoupling limit is to separate the physics of the different modes. Here we are interested in following the interactions of the helicity0 mode without the complications from the standard helicity2 interactions that already arise in GR. For this purpose we can take the limit M_{Pl} → ∞ while simultaneously sending \({m_0} = M_5^3/M_{{\rm{Pl}}}^2 \to 0\) while keeping the scale \(\Lambda = {(m_0^2{M_{{\rm{Pl}}}})^{1/3}}\) fixed. This is the scale at which the first interactions arise in DGP.
In DGP the decoupling limit should be taken by considering the full fivedimensional theory, as was performed in [389]. The fourdimensional EinsteinHilbert term does not give to any operators before the Planck scale, so in order to look for the irrelevant operator that come at the lowest possible scale, it is sufficient to focus on the boundary term from the fivedimensional action. It includes operators of the form
with integer powers n, k, ℓ ≥ 0 and n + k + ℓ ≥ 3 since we are dealing with interactions. The scale at which such an operator arises is
and it is easy to see that the lowest possible scale is \({\Lambda _3} = {({M_{{\rm{Pl}}}}m_0^2)^{1/3}}\) which arises for n = 0, k = 0 and ℓ = 3, it is thus a cubic interaction in the helicity0 mode π which involves four derivatives. Since it is only a cubic interaction, we can scan all the possible ways enters at the cubic level in the fivedimensional EinsteinHilbert action. The relevant piece are the ones from the extrinsic curvature in (4 22), and in particular the combination N ([K ]^{2} − [K^{2}]), with
Integrating \({m_0}M_{{\rm{Pl}}}^2N({[K]^2}  [{K^2}])\) along the extra dimension, we obtain the cubic contribution in π on the brane (using the relations (4.34) and (4.35))
So the decoupling limit of DGP arises at the scale Λ_{3} and reduces to a cubic Galileon for the helicity0 mode with no interactions for the helicity2 and 1 modes,
5.3 Phenomenology of DGP
The phenomenology of DGP is extremely rich and has led to many developments. In what follows we review one of the most important implications of the DGP for cosmology which the existence of selfaccelerating solutions. The cosmology and phenomenology of DGP was first derived in [159, 163] (see also [388, 385, 387, 386]).
5.3.1 Friedmann equation in de Sitter
To get some intuition on how cosmology gets modified in DGP, we first look at de Sitterlike solutions and then infer the full Friedmann equation in a FLRWgeometry. We thus start with fivedimensional Minkowski in de Sitter slicing (this can be easily generalized to FLRWslicing),
where \(\gamma _{\mu v}^{{\rm{(dS)}}}\) is the fourdimensional de Sitter metric with constant Hubble parameter H, \(\gamma _{\mu v}^{{\rm{(dS)}}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^v} =  {\rm{d}}{t^2} + {a^2}(t)\,{\rm{d}}{x^2}\), and the scale factor is given by a (t) = exp(Ht). The metric (4.43) is indeed Minkowski in de Sitter slicing if the warp factor b (y) is given by
and the mod y has be imposed by the ℤ_{2}orbifold symmetry. As we shall see the branch ϵ = +1 corresponds to the selfaccelerating branch of DGP and ϵ = −1 is the stable, normal branch of DGP.
We can now derive the Friedmann equation on the brane by integrating over the 00component of the Einstein equation (4.2) with the source (4.3) and consider some energy density T_{00} = ρ. The fourdimensional Einstein tensor gives the standard contribution G_{00} = 3H^{2} on the brane and so we obtain the modified Friedmann equation
with ^{(5)}G_{00} = 3(H^{2} − b ″(y)/b (y)), so
leading to the modified Friedmann equation,
where the fivedimensional nature of the theory is encoded in the new term −ϵm_{0}H (this new contribution can be seen to arise from the helicity0 mode of the graviton and could have been derived using the decoupling limit of DGP.) for reasons which will become clear in what follows, the choice ϵ = − 1 corresponds to the stable branch of DGP while the other choice ϵ = +1 corresponds to the selfaccelerating branch of DGP. As is already clear from the higherdimensional perspective, when ϵ = +1, the warp factor grows in the bulk (unless we think of the junction conditions the other way around), which is already signaling towards a pathology for that branch of solution.
5.3.2 General Friedmann equation
This modified Friedmann equation has been derived assuming a constant H, which is only consistent if the energy density is constant (i.e., a cosmological constant). We can now derive the generalization of this Friedmann equation for nonconstant H. This amounts to account for Ḣ and other derivative corrections which might have been omitted in deriving this equation by assuming that was constant. But the Friedmann equation corresponds to the Hamiltonian constraint equation and higher derivatives (e.g., Ḣ ⊃ ä and higher derivatives of H) would imply that this equation is no longer a constraint and this loss of constraint would imply that the theory admits a new degree of freedom about generic backgrounds namely the BD ghost (see the discussion of Section 7).
However, in DGP we know that the BD ghost is absent (this is ensured by the fivedimensional nature of the theory, in five dimensions we start with five dofs, and there is thus no sixth BD mode). So the Friedmann equation cannot include any derivatives of H, and the Friedmann equation obtained assuming a constant H is actually exact in FLRW even if H is not constant. So the constraint (4.47) is the exact Friedmann equation in DGP for any energy density ρ on the brane.
The same trick can be used for massive gravity and bigravity and the Friedmann equations (12.51), (12.52) and (12.54) are indeed free of any derivatives of the Hubble parameter.
5.3.3 Observational viability of DGP
Independently of the ghost issue in the selfaccelerating branch of the model, there has been a vast amount of investigation on the observational viability of both the selfaccelerating branch and the normal (stable) branch of DGP. First because many of these observations can apply equally well to the stable branch of DGP (modulo a minus sign in some of the cases), and second and foremost because DGP represents an excellent archetype in which ideas of modified gravity can be tested.
Observational tests of DGP fall into the following two main categories:

Tests of the Friedmann equation. This test was performed mainly using Supernovae, but also using Baryonic Acoustic Oscillations and the CMB so as to fix the background history of the Universe [162, 217, 221, 286, 391, 23, 405, 481, 304, 382, 462]. Current observations seem to slightly disfavor the additional term in the Friedmann equation of DGP, even in the normal branch where the latetime acceleration of the Universe is due to a cosmological constant as in ΛCDM. These put bounds on the graviton mass in DGP to the order of m_{0} ≲ 10^{−1} H_{0}, where H_{0} is the Hubble parameter today (see Ref. [492] for the latest bounds at the time of writing, including data from Planck). Effectively this means that in order for DGP to be consistent with observations, the graviton mass can have no effect on the latetime acceleration of the Universe.

Tests of an extra fifth force, either within the solar system, or during structure formation (see for instance [362, 260, 452, 451, 222, 482] Refs. [453, 337, 442] for Nbody simulations as well as Ref. [17, 441] using weak lensing).
Evading fifth force experiments will be discussed in more detail within the context of the Vainshtein mechanism in Section 10.1 and thereafter, and we save the discussion to that section. See Refs. [388, 385, 387, 386, 444] for a fivedimensional study dedicated to DGP. The study of cosmological perturbations within the context of DGP was also performed in depth for instance in [367, 92].
5.4 Selfacceleration branch
The cosmology of DGP has led to a major conceptual breakthrough, namely the realization that the Universe could be ‘selfaccelerating’. This occurs when choosing the ϵ = +1 branch of DGP, the Friedmann equation in the vacuum reduces to [159, 163]
which admits a nontrivial solution H = 0 in the absence of any cosmological constant nor vacuum energy. In itself this would not solve the old cosmological constant problem as the vacuum energy ought to be set to zero on its own, but it can lead to a model of ‘dark gravity’ where the amount of acceleration is governed by the scale m_{0} which is stable against quantum corrections.
This realization has opened a new field of study in its own right. It is beyond the scope of this review on massive gravity to summarize all the interesting developments that arose in the past decade and we simply focus on a few elements namely the presence of a ghost in this selfaccelerating branch as well as a few cosmological observations.
5.4.1 Ghost
The existence of a ghost on the selfaccelerating branch of DGP was first pointed out in the decoupling limit [389, 411], where the helicity0 mode of the graviton is shown to enter with the wrong sign kinetic in this branch of solutions. We emphasize that the issue of the ghost in the selfaccelerating branch of DGP is completely unrelated to the sixth BD ghost on some theories of massive gravity. In DGP there are five dofs one of which is a ghost. The analysis was then generalized in the fully fledged fivedimensional theory by K. Koyama in [360] (see also [263, 361] and [98]).
When perturbing about Minkowski, it was shown that the graviton has an effective mass \({m^2} = {m_0}\sqrt { \square}\). When perturbing on top of the selfaccelerating solution a similar analysis can be performed and one can show that in the vacuum the graviton has an effective mass at precisely the Higuchibound, \(m_{{\rm{eff}}}^2 = 2{H^2}\) (see Ref. [307]). When matter or a cosmological constant is included on the brane, the graviton mass shifts either inside the forbidden Higuchiregion \(0 < m_{{\rm{eff}}}^2 < 2{H^2}\), or outside \(m_{{\rm{eff}}}^2 > 2{H^2}\). We summarize the three case scenario following [360, 98]

In [307] it was shown that when the effective mass is within the forbidden Higuchiregion, the helicity0 mode of graviton has the wrong sign kinetic term and is a ghost.

Outside this forbidden region, when \(m_{{\rm{eff}}}^2 > 2{H^2}\), the zeromode of the graviton is healthy but there exists a new normalizable branebending mode in the selfaccelerating branch^{Footnote 8} which is a genuine degree of freedom. For \(m_{{\rm{eff}}}^2 > 2{H^2}\) the branebending mode was shown to be a ghost.

Finally, at the critical mass \(m_{{\rm{eff}}}^2 > 2{H^2}\) (which happens when no matter nor cosmological constant is present on the brane), the branebending mode takes the role of the helicity0 mode of the graviton, so that the theory graviton still has five degrees of freedom, and this mode was shown to be a ghost as well.
In summary, independently of the matter content of the brane, so long as the graviton is massive \(m_{{\rm{eff}}}^2 > 0\), the selfaccelerating branch of DGP exhibits a ghost. See also [210] for an exact nonperturbative argument studying domain walls in DGP. In the selfaccelerating branch of DGP domain walls bear a negative gravitational mass. This nonperturbative solution can also be used as an argument for the instability of that branch.
5.4.2 Evading the ghost?
Different ways to remove the ghosts were discussed for instance in [325] where a second brane was included. In this scenario it was then shown that the graviton could be made stable but at the cost of including a new spin0 mode (that appears as the mode describing the distance between the branes).
Alternatively it was pointed out in [233] that if the sign of the extrinsic curvature was flipped, the selfaccelerating solution on the brane would be stable.
Finally, a stable selfacceleration was also shown to occur in the massless case \(m_{{\rm{eff}}}^2 = 0\) by relying on GaussBonnet terms in the bulk and a selfsource AdS_{5} solution [156]. The fivedimensional theory is then similar as that of DGP (4.1) but with the addition of a fivedimensional GaussBonnet term \({\mathcal R}_{{\rm{GB}}}^2\) in the bulk and the wrong sign fivedimensional EinsteinHilbert term,
The idea is not so dissimilar as in new massive gravity (see Section 13), where here the wrong sign kinetic term in fivedimensions is balanced by the GaussBonnet term in such a way that the graviton has the correct sign kinetic term on the selfsourced AdS_{5} solution. The length scale ℓ is related to this AdS length scale, and the selfaccelerating branch admits a stable (ghostfree) de Sitter solution with H ∼ ℓ^{−1}.
We do not discuss this model any further in what follows since the graviton admits a zero (massless) mode. It is feasible that this model can be understood as a bigravity theory where the massive mode is a resonance. It would also be interesting to see how this model fits in with the Galileon theories [412] which admit stable selfaccelerating solution.
In what follows, we go back to the standard DGP model be it the selfaccelerating branch (ϵ = 1) or the normal branch (ϵ = − 1).
5.5 Degravitation
One of the main motivations behind modifying gravity in the infrared is to tackle the old cosmological constant problem. The idea behind ‘degravitation’ [211, 212, 26, 216] is if gravity is modified in the IR, then a cosmological constant (or the vacuum energy) could have a smaller impact on the geometry. In these models, we would live with a large vacuum energy (be it at the TeV scale or at the Planck scale) but only observe a small amount of lateacceleration due to the modification of gravity. In order for a theory of modified gravity to potentially tackle the old cosmological constant problem via degravitation it needs to have the two following properties:

1.
First, gravity must be weaker in the infrared and effectively massive [216] so that the effect of IR sources can be degravitated.

2.
Second, there must exist some (nearly) static attractor solutions towards which the system can evolve at latetime for arbitrary value of the vacuum energy or cosmological constant.
5.5.1 Flat solution with a cosmological constant
The first requirement is present in DGP, but as was shown in [216] in DGP gravity is not ‘sufficiently weak’ in the IR to allow degravitation solutions. Nevertheless, it was shown in [164] that the normal branch of DGP satisfies the second requirement for any negative value of the cosmological constant. In these solutions the fivedimensional spacetime is not Lorentz invariant, but in a way which would not (at this background level) be observed when confined on the fourdimensional brane.
For positive values of the cosmological constant, DGP does not admit a (nearly) static solution. This can be understood at the level of the decoupling limit using the arguments of [216] and generalized for other mass operators.
Inspired by the form of the graviton in DGP, \({m^2}(\square) = {m_0}\sqrt { \square}\), we can generalize the form of the graviton mass to
with α a positive dimensionless constant. α = 1 corresponds to a modification of the kinetic term. As shown in [153], any such modification leads to ghosts, so we do not consider this case here. α > 1 corresponds to a UV modification of gravity, and so we focus on α < 1.
In the decoupling limit the helicity2 decouples from the helicity0 mode which behaves (symbolically) as follows [216]
where T is the trace of the stressenergy tensor of external matter fields. At the linearized level, matter couples to the metric \({g_{\mu v}} = {\eta _{\mu v}} + {1 \over {{M_{{\rm{Pl}}}}}}(h_{\mu v}\prime + \pi {\eta _{\mu v}})\). We now check under which conditions we can still recover a nearly static metric in the presence of a cosmological constant T_{μν} = −Λ_{CC}g_{μν}. In the linearized limit of GR this leads to the profile for the helicity2 mode (which in that case corresponds to a linearized de Sitter solution)
One way we can obtain a static solution in this extended theory of massive gravity at the linear level is by ensuring that the solution for cancels out that of h ′_{μν} so that the metric g_{μν} remains flat. \(\pi = + {{{\Lambda _{{\rm{CC}}}}} \over {6{M_{{\rm{Pl}}}}}}{\eta _{\mu v}}{x^\mu}{x^v}\) is actually the solution of (4.51) when only the term contributes and all the other operators vanish for π ∝ x_{μ}x^{μ}. This is the case if α < 1/2 as shown in [216]. This explains why in the case of DGP which corresponds to border line scenario α = 1/2, one can never fully degravitate a cosmological constant.
5.5.2 Extensions
This realization has motivated the search for theories of massive gravity with 0 ≤ α < 1/2, and especially the extension of DGP to higher dimensions where the parameter can get as close to zero as required. This is the main motivation behind higher dimensional DGP [359, 240] and cascading gravity [135, 148, 132, 149] as we review in what follows. (In [433] it was also shown how a regularized version of higher dimensional DGP could be free of the strong coupling and ghost issues).
Note that α ≡ 0 corresponds to a hard mass gravity. Within the context of DGP, such a model with an ‘auxiliary’ extra dimension was proposed in [235, 133] where we consider a finitesize large extra dimension which breaks fivedimensional Lorentz invariance. The fivedimensional action is motivated by the fivedimensional gravity with scalar curvature in the ADM decomposition ^{(5)}R = R [g ] + [K ]^{2} − [K^{2}], but discarding the contribution from the fourdimensional curvature R [g ]. Similarly as in DGP, the fourdimensional curvature still appears induced on the brane
where ℓ is the size of the auxiliary extra dimension and g_{μν} is a fourdimensional metric and we set the lapse to one (this shift can be kept and will contribute to the fourdimensional Stückelberg field which restores fourdimensional invariance, but at this level it is easier to work in the gauge where the shift is set to zero and reintroduce the Stückelberg fields directly in four dimensions). Imposing the Dirichlet conditions g_{μν} (x, y = 0) = f_{μν}, we are left with a theory of massive gravity at y = 0, with reference metric f_{μν} and hard mass m_{0}. Here again the special structure ([K ]^{2} − [K^{2}]) inherited (or rather inspired) from fivedimensional gravity ensures the FierzPauli structure and the absence of ghost at the linearized level. Up to cubic order in perturbations it was shown in [138] that the theory is free of ghost and its decoupling limit is that of a Galileon.
Furthermore, it was shown in [133] that it satisfies both requirements presented above to potentially help degravitating a cosmological constant. Unfortunately at higher orders this model is plagued with the BD ghost [291] unless the boundary conditions are chosen appropriately [59]. For this reason we will not review this model any further in what follows and focus instead on the ghostfree theory of massive gravity derived in [137, 144]
5.5.3 Cascading gravity
5.5.3.1 Deficit angle
It is well known that a tension on a cosmic string does not cause the cosmic strong to inflate but rather creates a deficit angle in the two spatial dimensions orthogonal to the string. Similarly, if we consider a fourdimensional brane embedded in sixdimensional gravity, then a tension on the brane leads to the following flat geometry
where the two extra dimensions are expressed in polar coordinates {r, θ} and Δθ is a constant which parameterize the deficit angle in this canonical geometry. This deficit angle is related to the tension on the brane Λ_{CC} and the sixdimensional Planck scale (assuming sixdimensional gravity)
For a positive tension Λ_{cc} > 0, this creates a positive deficit angle and since Λθ cannot be larger than 2π, the maximal tension on the brane is \(M_6^4\). For a negative tension, on the other hand, there is no such bound as it creates a surplus of angle, see Figure 2.
This interesting feature has lead to many potential ways to tackle the cosmological constant by considering our Universe to live in a 3 + 1dimensional brane embedded in two or more large extra dimensions. (See Refs. [4, 3, 408, 414, 80, 470, 458, 459, 86, 82, 247, 333, 471, 81, 426, 409, 373, 85, 460, 155] for the supersymmetric largeextradimension scenario as an alternative way to tackle the cosmological constant problem). Extending the DGP to more than one extra dimension could thus provide a natural way to tackle the cosmological constant problem.
5.5.3.2 Spectral representation
Furthermore in nextra dimensions the gravitational potential is diluted as V (r) ∼ r^{−1−n}. If the propagator has a KällénLehmann spectral representation with spectral density π (μ^{2}), the Newtonian potential has the following spectral representation
In a higherdimensional DGP scenario, the gravitation potential behaves higher dimensional at large distance, V (r) ∼^{−(1+n)} which implies π (μ^{2}) ∼ μ^{n−2} in the IR as depicted in Figure 1.
Working back in terms of the spectral representation of the propagator as given in (4.19), this means that the propagator goes to 1/k in the IR as μ → 0 when n = 1 (as we know from DGP), while it goes to a constant for n > 1. So for more than one extra dimension, the theory tends towards that of a hard mass graviton in the far IR, which corresponds to α → 0 in the parametrization of (4.50). Following the arguments of [216] such a theory should thus be a good candidate to tackle the cosmological constant problem.
5.5.3.3 A brane on a brane
Both the spectral representation and the fact that codimensiontwo (and higher) branes can accommodate for a cosmological constant while remaining flat has made the field of highercodimension branes particularly interesting.
However, as shown in [240] and [135, 148, 132, 149], the straightforward extension of DGP to two large extra dimensions leads to ghost issues (sixth mode with the wrong sign kinetic term, see also [290, 70]) as well as divergences problems (see Refs. [256, 131, 130, 423, 422, 355, 83]).
To avoid these issues, one can consider simply applying the DGP procedure step by step and consider a 4 + 1dimensional DGP brane embedded in six dimension. Our Universe would then be on a 3 + 1dimensional DGP brane embedded in the 4 + 1 one, (note we only consider one side of the brane here which explains the factor of 2 difference compared with (4.1))
This model has two crossover scales: \({m_5} = M_5^3/M_{{\rm{Pl}}}^2\) which characterizes the scale at which one crosses from the fourdimensional to the fivedimensional regime, and \({m_6} = M_6^4/M_5^3\) yielding the crossing from a fivedimensional to a sixdimensional behavior. Of course we could also have a simultaneous crossing if m_{5} = m_{6}. In what follows we focus on the case where M_{pl} > M_{5} > M_{6}.
Performing the same linearized analysis as in Section 4.1.1 we can see that the fourdimensional theory of gravity is effectively massive with the soft mass in Fourier space
We see that the 4 + 1dimensional brane plays the role of a regulator (a divergence occurs in the limit m_{5} → 0).
In this sixdimensional model, there are effectively two new scalar degrees of freedom (arising from the extra dimensions). We can ensure that both of them have the correct sign kinetic term by

Either smoothing out the brane [240, 148] (this means that one should really consider a sixdimensional curvature on both the smoothed 4 + 1 and on the 3 + 1dimensional branes, which is something one would naturally expect^{Footnote 9}).

Or by including some tension on the 3 + 1 brane (which is also something natural since the setup is designed to degravitate a large cosmological constant on that brane). This was shown to be ghost free in the decoupling limit in [135] and in the full theory in [150].
As already mentioned in two large extra dimensional models there is to be a maximal value of the cosmological constant that can be considered which is related to the sixdimensional Planck scale. Since that scale is in turn related to the effective mass of the graviton and since observations set that scale to be relatively small, the model can only take care of a relatively small cosmological constant. Nevertheless, it still provides a proof of principle on how to evade Weinberg’s nogo theorem [484].
The extension of cascading gravity to more than two extra dimensions was considered in [149]. It was shown in that case how the 3 + 1 brane remains flat for arbitrary values of the cosmological constant on that brane (within the regime of validity of the weakfield approximation). See Figure 3 for a picture on how the scalar potential adapts itself along the extra dimensions to accommodate for a cosmological constant on the brane.
6 Deconstruction
As for DGP and its extensions, to get some insight on how to construct a fourdimensional theory of single massive graviton, we can start with fivedimensional general relativity. This time, we consider the extra dimension to be compactified and of finite size R, with periodic boundary conditions. It is then natural to perform a KaluzaKlein decomposition and to obtain a tower of KaluzaKlein graviton mode in four dimensions. The zero mode is then massless and the higher modes are all massive with mass separation m = 1/R. Since the graviton mass is constant in this formalism we omit the subscript 0 in the rest of this review.
Rather than starting directly with a KaluzaKlein decomposition (discretization in Fourier space), we perform instead a discretization in real space, known as “deconstruction” of fivedimensional gravity [24, 25, 170, 168, 28, 443, 340]. The deconstruction framework helps making the connection with massive gravity more explicit. However, we can also obtain multigravity out of it which is then completely equivalent to the KaluzaKlein decomposition (after a nonlinear field redefinition).
The idea behind deconstruction is simply to ‘replace’ the continuous fifth dimension y by a series of N sites y_{j} separated by a distance ℓ = R/N. So that the fivedimensional metric is replaced by a set of interacting metrics depending only on x.
In what follows, we review the procedure derived in [152] to recover fourdimensional ghostfree massive gravity as well as bi and multigravity out of fivedimensional GR. The procedure works in any dimensions and we only focus to deconstructing fivedimensional GR for sake of concreteness.
6.1 Formalism
6.1.1 Metric versus EinsteinCartan formulation of GR
Before going further, let us first describe fivedimensional general relativity in its EinsteinCartan formulation, where we introduce a set of vielbein \(e_A^{\rm{a}}\), so that the relation between the metric and the vielbein is simply,
where, as mentioned previously, the capital Latin letters label fivedimensional spacetime indices, while letters to a,b,c,… label fivedimensional Lorentz indices.
Under the torsionless condition, de+ω ∧e = 0, the antisymmetric spin connection ω, is uniquely determined in terms of the vielbeins
with \({O^{{\rm{ab}}}}_{\rm{c}} = 2{e^{{\rm{a}}A}}{e^{{\rm{b}}B}}{\partial _{{{[{A^e}B]}_{\rm{c}}}}}\). In the EinsteinCartan formulation of GR, we introduce a 2form Riemann curvature,
and up to boundary terms, the EinsteinHilbert action is then given in the respective metric and the vielbein languages by (here in five dimensions for definiteness),
where R^{(5)}[g ] is the scalar curvature built out of the fivedimensional metric g_{μν} and M_{5} is the fivedimensional Planck scale.
The counting of the degrees of freedom in both languages is of course equivalent and goes as follows: In dspacetime dimensions, the metric has d (d + 1)/2 independent components. Covariance removes 2d of them,^{Footnote 10} which leads to \({{\mathcal N}_d} = d(d  3)/2\) independent degrees of freedom. In fourdimensions, we recover the usual \({{\mathcal N}_4} = 2\) independent polarizations for gravitational waves. In fivedimensions, this leads to \({{\mathcal N}_5} = 5\) degrees of freedom which is the same number of degrees of freedom as a massive spin2 field in four dimensions. This is as expect from the KaluzaKlein philosophy (massless bosons in d + 1 dimensions have the same number of degrees of freedom as massive bosons in d dimensions — this counting does not directly apply to fermions).
In the EinsteinCartan formulation, the counting goes as follows: The vielbein has d^{2} independent components. Covariance removes 2d of them, and the additional global Lorentz invariance removes an additional d (d − 1)/2, leading once again to a total of \({{\mathcal N}_d} = d(d  3)/2\) independent degrees of freedom.
In GR one usually considers the metric and the vielbein formulation as being fully equivalent. However, this perspective is true only in the bosonic sector. The limitations of the metric formulation becomes manifest when coupling gravity to fermions. For such couplings one requires the vielbein formulation of GR. For instance, in four spacetime dimensions, the covariant action for a Dirac fermion ψ at the quadratic order is given by (see Ref. [392]),
where the γ^{a}’s are the Dirac matrices and represents the covariant derivative, \(D\psi = d\psi  {1 \over 8}{\omega ^{ab}}[{\gamma _a},{\gamma _b}]\psi\).
In the bosonic sector, one can convert the covariant action of bosonic fields (e.g., of scalar, vector fields, etc.…) between the vielbein and the metric language without much confusion, however this is not possible for the covariant Dirac action, or other halfspin fields. For these types of matter fields, the EinsteinCartan Formulation of GR is more fundamental than its metric formulation. In doubt, one should always start with the vielbein formulation. This is especially important in the case of deconstruction when a discretization in the metric language is not equivalent to a discretization in the vielbein variables. The same holds for KaluzaKlein decomposition, a point which might have been underappreciated in the past.
6.1.2 Gaugefixing
The discretization process breaks covariance and so before staring this procedure it is wise to fix the gauge (failure to do so leads to spurious degrees of freedom which then become ghost in the fourdimensional description). We thus start in five spacetime dimensions by setting the gauge
meaning that the lapse is set to unity and the shift to zero. Notice that one could in principle only set the lapse to unity and keep the shift present throughout the discretization. From a fourdimensional point of view, the shift will then ‘morally’ play the role of the Stückelberg fields, however they do so only after a cumbersome field redefinition. So for sake of clarity and simplicity, in what follows we first gaugefix the shift and then once the fourdimensional theory is obtained to restore gauge invariance by use of the Stückelberg trick presented previously.
In vielbein language, we fix the fivedimensional coordinate system and use four Lorentz transformations to set
and use the remaining six Lorentz transformations to set
In this gauge, the fivedimensional EinsteinHilbert term (5.4), (5.5) is given by
where R [g ], is the fourdimensional curvature built out of the fourdimensional metric g_{μν}, R^{ab} is the 2form curvature built out of the fourdimensional vielbein \(e_\mu ^a\) and its associated connection \({\omega ^{ab}} = \omega _\mu ^{ab}d{x^\mu},\,{R^{ab}} = d{\omega ^{ab}} + {\omega ^a}_c\wedge{\omega ^{cb}}\), and \(K_{\,\,\,v}^\mu = {g^{\mu \alpha}}{K_{\alpha v}}\) is the extrinsic curvature,
6.1.3 Discretization in the vielbein
One could in principle go ahead and perform the discretization directly at the level of the metric but first this would not lead to a consistent truncated theory of massive gravity.^{Footnote 11} As explained previously, the vielbein is more fundamental than the metric itself, and in what follows we discretize the theory keeping the vielbein as the fundamental object.
The gauge choice (5.9) then implies
where the arrow ↪ represents the deconstruction of fivedimensional gravity. We have also introduced the ‘truncation scale’, m_{N} = Nm = ℓ^{−1} = NR^{−1}, i.e., the scale of the highest mode in the discretized theory. After discretization, we see the Deservan Nieuwenhuizen [187] condition appearing in Eq. (5.17), which corresponds to the symmetric vielbein condition. This is a sufficient condition to allow for a formulation back into the metric language [410, 314, 172]. Note, however, that as mentioned in [152], we have not assumed that this symmetric vielbein condition was true, we simply derived it from the discretization procedure in the fivedimensional gauge choice \(\omega _y^{ab} = 0\). In terms of the extrinsic curvature, this implies
This can be written back in the metric language as follows
where the square root in the extrinsic curvature appears after converting back to the metric language. The square root exists as long as the metrics g_{j} and g_{j+1} have the same signature and \(g_j^{ 1}{g_{j + 1}}\) has positive eigenvalues so if both metrics were diagonal the ‘time’ direction associated with each metric would be the same, which is a meaningful requirement.
From the metric language, we thus see that the discretization procedure amounts to converting the extrinsic curvature to an interaction between neighboring sites through the building block \({\mathcal K}_v^\mu [{g_j},{g_{j + 1}}]\)
6.2 Ghostfree massive gravity
6.2.1 Simplest discretization
In this subsection we focus on deriving a consistent theory of massive gravity from the discretization procedure (5.19, 5.20). For this, we consider a discretization with only two sites j = 1, 2 and will only be considered in the fourdimensional action induced on one site (say site 1), rather than the sum of both sites. This picture is analogous in spirit to a braneworld picture where we induce the action at one point along the extra dimension. This picture gives the theory of a unique dynamical metric, expressed in terms of a reference metric which corresponds to the fixed metric on the other site. We emphasize that this picture corresponds to a trick to build a consistent theory of massive gravity, and would otherwise be more artificial than its multigravity extension. However, as we shall see later, massive gravity can be seen as a perfectly consistent limit of multi (or bi)gravity where the massless spin2 field (and other fields in the multicase) decouple and is thus perfectly acceptable.
To simplify the notation for this twosite case, we write the vielbein on both sites as e_{1} = e, e_{2} = f, and similarly for the metrics g_{1,μν}, = g_{,μν} and g_{2,μν} = f_{μν}. Out of the fivedimensional action for GR, we obtain the theory of massive gravity in four dimensions, (on site 1),
with
with the mass term in the vielbein language
or the mass term building block in the metric language,
and we introduced the fourdimensional Planck scale, \(M_{{\rm{Pl}}}^2 = M_5^3\int {dy}\), where in this case we limit the integral about one site.
The theory of massive gravity (5.22), or equivalently (5.23) is one special example of a ghostfree theory of massive gravity (i.e., for which the BD ghost is absent). In terms of the ‘Stückelbergized’ tensor \({\mathbb X}\) introduced in Eq. (2.76), we see that
or in other words,
and the mass term can be written as
This also a generalization of the FierzPauli mass term, albeit more complicated on first sight than the ones considered in (2.83) or (2.84), but as we shall see, a generalization of the FierzPauli mass term which remains free of the BD ghost as is proven in depth in Section 7. We emphasize that the idea of the approach is not to give a proof of the absence of ghost (which is provided later) but rather to provide an intuitive argument of why the mass term takes its very peculiar structure.
6.2.2 Generalized mass term
This mass term is not the unique acceptable generalization of FierzPauli gravity and by considering more general discretization procedures we can generate the entire 2parameter family of acceptable potentials for gravity which will also be shown to be free of ghost in Section 7.
Rather than considering the straightforward discretization e (x, y) ↪ e_{j} (x), we could consider the average value on one site, pondered with arbitrary weight r,
The mass term at one site is then generalized to
and the most general action for massive gravity with reference vielbein is thus^{Footnote 12}
with
for any r, s ∈ ℝ.
In particular for the twosite case, this generates the twoparameter family of mass terms
with c_{0} = (1 − s)(1 − r), c_{1} = (−2 + 3s + 3r − 4rs), c_{2} = (1 − 3s − 3r + 6rs), c_{3} = (r + s − 4rs) and c_{4} = rs. This corresponds to the most general potential which, by construction, includes no cosmological constant nor tadpole. One can also always include a cosmological constant for such models, which would naturally arise from a cosmological constant in the fivedimensional picture.
We see that in the vielbein language, the expression for the mass term is extremely natural and simple. In fact this form was guessed at already for special cases in Ref. [410] and even earlier in [502]. However, the crucial analysis on the absence of ghosts and the reason for these terms was incorrect in both of these presentations. Subsequently, after the development of the consistent metric formulation, the generic form of the mass terms was given in Refs. [95]^{Footnote 13} and [314].
In the metric Language, this corresponds to the following Lagrangian for dRGT massive gravity [144], or its generalization to arbitrary reference metric [296]
where the two parameters are related to the two discretization parameters r, s as
and for any tensor Q, we define the scalar ℒ_{n} symbolically as
for any n = 0, ⋯ d, where d is the number of spacetime dimensions. ε is the LeviCevita antisymmetric symbol, so for instance in four dimensions, \({{\mathcal L}_2}[Q] = {\varepsilon ^{\mu v\alpha \beta}}{\varepsilon _{\mu \prime v\prime \alpha \beta}}Q_\mu ^{\mu \prime}Q_v^{v\prime} = 2!({[Q]^2}  [{Q^2}])\), so we recover the mass term expressed in (5.28). Their explicit form is given in what follows in the relations (6.11)–(6.13) or (6.16)–(6.18).
This procedure is easily generalizable to any number of dimensions, and massive gravity in d dimensions has (d − 2)free parameters which are related to the (d − 2) discretization parameters.
6.3 Multigravity
In Section 5.2, we showed how to obtain massive gravity from considering the fivedimensional EinsteinHilbert action on one site.^{Footnote 14} Instead in this section, we integrate over the whole of the extra dimension, which corresponds to summing over all the sites after discretization. Following the procedure of [152], we consider N = 2M + 1 sites to start which leads to multigravity [314], and then focus on the twosite case leading to bigravity [293].
Starting with the fivedimensional action (5.12) and applying the discretization procedure (5.31) with \({\mathcal A}_{r,\,s}^{abcd}\) given in (5.33), we get
with \(M_4^2 = M_5^3R = M_5^3/m,\,\alpha _2^{(j)} =  1/2\), and in this deconstruction framework we obtain no cosmological constant nor tadpole, \(\alpha _0^{(j)} = \alpha _1^{(j)} = 0\) at any site j, (but we keep them for generality). In the mass Lagrangian, we use the shorthand notation for the tensor \({\mathcal K}_{\,\,\,\,\,v}^\mu [{g_i},{g_{j + 1}}]\). This is a special case of multigravity presented in [314] (see also [417] for other ‘topologies’ in the way the multiple gravitons interact), where each metric only interacts with two other metrics, i.e., with its closest neighbors, leading to 2Nfree parameters. For any fixed j, one has \(\alpha _3^{(j)} = ({r_j} + {s_j})\), and \(\alpha _4^{(j)} = {r_j} + {s_j}\).
To see the mass spectrum of this multigravity theory, we perform a Fourier decomposition, which is what one would obtain (after a field redefinition) by performing a KK decomposition rather than a real space discretization. KK decomposition and deconstruction are thus perfectly equivalent (after a nonlinear — but benign^{Footnote 15} — field redefinition). We define the discrete Fourier transform of the vielbein variables,
with the inverse map,
In terms of the Fourier transform variables, the multigravity action then reads at the linear level
with \(M_{{\rm{Pl}}}^{ 1}{{\tilde h}_{\mu v,n}} = {{\tilde e}^a}_{\mu, n}\tilde e_{v,n}^b{\eta _{ab}}  {\eta _{\mu v}}\) and M_{Pl} represents the fourdimensional Planck scale, \({M_{{\rm{Pl}}}} = {M_4}\sqrt N\). The reality condition on the vielbein imposes ẽ_{n} = ẽ*_{−n} and similarly for \({\tilde h_n}\). The mass spectrum is then
The counting of the degrees of freedom in multigravity goes as follows: the theory contains 2M massive spin2 fields with five degrees of freedom each and one massless spin2 field with two degrees of freedom, corresponding to a total of 10M + 2 degrees of freedom. In the continuum limit, we also need to account for the zero mode of the lapse and the shift which have been gauged fixed in five dimensions (see Ref. [443] for a nice discussion of this point). This leads to three additional degrees of freedom, summing up to a total of 5N degrees of freedom of the four coordinates x^{2}.
6.4 Bigravity
Let us end this section with the special case of bigravity. Bigravity can also be derived from the deconstruction paradigm, just as massive gravity and multigravity, but the idea has been investigated for many years (see for instance [436, 324]). Like massive gravity, bigravity was for a long time thought to host a BD ghost parasite, but a ghostfree realization was recently proposed by Hassan and Rosen [293] and bigravity is thus experiencing a revived amount of interested. This extensions is nothing other than the ghostfree massive gravity Lagrangian for a dynamical reference metric with the addition of an EinsteinHilbert term for the now dynamical reference metric.
6.4.1 Bigravity from deconstruction
Let us consider a twosite discretization with periodic boundary conditions, j = 1, 2, 3 with quantities at the site j = 3 being identified with that at the site j = 1. Similarly, as in Section 5.2 we denote by \({g_{\mu v}} = e_\mu ^ae_v^b{\eta _{ab}}\) and by \({f_{\mu v}} = f_\mu ^af_v^b{\eta _{ab}}\) the metrics and vielbeins at the respective locations y_{1} and y_{2}.
Then applying the discretization procedure highlighted in Eqs. (5.14, 5.15, 5.18, 5.19 and 5.20) and summing over the extra dimension, we obtain the bigravity action
where \({\mathcal K}[g,f]\) is given in (5.25) and we use the notation M_{g} = M_{Pl}. We can equivalently well write the mass terms in terms of \({\mathcal K}[g,f]\) rather than \({\mathcal K}[g,f]\) as performed in (6.21).
Notice that the most naive discretization procedure would lead to M_{g} = M_{Pl} = M_{f}, but these can be generalized either ‘by hand’ by changing the weight of each site during the discretization, or by considering a nontrivial configuration along the extra dimension (for instance warping along the extra dimension^{Footnote 16}), or most simply by performing a conformal rescaling of the metric at each site.
Here, \({{\mathcal L}_0}[{\mathcal K}[g,f]]\) corresponds to a cosmological constant for the metric g_{μν} and the special combination \(\sum\nolimits_{n = 0}^4 {{{( 1)}^n}C_4^n{{\mathcal L}_n}[{\mathcal K}[g,f]]}\), where the \(C_n^m\) are the binomial coefficients is the cosmological constant for the metric f_{μν}, so only ℒ_{2,3,4} correspond to genuine interactions between the two metrics.
In the deconstruction framework, we naturally obtain α_{2} = 1 and no tadpole nor cosmological constant for either metrics.
6.4.2 Mass eigenstates
In this formulation of bigravity, both metrics g and f carry a superposition of the massless and the massive spin2 field. As already emphasize the notion of mass (and of spin) only makes sense for a field living in Minkowski, and so to analyze the mass spectrum, we expand both metrics about flat spacetime,
The general mass spectrum about different backgrounds is richer and provided in [300]. Here we only focus on a background which preserves Lorentz invariance (in principle we could also include other maximally symmetric backgrounds which hae the same amount of symmetry as Minkowski).
Working about Minkowski, then to quadratic order in h, the action for bigravity reads (for
where all indices are raised and lowered with respect to the flat Minkowski metric and the Lichnerowicz operator \(\hat \varepsilon _{\mu v}^{\alpha \beta}\) was defined in (2.37). We see appearing the FierzPauli mass term combination \(h_{\mu v}^2  {h^2}\) introduced in (2.44) for the massive field with the effective mass M_{eff} defined as [293]
The massive field h is given by
while the other combination represents the massless field ℓ_{μν},
so that in terms of the light and heavy spin2 fields (or more precisely in terms of the two mass eigenstates h and ℓ), the quadratic action for bigravity reproduces that of a massless spin2 field ℓ and a FierzPauli massive spin2 field h with mass m_{eff},
As explained in [293], in the case where there is a large Hierarchy between the two Planck scales M_{Pl} and M_{f}, the massive particles is always the one that enters at the lower Planck mass and the massless one the one that has a large Planck scale. For instance if M_{f} ≫ M_{Pl}, the massless particle is mainly given by δf_{μν} and the massive one mainly by δg_{μν}. This means that in the limit M_{f} → ∞ while keeping M_{Pl} fixed, we recover the theory of a massive gravity and a fully decoupled massless graviton as will be explained in Section 8.2.
6.5 Coupling to matter
So far we have only focus on an empty fivedimensional bulk with no matter. It is natural, though, to consider matter fields living in five dimensions, χ (x, y) with Lagrangian (in the gauge choice (5.7))
in addition to arbitrary potentials (we focus on the case of a scalar field for simplicity, but the same philosophy can be applied to higherspin species be it bosons or fermions). Then applying the same discretization scheme used for gravity, every matter field then comes in N copies
for j = 1, ⋯, N and each field χ^{j} is coupled to the associated vielbein e^{(j)} or metric \(g_{\mu v}^{(i)} = e_\mu ^{(j)}{}^ae_v^{(j)}{}^b{\eta _{ab}}\) at the same site. In the discretization procedure, the gradient along the extra dimension yields a mixing (interaction) between fields located on neighboring sites,
(assuming again periodic boundary conditions, χ^{(N +1)} = χ^{(1)}). The discretization procedure could be also performed using a more complicated definition of the derivative along y involving more than two sites, which leads to further interactions between the different fields.
In the twosight derivative formulation, the action for matter is then
The coupling to gauge fields or fermions can be derived in the same way, and the vielbein formalism makes it natural to extend the action (5.6) to five dimensions and applying the discretization procedure. Interestingly, in the case of fermions, the fields and would not directly couple to one another, but they would couple to both the vielbein e^{(j)} at the same site and the one e^{(j −1)} on the neighboring site.
Notice, however, that the current full proofs for the absence of the BD ghost do not include such couplings between matter fields living on different metrics (or vielbeins), nor matter fields coupling directly to more than one metric (vielbein).
6.6 No new kinetic interactions
In GR, diffeomorphism invariance uniquely fixes the kinetic term to be the EinsteinHilbert one
(see, for instance, Refs. [287, 483, 175, 225, 76] for the uniqueness of GR for the theory of a massless spin2 field).
In more than four dimensions, the GR action can be supplemented by additional Lovelock invariants [383] which respect diffeomorphism invariance and are expressed in terms of higher powers of the Riemann curvature but lead to second order equations of motion. In four dimensions there is only one nontrivial additional Lovelock invariant corresponding the GaussBonnet term but it is topological and thus does not affect the theory, unless other degrees of freedom such as a scalar field is included.
So, when dealing with the theory of a single massless spin2 field in four dimensions the only allowed kinetic term is the wellknown EinsteinHilbert one. Now when it comes to the theory of a massive spin2 field, diffeomorphism invariance is broken and so in addition to the allowed potential terms described in (6.9)–(6.13), one could consider other kinetic terms which break diffeomorphism.
This possibility was explored in Refs. [231, 310, 230] where it was shown that in four dimensions, the following derivative interaction \({\mathcal L}_3^{{\rm{(der)}}}\) is ghostfree at leading order (i.e., there is no higher derivatives for the Stückelberg fields when introducing the Stückelberg fields associated with linear diffeomorphism),
So this new derivative interaction would be allowed for a theory of a massive spin2 field which does not couple to matter. Note that this interaction can only be considered if the spin2 field is massive in the first place, so this interaction can only be present if the FierzPauli mass term (2.44) is already present in the theory.
Now let us turn to a theory of gravity. In that case, we have seen that the coupling to matter forces linear diffeomorphisms to be extended to fully nonlinear diffeomorphism. So to be viable in a theory of massive gravity, the derivative interaction (5.57) should enjoy a ghostfree nonlinear completion (the absence of ghost nonlinearly can be checked for instance by restoring nonlinear diffeomorphism using the nonlinear Stückelberg decomposition (2.80) in terms of the helicity1 and 0 modes given in (2.46), or by performing an ADM analysis as will be performed for the mass term in Section 7.) It is easy to check that by itself \({\mathcal L}_3^{{\rm{(der)}}}\) has a ghost at quartic order and so other nonlinear interactions should be included for this term to have any chance of being ghostfree.
Within the deconstruction paradigm, the nonlinear completion of \({\mathcal L}_3^{{\rm{(der)}}}\) could have a natural interpretation as arising from the fivedimensional GaussBonnet term after discretization. Exploring the avenue would indeed lead to a new kinetic interaction of the form \(\sqrt { g} {{\mathcal K}_{\mu v}}{{\mathcal K}_{\alpha \beta}}^*{R^{\mu v\alpha \beta}}\), where *R is the dual Riemann tensor [339, 153]. However, a simple ADM analysis shows that such a term propagates more than five degrees of freedom and thus has an Ostrogradsky ghost (similarly as the BD ghost). As a result this new kinetic interaction (5.57) does not have a natural realization from a fivedimensional point of view (at least in its metric formulation, see Ref. [153] for more details.)
We can push the analysis even further and show that no matter what the higher order interactions are, as soon as \({\mathcal L}_3^{{\rm{(der)}}}\) is present it will always lead to a ghost and so such an interaction is never acceptable [153].
As a result, the EinsteinHilbert kinetic term is the only allowed kinetic term in Lorentzinvariant (massive) gravity.
This result shows how special and unique the EinsteinHilbert term is. Even without imposing diffeomorphism invariance, the stability of the theory fixes the kinetic term to be nothing else than the EinsteinHilbert term and thus forces diffeomorphism invariance at the level of the kinetic term. Even without requiring coordinate transformation invariance, the Riemann curvature remains the building block of the kinetic structure of the theory, just as in GR.
Before summarizing the derivation of massive gravity from higher dimensional deconstruction/KaluzaKlein decomposition, we briefly comment on other ‘apparent’ modifications of the kinetic structure like in f (R) — gravity (see for instance Refs. [89, 354, 46] for f (R) massive gravity and their implications to cosmology).
Such kinetic terms à la f (R) are also possible without a mass term for the graviton. In that case diffeomorphism invariance allows us to perform a change of frame. In the Einsteinframe f (R) gravity is seen to correspond to a theory of gravity with a scalar field, and the same result will hold in f (R) massive gravity (in that case the scalar field couples nontrivially to the Stückelberg fields). As a result f (R) is not a genuine modification of the kinetic term but rather a standard EinsteinHilbert term and the addition of a new scalar degree of freedom which not a degree of freedom of the graviton but rather an independent scalar degree of freedom which couples nonminimally to matter (see Ref. [128] for a review on f (R)gravity.)
7 Part II Ghostfree Massive Gravity
8 Massive, Bi and MultiGravity Formulation: A Summary
The previous ‘deconstruction’ framework gave a intuitive argument for the emergence of a potential of the form (6.3) (or (6.1) in the vielbein language) and its bi and multimetric generalizations. In deconstruction or KaluzaKlein decomposition a certain type of interaction arises naturally and we have seen that the whole spectrum of allowed potentials (or interactions) could be generated by extending the deconstruction procedure to a more general notion of derivative or by involving the mixing of more sites in the definition of the derivative along the extra dimensions. We here summarize the most general formulation for the theories of massive gravity about a generic reference metric, bigravity and multigravity and provide a dictionary between the different languages used in the literature.
The general action for ghostfree (or dRGT) massive gravity [144] in the vielbein language is [95, 314] (see however Footnote 13 with respect to Ref. [95], see also Refs. [502, 410] for earlier work)
with
or in the metric language [144],
In what follows we will use the notation for the overall potential of massive gravity
so that
where ℒ_{GR}[g ] is the standard GR EinsteinHilbert Lagrangian for the dynamical metric g_{μν} and f_{μν} is the reference metric and for bigravity,
where both g_{μν} and f_{μν} are then dynamical metrics.
Both massive gravity and bigravity break one copy of diff invariance and so the Stückelberg fields can be introduced in exactly the same way in both cases \({\mathcal U}[g,f] \to {\mathcal U}[g,\tilde f]\) where the Stückelbergized metric \({\tilde f_{\mu v}}\) was introduced in (2.75) (or alternatively \({\mathcal U}[g,f] \to {\mathcal U}[\tilde g,f]\). Thus bigravity is by no means an alternative to introducing the Stückelberg fields as is sometimes stated.
In these formulations, ℒ_{0} (or the term proportional to c_{0}) correspond to a cosmological constant, ℒ_{1} to a tadpole, ℒ_{2} to the mass term and ℒ_{3,4} to allowed higher order interactions. The presence of the tadpole ℒ_{1} would imply a nonzero vev. The presence of the potentials ℒ_{3,4} without ℒ_{2} would lead to infinitely strongly coupled degrees of freedom and would thus be pathological. We recall that \({\mathcal K}[g,f]\) is given in terms of the metrics g and f as
and the Lagrangians ℒ_{n} are defined as follows in arbitrary dimensions d [144]
with ℒ_{0}[Q ] = d ! and = (ℒ_{1}[Q ] = (d − 1)![Q ] or equivalently in four dimensions [292]
We have introduced the constant \({{\mathcal L}_0}\) (\({{\mathcal L}_0} = 4!\) and \(\sqrt { g{{\mathcal L}_0}}\) is nothing other than the cosmological constant) and the tadpole ℒ_{1} for completeness. Notice however that not all these five Lagrangians are independent and the tadpole can always be reexpressed in terms of a cosmological constant and the other potential terms.
Alternatively, we may express these scalars as follows [144]
These are easily generalizable to any number of dimensions, and in d dimensions we find d such independent scalars.
The multigravity action is a generalization to multiple interacting spin2 fields with the same form for the interactions, and bigravity is the special case of two metrics (N = 2), [314]
or
8.1 Inverse argument
We could have written this set of interactions in terms of \({\mathcal K}[f,g]\) rather than \({\mathcal K}[g,f]\),
with
Interestingly, the absence of tadpole and cosmological constant for say the metric implies α_{0} = α_{1} = 0 which in turn implies the absence of tadpole and cosmological constant for the other metric f, ã_{0} = ã_{1} = 0, and thus ã_{2} = α_{2} = 1.
8.2 Alternative variables
Alternatively, another fully equivalent convention has also been used in the literature [292] in terms of \({\mathbb X}_{\,\,\,v}^\mu = {g^{\mu \alpha}}{f_{\alpha v}}\) defined in (2.76),
which is equivalent to (6.4) with ℒ_{0} = 4! and
or the inverse relation,
so that in order to avoid a tadpole and a cosmological constant we need to set for instance β_{4} = − (24β_{0} + 24β_{1} + 12β_{2} + 4β_{3}) and β_{3} = −6(4β_{0} + 3β_{1} + β_{2}).
8.3 Expansion about the reference metric
In the vielbein language the mass term is extremely simple, as can be seen in Eq. (6.1) with \({\mathcal A}\) defined in (2.60). Back to the metric language, this means that the mass term takes a remarkably simple form when writing the dynamical metric g_{μν} in terms of the reference metric and a difference \({\tilde h_{\mu v}} = 2{h_{\mu v}} + h_{\mu v}^2\) as
where f^{αβ} = (f^{−1})^{αβ} The mass terms is then expressed as
where the ℒ_{n} have the same expression as the ℒ_{n} in (6.9)(6.13) so \({\tilde {\mathcal L}_n}\) is genuinely n^{th} order in h_{μν}. The expression (6.27) is thus at most quartic order in h_{μν} but is valid to all orders in h_{μν}, (there is no assumption that h be small). In other words, the mass term (6.27) is not an expansion in h_{μν} truncated to a finite (quartic) order, but rather a fully equivalent way to rewrite the mass Lagrangian in terms of the variable h_{μν} rather than g_{μν}. Of course the kinetic term is intrinsically nonlinear and includes a infinite expansion in h_{μν}. A generalization of such parameterizations are provided in [300].
The relation between the coefficients κ_{n} and α_{n} is given by
The quadratic expansion about a background different from the reference metric was derived in Ref. [278]. Notice however that even though the mass term may not appear as having an exact FierzPauli structure as shown in [278], it still has the correct structure to avoid any BD ghost, about any background [295, 294, 300, 297].
9 Evading the BD Ghost in Massive Gravity
The deconstruction framework gave an intuitive approach on how to construct a theory of massive gravity or multiple interacting ‘gravitons’. This lead to the ghostfree dRGT theory of massive gravity and its bi and multigravity extensions in a natural way. However, these developments were only possible a posteriori.
The deconstruction framework was proposed earlier (see Refs. [24, 25, 168, 28, 443, 168, 170]) directly in the metric language and despite starting from a perfectly healthy fivedimensional theory of GR, the discretization in the metric language leads to the standard BD issue (this also holds in a KK decomposition when truncating the KK tower at some finite energy scale). Knowing that massive gravity (or multigravity) can be naturally derived from a healthy fivedimensional theory of GR is thus not a sufficient argument for the absence of the BD ghost, and a great amount of effort was devoted to that proof, which is known by now a multitude of different forms and languages.
Within this review, one cannot make justice to all the independent proofs that have been formulated by now in the literature. We thus focus on a few of them — the Hamiltonian analysis in the ADM language — as well as the analysis in the Stückelberg language. One of the proofs in the vielbein formalism will be used in the multigravity case, and thus we do not emphasize that proof in the context of massive gravity, although it is perfectly applicable (and actually very elegant) in that case. Finally, after deriving the decoupling limit in Section 8.3, we also briefly review how it can be used to prove the absence of ghost more generically.
We note that even though the original argument on how the BD ghost could be circumvented in the full nonlinear theory was presented in [137] and [144], the absence of BD ghost in “ghostfree massive gravity” or dRGT has been the subject of many discussions [12, 13, 345, 342, 95, 341, 344, 96] (see also [350, 351, 349, 348, 352] for related discussions in bigravity). By now the confusion has been clarified, and see for instance [295, 294, 400, 346, 343, 297, 15, 259] for thorough proofs addressing all the issues raised in the previous literature. (See also [347] for the proof of the absence of ghosts in other closely related models).
9.1 ADM formulation
9.1.1 ADM formalism for GR
Before going onto the subtleties associated with massive gravity, let us briefly summarize how the counting of the number of degrees of freedom can be performed in the ADM language using the Hamiltonian for GR. Using an ADM decomposition (where this time, we single out the time, rather than the extra dimension as was performed in Part I),
with the lapse N, the shift and the 3dimensional space metric γ_{ij}. In this section indices are raised and lowered with respect to γ_{ij} and dots represent derivatives with respect to t. In terms of these variables, the action density for GR is
where ^{(3)}R is the threedimensional scalar curvature built out of γ (no time derivatives in ^{(3)}R) and _{i} is the threedimensional extrinsic curvature,
The GR action can thus be expressed in a way which has no double or higher time derivatives and only first timederivatives squared of γ_{ij} This means that neither the shift nor the lapse are truly dynamical and they do not have any associated conjugate momenta. The conjugate momentum associated with γ is,
We can now construct the Hamiltonian density for GR in terms of the 12 phase space variables (γ_{ij} and p^{ij} carry 6 component each),
So we see that in GR, both the shift and the lapse play the role of Lagrange multipliers. Thus they propagate a firstclass constraint each which removes 2 phase space degrees of freedom per constraint. The counting of the number of degrees of freedom in phase space thus goes as follows:
corresponding to a total of 4 degrees of freedom in phase space, or 2 independent degrees of freedom in field space. This is the very wellknown and established result that in four dimensions GR propagates 2 physical degrees of freedom, or gravitational waves have two polarizations.
This result is fully generalizable to any number of dimensions, and in spacetime dimensions, gravitational waves carry d (d − 3)/2 polarizations. We now move to the case of massive gravity.
9.1.2 ADM counting in massive gravity
We now amend the GR Lagrangian with a potential \({\mathcal U}\). As already explained, this can only be performed by breaking covariance (with the exception of a cosmological constant). This potential could be a priori an arbitrary function of the metric, but contains no derivatives and so does not affect the definition of the conjugate momenta p^{ij} This translates directly into a potential at the level of the Hamiltonian density,
where the overall potential for ghostfree massive gravity is given in (6.4).
If \({\mathcal U}\) depends nonlinearly on the shift or the lapse then these are no longer directly Lagrange multipliers (if they are nonlinear, they still appear at the level of the equations of motion, and so they do not propagate a constraint for the metric but rather for themselves). As a result for an arbitrary potential one is left with (2 × 6) degrees of freedom in the threedimensional metric and its momentum conjugate and no constraint is present to reduce the phase space. This leads to 6 degrees of freedom in field space: the two usual transverse polarizations for the graviton (as we have in GR), in addition to two ‘vector’ polarizations and two ‘scalar’ polarizations.
These 6 polarizations correspond to the five healthy massive spin2 field degrees of freedom in addition to the sixth BD ghost, as explained in Section 2.5 (see also Section 7.2).
This counting is also generalizable to an arbitrary number of dimensions, in spacetime dimensions, a massive spin2 field should propagate the same number of degrees of freedom as a massless spin2 field in d + 1 dimensions, that is (d + 1)(d − 2)/2 polarizations. However, an arbitrary potential would allow for d (d − 1)/2 independent degrees of freedom, which is 1 too many excitations, always corresponding to one BD ghost degree of freedom in an arbitrary number of dimensions.
The only way this counting can be wrong is if the constraints for the shift and the lapse cannot be inverted for the shift and the lapse themselves, and thus at least one of the equations of motion from the shift or the lapse imposes a constraint on the threedimensional metric γ_{ij}. This loophole was first presented in [138] and an example was provided in [137]. It was then used in [144] to explain how the ‘nogo’ on the presence of a ghost in massive gravity could be circumvented. Finally, this argument was then carried through fully nonlinearly in [295] (see also [342] for the analysis in 1 + 1 dimensions as presented in [144]).
9.1.3 Eliminating the BD ghost
9.1.3.1 Linear FierzPauli massive gravity
FierzPauli massive gravity is special in that at the linear level (quadratic in the Hamiltonian), the lapse remains linear, so it still acts as a Lagrange multiplier generating a primary secondclass constraint. Defining the metric as h_{μν} = M_{Pl}(g_{μν} − η_{μν}), (where for simplicity and definiteness we take Minkowski as the reference metric f_{μν} = η_{μν}, although most of what follows can be easily generalizable to an arbitrary reference metric f_{μν}). Expanding the lapse as N = 1 + δN, we have h_{00} = δN + γ_{ij}N^{i}N^{j} and h_{0i} = γ_{ij}N^{j}. In the ADM decomposition, the FierzPauli mass term is then (see Eq. (2.45))
and is linear in the lapse. This is sufficient to deduce that it will keep imposing a constraint on the threedimensional phase space variables {γ_{ij}, p^{ij}} and remove at least half of the unwanted BD ghost. The shift, on the other hand, is nonlinear already in the FierzPauli theory, so their equations of motion impose a relation for themselves rather than a constraint for the threedimensional metric. As a result the FierzPauli theory (at that order) propagates three additional degrees of freedom than GR, which are the usual five degrees of freedom of a massive spin2 field. Nonlinearly however the FierzPauli mass term involve a nonlinear term in the lapse in such a way that the constraint associated with it disappears and FierzPauli massive gravity has a ghost at the nonlinear level, as pointed out in [75]. This is in complete agreement with the discussion in Section 2.5, and is a complementary way to see the issue.
In Ref. [111], the most general potential was considered up to quartic order in the h_{μν}, and it was shown that there is no choice of such potential (apart from a pure cosmological constant) which would prevent the lapse from entering nonlinearly. While this result is definitely correct, it does not however imply the absence of a constraint generated by the set of shift and lapse N^{μ} = {N, N^{i}}. Indeed there is no reason to believe that the lapse should necessarily be the quantity to generates the constraint necessary to remove the BD ghost. Rather it can be any combination of the lapse and the shift.
9.1.3.2 Example on how to evade the BD ghost nonlinearly
As an instructive example presented in [137], consider the following Hamiltonian,
with the following example for the potential
In this example neither the lapse nor the shift enter linearly, and one might worry on the loss of the constraint to project out the BD ghost. However, upon solving for the shift and substituting back into the Hamiltonian (this is possible since the lapse is not dynamical), we get
and the lapse now appears as a Lagrange multiplier generating a constraint, even though it was not linear in (7.10). This could have been seen more easily, without the need to explicitly integrating out the shift by computing the Hessian
In the example (7.10), one has
The Hessian cannot be inverted, which means that the equations of motion cannot be solved for all the shift and the lapse. Instead, one of these ought to be solved for the threedimensional phase space variables which corresponds to the primary secondclass constraint. Note that this constraint is not associated with a symmetry in this case and while the Hamiltonian is then pure constraint in this toy example, it will not be in general.
Finally, one could also have deduce the existence of a constraint by performing the linear change of variable
in terms of which the Hamiltonian is then explicitly linear in the lapse,
and generates a constraint that can be read for {n_{i}, γ_{ij}, p^{ij}}.
9.1.3.3 Condition to evade the ghost
To summarize, the condition to eliminate (at least half of) the BD ghost is that the det of the Hessian (7.13) L_{μν} vanishes as explained in [144]. This was shown to be the case in the ghostfree theory of massive gravity (6.3) [(6.1)] exactly in some cases and up to quartic order, and then fully nonlinearly in [295]. We summarize the derivation in the general case in what follows.
Ultimately, this means that in massive gravity we should be able to find a new shift n^{i} related to the original one as follows N^{i} = f_{0}(γ, n) + Nf_{1}(γ, n), such that the Hamiltonian takes the following factorizable form
In this form, the equation of motion for the shift is manifestly independent of the lapse and integrating over the shift n^{i} manifestly keeps the Hamiltonian linear in the lapse and has the constraint \({{\mathcal C}_1}(\gamma, p){\mathcal F}(\gamma, p,{n^i}(\gamma)) + {{\mathcal C}_2}(\gamma, p) = 0\). However, such a field redefinition has not (yet) been found. Instead, the new shift n^{i} found below does the next best thing (which is entirely sufficient) of a. Keeping the Hamiltonian linear in the lapse and b. Keeping its own equation of motion independent of the lapse, which is sufficient to infer the presence of a primary constraint.
9.1.3.4 Primary constraint
We now proceed by deriving the primary firstclass constraint present in ghostfree (dRGT) massive gravity. The proof works equally well for any reference at no extra cost, and so we consider a general reference metric in its own ADM decomposition, while keep the dynamical metric in its original ADM form (since we work in unitary gauge, we may not simplify the metric further),
and denote again by p^{ij} the conjugate momentum associated with γ_{ij}. \({\overset  f _{ij}}\) is not dynamical in massive gravity so there is no conjugate momenta associated with it. The bars on the reference metric are there to denote that these quantities are parameters of the theory and not dynamical variables, although the proof for a dynamical reference metric and multigravity works equally well, this is performed in Section 7.4.
Proceeding similarly as in the previous example, we perform a change of variables similar as in (7.15) (only more complicated, but which remains linear in the lapse when expressing in terms of n^{i}) [295, 296]
where the matrix \(D_j^i\) satisfies the following relation
with
In what follows we use the definition
with
The field redefinition naturally involves a square root through the expression of the matrix D in (7.21), which should come as no surprise from the square root structure of the potential term. For the potential to be writable in the metric language, the square root in the definition of the tensor \({\mathcal K}_{\,\,\,v}^\mu\), should exist, which in turns imply that the square root in the definition of \(D_j^i\) in (7.21) must also exist. While complicated, the important point to notice is that this field redefinition remains linear in the lapse (and so does not spoil the standard constraints of GR).
The Hamiltonian for massive gravity is then
where \({\mathcal U}\) includes the new contributions from the mass term. \({\mathcal U}(\gamma, {N^i},N)\) is neither linear in the lapse N, nor in the shift N^{i}. There is actually no choice of potential \({\mathcal U}\) which would keep it linear in the lapse beyond cubic order [111]. However, as we shall see, when expressed in terms of the redefined shift n^{i}, the nonlinearities in the shift absorb all the original nonlinearities in the lapse and \({\mathcal U}(\gamma, {n^i}N)\). In itself this is not sufficient to prove the presence of a constraint, as the integration over the shift n^{i} could in turn lead to higher order lapse in the Hamiltonian,
with
where the β’s are expressed in terms of the α’s as in (6.28). For the purpose of this analysis it is easier to work with that notation.
The structure of the potential is so that the equations of motion with respect to the shift are independent of the lapse and impose the following relations in terms of \({\bar n_i} = {n^j}\,{\bar f_{ij}}\),
which entirely fixes the three shifts n^{i} in terms of γ_{ij} and p^{ij} as well as the reference metric \({\overset  f _{ij}}\) (note that \({\overset  {\mathcal N} ^i}\) entirely disappears from these equations of motion).
The two requirements defined previously are thus satisfied: a. The Hamiltonian is linear in the lapse and b. the equations of motion with respect to the shift n^{i} are independent of the lapse, which is sufficient to infer the presence of a primary constraint. This primary constraint is derived by varying with respect to the lapse and evaluating the shift on the constraint surface (7.29),
where the symbol “≈” means on the constraint surface. The existence of this primary constraint is sufficient to infer the absence of BD ghost. If we were dealing with a generic system (which could allow for some spontaneous parity violation), it could still be in principle that there are no secondary constraints associated with \({{\mathcal C}_0} = 0\) and the theory propagates 5.5 physical degrees of freedom (11 dofs in phase space). However, physically this never happens in the theory of gravity we are dealing with preserves parity and is Lorentz invariant. Indeed, to have 5.5 physical degrees of freedom, one of the variables should have an equation of motion which is linear in time derivatives. Lorentz invariance then implies that it must also be linear in space derivatives which would then violate parity. However, this is only an intuitive argument and the real proof is presented below. Indeed, it ghostfree massive gravity admits a secondary constraint which was explicitly found in [294].
9.1.3.5 Secondary constraint
Let us imagine we start with initial conditions that satisfy the constraints of the system, in particular the modified Hamiltonian constraint (7.30). As the system evolves the constraint (7.30) needs to remain satisfied. This means that the modified Hamiltonian constraint ought to be independent of time, or in other words it should commute with the Hamiltonian. This requirement generates a secondary constraint,
with \({H_{{\rm{mGR,1}}}} = \int {{{\rm{d}}^{\rm{3}}}x{{\mathcal H}_{{\rm{mGR,1}}}}}\) and
Finding the precise form of this secondary constraint requires a very careful analysis of the Poisson bracket algebra of this system. This formidable task lead to some confusions at first (see Refs. [345]) but was then successfully derived in [294] (see also [258, 259] and [343]). Deriving the whole set of Poisson brackets is beyond the scope of this review and we simply give the expression for the secondary constraint,
where unless specified otherwise, all indices are raised and lowered with respect to the dynamical metric γ_{ij}, and the covariant derivatives are also taken with respect to the same metric. We also define
The important point to notice is that the secondary constraint (7.33) only depends on the phase space variables γ_{ij}, p^{ij} and not on the lapse N. Thus it constraints the phase space variables rather than the lapse and provides a genuine secondary constraint in addition to the primary one (7.30) (indeed one can check that \({{\mathcal C}_2}{\vert_{{{\mathcal C}_{0 = 0}} \ne 0}}\).
Finally, we should also check that this secondary constraint is also maintained in time. This was performed [294], by inspecting the condition
This condition should be satisfied without further constraining the phase space variables, which would otherwise imply that fewer than five degrees of freedom are propagating. Since five fully fledged dofs are propagating at the linearized level, the same must happen nonlinearly.^{Footnote 17} Rather than a constraint on {γ_{ij}, p^{ij}}, (7.36) must be solved for the lapse. This is only possible if both the two following conditions are satisfied
As shown in [294], since these conditions do not vanish at the linear level (the constraints reduce to the FierzPauli ones in that case), we can deduce that they cannot vanish nonlinearly and thus the condition (7.36) fixes the expression for the lapse rather than constraining further the phase space dofs. Thus there is no tertiary constraint on the phase space.
To conclude, we have shown in this section that ghostfree (or dRGT) massive gravity is indeed free from the BD ghost and the theory propagates five physical dofs about generic backgrounds. We now present the proof in other languages, but stress that the proof developed in this section is sufficient to infer the absence of BD ghost.
9.1.3.6 Secondary constraints in bi and multigravity
In bi or multigravity where all the metrics are dynamical the Hamiltonian is pure constraint (every term is linear in the one of the lapses as can be seen explicitly already from (7.25) and (7.26)).
In this case, the evolution equation of the primary constraint can always be solved for their respective Lagrange multiplier (lapses) which can always be set to zero. Setting the lapses to zero would be unphysical in a theory of gravity and instead one should take a ‘bifurcation’ of the Dirac constraint analysis as explained in [48]. Rather than solving for the Lagrange multipliers we can choose to use the evolution equation of some of the primary constraints to provide additional secondary constraints instead of solving them for the lagrange multipliers.
Choosing this bifurcation leads to statements which are then continuous with the massive gravity case and one recovers the correct number of degrees of freedom. See Ref. [48] for an enlightening discussion.
9.2 Absence of ghost in the Stückelberg language
9.2.1 Physical degrees of freedom
Another way to see the absence of ghost in massive gravity is to work directly in the Stückelberg language for massive spin2 fields introduced in Section 2.4. If the four scalar fields ϕ^{a} were dynamical, the theory would propagate six degrees of freedom (the two usual helicity2 which dynamics is encoded in the standard EinsteinHilbert term, and the four Stückelberg fields). To remove the sixth mode, corresponding to the BD ghost, one needs to check that not all four Stückelberg fields are dynamical but only three of them. See also [14] for a theory of two Stückelberg fields.
Stated more precisely, in the Stückelberg language beyond the DL, if ℰ^{a} is the equation of motion with respect to the field the correct requirement for the absence of ghost is that the Hessian defined as
be not invertible, so that the dynamics of not all four Stückelberg may be derived from it. This is the case if
as first explained in Ref. [145]. This condition was successfully shown to arise in a number of situations for the ghostfree theory of massive gravity with potential given in (6.3) or equivalently in (6.1) in Ref. [145] and then more generically in Ref. [297].^{Footnote 18} For illustrative purposes, we start by showing how this constraint arises in simple twodimensional realization of ghostfree massive gravity before deriving the more general proof.
9.2.2 Twodimensional case
Consider massive gravity on a twodimensional spacetime, ds^{2} = − N^{2} dt^{2} + γ (dx + N_{x} dt)^{2}, with the two Stückelberg fields ϕ^{0,1} [145]. In this case the graviton potential can only have one independent nontrivial term, (excluding the tadpole),
In lightcone coordinates,
the potential is thus
The Hessian of this Lagrangian with respect to the two Stückelberg fields ϕ^{±} is then
and is clearly noninvertible, which shows that not both Stückelberg fields are dynamical. In this special case, the Hamiltonian is actually pure constraint as shown in [145], and there are no propagating degrees of freedom. This is as expected for a massive spintwo field in two dimensions.
As shown in Refs. [144, 145] the square root can be traded for an auxiliary nondynamical variable \(\lambda _{\,\,\,\,v}^\mu\). In this twodimensional example, the mass term (7.43) can be rewritten with the help of an auxiliary nondynamical variable λ as
A similar trick will be used in the full proof.
9.2.3 Full proof
The full proof in the minimal model (corresponding to α_{2} = 1 and α_{3} = −2/3 and α_{4} = 1/6 in (6.3) or β_{2} = β_{3} = 0 in the alternative formulation (6.23)), was derived in Ref. [297]. We briefly review the essence of the argument, although the full technical derivation is beyond the scope of this review and refer the reader to Refs. [297] and [15] for a fullyfledged derivation.
Using a set of auxiliary variables \(\lambda _b^a\) (with λ_{ab} = λ_{ba}, so these auxiliary variables contain ten elements in four dimensions) as explained previously, we can rewrite the potential term in the minimal model as [79, 342],
where the matrix Y has been defined in (2.77) and is equivalent to X used previously. Upon integration over the auxiliary variable λ we recover the squareroot structure as mentioned in Ref. [144]. We now perform an ADM decomposition as in (7.1) which implies the ADM decomposition on the matrix Y,
with
Since the matrix uses a projection along the 3 spatial directions it is genuinely a rank3 matrix rather than rank 4. This implies that det V = 0. Notice that we consider an arbitrary reference metric f, as the proof does not depend on it and can be done for any f at no extra cost [297]. The canonical momenta conjugate to ϕ^{a} is given by
with
In terms of these conjugate momenta, the equations of motion with respect to λ^{ab} then imposes the relation (after multiplying with the matrix^{Footnote 19} α λ on both side),
with the matrix C_{ab} defined as
Since det V = 0, as mentioned previously, the equation of motion (7.52) is only consistent if we also have det C = 0. This is the first constraint found in [297] which is already sufficient to remove (half) the BD ghost,
which is the primary constraint on a subset of physical phase space variables {γ_{ij}, p_{a}}, (by construction det f ≠ 0). The secondary constraint is then derived by commuting \({{\mathcal C}_1}\) with the Hamiltonian. Following the derivation of [297], we get on the constraint surface
where π^{ij} is the momentum conjugate associated with γ_{ij}, and Δ^{(f)} is the covariant derivative associated with f.
9.2.4 Stückelberg method on arbitrary backgrounds
When working about different nonMinkowski backgrounds, one can instead generalize the definition of the helicity0 mode as was performed in [400]. The essence of the argument is to perform a rotation in field space so that the fluctuations of the Stückelberg fields about a curved background form a vector field in the new basis, and one can then employ the standard treatment for a vector field. See also [10] for another study of the Stückelberg fields in an FLRW background.
Recently, a covariant Stückelberg analysis valid about any background was performed in Ref. [369] using the BRST formalism. Interestingly, this method also allows to derive the decoupling limit of massive gravity about any background.
In what follows, we review the approach derived in [400] which provides yet another independent argument for the absence of ghost in all generalities. The proofs presented in Sections 7.1 and 7.2 work to all orders about a trivial background while in [400], the proof is performed about a generic (curved) background, and the analysis can thus stop at quadratic order in the fluctuations. Both types of analysis are equivalent so long as the fields are analytic, which is the case if one wishes to remain within the regime of validity of the theory.
Consider a generic background metric, which in unitary gauge (i.e., in the coordinate system {x} where the Stückelberg background fields are given by \({\phi ^a}(x) = {x^\mu}\delta _\mu ^a\), the background metric is given by \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b{(x)_{{\eta _{ab}}}}\), and the background Stückelberg fields are given by \(\phi _{{\rm{bg}}}^a(x) = {x^a}  A_{{\rm{bg}}}^a(x)\).
We now add fluctuations about that background,
with \({A^a} = A_{{\rm{bg}}}^a + {a^a}\).
9.2.4.1 Flat background metric
First, note that if we consider a flat background metric to start with, then at zeroth order in h, the ghostfree potential is of the form [400], (this can also be seen from [238, 419])
with F_{ab} = ∂_{a}A_{b} − ∂_{b}A_{a}. This means that for a symmetric Stückelberg background configuration, i.e., if the matrix \({\partial _\mu}\phi _{{\rm{bg}}}^a\) is symmetric, then \(F_{ab}^{{\rm{bg}}} = 0\), and at quadratic order in the fluctuation a, the action has a U (1)symmetry. This symmetry is lost nonlinearly, but is still relevant when looking at quadratic fluctuations about arbitrary backgrounds. Now using the split about the background, \({A^a} + A_{{\rm{bg}}}^a + {a^a}\), this means that up to quadratic order in the fluctuations a^{a}, the action at zeroth order in the metric fluctuation is of the form [400]
with f_{μν} = ∂_{μ}a_{ν}, − ∂_{ν}a_{μ} and \({\overset  B ^{\mu \alpha v\beta}}\) is a set of constant coefficients which depends on \(A_{{\rm{bg}}}^a\). This quadratic action has an accidental U (1)symmetry which is responsible for projecting out one of the four dofs naively present in the four Stückelberg fluctuations a^{a}. Had we considered any other potential term, the U (1) symmetry would have been generically lost and all four Stückelberg fields would have been dynamical.
9.2.4.2 Nonsymmetric background Stückelberg
If the background configuration is not symmetric, then at every point one needs to perform first an internal Lorentz transformation Λ(x) in the Stückelberg field space, so as to align them with the coordinate basis and recover a symmetric configuration for the background Stückelberg fields. In this new Lorentz frame, the Stückelberg fluctuation is \({\tilde a^\mu} = \Lambda _v^\mu (x){a_v}\). As a result, to quadratic order in the Stückelberg fluctuation the part of the ghostfree potential which is independent of the metric fluctuation and its curvature goes symbolically as (7.60) with f replaced by \(f \to \tilde f + (\partial \Lambda){\Lambda ^{ 1}}\tilde a\), (with = \({\tilde f_{\mu v}} = {\partial _\mu}{\tilde a_v}  {\partial _v}{\tilde a_\mu}\)). Interestingly, the Lorentz boost (∂ Λ)Λ^{−1} now plays the role of a mass term for what looks like a gauge field ã. This mass term breaks the U (1) symmetry, but there is still no kinetic term for ã_{0}, very much as in a Proca theory. This part of the potential is thus manifestly ghostfree (in the sense that it provides a dynamics for only three of the four Stückelberg fields, independently of the background).
Next, we consider the mixing with metric fluctuation h while still assuming zero curvature. At linear order in h, the ghostfree potential, (6.3) goes as follows
where the tensors \(X_{\mu v}^{(n)}\) are similar to the ones found in the decoupling limit, but now expressed in terms of the symmetric full four Stückelberg fields rather than just π, i.e., replacing by ∂_{μ}A_{ν} + ∂_{ν}A_{μ} in the respective expressions (8.29), (8.30) and (8.31) for \(X_{\mu v}^{(1,2,3)}\). Starting with the symmetric configuration for the Stückelberg fields, then since we are working at the quadratic level in perturbations, one of the A_{μ} in the \(X_{\mu v}^{(n)}\) is taken to be the fluctuation a_{μ}, while the others are taken to be the background field \(A_\mu ^{{\rm{bg}}}\). As a result in the first terms in hX in (7.61) ∂_{0}a_{0} cannot come at the same time as h_{00} or h_{0i}, and we can thus integrate by parts the time derivative acting on any a_{0}, leading to a harmless first time derivative on h_{ij}, and no time evolution for a_{0}.
As for the second type of term in (7.61), since F = 0 on the background field \(A_\mu ^{{\rm{bg}}}\), the second type of terms is forced to be proportional to f_{μν} and cannot involve any ∂_{0} a_{0} at all. As a result a_{0} is not dynamical, which ensures that the theory is free from the BD ghost.
This part of the argument generalizes easily for non symmetric background Stückelberg configurations, and the same replacement \(f \to \tilde f + (\partial \Lambda){\Lambda ^{ 1}}\tilde a\) still ensures that ã_{0} acquires no dynamics from (7.61).
9.2.4.3 Background curvature
Finally, to complete the argument, we consider the effect from background curvature, then \(g_{\mu v}^{{\rm{bg}}} \ne {\eta _{\mu v}}\) with \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b(x)\). The spacetime curvature is another source of ‘misalignment’ between the coordinates and the Stückelberg fields. To rectify for this misalignment, we could go two ways: Either perform a local change of coordinate so as to align the background metric \(g_{\mu v}^{{\rm{bg}}}\) with the flat reference metric η_{μν} (i.e., going to local inertial frame), or the other way around: i.e., express the flat reference metric in terms of the curved background metric, \({\eta _{ab}} = e_a^\mu e_b^vg_{\mu v}^{{\rm{bg}}}\), in terms of the inverse vielbein, \(e_a^\mu \equiv ({e^{ 1}})_a^\mu\). Then the building block of ghostfree massive gravity is the matrix \({\mathbb X}\), defined previously as
As a result, the whole formalism derived previously is directly applicable with the only subtlety that the Stückelberg fields ϕ^{a} should be replaced by their ‘vielbeindependent’ counterparts, i.e., \({\partial _\mu }{A_\nu } \to {g_{\mu {\nu ^{{\text{bg}}}}}}  g_{\nu \alpha }^{{\text{bg}}}e_{{\kern 1pt} a}^\alpha {\partial _\mu }{\phi ^a}\). In terms of the Stückelberg field fluctuation a^{a}, this implies the replacement \({a^a} \to {\bar a_\mu} = g_{\mu v}^{{\rm{bg}}}e_{\,\,\,\,\,a}^v{a^a}\), and symbolically, \(f \to \bar f + (\partial \Sigma){\Sigma ^{ 1}}\bar a\), with Σ = gė. The situation is thus the same as when we were dealing with a nonsymmetric Stückelberg background configuration, after integration by parts (which might involve curvature harmless contributions), the potential can be written in a way which never involves any time derivative on ā_{0}. As a result, ā_{μ}, plays the role of an effective Proca vector field which only propagates three degrees of freedom, and this about any curved background metric. The beauty of this argument lies in the correct identification of the proper degrees of freedom when dealing with a curved background metric.
9.3 Absence of ghost in the vielbein formulation
Finally, we can also prove the absence of ghost for dRGT in the Vielbein formalism, either directly at the level of the Lagrangian in some special cases as shown in [171] or in full generality in the Hamiltonian formalism, as shown in [314]. The later proof also works in all generality for a multigravity theory and will thus be presented in more depth in what follows, but we first focus on a special case presented in Ref. [171].
Let us start with massive gravity in the vielbein formalism (6.1). As was the case in Part II, we work with the symmetric vielbein condition, \(e_\mu ^af_v^b{\eta _{ab}} = e_v^af_\mu ^b{\eta _{ab}}\). For simplicity we specialize further to the case where \(f_\mu ^a = \delta _\mu ^a\), so that the symmetric vielbein condition imposes e^{aμ} = e^{μa}. Under this condition, the vielbein contains as many independent components as the metric. The symmetric veilbein condition ensures that one is able to reformulate the theory in a metric language. In spacetime dimensions, there is a priori d (d + 1)/2 independent components in the symmetric vielbein.
Varying the action (6.1) with respect to the vielbein leads to the modified Einstein equation,
with G_{a} = ε_{abcd}ω^{bc} ∧ e^{d}. From the Bianchi identity, \({\mathcal D}{G_a} = {\rm{d}}{G_a} = {\rm{d}}{G_a}  \omega _a^b{G_b}\), we infer the d constraints
leading to d (d − 1)/2 independent components in the vielbein. This is still one too many component, unless an additional constraint is found. The idea behind the proof in Ref. [171], is then to use the Bianchi identities to infer an additional constraint of the form,
where m^{a} is an appropriate oneform which depends on the specific coefficients of the theory. Such a constrain is present at the linear level for FierzPauli massive gravity, and it was further shown in Ref. [171] that special choices of coefficients for the theory lead to remarkably simple analogous relations fully nonlinearly. To give an example, we consider all the coefficients c_{n} to vanish but c_{1} ≠ 0. In that case the Bianchi identity (7.65) implies
where similarly as in (5.2), the torsionless connection is given in term of the vielbein as
with \({o^{ab}}_c = 2{e^{a\mu}}{e^{bv}}{\partial _{\left[ \mu \right.}}{e_{\left. v \right]}}_c\). The Bianchi identity (7.67) then implies \(e_a^{\,\,\,\,b}{\partial _{\left[ b \right.}}e_{\left. a \right]}^a = 0\), so that we obtain an extra constraint of the form (7.66) with m^{a} = e^{a}. Ref. [171] derived similar constraints for other parameters of the theory.
9.4 Absence of ghosts in multigravity
We now turn to the proof for the absence of ghost in multigravity and follow the vielbein formulation of Ref. [314]. In this subsection we use the notation that uppercase Latin indices represent ddimensional Lorentz indices, A, B, ⋯ = 0, ⋯, d − 1, while lowercase Latin indices represent the d − 1dimensional Lorentz indices along the space directions after ADM decomposition, a, b, ⋯ = 1, ⋯, d − 1. Greek indices represent ddimensional spacetime indices μ, ν, = 0, ⋯, d − 1, while the ‘middle’ of the Latin alphabet indices i, j ⋯ represent pure space indices i, j, ⋯ = 1, ⋯, d, − 1. Finally, capital indices label the metric and span over I, J, K, ⋯ = 1, ⋯, N.
Let us start with N noninteracting spin2 fields. The theory has then N copies of coordinate transformation invariance (the coordinate system associated with each metric can be changed separately), as well as N copies of Lorentz invariance. At this level may, for each vielbein e_{(J)}, J = 1, ⋯, N we may use part of the Lorentz freedom to work in the upper triangular form for the vielbein,
leading to the standard ADM decomposition for the metric,
with the threedimensional metric \(\gamma (J)ij = e(J)_{\,\,\,i}^ae(J)_{\,\,\,\,j}^b{\delta _{ab}}\). Starting with noninteracting fields, we simply take copies of the GR action,
and the Hamiltonian in terms of the vielbein variables then takes the form (7.6)
where \({\pi _{(J)}}_a^{\,\,i}\) is the conjugate momentum associated with the vielbein\({e_{(J)}}_{\,\,i}^a\) and the constraints \({{\mathcal C}_{(J)0,i}} = {{\mathcal C}_{0,i}}({e_{(J)}},{\pi _{(J)}})\) are the ones mentioned previously in (7.6) (now expressed in the vielbein variables) and are related to diffeomorphism invariance. In the vielbein language there is an addition d (d − 1)/2 primary constraints for each vielbein field
related to the residual local Lorentz symmetry still present after fixing the upper triangular form for the vielbeins.
Now rather than setting part of the N Lorentz frames to be on the upper diagonal form for all the vielbein (7.69) we only use one Lorentz boost to set one of the vielbein in that form, say e_{(1)}, and ‘unboost’ the N − 1 other frames, so that for any of the other vielbein one has
where p_{(J)a} is the boost that would bring that vielbein in the upper diagonal form.
We now consider arbitrary interactions between the N fields of the form (6.1),
where for concreteness we assume d ≤ N, otherwise the formalism is exactly the same (there is some redundancy in this formulation, i.e., some interactions are repeated in this formulation, but this has no consequence for the argument). Since the vielbeins \({e_{(J)}}_0^A\) are linear in their respective shifts and lapse \({N_{(J)}},N_{(J)}^i\) and the vielbeins \({e_{(J)}}_i^A\) do not depend any shift nor lapse, it is easy to see that the general set of interactions (7.77) lead to a Hamiltonian which is also linear in every shift and lapse,
Indeed the wedge structure of (6.1) or (7.77) ensures that there is one and only one vielbein with timelike index \({e_{(J)}}_0^A\) for every term \({\varepsilon _{{a_1} \ldots {a_d}}}e_{({J_1})}^{{a_1}}\wedge \ldots \wedge e_{(Jd)}^{{a_d}}\).
Notice that for the interactions, the terms \({\mathcal C}_{(J)0,i}^{{\rm{int}}}\) can depend on all the N vielbeins e_{(J ′)} and all the N − 1 ‘boosts’ p_{(J′)}, (as mentioned previously, part of one Lorentz frame is set so that p_{(1)} = 0 and e_{(1)} is in the upper diagonal form). Following the procedure of [314], we can now solve for the N − 1 remaining boosts by using (N − 1) of the N shift equations of motion
Now assuming that all N vielbein are interacting,^{Footnote 20} (i.e., there is no vielbein e_{(J)} which does not appear at least once in the interactions (7.77) which mix different vielbeins), the shift equations (7.79) will involve all the N − 1 boosts and can be solved for them without spoiling the linearity in any of the N lapses N_{(J)}. As a result, the N − 1 lapses N_{(J)} for J = 2, ⋯, N are Lagrange multiplier for (N − 1) first class constraints. The lapse N_{(1)} for the first vielbein combines with the remaining shift \(N_{(1)}^i\) to generate the one remaining copy of diffeomorphism invariance.
We now have all the ingredients to count the number of dofs in phase space: We start with d^{2} components in each of the N vielbein \(e_{(J)i}^a\) and associated conjugate momenta, that is a total of 2 × d^{2} × N phase space variables. We then have 2 × d (d − 1)/2 × N constraints^{Footnote 21} associated with the \(\lambda _{(J)}^{ab}\). There is one copy of diffeomorphism removing 2 × (d + 1) phase space dofs (with Lagrange multiplier N_{(1)} and \(N_{(1)}^i\) and (N − 1) additional firstclass constraints with Lagrange multipliers N_{(J ≥2)} removing 2 × (N − 1) dofs. As a result we end up with
which is the correct counting in (d + 1) spacetime dimensions, and the theory is thus free of any BD ghost.
10 Decoupling Limits
10.1 Scaling versus decoupling
Before moving to the decoupling of massive gravity and bigravity, let us make a brief interlude concerning the correct identification of degrees of freedom. The Stückelberg trick used previously to identify the correct degrees of freedom works in all generality, but care must be used when taking a “decoupling limit” (i.e., scaling limit) as will be done in Section 8.2.
Imagine the following gauge field theory
i.e., the Proca mass term without any kinetic Maxwell term for the gauge field. Since there are no dynamics in this theory, there is no degrees of freedom. Nevertheless, one could still proceed and use the same split \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}{\mathcal X}/m\) as performed previously,
so as to introduce what appears to be a kinetic term for the mode χ. At this level the theory is still invariant under χ ⊒ χ + mξ and \(A_\mu ^ \bot \to A_\mu ^ \bot  {\partial _\mu}\xi\) and so while there appears to be a dynamical degree of freedom χ, the symmetry makes that degree of freedom unphysical, so that (8.2) still propagates no physical degree of freedom.
Now consider the m ⊒ 0 scaling limit of (8.2) while keeping \(A_\mu ^ \bot\) and χ finite. In that scaling limit, the theory reduces to
i.e., one degree of freedom with no symmetry which implies that the theory (8.3) propagates one degree of freedom. This is correct and thus means that (8.3) is not a consistent decoupling limit of (8.2) since the number of degrees of freedom is different already at the linear level. In the rest of this review, we will call a decoupling limit a specific type of scaling limit which preserves the same number of physical propagating degrees of freedom in the linear theory. As suggested by the name, a decoupling limit is a special kind of limit in which some of the degrees of freedom of the original theory might decouple from the rest, but the total number of degrees of freedom remains identical. For the theory (8.2), this means that the scaling ought to be taken not with \(A_\mu ^ \bot\) fixed but rather with \(\tilde A_\mu ^ \bot = A_\mu ^ \bot/m\) fixed. This is indeed a consistent rescaling which leads to finite contributions in the limit m ⊒ 0,
which clearly propagates no degrees of freedom.
This procedure is true in all generality: a decoupling limit is a special scaling limit where all the fields in the original theory are scaled with the highest possible power of the scale in such a way that the decoupling limit is finite.
A decoupling limit of a theory never changes the number of physical degrees of freedom of a theory. At best it ‘decouples’ some of them in such a way thai they are inaccessible from another sector.
Before looking at the massive gravity limit of bigravity and other decoupling limits of massive and bigravity, let us start by describing the different scaling limits that can be taken. We start with a bigravity theory where the two spin2 fields have respective Planck scales M_{g} and M_{f} and the interactions between the two metrics arises at the scale m. In order to stick to the relevant points we perform the analysis in four dimensions, but the following arguments extend trivially to arbitrary dimensions.

Noninteracting Limit: The most natural question to ask is what happens in the limit where the interactions between the two fields are ‘switched off’, i.e., when sending the scale m ⊒ 0, (the limit m ⊒ 0 is studied more carefully in Sections 8.3 and 8.4). In that case if the two Planck scales M_{g,f} remain fixed as m → 0, we then recover two massless noninteracting spin2 fields (carrying both 2 helicity2 modes), in addition to a decoupled sector containing a helicity0 mode and a helicity1 mode. In bigravity matter fields couple only to one metric, and this remains the case in the limit m → 0, so that the two massless spin2 fields live in two fully decoupled sectors even when matter in included.

Massive Gravity: Alternatively, we may look at the limit where one of the spin2 fields (say f_{μν}) decouples. This can be studied by sending its respective Planck scale to infinity. The resulting limit corresponds to a massive spin2 field (carrying five dofs) and a decoupled massless spin2 field carrying 2 dofs. This is nothing other than the massive gravity limit of bigravity (which includes a fully decoupled massless sector).
If one considers matter coupling to the metric which scales in such a way that a nontrivial solution for f_{μν} survives in the \({M_f} \to \infty \,\lim {\rm{it}}\,{f_{\mu v}} \to {\overset  f _{\mu v}}\), we then obtain a massive gravity sector on an arbitrary nondynamical reference metric \({\overset  f _{\mu v}}\). The dynamics of the massless spin2 field fully decouples from that of the massive sector.

Other Decoupling Limits Finally, one can look at combinations of the previous limits, and the resulting theory depends on how fast M_{f}, M_{g} → ∞ compared to how fast m → 0. For instance if one takes the limit M_{f}, M_{g} → ∞ and m → 0, while keeping both M_{g}/M_{f} and \(\Lambda _3^3 = {M_g}{m^2}\) fixed, then we obtain what is called the Λ_{3}decoupling limit of bigravity (derived in Section 8.4), where the dynamics of the two helicity2 modes (which are both massless in that limit), and that of the helicity1 and 0 modes can be followed without keeping track of the standard nonlinearities of GR.
If on top of this Λ_{3}decoupling limit one further takes M_{f} → ∞, then one of the massless spin2 fields fully decoupled (no communication between that field and the helicity1 and 0 modes). If, on the other hand, we take the additional limit m → 0 on top of the Λ_{3}decoupling limit, then the helicity0 and 1 modes fully decouple from both helicity2 modes.
In all of these decoupling limits, the number of dofs remains the same as in the original theory, some fields are simply decoupled from the rest of the standard gravitational sector. These prevents any communication between these decoupled fields and the gravitational sector, and so from the gravitational sector view point it appears as if these decoupled fields did not exist.
It is worth stressing that all of these limits are perfectly sensible and lead to sensible theories, (from a theoretical view point). This is important since if one of these scaling limits lead to a pathological theory, it would have severe consequences for the parent bigravity theory itself.
Similar decoupling limit could be taken in multigravity and out of N interacting spin2 fields, we could obtain for instance N decoupled massless spin2 fields and 3(N − 1) decoupled dofs in the helicity0 and 1 modes.
In what follows we focus on massive gravity limit of bigravity when M_{f} ⊒∞
10.2 Massive gravity as a decoupling limit of bigravity
10.2.1 Minkowski reference metric
In the following two sections we review the decoupling arguments given previously in the literature, (see for instance [154]). We start with the theory of bigravity presented in Section 5.4 with the action (5.43)
with \({{\mathcal L}_m}(g,f) = \sum\nolimits_{n = 0}^4 {{\alpha _n}{{\mathcal L}_n}[{\mathcal K}(g,f)]}\) as defined in (6.3) and where \({\mathcal K}_v^\mu = \delta _v^\mu  \sqrt {{g^{\mu \alpha}}{f_{\alpha v}}}\). We also allow for the coupling to matter with different species ψ_{g,f} living on each metrics.
We now consider matter fields ψ_{f} such that f_{μν} = η_{μν} is a solution to the equations of motion (so for instance there is no overall cosmological constant living on the metric f_{μν}). In that case we can write that metric f_{μν} as
We may now take the limit M_{f} → ∞ while keeping the scales M_{g} and m and all the fields χ, g, ψ_{f,g} fixed. We then recover massive gravity plus a completely decoupled massless spin2 field χ_{μν}, and a fully decoupled matter sector ψ_{f} living on flat space
with the massive gravity Lagrangian ℒ_{MG} is expressed in (6.3). That massive gravity Lagrangian remains fully nonlinear in this limit and is expressed in terms of the full metric g_{μν} and the reference metric η_{μν}. While the metric f_{μν} is ‘frozen’ in this limit, we emphasize however that the massless spin2 field χ_{μν} is itself not frozen — its dynamics is captured through the kinetic term \({{\mathcal X}^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{{\mathcal X}_{\alpha \beta}}\), but that spin2 field decouple from its own matter sector ψ_{f}, (although this can be accommodated for by scaling the matter fields ψ_{f} accordingly in the limit M_{f} → ∞ so as to maintain some interactions).
At the level of the equations of motion, in the limit M_{f} → ∞ we obtain the massive gravity modified Einstein equation for g_{μν}, the free massless linearized Einstein equation for which fully decouples and the equation of motion for all the matter fields ψ_{f} on flat spacetime, (see also Ref. [44]).
10.2.2 (A)dS reference metric
To consider massive gravity with an (A)dS reference metric as a limit of bigravity, we include a cosmological constant for the metric f into (8.5)
There can also be in principle another cosmological constant living on top of the metric but this can be included into the potential \({\mathcal U}(g,f)\). The background field equations of motion are then given by
Taking now the limit M_{f} → ∞ while keeping the cosmological constant Λ_{f} fixed, the background solution for the metric f_{μν} is nothing other than dS (or AdS depending on the sign of Λ_{f}). So we can now express the metric f_{μν} as
where γ_{μν} is the dS metric with Hubble parameter \(H\sqrt {{\Lambda _f}/3}\). Taking the limit M_{f} → ∞, we recover massive gravity on (A)dS plus a completely decoupled massless spin2 field χ_{μν},
where once again the scales M_{Pl} and m are kept fixed in the limit M_{f} → ∞. γ_{μν} now plays the role of a nontrivial reference metric for massive gravity. This corresponds to a theory of massive gravity on a more general reference metric as presented in [296]. Here again the Lagrangian for massive gravity is given in (6.3) with now \({\mathcal K}_v^\mu (g) = \delta _v^\mu  \sqrt {{g^{\mu \alpha}}{\gamma _{\alpha v}}}\). The massive gravity action remains fully nonlinear in the limit M_{f} → ∞ and is expressed solely in terms of the full metric g_{μν} and the reference metric γ_{μν} while the excitations χ_{μν} for the massless graviton remain dynamical but fully decouple from the massive sector.
10.2.3 Arbitrary reference metric
As is already clear from the previous discussion, to recover massive gravity on a nontrivial reference metric as a limit of bigravity, one needs to scale the Matter Lagrangian that couples to what will become the reference metric (say the metric f for definiteness) in such a way that the Riemann curvature of f remains finite in that decoupling limit. For a macroscopic description of the matter living on this is in principle always possible. For instance one can consider a point source of mass M_{bh} living on the metric f. Then, taking the limit M_{f}, M_{bh} → ∞ while keeping the ratio M_{BH}/M_{f} fixed, leads to a theory of massive gravity on a Schwarzschild reference metric and a decoupled massless graviton. However, some care needs to be taken to see how this works when the dynamics of the matter sourcing is included.
As soon as the dynamics of the matter field is considered, one has to send the scale of that field to infinity so that it maintains some nonzero effect on f in the limit M_{f} → ∞ i.e.,
Nevertheless, this can be achieved in such a way that the fluctuations of the matter fields remain finite and decouple in the limit M_{f} → ∞. We note that this scaling is the key difference between the decoupling limit of bigravity on a Minkowski reference metric derived in section 8.2.1 where the matter field scale as \({\rm{li}}{{\rm{m}}_{{M_f} \to \infty}}{1 \over {M_f^2}}{T^{\mu v}} \to 0\) and the decoupling limit of bigravity on an arbitrary reference metric derived here.
As an example, suppose that the Lagrangian for the matter (for example a scalar field) sourcing the f metric is
where F (X) is an arbitrary dimensionless function of its argument. Then choosing to take the form
and rescaling \({V_0} = M_f^2{\overset  V _0}\) and \(\lambda = {M_f}\overset  \lambda\), then on taking the limit M_{f} → ∞ keeping \(\bar {\mathcal X}\), \(\delta {\mathcal X}\) and \(\overset  \lambda\) fixed, since
we find that the background stress energy blows up in such a way that \({1 \over {M_f^2}}{T^{\mu v}}\) remains finite and nontrivial, and in addition the background equations of motion for \(\bar {\mathcal X}\) remain welldefined and nontrivial in this limit,
This implies that even in the limit M_{f} → ∞ can remain consistently as a nontrivial sourced metric which is a solution of some dynamical equations sourced by matter. In addition the action for the fluctuations δ_{χ} asymptotes to a free theory which is coupled only to the fluctuations of which are themselves completely decoupled from the fluctuations of the metric g and matter fields coupled to g.
As a result, massive gravity with an arbitrary reference metric can be seen as a consistent limit of bigravity in which the additional degrees of freedom in the metric and matter that sources the background decouple. Thus all solutions of massive gravity may be seen as M_{f} → ∞ decoupling limits of solutions of bigravity. This will be discussed in more depth in Section 8.4. For an arbitrary reference metric which can be locally written as a small departures about Minkowski the decoupling limit is derived in Eq. (8.81).
Having derived massive gravity as a consistent decoupling limit of bigravity, we could of course do the same for any multimetric theory. For instance, out of Ninteracting fields, we could take a limit so as to decouple one of the metrics, we then obtain the theory of (N − 1)interacting fields, all of which being massive and one decoupled massless spin2 field.
10.3 Decoupling limit of massive gravity
We now turn to a different type of decoupling limit, whose aim is to disentangle the dofs present in massive gravity itself and analyze the ‘irrelevant interactions’ (in the usual EFT sense) that arise at the lowest possible scale. One could naively think that such interactions arise at the scale given by the graviton mass, but this is not so. In a generic theory of massive gravity with FierzPauli at the linear level, the first irrelevant interactions typically arise at the scale Λ_{5} = (m^{4}M_{Pl})^{1/5}. For the setups we have in mind, m ≪ Λ_{5} ≪ M_{Pl}. But we shall see that interactions arising at such a lowenergy scale are always pathological (reminiscent to the BD ghost [111, 173]), and in ghostfree massive gravity the first (irrelevant) interactions actually arise at the scale Λ_{3} = (m^{3}M_{Pl})^{1/3}.
We start by deriving the decoupling limit in the absence of vectors (helicity1 modes) and then include them in the following section 8.3.4. Since we are interested in the decoupling limit about flat spacetime, we look at the case where Minkowski is a vacuum solution to the equations of motion. This is the case in the absence of a cosmological constant and a tadpole and we thus focus on the case where α_{0} = α_{1} = 0 in (6.3).
10.3.1 Interaction scales
In GR, the interactions of the helicity2 mode arise at the very high energy scale, namely the Planck scale. In massive gravity a new scale enters and we expect some interactions to arise at a lower energy scale given by a geometric combination of the Planck scale and the graviton mass. The potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt { g} {{\mathcal L}_n}[{\mathcal K}[g,\eta ]]\) (6.3) includes generic interactions between the canonically normalized helicity0 (π), helicity1 (A_{μ}), and helicity2 modes (h_{μν}) introduced in (2.48)
at the scale
and with j, k, ℓ ∈ ℕ, and j + 2k + ℓ > 2.
Clearly, the lowest interaction scale is Λ_{j=0,k= 0,ℓ =3} ≡ Λ_{5} = (M_{Pl}m^{4})^{1/5} which arises for an operator of the form (∂^{2}π)^{3}. If present such an interaction leads to an Ostrogradsky instability which is another manifestation of the BD ghost as identified in [173].
Even if that very interaction is absent there is actually an infinite set of dangerous interactions of the form (∂^{2}π)^{ℓ} which arise at the scale Λ_{j=0,k =0;ℓ≥3}, with
with Λ_{j=0,k =0,ℓ→∞} = Λ_{3}.
Any interaction with j > 0 or k > 0 automatically leads to a larger scale, so all the interactions arising at a scale between Λ_{5} (inclusive) and Λ_{3} are of the form (∂^{2}π)^{ℓ} and carry an Ostrogradsky instability. For DGP we have already seen that there is no interactions at a scale below Λ_{3}. In what follows we show that same remains true for the ghostfree theory of massive gravity proposed in (6.3). To see this let us identify the interactions with j = k = 0 and arbitrary power ℓ for (∂^{2}π).
10.3.2 Operators below the scale Λ_{3}
We now express the potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt { g} {{\mathcal L}_n}[{\mathcal K}]\) introduced in (6.3) using the metric in term of the helicity0 mode, where we recall that the quantity \({\mathcal K}\) is defined in (6.7), as \({\mathcal K}_v^\mu [g,\tilde f] = \delta _{\,\,\,\,\,v}^\mu  (\sqrt {{g^{ 1}}\tilde f})_v^\mu\) where \(\tilde f\) is the ‘Stückelbergized’ reference metric given in (2.78). Since we are interested in interactions without the helicity2 and 1 modes (j = k = 0), it is sufficient to follow the behaviour of the helicity0 mode and so we have
with again Π_{μν} = ∂_{μ}∂_{ν} π and \(\Pi _{\mu v}^2: = {\eta ^{\alpha \beta}}{\Pi _{\mu \alpha}}{\Pi _{v\beta}}\).
As a result, we infer that up to the scale Λ_{3} (excluded), the potential in (6.3) is
where as mentioned earlier we focus on the case without a cosmological constant and tadpole i.e., α_{0} = α_{1} = 0. All of these interactions are total derivatives. So even though the ghostfree theory of massive gravity does in principle involve some interactions with higher derivatives of the form (∂^{2}π)^{ℓ} it does so in a very precise way so that all of these terms combine so as to give a total derivative and being harmless.^{Footnote 22}
As a result the potential term constructed proposed in Part II (and derived from the deconstruction framework) is free of any interactions of the form (∂^{2}π)^{ℓ}. This means that the BD ghost as identified in the Stückelberg language in [173] is absent in this theory. However, at this level, the BD ghost could still reappear through different operators at the scale Λ_{3} or higher.
10.3.3 Λ_{3}decoupling limit
Since there are no operators all the way up to the scale Λ_{3} (excluded), we can take the decoupling limit by sending M_{Pl} ⊒ ∞, m ⊒ 0 and maintaining the scale Λ_{3} fixed.
The operators that arise at the scale Λ_{3} are the ones of the form (8.18) with either j = 1, k = 0 and arbitrary ℓ ≥ 2 or with j = 0, k = 1 and arbitrary ℓ ≥ 1. The second case scenario leads to vector interactions of the form (∂A)^{2}(∂^{2}π)^{ℓ} and will be studied in the next Section 8.3.4. For now we focus on the first kind of interactions of the form h (∂^{2}π)^{ℓ},
with [144] (see also refs. [137] and [143])
Using the fact that
we obtain
where the tensors \(X_{\mu v}^{(n)}\) are constructed out of Π_{μν}, symbolically, X^{(n)} ∼ Π^{(n)} but in such a way that they are transverse and that their resulting equations of motion never involve more than two derivatives on each fields,
where we have included X^{(0)} and X^{(n ≥4)} for completeness (these become relevant for instance in the context of bigravity). The generalization of these tensors to arbitrary dimensions is straightforward and in dspacetime dimensions there are d such tensors, symbolically X^{(n)} = εε Π^{n}δ^{d−n −1} for n = 0, ⋯ d, − 1.
Since we are dealing with the decoupling limit with M_{Pl} → ∞ the metric is flat \({g_{\mu v}} = {\eta _{\mu v}} + M_{{\rm{Pl}}}^{ 1}{h_{\mu v}} \to {\eta _{\mu v}}\) and all indices are raised and lowered with respect to the Minkowski metric. These tensors can be written more explicitly as follows
Note that they also satisfy the recursive relation
with \(X_{\mu v}^{(0)} = 3!{\eta _{\mu v}}\).
10.3.3.1 Decoupling limit
From the expression of these tensors in terms of the fully antisymmetric LeviCevita tensors, it is clear that the tensors are transverse and that the equations of motion of \({h^{\mu v}}\overset  {{X_{\mu v}}}\) with respect to both h and π never involve more than two derivatives. This decoupling limit is thus free of the Ostrogradsky instability which is the way the BD ghost would manifest itself in this language. This decoupling limit is actually free of any ghostlie instability and the whole theory is free of the BD even beyond the decoupling limit as we shall see in depth in Section 7.
Not only does the potential term proposed in (6.3) remove any potential interactions of the form (∂^{2}π)^{ℓ} which could have arisen at an energy between Λ_{5} = (M_{Pl}m^{4})^{1/5} and Λ_{3}, but it also ensures that the interactions that arise at the scale Λ_{3} are healthy.
As already mentioned, in the decoupling limit M_{Pl} ⊒ ∞ the metric reduces to Minkowski and the standard EinsteinHilbert term simply reduces to its linearized version. As a result, neglecting the vectors for now the full Λ_{3}decoupling limit of ghostfree massive gravity is given by
with α_{1} = α_{2}/4, α_{2} = (2α_{2} + 3α_{3})/8 and α_{3} = (α_{3} + 4α_{4})/8 and the correct normalization should be α_{2} = 1.
10.3.3.2 Unmixing and Galileons
As was already the case at the linearized level for the FierzPauli theory (see Eqs. (2.47) and (2.48)) the kinetic term for the helicity0 mode appears mixed with the helicity2 mode. It is thus convenient to diagonalize these two modes by performing the following shift,
where the nonlinear term has been included to unmix the coupling \({h^{\mu v}}X_{\mu v}^{(2)}\), leading to the following decoupling limit [137]
where we introduced the Galileon Lagrangians \({\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]\) as defined in Ref. [412]
where the Lagrangians ℒ_{n} [Q ] = εεQ^{n}δ^{4−n} for a tensor \(Q_{\,\,\,\,v}^\mu\) are defined in (6.9)–(6.13), or more explicitly in (6.14)–(6.18), leading to the explicit form for the Galileon Lagrangians
and the coefficients c_{n} are given in terms of the α_{n} as follows,
Setting α_{2} = 1, we indeed recover the same normalization of −3/4(∂π)^{2} for the helicity0 mode found in (2.48).
10.3.3.3 X^{(3)}coupling
In general, the last coupling \({\tilde h^{\mu v}}X_{\mu v}^{(3)}\) between the helicity2 and helicity0 mode cannot be removed by a local field redefinition. The nonlocal field redefinition
where \(G_{\mu v\alpha \beta}^{{\rm{massless}}}\) is the propagator for a massless spin2 field as defined in (2.64), fully diagonalizes the helicity0 and 2 mode at the price of introducing nonlocal interactions for π.
Note however that these nonlocal interactions do not hide any new degrees of freedom. Furthermore, about some specific backgrounds, the field redefinition is local. Indeed focusing on static and spherically symmetric configurations if we consider π = π_{0}(r) and \({\tilde h_{\mu v}}\) given by
so that
The standard kinetic term for ψ sets ψ ′(r) = ϕ (r)/r as in GR and the X^{(3)} coupling can be absorbed via the field redefinition, \(\phi \to \bar \phi  2({\alpha _3} + 4{\alpha _4}){{\pi \prime_0}}{(r)^3}/r\Lambda _3^{ 6}\), leading to the following new sextic interactions for π,
interestingly this new order6 term satisfy all the relations of a Galileon interaction but cannot be expressed covariantly in a local way. See [61] for more details on spherically symmetric configurations with the X^{(3)}coupling.
10.3.4 Vector interactions in the Λ_{3}decoupling limit
As can be seen from the relation (8.19), the scale associated with interactions mixing two helicity1 fields with an arbitrary number of fields π, (j = 0, k = 1 and arbitrary ℓ) is also Λ_{3}. So at that scale, there are actually an infinite number of interactions when including the mixing with between the helicity1 and 0 modes (however as mentioned previously, since the vector field always appears quadratically it is always consistent to set them to zero as was performed previously).
The full decoupling limit including these interactions has been derived in Ref. [419], (see also Ref. [238]) using the vielbein formulation of massive gravity as in (6.1) and we review the formalism and the results in what follows.
In addition to the Stückelberg fields associated with local covariance, in the vielbein formulation one also needs to introduce 6 additional Stückelberg fields ω_{ab} associated to local Lorentz invariance, ω_{ab} = − ω_{ba}. These are nondynamical since they never appear with derivatives, and can thus be treated as auxiliary fields which can be integrated. It is however useful to keep them in the decoupling limit action, so as to retain a closesform expression. In terms of the Lorentz Stückelberg fields, the full decoupling limit of massive gravity in four dimensions at the scale Λ_{3} is then (before diagonalization) [419]
(the superscript (0) indicates that this decoupling limit is taken with Minkowski as a reference metric), with F_{ab} = ∂_{a}A_{b} − ∂_{b}A_{a} and the coefficients β_{n} are related to the α_{n} as in (6.28).
The auxiliary Lorentz Stückelberg fields carries all the nonlinear mixing between the helicity0 and 1 modes,
In some special cases these sets of interactions can be resummed exactly, as was first performed in [139], (see also Refs. [364, 456]).
This decoupling limit includes nonlinear combinations of the secondderivative tensor Π_{μν} and the first derivative Maxwell tensor F_{μν}. Nevertheless, the structure of the interactions is gauge invariant for A_{μ}, and there are no higher derivatives on in the equation of motion for A, so the equations of motions for both the helicity1 and 2 modes are manifestly second order and propagating the correct degrees of freedom. The situation is more subtle for the helicity0 mode. Taking the equation of motion for that field would lead to higher derivatives on π itself as well as on the helicity1 field. Since this theory has been proven to be ghostfree by different means (see Section 7), it must be that the higher derivatives in that equation are nothing else but the derivative of the equation of motion for the helicity1 mode similarly as what happens in Section 7.2.
When working beyond the decoupling limit, the even the equation of motion with respect to the helicity1 mode is no longer manifestly wellbehaved, but as we shall see below, the Stückelberg fields are no longer the correct representation of the physical degrees of freedom. As we shall see below, the proper number of degrees of freedom is nonetheless maintained when working beyond the decoupling limit.
10.3.5 Beyond the decoupling limit
10.3.5.1 Physical degrees of freedom
In Section 8.3, we have introduced four Stückelberg fields ϕ^{a} which transform as scalar fields under coordinate transformation, so that the action of massive gravity is invariant under coordinate transformations. Furthermore, the action is also invariant under global Lorentz transformations in the field space,
In the DL, taking M_{Pl} → ∞, all fields are living on flat spacetime, so in that limit, there is an additional global Lorentz symmetry acting this time on the spacetime,
The internal and spacetime Lorentz symmetries are independent, (the internal one is always present while the spacetime one is only there in the DL). In the DL we can identify both groups and work in the representation of the single group, so that the action is invariant under,
The Stückelberg fields ϕ^{a} then behave as Lorentz vectors under this identified group, and π defined previously behaves as a Lorentz scalar. The helicity0 mode of the graviton also behaves as a scalar in this limit, and captures the behavior of the graviton helicity0 mode. So in the DL limit, the right requirement for the absence of BD ghost is indeed the requirement that the equations of motion for π remain at most second order (time) in derivative as was pointed out in [173], (see also [111]). However, beyond the DL, the helicity0 mode of the graviton does not behave as a scalar field and neither does the π in the split of the Stückelberg fields. So beyond the DL there is no reason to anticipate that captures a whole degree of freedom, and it indeed, it does not. Beyond the DL, the equation of motion for will typically involve higher derivatives, but the correct requirement for the absence of ghost is different, as explained in Section 7.2. One should instead go back to the original four scalar Stückelberg fields ϕ^{a} and check that out of these four fields only three of them be dynamical. This has been shown to be the case in Section 7.2. These three degrees of freedom, together with the two standard graviton polarizations then gives the correct five degrees of freedom and circumvent the BD ghost.
Recently, much progress has been made in deriving the decoupling limit about arbitrary backgrounds, see Ref. [369].
10.3.6 Decoupling limit on (Anti) de Sitter
10.3.6.1 Linearized theory and Higuchi bound
Before deriving the decoupling limit of massive gravity on (Anti) de Sitter, we first need to analyze the linearized theory so as to infer the proper canonical normalization of the propagating dofs and the proper scaling in the decoupling limit, similarly as what was performed for massive gravity with flat reference metric. For simplicity we focus on (3 + 1) dimensions here, and when relevant give the result in arbitrary dimensions. Linearized massive gravity on (A)dS was first derived in [307, 308]. Since we are concerned with the decoupling limit of ghostfree massive gravity, we follow in this section the procedure presented in [154]. We also focus on the dS case first before commenting on the extension to AdS.
At the linearized level about dS, ghostfree massive gravity reduces to the FierzPauli action with \({g_{\mu v}} = {\gamma _{\mu v}} + {\tilde h_{\mu v}} = {\gamma _{\mu v}} + {h_{\mu v}}/{M_{{\rm{Pl}}}}\), where γ_{μν} is the dS metric with constant Hubble parameter H_{0},
where H_{μν}, is the tensor fluctuation as introduced in (2.80), although now considered about the dS metric,
with π_{μν} = ∇_{μ} ∇_{ν}π, ∇ being the covariant derivative with respect to the dS metric γ_{μν} and indices are raised and lowered with respect to this same metric. Similarly, \({\hat \varepsilon _{{\rm{dS}}}}\) is now the Lichnerowicz operator on de Sitter,
So at the linearized level and neglecting the vector fields, the helicity0 and 2 mode of massive gravity on dS behave as
After integration by parts, [Π^{2}] = [Π]^{2} − 3H^{2}(∂π)^{2}. The helicity2 and 0 modes are thus diagonalized as in flat spacetime by setting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\gamma _{\mu v}}\),
The most important difference from linearized massive gravity on Minkowski is that the properly canonically normalized helicity0 mode is now instead
for a standard coupling of the form \({1 \over {{M_{{\rm{Pl}}}}}}\pi T\), where T is the trace of the stressenergy tensor, as we would infer from the coupling \({1 \over {{M_{{\rm{P}}1}}}}{h_{\mu \nu}}{T^{\mu \nu}}\) after the shift \({h_{\mu \nu}} = {\bar h_{\mu \nu}} + \pi {\gamma _{\mu \nu}}\), this means that the properly normalized helicity0 mode couples as
and that coupling vanishes in the massless limit. This might suggest that in the massless limit m → 0, the helicity0 mode decouples, which would imply the absence of the standard vDVZ discontinuity on (Anti) de Sitter [358, 430], unlike what was found on Minkowski, see Section 2.2.3, which confirms the Newtonian approximation presented in [186].
While this observation is correct on AdS, in the dS one cannot take the massless limit without simultaneously sending H → 0 at least the same rate. As a result, it would be incorrect to deduce that the helicity0 mode decouples in the massless limit of massive gravity on dS.
To be more precise, the linearized action (8.62) is free from ghost and tachyons only if m ≡ 0 which corresponds to GR, or if m^{2} > 2H^{2}, which corresponds to the wellknow Higuchi bound [307, 190]. In d spacetime dimensions, the Higuchi bound is m^{2} > (d − 2)H^{2}. In other words, on dS there is a forbidden range for the graviton mass, a theory with 0 < m^{2} < 2H^{2} or with m^{2} < 0 always excites at least one ghost degree of freedom. Notice that this ghost, (which we shall refer to as the Higuchi ghost from now on) is distinct from the BD ghost which corresponded to an additional sixth degree of freedom. Here the theory propagates five dof (in four dimensions) and is thus free from the BD ghost (at least at this level), but at least one of the five dofs is a ghost. When 0 < m^{2} < 2H^{2}, the ghost is the helicity0 mode, while for m^{2} < 0, the ghost is he helicity1 mode (at quadratic order the helicity1 mode comes in as \( {{{m^2}} \over 4}F_{\mu v}^2\)). Furthermore, when m^{2} < 0, both the helicity2 and 0 are also tachyonic, although this is arguably not necessarily a severe problem, especially not if the graviton mass is of the order of the Hubble parameter today, as it would take an amount of time comparable to the age of the Universe to see the effect of this tachyonic behavior. Finally, the case m^{2} = 2H^{2} (or m^{2} = (d − 2)H^{2} in d spacetime dimensions), represents the partially massless case where the helicity0 mode disappears. As we shall see in Section 9.3, this is nothing other than a linear artefact and nonlinearly the helicity0 mode always reappears, so the PM case is infinitely strongly coupled and always pathological.
A summary of the different bounds is provided below as well as in Figure 4:

m^{2} < 0: Helicity1 modes are ghost, helicity2 and 0 are tachyonic, sick theory

m^{2} = 0: General Relativity: two healthy (helicity2) degrees of freedom, healthy theory,

0 < m^{2} < 2H^{2}: One “Higuchi ghost” (helicity0 mode) and four healthy degrees of freedom (helicity2 and 1 modes), sick theory,

m^{2} = 2H^{2}: Partially Massless Gravity: Four healthy degrees (helicity2 and 1 modes), and one infinitely strongly coupled dof (helicity0 mode), sick theory,

m^{2} > 2H^{2}: Massive Gravity on dS: Five healthy degrees of freedom, healthy theory.
10.3.6.2 Massless and decoupling limit

As one can see from Figure 4, in the case where H^{2} < 0 (corresponding to massive gravity on AdS), one can take the massless limit m ⊒ 0 while keeping the AdS length scale fixed in that limit. In that limit, the helicity0 mode decouples from external matter sources and there is no vDVZ discontinuity. Notice however that the helicity0 mode is nevertheless still strongly coupled at a low energy scale.
When considering the decoupling limit m ⊒ 0, M_{Pl} ⊒ ∞ of massive gravity on AdS, we have the choice on how we treat the scale H in that limit. Keeping the AdS length scale fixed in that limit could lead to an interesting phenomenology in its own right, but is yet to be explored in depth.

In the dS case, the Higuchi forbidden region prevents us from taking the massless limit while keeping the scale H fixed. As a result, the massless limit is only consistent if H ⊒ 0 simultaneously as m ⊒ 0 and we thus recover the vDVZ discontinuity at the linear level in that limit.
When considering the decoupling limit m ⊒ 0, M_{Pl} ⊒ ∞ of massive gravity on dS, we also have to send H ⊒ 0. If H/m ⊒ 0 in that limit, we then recover the same decoupling limit as for massive gravity on Minkowski, and all the results of Section 8.3 apply. The case of interest is thus when the ratio H/m remains fixed in the decoupling limit.
10.3.6.3 Decoupling limit
When taking the decoupling limit of massive gravity on dS, there are two additional contributions to take into account:

First, as mentioned in Section 8.3.5, care needs to be applied to properly identify the helicity0 mode on a curved background. In the case of (A)dS, the formalism was provided in Ref. [154] by embedding a ddimensional de Sitter spacetime into a flat (d + 1)dimensional spacetime where the standard Stückelberg trick could be applied. As a result the ‘covariant’ fluctuation defined in (2.80) and used in (8.59) needs to be generalized to (see Ref. [154] for details)
$$\begin{array}{*{20}c} {{1 \over {{M_{{\rm{Pl}}}}}}{H_{\mu \nu}} = {1 \over {{M_{{\rm{Pl}}}}}}{h_{\mu \nu}} + {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}}  {1 \over {\Lambda _3^6}}\Pi _{\mu \nu}^2\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over {\Lambda _3^3}}{{{H^2}} \over {{m^2}}}\left({{{(\partial \pi)}^2}({\gamma _{\mu \nu}}  {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}})  {1 \over {\Lambda _3^6}}{\Pi _{\mu \alpha}}{\Pi _{\nu \beta}}{\partial ^\alpha}\pi {\partial ^\beta}\pi} \right)} \\ {+ {H^2}{{{H^2}} \over {{m^2}}}{{{{(\partial \pi)}^4}} \over {\Lambda _3^3}} + \cdots \,.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$(8.65)Any corrections in the third line vanish in the decoupling limit and can thus be ignored, but the corrections of order H^{2} in the second line lead to new nontrivial contributions.

Second, as already encountered at the linearized level, what were total derivatives in Minkowski (for instance the combination [Π^{2}] − [Π]^{2}), now lead to new contributions on de Sitter. After integration by parts, m^{−2}([Π^{2}] − [Π]^{2}) = m^{−2} = 12H^{2}/m^{2}(∂π)^{2}. This was the origin of the new kinetic structure for massive gravity on de Sitter and will have further effects in the decoupling limit when considering similar contributions from ℒ_{3,4}(Π), where ℒ_{3,4} are defined in (6.12, 6.13) or more explicitly in (6.17, 6.18).
Taking these two effects into account, we obtain the full decoupling limit for massive gravity on de Sitter,
where \({\mathcal L}_{{\Lambda _3}}^{(0)}\) is the full Lagrangian obtained in the decoupling limit in Minkowski and given in (8.52), and \({\mathcal L}_{{\rm{(Gal)}}}^{(n)}\) are the Galileon Lagrangians as encountered previously. Notice that while the ratio H/m remains fixed, this decoupling limit is taken with H, m ⊒ 0, so all the fields in (8.66) live on a Minkowski metric. The constant coefficients λ_{n} depend on the free parameters of the ghostfree theory of massive gravity, for the theory (6.3) with α_{1} = 0 and α_{2} = 1, we have
At this point we may perform the same field redefinition (8.39) as in flat space and obtain the following semidiagonalized decoupling limit,
where the contributions from the helicity1 modes are the same as the ones provided in (8.52), and the new coefficients \({\tilde c_n} =  {c_n}/4 + {H^2}/{m^2}{\lambda _n}\) cancel identically for m^{2} = 2H^{2}, α_{3} = −1 and α_{4} = −α_{3}/4 = 1/4, as pointed out in [154], and the same result holds for bigravity as pointed out in [301]. Interestingly, for these specific parameters, the helicity0 loses its kinetic term, and any selfmixing as well as any mixing with the helicity2 mode. Nevertheless, the mixing between the helicity1 and 0 mode as presented in (8.52) are still alive. There are no choices of parameters which would allow to remove the mixing with the helicity1 mode and as a result, the helicity0 mode generically reappears through that mixing. The loss of its kinetic term implies that the field is infinitely strongly coupled on a configuration with zero vev for the helicity1 mode and is thus an illdefined theory. This was confirmed in various independent studies, see Refs. [185, 147].
10.4 Λ_{3}decoupling limit of bigravity
We now proceed to derive the Λ_{3}decoupling limit of bigravity, and we will see how to recover the decoupling limit about any reference metric (including Minkowski and de Sitter) as special cases. As already seen in Section 8.3.4, the full DL is better formulated in the vielbein language, even though in that case Stückelberg fields ought to be introduced for the broken diff and the broken Lorentz. Yet, this is a small price to pay, to keep the action in a much simpler form. We thus proceed in the rest of this section by deriving the Λ_{3}decoupling of bigravity and start in its vielbein formulation. We follow the derivation and formulation presented in [224]. As previously, we focus on (3 + 1)spacetime dimensions, although the whole formalism is trivially generalizable to arbitrary dimensions.
We start with the action (5.43) for bigravity, with the interaction
where the relation between the α’s and the β’s is given in (6.28).
We now introduce Stückelberg fields ϕ^{a} = x^{a} − χ^{a} for diffs and \(\Lambda _b^a\) for the local Lorentz. In the case of massive gravity, there was no ambiguity in how to perform this ‘Stückelbergization’ but in the case of bigravity, one can either ‘Stückelbergize the metric f_{μν} or the metric g_{μν}. In other words the broken diffs and local Lorentz symmetries can be restored by performing either one of the two replacements in (8.69),
or alternatively
For now we stick to the first choice (8.71) but keep in mind that this freedom has deep consequences for the theory, and is at the origin of the duality presented in Section 10.7.
Since we are interested in the decoupling limit, we now perform the following splits, (see Ref. [419] for more details),
and perform the scaling or decoupling limit,
while keeping
Before performing any change of variables (any diagonalization), in addition to the kinetic term for quadratic h, v and A, there are three contributions to the decoupling limit of bigravity:

❶
Mixing of the helicity0 mode with the helicity1 mode A_{μ}, as derived in (8.52),

❷
Mixing of the helicity0 mode with the helicity2 mode \(h_\mu ^a\), as derived in (8.40),

❸
Mixing of the helicity0 mode with the new helicity2 mode \(\upsilon _\mu ^a\),
noticing that before field redefinitions, the helicity0 mode do not selfinteract (their selfinteractions are constructed so as to be total derivatives).
As already explained in Section 8.3.6, the first contribution ❶ arising from the mixing between the helicity0 and 1 modes is the same (in the decoupling limit) as what was obtained in Minkowski (and is independent of the coefficients β_{n} or α_{n}). This implies that the can be directly read of from the three last lines of (8.52). These contributions are the most complicated parts of the decoupling limit but remained unaffected by the dynamics of i.e., unaffected by the bigravity nature of the theory. This statement simply follows from scaling considerations. In the decoupling limit there cannot be any mixing between the helicity1 and neither of the two helicity2 modes. As a result, the helicity1 modes only mix with themselves and the helicity0 mode. Hence, in the scaling limit (8.74, 8.75) the helicity1 decouples from the massless spin2 field.
Furthermore, the first line of (8.52) which corresponds to the dynamics of \(h_\mu ^a\) and the helicity0 mode is also unaffected by the bigravity nature of the theory. Hence, the second contribution ❷ is the also the same as previously derived. As a result, the only new ingredient in bigravity is the mixing ❸ between the helicity0 mode and the second helicity2 mode \(\upsilon _\mu ^a\), given by a fixing of the form h^{μν}X^{μν}.
Unsurprisingly, these new contributions have the same form as ❷, with three distinctions: First the way the coefficients enter in the expressions get modified ever so slightly (β_{1} → β_{1}/3 and β_{3} → 3β_{3}). Second, in the mass term the spacetime index for ought to dressed with the Stückelberg field,
Finally, and most importantly, the helicity2 field \(\upsilon _a^\mu\) (which enters in the mass term) is now a function of the ‘Stückelbergized’ coordinates ϕ^{a}, which in the decoupling limit means that for the mass term
These two effects do not need to be taken into account for the υ that enters in its standard curvature term as it is Lorentz and diff invariant.
Taking these three considerations into account, one obtains the decoupling limit for bigravity,
with \({\tilde \beta _n} = {\beta _n}/(4  n)!(n  1)!\). Modulo the nontrivial dependence on the coordinate \(\tilde x = x + \partial \pi/\Lambda _3^3\), this is a remarkable simple decoupling limit for bigravity. Out of this decoupling limit we can rederive all the DL found previously very elegantly.
Notice as well the presence of a tadpole for υ if β_{1} ≠ 0. When this tadpole vanishes (as well as the one for h), one can further take the limit M_{f} → ∞ keeping all the other β’s fixed as well as Λ_{3}, and recover straight away the decoupling limit of massive gravity on Minkowski found in (8.52), with a free and fully decoupled massless spin2 field.
In the presence of a cosmological constant for both metrics (and thus a tadpole in this framework), we can also take the limit M_{f} → ∞ and recover straight away the decoupling limit of massive gravity on (A)dS, as obtained in (8.66).
This illustrates the strength of this generic decoupling limit for bigravity (8.78). In principle we could even go further and derive the decoupling limit of massive gravity on an arbitrary reference metric as performed in [224]. To obtain a general reference metric we first need to add an external source for that generates a background for \({\overset  V _{\mu v}} = {M_f}/{M_{{\rm{Pl}}}}{\overset  U _{\mu v}}\). The reference metric is thus expressed in the local inertial frame as
The fact that the metric looks like a perturbation away from Minkowski is related to the fact that the curvature needs to scale as m^{2} in the decoupling limit in order to avoid the issues previously mentioned in the discussion of Section 8.2.3.
We can then perform the scaling limit M_{f} → ∞, while keeping the β’s and the scale Λ_{3} = (M_{Pl}m^{2})^{1/3} fixed as well as the field υ_{μν} and the fixed tensor \({\overset  U _{\mu v}}\). The decoupling limit is then simply given by
where the helicity2 field υ fully decouples from the rest of the massive gravity sector on the first line which carries the other helicity2 field as well as the helicity1 and 0 modes. Notice that the general metric \(\overset  U\) has only an effect on the helicity0 selfinteractions, through the second term on the first line of (8.81) (just as observed for the decoupling limit on AdS). These new interactions are ghostfree and look like Galileons for conformally flat \({\overset  U _{\mu v}} = \lambda {\eta _{\mu v}}\), with λ constant, but not in general. In particular, the interactions found in (8.81) would not be the covariant Galileons found in [166, 161, 157] (nor the ones found in [237]) for a generic metric.
11 Extensions of Ghostfree Massive Gravity
Massive gravity can be seen as a theory of a spin2 field with the following free parameters in addition to the standard parameters of GR (e.g., the cosmological constant, etc…),

Reference metric f_{ab},

Graviton mass m,

(d − 2) dimensionless parameters α_{n} (or the β’s).
As natural extensions of massive gravity one can make any of these parameters dynamical. As already seen, the reference metric can be made dynamical leading to bigravity which in addition to massive spin2 field carries a massless one as well.
Another natural extension is to promote the graviton mass m, or any of the free parameters α_{n} (or β_{n}) to a function of a new dynamical variable, say of an additional scalar field In principle the mass and the parameters α’s can be thought as potentials for an arbitrary number of scalar fields m = m (ψ_{j}), α_{n} = α_{n} (ψ_{j}), and not necessarily the same fields for each one of them [320]. So long as these functions are pure potentials and hide no kinetic terms for any new degree of freedom, the constraint analysis performed in Section 7 will go relatively unaffected, and the theory remains free from the BD ghost. This was shown explicitly for the massvarying theory [319, 315] (where the mass is promoted to a scalar function of a new single scalar field, m = m (ϕ), while the parameters α remain constant^{Footnote 23}), as well as a general massive scalartensor theory [320], and for quasidilaton which allow for different couplings between the spin2 and the scalar field, motivated by scale invariance. We review these models below in Sections 9.1 and 9.2.
Alternatively, rather than considering the parameters and as arbitrary, one may set them to special values of special interest depending on the reference metric f_{μν}. Rather than an ‘extension’ per se this is more special cases in the parameter space. The first obvious one is m = 0 (for arbitrary reference metric and parameters), for which one recovers the theory of GR (so long as the spin2 field couples to matter in a covariant way to start with). Alternatively, one may also sit on the Higuchi bound, (see Section 8.3.6) with the parameters m^{2} = 2H^{2}, α_{3} = −1/3 and α_{4} = 1/12 in four dimensions. This corresponds to the partially massless theory of gravity, which at the moment is pathological in its simplest realization and will be reviewed below in Section 9.3.
The coupling massive gravity to a DBI Galileon [157] was considered in [237, 461, 261] leading to a generalized Galileon theory which maintains a Galileon symmetry on curved backgrounds. This theory was shown to be free of any Ostrogradsky ghost in [19] and the cosmology was recently studied in [315] and perturbations in [20].
Finally, as other extensions to massive gravity, one can also consider all the extensions applicable to GR. This includes the higher order Lovelock invariants in dimensions greater than four, as well as promoting the EinsteinHilbert kinetic term to a function f (R), which is equivalent to gravity with a scalar field. In the case of massive gravity this has been performed in [89] (see also [46, 354]), where the absence of BD ghost was proven via a constraint analysis, and the cosmology was explored (this was also discussed in Section 5.6 and see also Section 12.5). f (R) extensions to bigravity were also derived in [416, 415].
Traceanomaly driven inflation in bigravity was also explored in Ref. [47]. Massless quantum effects can be taking into account by including the trace anomaly \({{\mathcal T}_A}\) given as [203]
where c_{1,2,3} are three constants depending on the field content (for instance the number of scalars, spinors, vectors, graviton etc.) Including this trace anomaly to the bigravity de Sitterlike solutions were found which could represent a good model for anomalydriven models of inflation.
11.1 Massvarying
The idea behind massvarying gravity is to promote the graviton mass to a potential for an external scalar field ψ, m → m (ψ), which has its own dynamics [319], so that in four dimensions, the dRGT action for massive gravity gets promoted to
and the tensors \({\mathcal K}\) are given in (6.7). This could also be performed for bigravity, where we would simply include the EinsteinHilbert term for the metric f_{μν}. This formulation was then promoted not only to varying parameters α_{n} → α_{n} (ψ) but also to multiple fields α_{A}, with \(A = 1, \cdots {\mathcal N}\) in [320],
The absence of BD ghost in these theories were performed in [319] and [320] in unitary gauge, in the ADM language by means of a constraint analysis as formulated in Section 7.1. We recall that in the absence of the scalar field ψ, the primary secondclass (Hamiltonian) constraint is given by
In the case of a massvarying theory of gravity, the entire argument remains the same, with the simple addition of the scalar field contribution,
where p_{ψ} is the conjugate momentum associated with the scalar field ψ and
then the timeevolution of this primary constraint leads to a secondary constraint similarly as in Section 7.1. The expression for this secondary constraint is the same as in (7.33) with a benign new contribution from the scalar field [319]
then as in the normal fixedmass case, the tertiary constraint is a constraint for the lapse and the system of constraint truncates leading to 5+1 physical degrees of freedom in four dimensions. The same logic goes through for generalized massive gravity as explained in [320].
One of the important aspects of a massvarying theory of massive gravity is that it allows more flexibility for the graviton mass. In the past the mass could have been much larger and could have lead to potential interesting features, be it for inflation (see for instance Refs. [315, 378] and [282]), the HartleHawking noboundary proposal [498, 439, 499], or to avoid the Higuchi bound [307], and yet be compatible with current bounds on the graviton mass. If the graviton mass is an effective description from higher dimensions it is also quite natural to imagine that the graviton mass would depend on some moduli.
11.2 Quasidilaton
The Planck scale M_{pl}, or Newton constant explicitly breaks scale invariance, but one can easily extend the theory of GR to a scale invariant one M_{Pl} → M_{Pl}e^{λ(x)} by including a dilaton scalar field λ which naturally arises from string theory or from extra dimension compactification (see for instance [122] and see Refs. [429, 120, 248] for the role of a dilaton scalar field on cosmology).
When dealing with multigravity, one can extend the notion of conformal transformation to the global rescaling of the coordinate system of one metric with respect to that of another metric. In the case of massive gravity this amounts to considering the global rescaling of the reference coordinates with respect to the physical one. As already seen, the reference metric can be promoted to a tensor with respect to transformations of the physical metric coordinates, by introducing four Stückelberg fields ϕ^{a}, f_{μν} → f_{ab} ∂_{μ} ϕ^{a}∂_{ν}ϕ^{b}. Thus the theory can be made invariant under global rescaling of the reference metric if the reference metric is promoted to a function of the quasidilaton scalar field σ,
This is the idea behind the quasidilaton theory of massive gravity proposed in Ref. [119]. The theoretical consistency of this model was explored in [119] and is reviewed below. The Vainshtein mechanism and the cosmology were also explored in [119, 118] as well as in Refs. [288, 243, 127] and we review the cosmology in Section 12.5. As we shall see in that section, one of the interests of quasidilaton massive gravity is the existence of spatially flat FLRW solutions, and particularly of selfaccelerating solutions. Nevertheless, such solutions have been shown to be strongly coupled within the region of interest [118], but an extension of that model was proposed in [127] and shown to be free from such issues.
Recently, the decoupling limit of the original quasidilaton model was derived in [239]. Interestingly, a new selfaccelerating solution was found in this model which admits no instability and all the modes are (sub)luminal for a given realistic set of parameters. The extension of this solution to the full theory (beyond the decoupling limit) should provide for a consistent selfaccelerating solution which is guaranteed to be stable (or with a harmless instability time scale of the order of the age of the Universe at least).
11.2.1 Theory
As already mentioned, the idea behind quasidilaton massive gravity (QMG) is to extend massive gravity to a theory which admits a new global symmetry. This is possible via the introduction of a quasidilaton scalar field σ (x). The action for QMG is thus given by
where ψ represent the matter fields, g is the dynamical metric, and unless specified otherwise all indices are raised and lowered with respect to g, and represents the scalar curvature with respect to g. The Lagrangians ℒ_{n} were expressed in (6.9–6.13) or (6.14–6.18) and the tensor \(\tilde K\) is given in terms of the Stückelberg fields as
In the case of the QMG presented in [119], there is no cosmological constant nor tadpole (α_{0} = α_{1} = 0) and α_{2} = 1. This is a very special case of the generalized theory of massive gravity presented in [320], and the proof for the absence of BD ghost thus goes through in the same way. Here again the presence of the scalar field brings only minor modifications to the Hamiltonian analysis in the ADM language as presented in Section 9.1, and so we do not reproduce the proof here. We simply note that the theory propagates six degrees of freedom in four dimensions and is manifestly free of any ghost on flat space time provided that ω > 1/6. The key ingredient compared to massvarying gravity or generalized massive gravity is the presence of a global rescaling symmetry which is both a spacetime and internal transformation [119],
Notice that the matter action \({{\rm{d}}^{\rm{4}}}x\sqrt { g} {\mathcal L}(g,\psi)\) breaks this symmetry, reason why it is called a ‘quasidilaton’.
An interesting feature of QMG is the fact that the decoupling limit leads to a biGalileon theory, one Galileon being the helicity0 mode presented in Section 8.3, and the other Galileon being the quasidilaton σ. Just as in massive gravity, there are no irrelevant operators arising at energy scale below Λ_{3}, and at that scale the theory is given by
where the decoupling limit Lagrangian \({\mathcal L}_{{\Lambda _3}}^{(0)}\) in the absence of the quasidilaton is given in (8.52) and we recall that \({\alpha _2} = 1,\,{\alpha _1} = 0,\,\Pi _{\,\,\,v}^\mu = {\partial ^\mu}{\partial _v}\pi\) and the Lagrangians ℒ_{n} are expressed in (6.10)–(6.13) or (6.15)–(6.18). We see emerging a biGalileon theory for π and σ, and thus the decoupling limit is manifestly ghostfree. We could then apply a similar argument as in Section 7.2.4 to infer the absence of BD ghost for the full theory based on this decoupling limit. Up to integration by parts, the Lagrangian (9.13) is invariant under both independent Galilean transformation π → π +c +υ_{μ}x^{μ} and \(\sigma \to \sigma + \tilde c + {\tilde \upsilon _\mu}{x^\mu}\).
One of the relevance of this decoupling limit is that it makes the study of the Vainshtein mechanism more explicit. As we shall see in what follows (see Section 10.1), the Galileon interactions are crucial for the Vainshtein mechanism to work.
Note that in (9.13), the interactions with the quasidilaton come in the combination ((4 − n)α_{n} −(n + 1)α_{n+1}), while in \({\mathcal L}_{{\Lambda _3}}^{(0)}\), the interactions between the helicity0 and 2 modes come in the combination ((4 − n)α_{n} + (n + 1)α_{n+1}). This implies that in massive gravity, the interactions between the helicity2 and 0 mode disappear in the special case where α_{n} = −(n + 1)/(4 − n)α_{n+1} (this corresponds to the minimal model), and the Vainshtein mechanism is no longer active for spherically symmetric sources (see Refs. [99, 56, 58, 57, 435]). In the case of QMG, the interactions with the quasidilaton survive in that specific case α_{3} = −4α_{4}, and a Vainshtein mechanism could still be feasible, although one might still need to consider nonasymptotically Minkowski configurations.
The cosmology of QMD was first discussed in [119] where the existence of selfaccelerating solutions was pointed out. This will be reviewed in the section on cosmology, see Section 12.5. We now turn to the extended version of QMG recently proposed in Ref. [127].
11.2.2 Extended quasidilaton
Keeping the same philosophy as the quasidilaton in mind, a simple but yet powerful extension was proposed in Ref. [127] and then further extended in [126], leading to interesting phenomenology and stable selfaccelerating solutions. The phenomenology of this model was then further explored in [45]. The stability of the extended quasidilaton theory of massive gravity was explored in [353] and was proven to be ghostfree in [406].
The key ingredient behind the extended quasidilaton theory of massive gravity (EMG) is to notice that two most important properties of QMG namely the absence of BD ghost and the existence of a global scaling symmetry are preserved if the covariantized reference metric is further generalized to include a disformal contribution of the form ∂_{μ}σ ∂_{ν}σ (such a contribution to the reference metric can arise naturally from the branebending mode in higher dimensional braneworld models, see for instance [157]).
The action for EMG then takes the same form as in (9.10) with the tensor \(\tilde {\mathcal K}\), promoted to
with the tensor defined as
where α_{σ} is a new coupling dimensionless constant (as mentioned in [127], this coupling constant is expected to enjoy a nonrenormalization theorem in the decoupling limit, and thus to receive quantum corrections which are always suppressed by at least \({m^2}/\Lambda _3^2\). Furthermore, this action can be generalized further by

Considering different coupling constants for the \(\tilde {\mathcal K}\)’s entering in \({{\mathcal L}_2}[\tilde {\mathcal K}]\), \({{\mathcal L}_3}[\tilde {\mathcal K}]\) and \({{\mathcal L}_4}[\tilde {\mathcal K}]\).

One can also introduce what would be a cosmological constant for the metric \(\bar f\), namely a new term of the for \(\sqrt { \bar f} {e^{4\sigma/{M_{{\rm{Pl}}}}}}\).

General shiftsymmetric Horndeski Lagrangians for the quasidilaton.
With these further generalizations, one can obtain selfaccelerating solutions similarly as in the original QMG. For these selfaccelerating solutions, the coupling constant does not enter the background equations of motion but plays a crucial role for the stability of the scalar perturbations on top of these solutions. This is one of the benefits of this extended quasidilaton theory of massive gravity.
11.3 Partially massless
11.3.1 Motivations behind PM gravity
The multiple proofs for the absence of BD ghost presented in Section 7 ensures that the ghostfree theory of massive gravity, (or dRGT) does not propagate more than five physical degrees of freedom in the graviton. For a generic finite mass m the theory propagates exactly five degrees of freedom as can be shown from a linear analysis about a generic background. Yet, one can ask whether there exists special points in parameter space where some of degrees of freedom decouple. General relativity, for which m = 0 (and the other parameters α_{n} are finite) is one such example. In the massless limit of massive gravity the two helicity1 modes and the helicity0 mode decouple from the helicity2 mode and we thus recover the theory of a massless spin2 field corresponding to GR, and three decoupled degrees of freedom. The decoupling of the helicity0 mode occurs via the Vainshtein mechanism^{Footnote 24} as we shall see in Section 10.1.
As seen in Section 8.3.6, when considering massive gravity on de Sitter as a reference metric, if the graviton mass is precisely m^{2} = 2H^{2}, the helicity0 mode disappears linearly as can be seen from the linearized Lagrangian (8.62). The same occurs in any dimension when the graviton mass is tied to the de Sitter curvature by the relation m^{2} = (d − 2)H^{2}. This special case is another point in parameter space where the helicity0 mode could be decoupled, corresponding to a partially massless (PM) theory of gravity as first pointed out by Deser and Waldron [190, 189, 188], (see also [500] for partially massless higher spin, and [450] for related studies).
The absence of helicity0 mode at the linearized level in PM is tied to the existence of a new scalar gauge symmetry at the linearized level when m^{2} = 2H^{2} (or (d − 2)H^{2} in arbitrary dimensions), which is responsible for making the helicity0 mode unphysical. Indeed the action (8.62) is invariant under a special combination of a linearized diff and a conformal transformation [190, 189, 188],
If a nonlinear completion of PM gravity exist, then there must exist a nonlinear completion of this symmetry which eliminates the helicity0 mode to all orders. The existence of such a symmetry would lead to several outstanding features:

It would protect the structure of the potential.

In the PM limit of massive gravity, the helicity0 mode fully decouples from the helicity2 mode and hence from external matter. As a consequence, there is no Vainshtein mechanism that decouples the helicity0 mode in the PM limit of massive gravity unlike in the massless limit. Rather, the helicity0 mode simply decouples without invoking any strong coupling effects and the theoretical and observational luggage that goes with it.

Last but not least, in PM gravity the symmetry underlying the theory is not diffeomorphism invariance but rather the one pointed out in (9.16). This means that in PM gravity, an arbitrary cosmological constant does not satisfy the symmetry (unlike in GR). Rather, the value of the cosmological constant is fixed by the gauge symmetry and is proportional to the graviton mass. As we shall see in Section 10.3 the graviton mass does not receive large quantum corrections (it is technically natural to set to small values). So, if a PM theory of gravity existed it would have the potential to tackle the cosmological constant problem.
Crucially, breaking of covariance implies that matter is no longer covariantly conserved. Instead the failure of energy conservation is proportional to the graviton mass,
which in practise is extremely small.
It is worth emphasizing that if a PM theory of gravity existed, it would be distinct from the minimal model of massive gravity where the nonlinear interactions between the helicity0 and 2 modes vanish in the decoupling limit but the helicity0 mode is still fully present. PM gravity is also distinct from some specific branches of solutions found in cosmology (see Section 12) on top of which the helicity0 mode disappears. If a PM theory of gravity exists the helicity0 mode would be fully absent of the whole theory and not only for some specific branches of solutions.
11.3.2 The search for a PM theory of gravity
11.3.2.1 A candidate for PM gravity:
The previous considerations represent some strong motivations for finding a fully fledged theory of PM gravity (i.e., beyond the linearized theory) and there has been many studies to find a nonlinear realization of the PM symmetry. So far all these studies have in common to keep the kinetic term for gravity unchanged (i.e., keeping the standard EinsteinHilbert action, with a potential generalization to the Lovelock invariants [298]).
Under this assumption, it was shown in [501, 330], that while the linear level theory admits a symmetry in any dimensions, at the cubic level the PM symmetry only exists in d = 4 spacetime dimensions, which could make the theory even more attractive. It was also pointed out in [191] that in four dimensions the theory is conformally invariant. Interestingly, the restriction to four dimensions can be lifted in bigravity by including the Lovelock invariants [298].
From the analysis in Section 8.3.6 (see Ref. [154]) one can see that the helicity0 mode entirely disappears from the decoupling limit of ghostfree massive gravity, if one ignores the vectors and sets the parameters of the theory to m^{2} = 2H^{2}, a._{3} = −1 and α._{4} = 1/4 in four dimensions. The ghostfree theory of massive gravity with these parameters is thus a natural candidate for the PM theory of gravity. Following this analysis, it was also shown that bigravity with the same parameters for the interactions between the two metrics satisfies similar properties [301]. Furthermore, it was also shown in [147] that the potential has to follow the same structure as that of ghostfree massive gravity to have a chance of being an acceptable candidate for PM gravity. In bigravity the same parameters as for massive gravity were considered as also being the natural candidate [301], in addition of course to other parameters that vanish in the massive gravity limit (to make a fair comparison once needs to take the massive gravity limit of bigravity with care as was shown in [301]).
11.3.2.2 Reappearance of the Helicity0 mode:
Unfortunately, when analyzing the interactions with the vector fields, it is clear from the decoupling limit (8.52) that the helicity0 mode reappears nonlinearly through their couplings with the vector fields. These never cancel, not even in four dimensions and for no parameters of theory. So rather than being free from the helicity0 mode, massive gravity with m^{2} = (d − 2)H^{2} has an infinitely strongly coupled helicity0 mode and is thus a sick theory. The absence of the helicity0 mode is simple artefact of the linear theory.
As a result we can thus deduce that there is no theory of PM gravity. This result is consistent with many independent studies performed in the literature (see Refs. [185, 147, 181, 194]).
11.3.2.3 Relaxing the assumptions:

One assumption behind this result is the form of the kinetic term for the helicity2 mode, which is kept to the be EinsteinHilbert term as in GR. A few studies have considered a generalization of that kinetic term to diffeomorphismbreaking ones [231, 310] however further analysis [339, 153] have shown that such interactions always lead to ghosts nonperturbatively. See Section 5.6 for further details.

Another potential way out is to consider the embedding of PM within bigravity or multigravity. Since bigravity is massive gravity and a decoupled massless spin2 field in some limit it is unclear how bigravity could evade the results obtained in massive gravity but this approach has been explored in [301, 298, 299, 184]. A perturbative relation between bigravity and conformal gravity was derived at the level of the equations of motion in Ref. [299] (unlike claimed in [184]).

The other assumptions are locality and Lorentzinvariance. It is well known that Lorentzbreaking theories of massive gravity can excite fewer than five degrees of freedom. This avenue is explored in Section 14.
To summarize there is to date no known nonlinear PM symmetry which could project out the helicity0 mode of the graviton while keeping the helicity2 mode massive in a local and Lorentz invariant way.
12 Massive Gravity Field Theory
12.1 Vainshtein mechanism
As seen earlier, in four dimensions a massless spin2 field has five degrees of freedom, and there is no special PM case of gravity where the helicity0 mode is unphysical while the graviton remains massive (or at least