Skip to main content

Massive Gravity

Abstract

We review recent progress in massive gravity. We start by showing how different theories of massive gravity emerge from a higher-dimensional theory of general relativity, leading to the Dvali-Gabadadze-Porrati model (DGP), cascading gravity, and ghost-free massive gravity. We then explore their theoretical and phenomenological consistency, proving the absence of Boulware-Deser ghosts and reviewing the Vainshtein mechanism and the cosmological solutions in these models. Finally, we present alternative and related models of massive gravity such as new massive gravity, Lorentz-violating massive gravity and non-local massive gravity.

Introduction

For almost a century, the theory of general relativity (GR) has been known to describe the force of gravity with impeccable agreement with observations. Despite all the successes of GR the search for alternatives has been an ongoing challenge since its formulation. Far from a purely academic exercise, the existence of consistent alternatives to describe the theory of gravitation is actually essential to test the theory of GR. Furthermore, the open questions that remain behind the puzzles at the interface between gravity/cosmology and particle physics such as the hierarchy problem, the old cosmological constant problem and the origin of the late-time acceleration of the Universe have pushed the search for alternatives to GR.

While it was not formulated in this language at the time, from a more modern particle physics perspective GR can be thought of as the unique theory of a massless spin-2 particle [287, 483, 175, 225, 76], and so in order to find alternatives to GR one should break one of the underlying assumptions behind this uniqueness theorem. Breaking Lorentz invariance and the notion of spin along with it is probably the most straightforward since non-Lorentz invariant theories include a great amount of additional freedom. This possibility has been explored at length in the literature; see for instance [398] for a review. Nevertheless, Lorentz invariance is observationally well constrained by both particle and astrophysics. Another possibility is to maintain Lorentz invariance and the notion of spin that goes with it but to consider gravity as being the representation of a higher spin. This idea has also been explored; see for instance [466, 52] for further details. In this review, we shall explore yet another alternative: Maintaining the notion that gravity is propagated by a spin-2 particle but considering this particle to be massive. From the particle physics perspective, this extension seems most natural since we know that the particles carrier of the electroweak forces have to acquire a mass through the Higgs mechanism.

Giving a mass to a spin-2 (and spin-1) field is an old idea and in this review we shall summarize the approach of Fierz and Pauli, which dates back to 1939 [226]. While the theory of a massive spin-2 field is in principle simple to derive, complications arise when we include interactions between this spin-2 particle and other particles as should be the case if the spin-2 field is to describe the graviton.

At the linear level, the theory of a massless spin-2 field enjoys a linearized diffeomorphism (diff) symmetry, just as a photon enjoys a U (1) gauge symmetry. But unlike for a photon, coupling the spin-2 field with external matter forces this symmetry to be realized in a different way non-linearly. As a result, GR is a fully non-linear theory, which enjoys non-linear diffeomorphism invariance (also known as general covariance or coordinate invariance). Even though this symmetry is broken when dealing with a massive spin-2 field, the non-linearities are inherited by the field. So, unlike a single isolated massive spin-2 field, a theory of massive gravity is always fully non-linear (and as a consequence non-renormalizable) just as for GR. The fully non-linear equivalent to GR for massive gravity has been a much more challenging theory to obtain. In this review we will summarize a few different approaches to deriving consistent theories of massive gravity and will focus on recent progress. See Ref. [309] for an earlier review on massive gravity, as well as Refs. [134] and [336] for other reviews relating Galileons and massive gravity.

When dealing with a theory of massive gravity two elements have been known to be problematic since the seventies. First, a massive spin-2 field propagates five degrees of freedom no matter how small its mass. At first this seems to suggest that even in the massless limit, a theory of massive gravity could never resemble GR, i.e., a theory of a massless spin-2 field with only two propagating degrees of freedom. This subtlety is at the origin of the vDVZ discontinuity (van Dam-Veltman-Zakharov [465, 497]). The resolution behind that puzzle was provided by Vainshtein two years later and lies in the fact that the extra degree of freedom responsible for the vDVZ discontinuity gets screened by its own interactions, which dominate over the linear terms in the massless limit. This process is now relatively well understood [463] (see also Ref. [35] for a recent review). The Vainshtein mechanism also comes hand in hand with its own set of peculiarities like strong coupling and superluminalities, which we will discuss in this review.

A second element of concern in dealing with a theory of massive gravity is the realization that most non-linear extensions of Fierz-Pauli massive gravity are plagued with a ghost, now known as the Boulware-Deser (BD) ghost [75]. The past decade has seen a revival of interest in massive gravity with the realization that this BD ghost could be avoided either in a model of soft massive gravity (not a single massive pole for the graviton but rather a resonance) as in the DGP (Dvali-Gabadadze-Porrati) model or its extensions [208, 209, 207], or in a three-dimensional model of massive gravity as in ‘new massive gravity’ (NMG) [66] or more recently in a specific ghost-free realization of massive gravity (also known as dRGT in the literature) [144].

With these developments several new possibilities have become a reality:

  • First, one can now more rigorously test massive gravity as an alternative to GR. We will summarize the different phenomenologies of these models and their theoretical as well as observational bounds through this review. Except in specific cases, the graviton mass is typically bounded to be a few times the Hubble parameter today, that is m ≲ 10−30 − 10−33 eV depending on the exact models. In all of these models, if the graviton had a mass much smaller than 10−33 eV, its effect would be unseen in the observable Universe and such a mass would thus be irrelevant. Fortunately there is still to date an open window of opportunity for the graviton mass to be within an interesting range and providing potentially new observational signatures.

  • Second, these developments have opened up the door for theories of interacting metrics, a success long awaited. Massive gravity was first shown to be expressible on an arbitrary reference metric in [296]. It was then shown that the reference metric could have its own dynamics leading to the first consistent formulation of bi-gravity [293]. In bi-gravity two metrics are interacting and the mass spectrum is that of a massless spin-2 field interacting with a massive spin-2 field. It can, therefore, be seen as the theory of general relativity interacting (fully non-linearly) with a massive spin-2 field. This is a remarkable new development in both field theory and gravity.

  • The formulation of massive gravity and bi-gravity in the vielbein language were shown to be both analytic and much more natural and allowed for a general formulation of multi-gravity [314] where an arbitrary number of spin-2 fields may interact together.

  • Finally, still within the theoretical progress front, all of these successes provided full and definite proof for the absence of Boulware-Deser ghosts in these types of theories; see [295], which has then been translated into a multitude of other languages. This also opens the door for new types of theories that can propagate fewer degrees than naively thought.

Independent of this, developments in massive gravity, bi-gravity and multi-gravity have also opened up new theoretical avenues, which we will summarize, and these remain very much an active area of progress. On the phenomenological front, a genuine task force has been devoted to finding both exact and approximate solutions in these types of gravitational theories, including the ones relevant for black holes and for cosmology. We shall summarize these in the review.

This review is organized as follows: We start by setting the formalism for massive and massless spin-1 and -2 fields in Section 2 and emphasize the Stückelberg language both for the Proca and the Fierz-Pauli fields. In Part I we then derive consistent theories using a higher-dimensional framework, either using a braneworld scenario à la DGP in Section 4, or via a discretizationFootnote 1 (or Kaluza-Klein reduction) of the extra dimension in Section 5. This second approach leads to the theory of ghost-free massive gravity (also known as dRGT) which we review in more depth in Part II. Its formulation is summarized in Section 6, before tackling other interesting aspects such as the fate of the BD ghost in Section 7, deriving its decoupling limit in Section 8, and various extensions in Section sec:Extensions. The Vainshtein mechanism and other related aspects are discussed in Section 10. The phenomenology of ghost-free massive gravity is then reviewed in Part III including a discussion of solar-system tests, gravitational waves, weak lensing, pulsars, black holes and cosmology. We then conclude with other related theories of massive gravity in Part IV, including new massive gravity, Lorentz breaking theories of massive gravity and non-local versions.

Notations and conventions: Throughout this review, we work in units where the reduced Planck constant and the speed of light c are set to unity. The gravitational Newton constant is related to the Planck scale by \(8\pi {G_N} = M_{{\rm{P1}}}^{- 2}\). Unless specified otherwise, d represents the number of spacetime dimensions. We use the mainly + convention (−+ ⋯ +) and space indices are denoted by i, j, ⋯ = 1, ⋯, d − 1 while 0 represents the time-like direction, x0 = t.

We also use the symmetric convention: \((a,b) = {1 \over 2}(ab + ba)\) and \([a,b] = {1 \over 2}(ab - ba)\). Throughout this review, square brackets of a tensor indicates the trace of tensor, for instance \([{\mathbb X}] = {\mathbb X}_\mu ^\mu, [{{\mathbb X}^2}] = {\mathbb X}_v^\mu {\mathbb X}_\mu ^v\), etc. … We also use the notation Πμν = dμdν and \({\mathcal I} = \delta _{v \cdot}^\mu \,{\varepsilon _{\mu v\alpha \beta}}\) and εabcde represent the Levi-Cevita symbol in respectively four and five dimensions, ε0123 = ε01234 = 1 = ε0123.

Massive and Interacting Fields

Proca field

Maxwell kinetic term

Before jumping into the subtleties of massive spin-2 field and gravity in general, we start this review with massless and massive spin-1 fields as a warm up. Consider a Lorentz vector field living on a four-dimensional Minkowski manifold. We focus this discussion to four dimensions and the extension to d dimensions is straightforward. Restricting ourselves to Lorentz invariant and local actions for now, the kinetic term can be decomposed into three possible contributions:

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - 1} = {a_1}{{\mathcal L}_1} + {a_2}{{\mathcal L}_2} + {a_3}{{\mathcal L}_3}\,,$$
(2.1)

where a1,2,3 are so far arbitrary dimensionless coefficients and the possible kinetic terms are given by

$${{\mathcal L}_1} = {\partial _\mu}{A^\nu}{\partial ^\mu}{A_\nu}$$
(2.2)
$${{\mathcal L}_2} = {\partial _\mu}{A^\mu}{\partial _\nu}{A^\nu}$$
(2.3)
$${{\mathcal L}_3} = {\partial _\mu}{A^\nu}{\partial _\nu}{A^\mu}\,,$$
(2.4)

where in this section, indices are raised and lowered with respect to the flat Minkowski metric. The first and third contributions are equivalent up to a boundary term, so we set a3 = 0 without loss of generality.

We now proceed to establish the behavior of the different degrees of freedom (dofs) present in this theory. A priori, a Lorentz vector field Aμ in four dimensions could have up to four dofs, which we can split as a transverse contribution \(A_\mu ^ \bot\) satisfying \({\partial ^\mu}A_\mu ^ \bot = 0\) bearing a priori three dofs and a longitudinal mode χ with \({\mathcal X}\)

Helicity-0 mode

Focusing on the longitudinal (or helicity-0) mode χ, the kinetic term takes the form

$${\mathcal L}_{\mathrm{kin}}^{\chi}=(a_1+a_2) \partial_\mu\partial_\nu \chi \partial^\mu\partial^\nu\chi= (a_1+a_2)(\square \chi)^2\,,$$
(2.5)

where □ = ημνμν represents the d’Alembertian in flat Minkowski space and the second equality holds after integrations by parts. We directly see that unless a1 = −a2, the kinetic term for the field χ bears higher time (and space) derivatives. As a well known consequence of Ostrogradsky’s theorem [421], two dofs are actually hidden in χ with an opposite sign kinetic term. This can be seen by expressing the propagator □−2 as the sum of two propagators with opposite signs:

$$\frac{1}{\square^2}=\lim\limits_{m\rightarrow 0} \frac{1}{2m^2}\left(\frac{1}{\square-m^2}-\frac{1}{\square+m^2}\right)\,,$$
(2.6)

signaling that one of the modes always couples the wrong way to external sources. The mass m of this mode is arbitrarily low which implies that the theory (2.1) with a3 = 0 and a1 +a2 = 0 is always sick. Alternatively, one can see the appearance of the Ostrogradsky instability by introducing a Lagrange multiplier \(\tilde {\mathcal X}(x)\), so that the kinetic action (2.5) for χ is equivalent to

$$\mathcal L_{\mathrm{kin}}^{\chi}=(a_1+a_2)\left(\tilde\chi \square \chi -\frac 14 \tilde\chi^2\right)\,,$$
(2.7)

after integrating out the Lagrange multiplierFootnote 2 \(\tilde {\mathcal X} \equiv 2\square{\mathcal X}\). We can now perform the change of variables χ = ϕ1 + ϕ2 and \(\tilde {\mathcal X} = {\phi _1} - {\phi _2}\) giving the resulting Lagrangian for the two scalar fields ϕ1,2

$$\mathcal L_{\mathrm{kin}}^{\chi}=(a_1+a_2)\left(\phi_1 \square \phi_1-\phi_2 \square \phi_2-\frac 14 (\phi_1-\phi_2)^2\right)\,.$$
(2.8)

As a result, the two scalar fields ϕ1,2 always enter with opposite kinetic terms, signaling that one of them is always a ghost.Footnote 3 The only way to prevent this generic pathology is to make the specific choice a1 + a2 = 0, which corresponds to the well-known Maxwell kinetic term.

Helicity-1 mode and gauge symmetry

Now that the form of the local and covariant kinetic term has been uniquely established by the requirement that no ghost rides on top of the helicity-0 mode, we focus on the remaining transverse mode \(A_\mu ^ \bot\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{helicity}} - {\rm{1}}} = {a_1}\, {\left({{\partial _\mu}A_\nu ^ \bot} \right)^2}\, ,$$
(2.9)

which has the correct normalization if a1 = −1/2. As a result, the only possible local kinetic term for a spin-1 field is the Maxwell one:

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = - {1 \over 4}F_{\mu v}^2$$
(2.10)

with Fμν = μAνμ. Restricting ourselves to a massless spin-1 field (with no potential and other interactions), the resulting Maxwell theory satisfies the following U (1) gauge symmetry:

$${A_\mu} \rightarrow {A_\mu} + {\partial _\mu}\xi .$$
(2.11)

This gauge symmetry projects out two of the naive four degrees of freedom. This can be seen at the level of the Lagrangian directly, where the gauge symmetry (2.11) allows us to fix the gauge of our choice. For convenience, we perform a (3 + 1)-split and choose Coulomb gauge iAi = 0, so that only two dofs are present in Ai, i.e., Ai contains no longitudinal mode, \({A_i} = A_i^t + {\partial _i}{A^l}\), with \({\partial ^i}A_i^t = 0\) and the Coulomb gauge sets the longitudinal mode Al = 0. The time-component A0 does not exhibit a kinetic term,

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = {1 \over 2}{({\partial _t}{A_i})^2} - {1 \over 2}{({\partial _i}{A^0})^2} - {1 \over 4}{({\partial _i}{A_j})^2}\, ,$$
(2.12)

and appears instead as a Lagrange multiplier imposing the constraint

$${\partial _i}{\partial ^i}{A_0} \equiv 0\, .$$
(2.13)

The Maxwell action has therefore only two propagating dofs in \(A_i^t\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = - {1 \over 2}{({\partial _\mu}A_i^t)^2}\, .$$
(2.14)

To summarize, the Maxwell kinetic term for a vector field and the fact that a massless vector field in four dimensions only propagates 2 dofs is not a choice but has been imposed upon us by the requirement that no ghost rides along with the helicity-0 mode. The resulting theory is enriched by a U (1) gauge symmetry which in turn freezes the helicity-0 mode when no mass term is present. We now ‘promote’ the theory to a massive vector field.

Proca mass term

Starting with the Maxwell action, we consider a covariant mass term AμAμ corresponding to the Proca action

$${{\mathcal L}_{{\rm{Proca}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}{m^2}{A_\mu}{A^\mu}\, ,$$
(2.15)

and emphasize that the presence of a mass term does not change the fact that the kinetic has been uniquely fixed by the requirement of the absence of ghost. An immediate consequence of the Proca mass term is the breaking of the U (1) gauge symmetry (2.11), so that the Coulomb gauge can no longer be chosen and the longitudinal mode is now dynamical. To see this, let us use the previous decomposition \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}\hat {\mathcal X}\) and notice that the mass term now introduces a kinetic term for the helicity-0 mode \({\mathcal X} = m\hat {\mathcal X}\)

$${{\mathcal L}_{{\rm{Proca}}}} = - {1 \over 2}{({\partial _\mu}A_\nu ^ \bot)^2} - {1 \over 2}{m^2}{(A_\mu ^ \bot)^2} - {1 \over 2}{({\partial _\mu}\chi)^2}\, .$$
(2.16)

A massive vector field thus propagates three dofs, namely two in the transverse modes \(A_\mu ^ \bot\) and one in the longitudinal mode χ. Physically, this can be understood by the fact that a massive vector field does not propagate along the light-cone, and the fluctuations along the line of propagation correspond to an additional physical dof.

Before moving to the Abelian Higgs mechanism, which provides a dynamical way to give a mass to bosons, we first comment on the discontinuity in number of dofs between the massive and massless case. When considering the Proca action (2.16) with the properly normalized fields \(A_\mu ^ \bot\) and χ, one does not recover the massless Maxwell action (2.9) or (2.10) when sending the boson mass m → 0. A priori, this seems to signal the presence of a discontinuity which would allow us to distinguish between for instance a massless photon and a massive one no matter how tiny the mass. In practice, however, the difference is physically indistinguishable so long as the photon couples to external sources in a way which respects the U (1) symmetry. Note however that quantum anomalies remain sensitive to the mass of the field so the discontinuity is still present at this level, see Refs. [197, 204].

To physically tell the difference between a massless vector field and a massive one with tiny mass, one has to probe the system, or in other words include interactions with external sources

$${{\mathcal L}_{{\rm{sources}}}} = - {A_\mu}{J^\mu}\, .$$
(2.17)

The U (1) symmetry present in the massless case is preserved only if the external sources are conserved, ∂μ Jμ = 0. Such a source produces a vector field which satisfies

$$\square A_\mu^\bot=J_\mu$$
(2.18)

in the massless case. The exchange amplitude between two conserved sources Jμ and Jμ mediated by a massless vector field is given by

$${\mathcal A}_{JJ \prime}^{{\rm{massless}}} = \int {\;{{\rm{d}}^4}A_\mu ^ \bot {{J \prime}^\mu}} = \int {{{\rm{d}}^4}x{{J \prime}^\mu}{1 \over {\square}}{J_\mu}\,.}$$
(2.19)

On the other hand, if the vector field is massive, its response to the source Jμ is instead

$$(\square-m^2) A_\mu^\bot=J_\mu \quad \text{and}\quad \square \chi=0\,.$$
(2.20)

In that case, one needs to consider both the transverse and the longitudinal modes of the vector field in the exchange amplitude between the two sources Jμ and Jμ. Fortunately, a conserved source does not excite the longitudinal mode and the exchange amplitude is uniquely given by the transverse mode,

$$A_{JJ\prime}^{{\rm{massless}}} = \int {{{\rm{d}}^4}x(A_\mu ^ \bot + {\partial _\mu}\chi)J{\prime ^\mu}} = \int {{{\rm{d}}^4}xJ{\prime ^\mu}{1 \over {\square \, - \,{m^2}}}{J_\mu}\, .}$$
(2.21)

As a result, the exchange amplitude between two conserved sources is the same in the limit m → 0 no matter whether the vector field is intrinsically massive and propagates 3 dofs or if it is massless and only propagates 2 modes. It is, therefore, impossible to probe the difference between an exactly massive vector field and a massive one with arbitrarily small mass.

Notice that in the massive case no U (1) symmetry is present and the source needs not be conserved. However, the previous argument remains unchanged so long as ∂μJμ goes to zero in the massless limit at least as quickly as the mass itself. If this condition is violated, then the helicity-0 mode ought to be included in the exchange amplitude (2.21). In parallel, in the massless case the non-conserved source provides a new kinetic term for the longitudinal mode which then becomes dynamical.

Abelian Higgs mechanism for electromagnetism

Associated with the absence of an intrinsic discontinuity in the massless limit is the existence of a Higgs mechanism for the vector field whereby the vector field acquires a mass dynamically. As we shall see later, the situation is different for gravity where no equivalent dynamical Higgs mechanism has been discovered to date. Nevertheless, the tools used to describe the Abelian Higgs mechanism and in particular the introduction of a Stückelberg field will prove useful in the gravitational case as well.

To describe the Abelian Higgs mechanism, we start with a vector field with associated Maxwell tensor Fμν and a complex scalar field ϕ with quartic potential

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}\left({{{\mathcal D}_\mu}\phi} \right){\left({{{\mathcal D}^\mu}\phi} \right)^{\ast}} - \lambda {\left({\phi {\phi ^{\ast}} - \Phi _0^2} \right)^2} .$$
(2.22)

The covariant derivative, \({{\mathcal D}_\mu} = {\partial _\mu} - iq{A_\mu}\) ensures the existence of the U (1) symmetry, which in addition to (2.11) shifts the scalar field as

$$\phi \rightarrow \phi {e^{iq\xi}}\,.$$
(2.23)

Splitting the complex scalar field ϕ into its norm and phase ϕ = φe, we see that the covariant derivative plays the role of the mass term for the vector field, when scalar field acquires a non-vanishing vacuum expectation value (vev),

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}{\varphi ^2}{\left({q{A_\mu} - {\partial _\mu}\chi} \right)^2} - {1 \over 2}{({\partial _\mu}\varphi)^2} - \lambda {\left({{\varphi ^2} - \Phi _0^2} \right)^2}\, .$$
(2.24)

The Higgs field φ can be made arbitrarily massive by setting λ ≫ 1 in such a way that its dynamics may be neglected and the field can be treated as frozen at φ ≡ Φ0 = const. The resulting theory is that of a massive vector field,

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}\Phi _0^2{\left({q{A_\mu} - {\partial _\mu}\chi} \right)^2}\, ,$$
(2.25)

where the phase χ of the complex scalar field plays the role of a Stückelberg which restores the U (1) gauge symmetry in the massive case,

$${A_\mu} \rightarrow {A_\mu} + \;{\partial _\mu}\xi (x)$$
(2.26)
$$\chi \rightarrow \chi + q \xi (x)\, .$$
(2.27)

In this formalism, the U (1) gauge symmetry is restored at the price of introducing explicitly a Stückelberg field which transforms in such a way so as to make the mass term invariant. The symmetry ensures that the vector field Aμ propagates only 2 dofs, while the Stückelberg χ propagates the third dof. While no equivalent to the Higgs mechanism exists for gravity, the same Stückelberg trick to restore the symmetry can be used in that case. Since the in that context the symmetry broken is coordinate transformation invariance, (full diffeomorphism invariance or covariance), four Stückelberg fields should in principle be included in the context of massive gravity, as we shall see below.

Interacting spin-1 fields

Now that we have introduced the notion of a massless and a massive spin-1 field, let us look at N interacting spin-1 fields. We start with N free and massless gauge fields, \(A_\mu ^{(a)}\), with a = 1, ⋯, N, and respective Maxwell tensors \(F_{\mu v}^{(a)} = {\partial _\mu}{A^{(a)}} - {\partial _v}A_\mu ^{(a)}\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{N}}\, {\rm{spin}} - {\rm{1}}} = - {1 \over 4}\sum\limits_{a = 1}^N {{{\left({F_{\mu v}^{(a)}} \right)}^2}} .$$
(2.28)

The theory is then manifestly Abelian and invariant under N copies of U (1), (i.e., the symmetry group is U (1)N which is Abelian as opposed to U (N) which would correspond to a Yang-Mills theory and would not be Abelian).

However, in addition to these N gauge invariances, the kinetic term is invariant under global rotations in field space,

$$A_\mu ^{(a)} \rightarrow \tilde A_\mu ^{(a)} = O_{\;b}^aA_\mu ^{(b)},$$
(2.29)

where \(O_b^a\) is a (global) rotation matrix. Now let us consider some interactions between these different fields. At the linear level (quadratic level in the action), the most general set of interactions is

$${{\mathcal L}_{{\rm{int}}}} = - {1 \over 2}\sum\limits_{a,b} {{{\mathcal I}_{ab}}A_\mu ^{(a)}A_\nu ^{(b)}{\eta ^{\mu \nu}}} ,$$
(2.30)

where \({{\mathcal I}_{ab}}\) is an arbitrary symmetric matrix with constant coefficients. For an arbitrary rank-N matrix, all N copies of U (1) are broken, and the theory then propagates N additional helicity-0 modes, for a total of 3N independent polarizations in four spacetime dimensions. However, if the rank r of \({\mathcal I}\) is r < N, i.e., if some of the eigenvalues of \({\mathcal I}\) vanish, then there are Nr special directions in field space which receive no interactions, and the theory thus keeps N − r independent copies of U (1). The theory then propagates r massive spin-1 fields and N − r massless spin-2 fields, for a total of 3N − r independent polarizations in four dimensions.

We can see this statement more explicitly in the case of N spin-1 fields by diagonalizing the mass matrix \({\mathcal I}\). A mentioned previously, the kinetic term is invariant under field space rotations, (2.29), so one can use this freedom to work in a field representation where the mass matrix I is diagonal,

$${{\mathcal I}_{ab}} = {\rm{diag}}\left({m_1^2, \cdots ,m_N^2} \right).$$
(2.31)

In this representation the gauge fields are the mass eigenstates and the mass spectrum is simply given by the eigenvalues of \({{\mathcal I}_{ab}}\).

Spin-2 field

As we have seen in the case of a vector field, as long as it is local and Lorentz-invariant, the kinetic term is uniquely fixed by the requirement that no ghost be present. Moving now to a spin-2 field, the same argument applies exactly and the Einstein-Hilbert term appears naturally as the unique kinetic term free of any ghost-like instability. This is possible thanks to a symmetry which projects out all unwanted dofs, namely diffeomorphism invariance (linear diffs at the linearized level, and non-linear diffs/general covariance at the non-linear level).

Einstein-Hilbert kinetic term

We consider a symmetric Lorentz tensor field hμν. The kinetic term can be decomposed into four possible local contributions (assuming Lorentz invariance and ignoring terms which are equivalent upon integration by parts):

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin - 2}}} = {1 \over 2}\;{\partial ^\alpha}{h^{\mu \nu}}\left({{b_1}{\partial _\alpha}{h_{\mu \nu}} + 2{b_2}{\partial _{(\mu}}{h_{\nu)\alpha}} + {b_3}{\partial _\alpha}h{\eta _{\mu \nu}} + 2{b_4}{\partial _{(\mu}}h{\eta _{\nu)\alpha}}} \right),$$
(2.32)

where b1,2,3,4 are dimensionless coefficients which are to be determined in the same way as for the vector field. We split the 10 components of the symmetric tensor field hμν into a transverse tensor \(h_{\mu v}^T\) (which carries 6 components) and a vector field χμ (which carries 4 components),

$${h_{\mu \nu}} = h_{\mu \nu}^T + 2{\partial _{(\mu}}{\chi _{\nu)}}.$$
(2.33)

Just as in the case of the spin-1 field, an arbitrary kinetic term of the form (2.32) with untuned coefficients bi would contain higher derivatives for χμ which in turn would imply a ghost. As we shall see below, avoiding a ghost within the kinetic term automatically leads to gauge-invariance. After substitution of hμν in terms of \(h_{\mu v}^T\) and χμ, the potentially dangerous parts are

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{kin}}}^{{\rm{spin - 2}}} \supset ({b_1} + {b_2}){\chi ^\mu}{\square ^2}{\chi _\mu} + ({b_1} + 3{b_2} + 2{b_3} + 4{b_4}){\chi ^\mu}\square {\partial _\mu}{\partial _\nu}{\chi ^\nu}} \\ {- 2{h^{T\mu \nu}}\left({({b_2} + {b_4}){\partial _\mu}{\partial _\nu}{\partial _\alpha}{\chi ^\alpha} + ({b_1} + {b_2}){\partial _\mu}\square {\chi _\mu}} \right.} \\ {\left. {+ ({b_3} + {b_4})\square {\partial _\alpha}{\chi ^\alpha}{\eta _{\mu v}}} \right).} \end{array}$$
(2.34)

Preventing these higher derivative terms from arising sets

$${b_4} = - {b_3} = - {b_2} = {b_1},$$
(2.35)

or in other words, the unique (local and Lorentz-invariant) kinetic term one can write for a spin-2 field is the Einstein-Hilbert term

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{2}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = - {1 \over 4}{h^{T\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}h_{\alpha \beta}^T,$$
(2.36)

where \(\hat \varepsilon\) is the Lichnerowicz operator

$$\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = - {1 \over 2}\left({\square {h_{\mu \nu}} - 2{\partial _{(\mu}}{\partial _\alpha}h_{\nu)}^\alpha + {\partial _\mu}{\partial _\nu}h - {\eta _{\mu \nu}}(\square h - {\partial _\alpha}{\partial _\beta}{h^{\alpha \beta}})} \right),$$
(2.37)

and we have set b1 = −1/4 to follow standard conventions. As a result, the kinetic term for the tensor field is invariant under the following gauge transformation,

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + {\partial _{(\mu}}{\xi _{\nu)}}.$$
(2.38)

We emphasize that the form of the kinetic term and its gauge invariance is independent on whether or not the tensor field has a mass, (as long as we restrict ourselves to a local and Lorentz-invariant kinetic term). However, just as in the case of a massive vector field, this gauge invariance cannot be maintained by a mass term or any other self-interacting potential. So only in the massless case, does this symmetry remain exact. Out of the 10 components of a tensor field, the gauge symmetry removes 2 × 4 = 8 of them, leaving a massless tensor field with only two propagating dofs as is well known from the propagation of gravitational waves in four dimensions.

In d ≥ 3 spacetime dimensions, gravitational waves have d (d +1)/2−2d = d (d −3)/2 independent polarizations. This means that in three dimensions there are no gravitational waves and in five dimensions they have five independent polarizations.

Fierz-Pauli mass term

As seen in seen in Section 2.2.1, for a local and Lorentz-invariant theory, the linearized kinetic term is uniquely fixed by the requirement that longitudinal modes propagate no ghost, which in turn prevents that operator from exciting these modes altogether. Just as in the case of a massive spin-1 field, we shall see in what follows that the longitudinal modes can nevertheless be excited when including a mass term. In what follows we restrict ourselves to linear considerations and spare any non-linearity discussions for Parts I and II. See also [327] for an analysis of the linearized Fierz-Pauli theory using Bardeen variables.

In the case of a spin-2 field hμν, we are a priori free to choose between two possible mass terms \(h_{\mu v}^2\) and h2, so that the generic mass term can be written as a combination of both,

$${{\mathcal L}_{{\rm{mass}}}} = - {1 \over 8}{m^2}\left({h_{\mu v}^2 - A{h^2}} \right),$$
(2.39)

where A is a dimensionless parameter. Just as in the case of the kinetic term, the stability of the theory constrains very strongly the phase space and we shall see that only for α = 1 is the theory stable at that order. The presence of this mass term breaks diffeomorphism invariance. Restoring it requires the introduction of four Stückelberg fields χμ which transform under linear diffeomorphisms in such a way as to make the mass term invariant, just as in the Abelian-Higgs mechanism for electromagnetism [174]. Including the four linearized Stückelberg fields, the resulting mass term

$${{\mathcal L}_{{\rm{mass}}}} = - {1 \over 8}{m^2}\left({{{({h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}})}^2} - A{{(h + 2{\partial _\alpha}{\chi ^\alpha})}^2}} \right),$$
(2.40)

is invariant under the simultaneous transformations:

$${h_{\mu v}} \rightarrow {h_{\mu v}} + {\partial _{(\mu}}{\xi _{v)}}\,,$$
(2.41)
$${\chi _\mu} \rightarrow {\chi _\mu} - {1 \over 2}\xi \mu .$$
(2.42)

This mass term then provides a kinetic term for the Stückelberg fields

$${\mathcal L}_{{\rm{kin}}}^\chi = - {1 \over 2}{m^2}\left({{{({\partial _\mu}{\chi _\nu})}^2} - A{{({\partial _\alpha}{\chi ^\alpha})}^2}} \right),$$
(2.43)

which is precisely of the same form as the kinetic term considered for a spin-1 field (2.1) in Section 2.1.1 with a3 = 0 and a2 = Aa1. Now the same logic as in Section 2.1.1 applies and singling out the longitudinal component of these Stückelberg fields it follows that the only combination which does not involve higher derivatives is a2 = a1 or in other words A = 1. As a result, the only possible mass term one can consider which is free from an Ostrogradsky instability is the Fierz-Pauli mass term

$${{\mathcal L}_{{\rm{FP\,mass}}}} = - {1 \over 8}{m^2}\left({{{({h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}})}^2} - {{(h + 2{\partial _\alpha}{\chi ^\alpha})}^2}} \right).$$
(2.44)

In unitary gauge, i.e., in the gauge where the Stückelberg fields χa are set to zero, the Fierz-Pauli mass term simply reduces to

$${{\mathcal L}_{{\rm{FP\,mass}}}} = - {1 \over 8}{m^2}\left({h_{\mu v}^2 - {h^2}} \right),$$
(2.45)

where once again the indices are raised and lowered with respect to the Minkowski metric.

Propagating degrees of freedom

To identify the propagating degrees of freedom we may split further into a transverse and a longitudinal mode,

$${\chi ^a} = {1 \over m}{A^a} + {1 \over {{m^2}}}{\eta ^{ab}}{\partial _b}\pi ,$$
(2.46)

(where the normalization with negative factors of m has been introduced for further convenience).

In terms of hμν and the Stückelberg fields and π the linearized Fierz-Pauli action is

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{FP}}}} = - {1 \over 4}{h^{\mu v}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {1 \over 2}{h^{\mu v}}({\Pi _{\mu v}} - [\Pi ]{\eta _{\mu v}}) - {1 \over 8}F_{\mu v}^2 \\- {1 \over 8}{m^2}(h_{\mu v}^2 - {h^2}) - {1 \over 2}m({h^{\mu v}} - h{\eta ^{\mu v}}){\partial _{(\mu}}{A_{v)}}, \end{array}$$
(2.47)

with Fμν = μAννAμ and Πμν = μνπ and all the indices are raised and lowered with respect to the Minkowski metric.

Terms on the first line represent the kinetic terms for the different fields while the second line represent the mass terms and mixing.

We see that the kinetic term for the field π is hidden in the mixing with hμν. To make the field content explicit, we may diagonalize this mixing by shifting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\eta _{\mu v}}\) and the linearized Fierz-Pauli action is

$$\begin{array}{*{20}c} {{\mathcal L_{{\rm{FP}}}} = - {1 \over 4}{{\tilde h}^{\mu v}}\hat E_{\mu \nu}^{\alpha \beta}{{\tilde h}_{\alpha \beta}} - {3 \over 4}{{(\partial \pi)}^2} - {1 \over 8}F_{\mu v}^2\quad \quad \quad \quad} \\{- {1 \over 8}{m^2}(\tilde h_{\mu v}^2 - {{\tilde h}^2}) + {3 \over 2}{m^2}{\pi ^2} + {3 \over 2}{m^2}\pi \tilde h} \\{\quad - {1 \over 2}{m^2}({{\tilde h}^{\mu v}} - \tilde h{\eta ^{\mu v}}){\partial _{(\mu}}{A_{v)}} + 3m\pi {\partial _\alpha}{A^\alpha}.} \\ \end{array}$$
(2.48)

This decomposition allows us to identify the different degrees of freedom present in massive gravity (at least at the linear level): hμν represents the helicity-2 mode as already present in GR and propagates 2 dofs, Aμ represents the helicity-1 mode and propagates 2 dofs, and finally π represents the helicity-0 mode and propagates 1 dof, leading to a total of five dofs as is to be expected for a massive spin-2 field in four dimensions.

The degrees of freedom have not yet been split into their mass eigenstates but on doing so one can easily check that all the degrees of freedom have the same positive mass square m2.

Most of the phenomenology and theoretical consistency of massive gravity is related to the dynamics of the helicity-0 mode. The coupling to matter occurs via the coupling \({h_{\mu v}}{T^{\mu u}} = {\tilde h_{\mu v}}{T^{\mu v}} + \pi T\), where T is the trace of the external stress-energy tensor. We see that the helicity-0 mode couples directly to conserved sources (unlike in the case of the Proca field) but the helicity-1 mode does not. In most of what follows we will thus be able to ignore the helicity-1 mode.

Higgs mechanism for gravity

As we shall see in Section 9.1, the graviton mass can also be promoted to a scalar function of one or many other fields (for instance of a different scalar field), m = m (ψ). We can thus wonder whether a dynamical Higgs mechanism for gravity can be considered where the field(s) ψ start in a phase for which the graviton mass vanishes, m (ψ) = 0 and dynamically evolves to acquire a non-vanishing vev for which m (ψ) ≠ 0. Following the same logic as the Abelian Higgs for electromagnetism, this strategy can only work if the number of dofs in the massless phase m = 0 is the same as that in the massive case m ≠ 0. Simply promoting the mass to a function of an external field is thus not sufficient since the graviton helicity-0 and -1 modes would otherwise be infinitely strongly coupled as m → 0.

To date no candidate has been proposed for which the graviton mass could dynamically evolve from a vanishing value to a finite one without falling into such strong coupling issues. This does not imply that Higgs mechanism for gravity does not exist, but as yet has not been found. For instance on AdS, there could be a Higgs mechanism as proposed in [431], where the mass term comes from integrating out some conformal fields with slightly unusual (but not unphysical) ‘transparent’ boundary conditions. This mechanism is specific to AdS and to the existence of time-like boundary and would not apply on Minkowski or dS.

Van Dam-Veltman-Zakharov discontinuity

As in the case of spin-1, the massive spin-2 field propagates more dofs than the massless one. Nevertheless, these new excitations bear no observational signatures for the spin-1 field when considering an arbitrarily small mass, as seen in Section 2.1.2. The main reason for that is that the helicity-0 polarization of the photon couple only to the divergence of external sources which vanishes for conserved sources. As a result no external sources directly excite the helicity-0 mode of a massive spin-1 field. For the spin-2 field, on the other hand, the situation is different as the helicity-0 mode can now couple to the trace of the stress-energy tensor and so generic sources will excite not only the 2 helicity-2 polarization of the graviton but also a third helicity-0 polarization, which could in principle have dramatic consequences. To see this more explicitly, let us compute the gravitational exchange amplitude between two sources Tμν and T′μν in both the massive and massless gravitational cases.

In the massless case, the theory is diffeomorphism invariant. When considering coupling to external sources, of the form hμνTμν, we thus need to ensure that the symmetry be preserved, which implies that the stress-energy tensor Tμν should be conserved μTμν = 0. When computing the gravitational exchange amplitude between two sources we thus restrict ourselves to conserved ones. In the massive case, there is a priori no reasons to restrict ourselves to conserved sources, so long as their divergences cancel in the massless limit m → 0.

Massive spin-2 field

Let us start with the massive case, and consider the response to a conserved external source Tμν,

$${\mathcal L} = - {1 \over 4}{h^{\mu {\rm{v}}}}\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {{{m^2}} \over 8}(h_{\mu v}^2 - {h^2}) + {1 \over {2{M_{{\rm{Pl}}}}}}{h_{\mu \nu}}{T^{\mu \nu}}.$$
(2.49)

The linearized Einstein equation is then

$$\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 2}{m^2}({h_{\mu v}} - h{\eta _{\mu v}}) = {1 \over {{M_{{\rm{Pl}}}}}}{T_{\mu v}}\,.$$
(2.50)

To solve this modified linearized Einstein equation for hμν we consider the trace and the divergence separately,

$$h = - {1 \over {3{m^2}{M_{{\rm{Pl}}}}}}\left({T + {2 \over {{m^2}}}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}} \right)$$
(2.51)
$${\partial _\mu}h_v^\mu = {1 \over {{m^2}{M_{{\rm{pl}}}}}}\left({{\partial _\mu}T_v^\mu + {1 \over 3}{\partial _\nu}T + {2 \over {3{m^2}}}{\partial _\nu}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}} \right).$$
(2.52)

As is already apparent at this level, the massless limit m → 0 is not smooth which is at the origin of the vDVZ discontinuity (for instance we see immediately that for a conserved source the linearized Ricci scalar vanishes dμdνhμν − □h = 0 see Refs. [465, 497]. This linearized vDVZ discontinuity was recently repointed out in [193].) As has been known for many decades, this discontinuity (or the fact that the Ricci scalar vanishes) is an artefact of the linearized theory and is resolved by the Vainshtein mechanism [463] as we shall see later.

Plugging these expressions back into the modified Einstein equation, we get

$$\begin{array}{*{20}c} {\left({\square - {m^2}} \right){h_{\mu {\rm{v}}}} = - {1 \over {{M_{{\rm{pl}}}}}}\left[ {{T_{\mu v}} - {1 \over 3}T{\eta _{\mu v}} - {2 \over {{m^2}}}{\partial _{(\mu}}{\partial _\alpha}T_{\nu)}^\alpha + {1 \over {3{m^2}}}{\partial _\mu}{\partial _\nu}T} \right.\quad \quad \quad \quad \quad \quad \quad \quad} \\{\left. {+ {1 \over {3{m^2}}}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}{\eta _{\mu v}} + {2 \over {3{m^4}}}{\partial _\mu}{\partial _\nu}{\partial _\alpha}{\partial _\beta}T} \right]} \end{array}$$
(2.53)
$$= {1 \over {{M_{{\rm{pl}}}}}}\left[ {{{\tilde \eta}_{\mu (\alpha}}{{\tilde \eta}_{\nu \beta)}} - {1 \over 3}{{\tilde \eta}_{\mu v}}\tilde \eta \alpha \beta} \right]{T^{\alpha \beta}},$$
(2.54)

with

$${{\tilde \eta}_{\mu v}} = {\eta _{\mu v}} - {1 \over {{m^2}}}{\partial _\mu}{\partial _\nu}.$$
(2.55)

The propagator for a massive spin-2 field is thus given by

$$G_{\mu \nu \alpha \beta}^{{\rm{massive}}}(x,x\prime) = {{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m^2}}},$$
(2.56)

where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,

$$f_{\mu \nu \alpha \beta}^{{\rm{massive}}} = {\tilde \eta _{\mu (\alpha}}{\tilde \eta _{\nu \beta)}} - {1 \over 3}{\tilde \eta _{\mu \nu}}{\tilde \eta _{\alpha \beta}}.$$
(2.57)

In Fourier space we have

$$\begin{array}{*{20}c} {f_{\mu \nu \alpha \beta}^{{\rm{massive}}}({p_\mu},m) = {2 \over {3{m^4}}}{p_\mu}{p_\nu}{p_\alpha}{p_\beta} + {\eta _{\mu (\alpha}}{\eta _{\nu \beta)}} - {1 \over 3}{\eta _{\mu v}}{\eta _{\alpha \beta}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over {{m^2}}}\left({{p_\alpha}{p_{(\mu}}{\eta _{\nu)\beta}} + {p_\beta}{p_{(\mu}}{\eta _{\nu)\alpha}} - {1 \over 3}{p_\mu}{p_\nu}{\eta _{\alpha \beta}} - {1 \over 3}{p_\alpha}{p_\beta}{\eta _{\mu v}}} \right).} \end{array}$$
(2.58)

The amplitude exchanged between two sources Tμν and Tμν via a massive spin-2 field is thus given by

$${\mathcal A}_{T{T{\prime}}}^{{\rm{massive}}} = \int {{d^4}x\;{h_{\mu \nu}}{T^{{\prime}\mu v}} =} \int {{d^4}x\;{T^{{\prime}\mu v}}{{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m^2}}}\;{T^{\alpha \beta}}} .$$
(2.59)

As mentioned previously, to compare this result with the massless case, the sources ought to be conserved in the massless limit, \({\partial _\mu}T_v^\mu, {\partial _\mu}T_v^{{\mu \prime}} \to 0\) as m → 0. The gravitational exchange amplitude in the massless limit is thus given by

$${\mathcal A}_{TT\prime}^{m \rightarrow 0}\int {{d^4}x\;{{T\prime}^{\mu v}}{1 \over \square}\;\left({{T_{\mu v}} - {1 \over 3}T{\eta _{\mu v}}} \right)} .$$
(2.60)

We now compare this result with the amplitude exchanged by a purely massless graviton.

Massless spin-2 field

In the massless case, the equation of motion (2.50) reduces to the linearized Einstein equation

$$\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = {1 \over {{M_{{\rm{Pl}}}}}}{T_{\mu v}},$$
(2.61)

where diffeomorphism invariance requires the stress-energy to be conserved, \({\partial _\mu}T_v^\mu = 0\). In this case the transverse part of this equation is trivially satisfied (as a consequence of the Bianchi identity which follows from symmetry). Since the theory is invariant under diffeomorphism transformations (2.38), one can choose a gauge of our choice, for instance de Donder (or harmonic) gauge

$${\partial _\mu}h_v^\mu = {1 \over 2}{p_\nu}.$$
(2.62)

In de Donder gauge, the Einstein equation then reduces to

$$(\square - {m^2}){h_{\mu v}} = - {2 \over {{M_{{\rm{Pl}}}}}}\left({{T_{\mu v}} - {1 \over 2}T{\eta _{\mu v}}} \right).$$
(2.63)

The propagator for a massless spin-2 field is thus given by

$$G_{\mu \nu \alpha \beta}^{{\rm{massless}}} = {{f_{\mu \nu \alpha \beta}^{{\rm{massless}}}} \over \square},$$
(2.64)

where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,

$$f_{\mu \nu \alpha \beta}^{{\rm{massless}}} = {\eta _{\mu (\alpha}}{\eta _{\nu \beta)}} - {1 \over 2}{\eta _{\mu \nu}}{\eta _{\alpha \beta}}.$$
(2.65)

The amplitude exchanged between two sources Tμν and T′μν, via a genuinely massless spin-2 field is thus given by

$${\mathcal A}_{TT\prime}^{{\rm{massless}}} = - {2 \over {{M_{{\rm{Pl}}}}}}\int {{{\rm{d}}^4}x\;{{T\prime}^{\mu v}}{1 \over \square}\;\left({{T_{\mu v}} - {1 \over 2}T{\eta _{\mu v}}} \right)} ,$$
(2.66)

and differs from the result (2.60) in the small mass limit. This difference between the massless limit of the massive propagator and the massless propagator (and gravitational exchange amplitude) is a well-known fact and was first pointed out by van Dam, Veltman and Zakharov in 1970 [465, 497]. The resolution to this ‘problem’ lies within the Vainshtein mechanism [463]. In 1972, Vainshtein showed that a theory of massive gravity becomes strongly coupled a low energy scale when the graviton mass is small. As a result, the linear theory is no longer appropriate to describe the theory in the limit of small mass and one should keep track of the non-linear interactions (very much as what we do when approaching the Schwarzschild radius in GR.) We shall see in Section 10.1 how a special set of interactions dominate in the massless limit and are responsible for the screening of the extra degrees of freedom present in massive gravity.

Another ‘non-GR’ effect was also recently pointed out in Ref. [280] where a linear analysis showed that massive gravity predicts different spin-orientations for spinning objects.

From linearized diffeomorphism to full diffeomorphism invariance

When considering the massless and non-interactive spin-2 field in Section 2.2.1, the linear gauge invariance (2.38) is exact. However, if this field is to be probed and communicates with the rest of the world, the gauge symmetry is forced to include non-linear terms which in turn forces the kinetic term to become fully non-linear. The result is the well-known fully covariant Einstein-Hilbert term \(M_{{\rm{Pl}}}^2\sqrt {- gR}\), where R is the scalar curvature associated with the metric gμν, = ημν + hμν/Mpl.

To see this explicitly, let us start with the linearized theory and couple it to an external source \(T_0^{\mu v}\), via the coupling

$${\mathcal L}_{{\rm{matter}}}^{{\rm{linear}}} = {1 \over {2{M_{{\rm{pl}}}}}}{h_{\mu \nu}}T_0^{\mu \nu}.$$
(2.67)

This coupling preserves diffeomorphism invariance if the source is conserved, \({\partial _\mu}T_0^{\mu v} = 0\). To be more explicit, let us consider a massless scalar field φ which satisfies the Klein-Gordon equation □φ = 0. A natural choice for the stress-energy tensor Tμν is then

$$T_0^{\mu v} = {\partial ^\mu}\varphi {\partial ^\nu}\varphi - {1 \over 2}{(\partial \varphi)^2}{\eta ^{\mu v}},$$
(2.68)

so that the Klein-Gordon equation automatically guarantees the conservation of the stress-energy tensor on-shell at the linear level and linearized diffeomorphism invariance. However, the very coupling between the scalar field and the spin-2 field affects the Klein-Gordon equation in such a way that beyond the linear order, the stress-energy tensor given in (2.68) fails to be conserved. When considering the coupling (2.67), the Klein-Gordon equation receives corrections of the order of hμν/Mpl

$$\square \varphi = {1 \over {{M_{{\rm{Pl}}}}}}\left({{\partial ^\alpha}({h_{\alpha \beta}}{\partial ^\beta}\varphi) - {1 \over 2}{\partial _\alpha}(h_\beta ^\beta {\partial ^\alpha}\varphi)} \right),$$
(2.69)

implying a failure of conservation of \(T_0^{\mu v}\) at the same order,

$${\partial _\mu}T_0^{\mu v} = {{{\partial ^\nu}\varphi} \over {{M_{{\rm{Pl}}}}}}\left({{\partial ^\alpha}({h_{\alpha \beta}}{\partial ^\beta}\varphi) - {1 \over 2}{\partial _\alpha}(h_\beta ^\beta {\partial ^\alpha}\varphi)} \right).$$
(2.70)

The resolution is of course to include non-linear corrections in h/MPl in the coupling with external matter,

$${{\mathcal L}_{{\rm{matter}}}} = {1 \over {2{M_{{\rm{Pl}}}}}}{h_{\mu \nu}}T_0^{\mu \nu} + {1 \over {2M_{{\rm{Pl}}}^2}}{h_{\mu \nu}}{h_{\alpha \beta}}T_1^{\mu \nu \alpha \beta} + \cdots ,$$
(2.71)

and promote diffeomorphism invariance to a non-linearly realized gauge symmetry, symbolically,

$$h \rightarrow h + \partial \xi + {1 \over {{M_{{\rm{Pl}}}}}}\partial (h\xi) + \cdots ,$$
(2.72)

so this gauge invariance is automatically satisfied on-shell order by order in h/Mpl, i.e., the scalar field (or general matter field) equations of motion automatically imply the appropriate relation for the stress-energy tensor to all orders in h/MPl. The resulting symmetry is the well-known fully non-linear coordinate transformation invariance (or full diffeomorphism invariance or covarianceFootnote 4), which requires the stress-energy tensor to be covariantly conserved. To satisfy this symmetry, the kinetic term (2.36) should then be promoted to a fully non-linear contribution,

$${\mathcal L}_{{\rm{kin}}\,{\rm{linear}}}^{{\rm{spin}} - {\rm{2}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}}\quad \rightarrow \quad {\mathcal L}_{{\rm{kin}}\;{\rm{covariant}}}^{{\rm{spin}} - {\rm{2}}} = {{M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} R[g].$$
(2.73)

Just as the linearized version \({h^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{h_{\alpha \beta}}\) was unique, the non-linear realization \(\sqrt {- g} R\) is also unique.Footnote 5 As a result, any theory of an interacting spin-2 field is necessarily fully non-linear and leads to the theory of gravity where non-linear diffeomorphism invariance (or covariance) plays the role of the local gauge symmetry that projects out four out of the potential six degrees of freedom of the graviton and prevents the excitation of any ghost by the kinetic term.

The situation is very different from that of a spin-1 field as seen earlier, where coupling with other fields can be implemented at the linear order without affecting the U (1) gauge symmetry. The difference is that in the case of a U (1) symmetry, there is a unique nonlinear completion of that symmetry, i.e., the unique nonlinear completion of a U (1) is nothing else but a U (1). Thus any nonlinear Lagrangian which preserves the full U (1) symmetry will be a consistent interacting theory. On the other hand, for spin-2 fields, there are two, and only two ways to nonlinearly complete linear diffs, one as linear diffs in the full theory and the other as full non-linear diffs. While it is possible to write self-interactions which preserve linear diffs, there are no interactions between matter and hμν. which preserve linear diffs. Thus any theory of gravity must exhibit full nonlinear diffs and is in this sense what leads us to GR.

Non-linear Stückelberg decomposition

On the need for a reference metric

We have introduced the spin-2 field hμν as the perturbation about flat spacetime. When considering the theory of a field of given spin it is only natural to work with Minkowski as our spacetime metric, since the notion of spin follows from that of Poincaré invariance. Now when extending the theory non-linearly, we may also extend the theory about different reference metric. When dealing with a reference metric different than Minkowski, one loses the interpretation of the field as massive spin-2, but one can still get a consistent theory. One could also wonder whether it is possible to write a theory of massive gravity without the use of a reference metric at all. This interesting question was investigated in [75], where it shown that the only consistent alternative is to consider a function of the metric determinant. However, as shown in [75], the consistent function of the determinant is the cosmological constant and does not provide a mass for the graviton.

Non-linear Stückelberg

Full diffeomorphism invariance (or covariance) indicates that the theory should be built out of scalar objects constructed out of the metric gμν and other tensors. However, as explained previously a theory of massive gravity requires the notion of a reference metricFootnote 6 fμν (which may be Minkowski fμν = ημν) and at the linearized level, the mass for gravity was not built out of the full metric gμν, but rather out of the fluctuation hμν about this reference metric which does not transform as a tensor under general coordinate transformations. As a result the mass term breaks covariance.

This result is already transparent at the linear level where the mass term (2.39) breaks linearized diffeomorphism invariance. Nevertheless, that gauge symmetry can always be ‘formally’ restored using the Stückelberg trick which amounts to replacing the reference metric (so far we have been working with the flat Minkowski metric as the reference), to

$${\eta _{\mu \nu}} \rightarrow ({\eta _{\mu \nu}} - {2 \over {{M_{{\rm{Pl}}}}}}{\partial _{(\mu}}{\chi _{\nu)}}),$$
(2.74)

and transforming χμ under linearized diffeomorphism in such a way that the combination hμν − 2∂(μχν) remains invariant. Now that the symmetry is non-linearly realized and replaced by general covariance, this Stückelberg trick should also be promoted to a fully covariant realization.

Following the same Stückelberg trick non-linearly, one can ‘formally restore’ covariance by including four Stückelberg fields ϕa (a = 0, 1, 2, 3) and promoting the reference metric fμν, which may of may not be Minkowski, to a tensor [446, 27],

$${f_{\mu \nu}} \rightarrow {\tilde f_{\mu \nu}} = {\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{f_{ab}}$$
(2.75)

As we can see from this last expression, \({{\tilde f}_{\mu v}}\), transforms as a tensor under coordinate transformations as long as each of the four fields ϕa transform as scalars. We may now construct the theory of massive gravity as a scalar Lagrangian of the tensors \({{\tilde f}_{\mu v}}\) and gμν. In unitary gauge, where the Stückelberg fields are ϕa = xa, we simply recover \({{\tilde f}_{\mu v}} = {f_{\mu v}}\).

This Stückelberg trick for massive gravity dates already from Green and Thorn [267] and from Siegel [446], introduced then within the context of open string theory. In the same way as the massless graviton naturally emerges in the closed string sector, open strings also have spin-2 excitations but whose lowest energy state is massive at tree level (they only become massless once quantum corrections are considered). Thus at the classical level, open strings contain a description of massive excitations of a spin-2 field, where gauge invariance is restored thanks to same Stückelberg fields as introduced in this section. In open string theory, these Stückelberg fields naturally arise from the ghost coordinates. When constructing the non-linear theory of massive gravity from extra dimension, we shall see that in that context the Stückelberg fields naturally arise at the shift from the extra dimension.

For later convenience, it will be useful to construct the following tensor quantity,

$${\mathbb X}_v^\mu = {g^{\mu \alpha}}{\tilde f_{\alpha \nu}} = {\partial ^\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{f_{ab}},$$
(2.76)

in unitary gauge, \({\mathbb X} = {g^{- 1}}f\).

Alternative Stückelberg trick

An alternative way to Stückelberize the reference metric is to express it as

$${g^{ac}}{f_{cb}} \rightarrow {\mathbb Y}_b^a = {g^{\mu \nu}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^c}{f_{cb}}.$$
(2.77)

As nicely explained in Ref. [14], both matrices \(X_v^\mu\) and \(Y_b^a\) have the same eigenvalues, so one can choose either one of them in the definition of the massive gravity Lagrangian without any distinction. The formulation in terms of Y rather than X was originally used in Ref. [94], although unsuccessfully as the potential proposed there exhibits the BD ghost instability, (see for instance Ref. [60]).

Helicity decomposition

If we now focus on the flat reference metric, fμν = ημν, we may further split the Stückelberg fields as \({\phi ^a} = {x^a} - {1 \over {{M_{{\rm{Pl}}}}}}{{\mathcal X}^a}\) and identify the index a with a Lorentz index,Footnote 7 we obtain the non-linear generalization of the Stückelberg trick used in Section 2.2.2

$${\eta _{\mu v}} \rightarrow {\tilde f_{\mu v}} = {\eta _{\mu v}} - {2 \over {{M_{{\rm{Pl}}}}}}{\partial _{(\mu}}{\chi _{\nu)}} + {1 \over {M_{{\rm{Pl}}}^2}}{\partial _\mu}{\chi ^a}{\partial _\nu}{\chi ^b}{\eta _{ab}}$$
(2.78)
$$\begin{array}{*{20}c} = {\eta _{\mu v}} - {2 \over {{M_{{\rm{Pl}}}}m}}{\partial _{(\mu}}{A_{\nu)}} - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}{\Pi _{\mu v}}{\quad \quad \quad \quad \quad \quad \quad} \\{{\rm{+}}{1 \over {M_{{\rm{Pl}}}^2{m^2}}}{\partial _\mu}{A^\alpha}{\partial _\nu}{\mathcal A_\alpha} + {2 \over {M_{{\rm{Pl}}}^2{m^3}}}{\partial _\mu}{A^\alpha}{\Pi _{\nu \alpha}} + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _{\mu v}^2.} \end{array}$$
(2.79)

where in the second equality we have used the split performed in (2.46) of in terms of the helicity-0 and -1 modes and all indices are raised and lowered with respect to ημν.

In other words, the fluctuations about flat spacetime are promoted to the tensor Hμν

$${h_{\mu v}} = {M_{{\rm{Pl}}}}\left({{g_{\mu v}} - {\eta _{\mu v}}} \right)\quad \rightarrow \quad {H_{\mu \nu}} = {M_{{\rm{Pl}}}}\left({{g_{\mu \nu}} - {{\tilde f}_{\mu \nu}}} \right)$$
(2.80)

with

$${H_{\mu v}} = {h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}} - {1 \over {{M_{{\rm{Pl}}}}}}{\eta _{ab}}{\partial _\mu}{\chi ^a}{\partial _\nu}{\chi ^b}$$
(2.81)
$$\begin{array}{*{20}c} {= {h_{\mu v}} + {2 \over m}{\partial _{(\mu}}{A_{\nu)}} + {2 \over {{m^2}}}{\Pi _{\mu v}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{- {1 \over {{M_{{\rm{Pl}}}}{m^2}}}{\partial _\mu}{A^\alpha}{\partial _\nu}{\mathcal A_\alpha} - {2 \over {{M_{{\rm{Pl}}}}{m^3}}}{\partial _\mu}{A^\alpha}{\Pi _{\nu \alpha}} - {1 \over {{M_{{\rm{Pl}}}}{m^4}}}\Pi _{\mu v}^2.} \end{array}$$
(2.82)

The field are introduced to restore the gauge invariance (full diffeomorphism invariance). We can now always set a gauge where hμν is transverse and traceless at the linearized level and Aμ is transverse. In this gauge the quantities hμν, Aμ; and π represent the helicity decomposition of the metric. hμν is the helicity-2 part of the graviton, Aμ the helicity-1 part and π the helicity-0 part. The fact that these quantities continue to correctly identify the physical degrees of freedom non-linearly in the limit MPl → ∞ is non-trivial and has been derived in [143].

Non-linear Fierz-Pauli

The most straightforward non-linear extension of the Fierz-Pauli mass term is as follows

$$\mathcal L^{\mathrm{(nl1)}}_{\mathrm{FP}}=-m^2 {M_{{\rm{Pl}}}^2}\sqrt{-g} \left([(\mathbb{I}-\mathbb{X})^2]-[\mathbb{I}-\mathbb{X}]^2\right),$$
(2.83)

this mass term is then invariant under non-linear coordinate transformations. This non-linear formulation was used for instance in [27]. Alternatively, one may also generalize the Fierz-Pauli mass non-linearly as follows [75]

$$\mathcal L^{\mathrm{(nl2)}}_{\mathrm{FP}}=-m^2{M_{{\rm{Pl}}}^2}\sqrt{-g}\sqrt{\det \mathbb X} \left([(\mathbb{I}-\mathbb{X}^{-1})^2]-[\mathbb{I}-\mathbb{X}^{-1}]^2\right).$$
(2.84)

A priori, the linear Fierz-Pauli action for massive gravity can be extended non-linearly in an arbitrary number of ways. However, as we shall see below, most of these generalizations generate a ghost non-linearly, known as the Boulware-Deser (BD) ghost. In Part II, we shall see that the extension of the Fierz-Pauli to a non-linear theory free of the BD ghost is unique (up to two constant parameters).

Boulware-Deser ghost

The easiest way to see the appearance of a ghost at the non-linear level is to follow the Stückelberg trick non-linearly and observe the appearance of an Ostrogradsky instability [111, 173], although the original formulation was performed in unitary gauge in [75] in the ADM language (Arnowitt, Deser and Misner, see Ref. [29]). In this section we shall focus on the flat reference metric, ƒμν = ημν

Focusing solely on the helicity-0 mode π to start with, the tensor \({\mathbb X}_v^\mu\) defined in (2.76) is expressed as

$$\mathbb X_{\;v}^\mu = \delta _{\;v}^\mu - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}\Pi _{\;v}^\mu + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _\alpha ^\mu \Pi _\nu ^\alpha ,$$
(2.85)

where at this level all indices are raised and lowered with respect to the flat reference metric ημν. then the Fierz-Pauli mass term (2.83) reads

$$\mathcal L_{{\rm{FP}},\pi}^{({m{nl}}1)} = - {4 \over {{m^2}}}\left({[{\Pi ^2}] - {{[\Pi ]}^2}} \right) + {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({[{\Pi ^3}] - [\Pi ][{\Pi ^2}]} \right) + {1 \over {M_{{\rm{Pl}}}^2{m^6}}}\left({[{\Pi ^4}] - {{[{\Pi ^2}]}^2}} \right).$$
(2.86)

Upon integration by parts, we notice that the quadratic term in (2.86) is a total derivative, which is another way to see the special structure of the Fierz-Pauli mass term. Unfortunately this special fact does not propagate to higher order and the cubic and quartic interactions are genuine higher order operators which lead to equations of motion with quartic and cubic derivatives. In other words these higher order operators ([Π3] − [Π][Π2]) and ([Π4] − [Π2]2) propagate an additional degree of freedom which by Ostrogradsky’s theorem, always enters as a ghost. While at the linear level, these operators might be irrelevant, their existence implies that one can always find an appropriate background configuration π = π0 + δπ, such that the ghost is manifest

$${\mathcal L}_{{\rm{FP}},\pi}^{({m{nl}}1)} = {4 \over {{M_{{\rm{Pl}}}}{m^4}}}{Z^{\mu \nu \alpha \beta}}{\partial _\mu}{\partial _\nu}\delta \pi {\partial _\alpha}{\partial _\beta}\delta \pi ,$$
(2.87)

with Zμναβ = 3μαπ0ηνβ − □π0ημαηνβ − 2μνπ0ηαβ + ⋯. This implies that non-linearly (or around a non-trivial background), the Fierz-Pauli mass term propagates an additional degree of freedom which is a ghost, namely the BD ghost. The mass of this ghost depends on the background configuration π0,

$$m_{{\rm{ghost}}}^2 \sim {{{M_{{\rm{Pl}}}}{m^4}} \over {{\partial ^2}{\pi _0}}}.$$
(2.88)

As we shall see below, the resolution of the vDVZ discontinuity lies in the Vainshtein mechanism for which the field takes a large vacuum expectation value, 2π0MPlm2, which in the present context would lead to a ghost with an extremely low mass, \(m_{{\rm{ghost}}}^2 \lesssim {m^2}\).

Choosing another non-linear extension for the Fierz-Pauli mass term as in (2.84) does not seem to help much,

$$\begin{array}{*{20}c} {\mathcal L_{{\rm{FP}}, \pi}^{({m{nl}}2)} = - {4 \over {{m^2}}}\left({[{\Pi ^2}] - {{[\Pi ]}^2}} \right) - {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({{{[\Pi ]}^3} - 4[\Pi ][{\Pi ^2}] + 3[{\Pi ^3}]} \right) + \cdots} \\ {\rightarrow {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({[\Pi ][{\Pi ^2}] - [{\Pi ^3}]} \right) + \cdots \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \end{array}$$
(2.89)

where we have integrated by parts on the second line, and we recover exactly the same type of higher derivatives already at the cubic level, so the BD ghost is also present in (2.84).

Alternatively the mass term was also generalized to include curvature invariants as in Ref. [69]. This theory was shown to be ghost-free at the linear level on FLRW but not yet non-linearly.

Function of the Fierz-Pauli mass term

As an extension of the Fierz-Pauli mass term, one could instead write a more general function of it, as considered in Ref. [75]

$${{\mathcal L}_{F({\rm{FP}})}} = - {m^2}\sqrt {- g} F\left({{g^{\mu v}}{g^{\alpha \beta}}({H_{\mu \alpha}}{H_{\nu \beta}} - {H_{\mu \nu}}{H_{\alpha \beta}})} \right),$$
(2.90)

however, one can easily see, if a mass term is actually present, i.e., F ′ ≠ 0, there is no analytic choice of the function F which would circumvent the non-linear propagation of the BD ghost. Expanding F into a Taylor expansion, we see for instance that the only choice to prevent the cubic higher-derivative interactions in π, [Π3] [Π ]−[Π2] is F ′(0) = 0, which removes the mass term at the same time. If F (0) ≠ 0 but F ′(0) = 0, the theory is massless about the specific reference metric, but infinitely strongly coupled about other backgrounds.

Instead to prevent the presence of the BD ghost fully non-linearly (or equivalently about any background), one should construct the mass term (or rather potential term) in such a way, that all the higher derivative operators involving the helicity-0 mode (2π)n are total derivatives. This is precisely what is achieved in the “ghost-free” model of massive gravity presented in Part II. In the next Part I we shall use higher dimensional GR to get some insight and intuition on how to construct a consistent theory of massive gravity.

Part I Massive Gravity from Extra Dimensions

Higher-Dimensional Scenarios

As seen in Section 2.5, the ‘most natural’ non-linear extension of the Fierz-Pauli mass term bears a ghost. Constructing consistent theories of massive gravity has actually been a challenging task for years, and higher-dimensional scenario can provide excellent frameworks for explicit realizations of massive gravity. The main motivation behind relying on higher dimensional gravity is twofold:

  • The five-dimensional theory is explicitly covariant.

  • A massless spin-2 field in five dimensions has five degrees of freedom which corresponds to the correct number of dofs for a massive spin-2 field in four dimensions without the pathological BD ghost.

While string theory and other higher dimensional theories give rise naturally to massive gravitons, they usually include a massless zero-mode. Furthermore, in the simplest models, as soon as the first massive mode is relevant so is an infinite tower of massive (Kaluza-Klein) modes and one is never in a regime where a single massive graviton dominates, or at least this was the situation until the Dvali-Gabadadze-Porrati model (DGP) [208, 209, 207], provided the first explicit model of (soft) massive gravity, based on a higher-dimensional braneworld model.

In the DGP model the graviton has a soft mass in the sense that its propagator does not have a simple pole at fixed value m, but rather admits a resonance. Considering the Kallen-Lehmann spectral representation [331, 374], the spectral density function ρ (μ2) in DGP is of the form

$${\rho _{{\rm{DGP}}}}({\mu ^2})\sim {{{m_0}} \over {\pi \mu}}{1 \over {{\mu ^2} + m_0^2}}\, ,$$
(3.1)

and so DGP corresponds to a theory of massive gravity with a resonance with width Δmm0 about m = 0.

In a Kaluza-Klein decomposition of a flat extra dimension we have, on the other hand, an infinite tower of massive modes with spectral density function

$${\rho _{{\rm{KK}}}}({\mu ^2})\sim \sum\limits_{n = 0}^\infty \delta ({\mu ^2} - {(n{m_0})^2})\, .$$
(3.2)

We shall see in the section on deconstruction (5) how one can truncate this infinite tower by performing a discretization in real space rather than in momentum space à la Kaluza-Klein, so as to obtain a theory of a single massive graviton

$${\rho _{{\rm{MG}}}}({\mu ^2})\sim \delta ({\mu ^2} - m_0^2)\, ,$$
(3.3)

or a theory of multi-gravity (with N-interacting gravitons),

$${\rho _{{\rm{multi - gravity}}}}({\mu ^2})\sim \sum\limits_{n = 0}^N \delta ({\mu ^2} - {(n{m_0})^2})\, .$$
(3.4)

In this language, bi-gravity is the special case of multi-gravity where N = 2. These different spectral representations, together with the cascading gravity extension of DGP are represented in Figure 1.

Figure 1
figure 1

Spectral representation of different models. (a) DGP (b) higher-dimensional cascading gravity and (c) multi-gravity. Bi-gravity is the special case of multi-gravity with one massless mode and one massive mode. Massive gravity is the special case where only one massive mode couples to the rest of the standard model and the other modes decouple. (a) and (b) are models of soft massive gravity where the graviton mass can be thought of as a resonance.

Recently, another higher dimensional embedding of bi-gravity was proposed in Ref. [495]. Rather than performing a discretization of the extra dimension, the idea behind this model is to consider a two-brane DGP model, where the radion or separation between these branes is stabilized via a Goldberger-Wise stabilization mechanism [255] where the brane and the bulk include a specific potential for the radion. At low energy the mass spectrum can be truncated to a massless mode and a massive mode, reproducing a bi-gravity theory. However, the stabilization mechanism involves a relatively low scale and the correspondence breaks down above it. Nevertheless, this provides a first proof of principle for how to embed such a model in a higher-dimensional picture without discretization and could be useful to tackle some of the open questions of massive gravity.

In what follows we review how five-dimensional gravity is a useful starting point in order to generate consistent four-dimensional theories of massive gravity, either for soft-massive gravity à la DGP and its extensions, or for hard massive gravity following a deconstruction framework.

The DGP model has played the role of a precursor for many developments in modified and massive gravity and it is beyond the scope of this review to summarize all of them. In this review we briefly summarize the DGP model and some key aspects of its phenomenology, and refer the reader to other reviews (see for instance [232, 390, 234]) for more details on the subject.

In this section, A, B, C ⋯ = 0, …, 4 represent five-dimensional spacetime indices and μ, ν, α ⋯ = 0, …, 3 label four-dimensional spacetime indices. y = x4 represents the fifth additional dimension, {xA} = {xμ, y}. The five-dimensional metric is given by (5)gab (x, y) while the four-dimensional metric is given by gμν (x). The five-dimensional scalar curvature is (5)R [G ] while R = R [g ] is the four-dimensional scalar-curvature. We use the same notation for the Einstein tensor where (5)Gab is the five-dimensional one and Gμν represents the four-dimensional one built out of gμν.

When working in the Einstein-Cartan formalism of gravity, \({\mathbb A},{\mathbb B},{\mathbb C}\) label five-dimensional Lorentz indices and a,b,c = ⋯ label the four-dimensional ones.

The Dvali-Gabadadze-Porrati Model

The idea behind the DGP model [209, 208, 207] is to start with a four-dimensional braneworld in an infinite size-extra dimension. A priori gravity would then be fully five-dimensional, with respective Planck scale M5, but the matter fields localized on the brane could lead to an induced curvature term on the brane with respective Planck scale MPl. See [22] for a potential embedding of this model within string theory.

At small distances the induced curvature dominates and gravity behaves as in four dimensions, while at large distances the leakage of gravity within the extra dimension weakens the force of gravity. The DGP model is thus a model of modified gravity in the infrared, and as we shall see, the graviton effectively acquires a soft mass, or resonance.

Gravity induced on a brane

We start with the five-dimensional action for the DGP model [209, 208, 207] with a brane localized at y = 0,

$$S = \int {{{\rm{d}}^4}x\,{\rm{d}}y\left({{{M_5^3} \over 4}\sqrt {{- ^{(5)}}g} {\;^{(5)}}R + \delta (y)\left[ {\sqrt {- g} {{M_{{\rm{Pl}}}^2} \over 2}R[g] + {{\mathcal L}_m}(g,\,{\psi _i})} \right]} \right)} \, ,$$
(4.1)

where ψi represent matter field species confined to the brane with stress-energy tensor Tμν. This brane is considered to be an orbifold brane enjoying a ℤ2-orbifold symmetry (so that the physics at y < 0 is the mirror copy of that at y < 0.) We choose the convention where we consider −∞ < y < ∞, reason why we have a factor or \(M_5^3/4\) rather than \(M_5^3/2\) if we had only consider one side of the brane, for instance y ≥ 0.

The five-dimensional Einstein equation of motion are then given by

$$M_5^3{\;^{(5)}}{G_{AB}} = 2\delta (y){\,^{(5)}}{T_{AB}}$$
(4.2)

with

$$^{(5)}{T_{AB}} = \left({- M_{{\rm{Pl}}}^2{G_{\mu \nu}} + {T_{\mu \nu}}} \right)\delta _A^\mu \delta _B^\nu \, .$$
(4.3)

The Israel matching condition on the brane [323] can be obtained by integrating this equation over \(\int\nolimits_{- \varepsilon}^\varepsilon {{\rm{d}}y}\) dy and taking the limit ε → 0, so that the jump in the extrinsic curvature across the brane is related to the Einstein tensor and stress-energy tensor of the matter field confined on the brane.

Perturbations about flat spacetime

In DGP the four-dimensional graviton is effectively massive. To see this explicitly, we look at perturbations about flat spacetime

$${\rm{d}}s_5^2 = \left({{\eta _{AB}} + {h_{AB}}\left({x,y} \right)} \right){\rm{d}}{x^A}\,{\rm{d}}{x^B}\, .$$
(4.4)

Since at this level we are dealing with five-dimensional GR, we are free to set the five-dimensional gauge of our choice and choose five-dimensional de Donder gauge (a discussion about the brane-bending mode will follow)

$${\partial _A}h_B^A = {1 \over 2}{\partial _B}h_A^A\, .$$
(4.5)

In this gauge the five-dimensional Einstein tensor is simply

$$^{{\rm{(5)}}}{G_{AB}} = - {1 \over 2}{\square_5}\left({{h_{AB}} - {1 \over 2}h_{C}^C{\eta _{AB}}} \right)\, ,$$
(4.6)

where \({\square_5} = \square + \partial _y^2\) is the five-dimensional d’Alembertian and □ is the four-dimensional one.

Since there is no source along the μy or yy directions ((5)Tμy = 0 = (5)Tyy), we can immediately infer that

$${\square_5}{h_{\mu y}} = 0\quad \Rightarrow \quad {h_{\mu y}} = 0$$
(4.7)
$${\square_5}\left({{h_{yy}} - h_{\mu}^\mu} \right) = 0\quad \Rightarrow \quad {h_{yy}} = h_{\mu}^\mu$$
(4.8)

up to an homogeneous mode which in this setup we set to zero. This does not properly account for the brane-bending mode but for the sake of this analysis it will give the correct expression for the metric fluctuation hμν. We will see in Section 4.2 how to keep track of the brane-bending mode which is partly encoded in hyy.

Using these relations in the five-dimensional de Donder gauge, we deduce the relation for the purely four-dimensional part of the metric perturbation,

$${\partial _\mu}h_{\nu}^\mu = {\partial _\nu}h_{\mu}^\mu \, .$$
(4.9)

Using these relations in the projected Einstein equation, we get

$${1 \over 2}M_5^3\left[{\square + \partial _y^2} \right]\left({{h_{\mu \nu}} - h{\eta _{\mu \nu}}} \right) = - \delta (y)\left({2{T_{\mu \nu}} + M_{{\rm{Pl}}}^2\left({{\square h_{\mu \nu}} - {\partial _\mu}{\partial _\nu}h} \right)} \right)\, ,$$
(4.10)

where \(h \equiv h_\alpha ^\alpha = {\eta ^{\mu v}}{h_{\mu v}}\) is the four-dimensional trace of the perturbations.

Solving this equation with the requirement that hμν → 0 as y → ±∞, we infer the following profile for the perturbations along the extra dimension

$${h_{\mu \nu}}(x,y) = {e^{- \vert y\vert \sqrt {-\square}}}{h_{\mu \nu}}(x)\, ,$$
(4.11)

where the □ should really be thought in Fourier space, and hμν (x) is set from the boundary conditions on the brane. Integrating the Einstein equation across the brane, from −ε to +ε, we get

$$\begin{array}{*{20}c}{{1 \over 2}\lim\limits_{\varepsilon \rightarrow 0} M_5^3\left[ {{\partial _y}{h_{\mu \nu}}(x,y) - h(x,y){\eta _{\mu \nu}}} \right]_{- \varepsilon}^\varepsilon + {\rm{M}}_{{\rm{Pl}}}^2\left(\square {{h_{\mu \nu}}(x,0) - {\partial _\mu}{\partial _\nu}h(x,0)} \right)\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{= - 2{T_{\mu \nu}}(x)\, ,}\end{array}$$
(4.12)

yielding the modified linearized Einstein equation on the brane

$$M_{{\rm{Pl}}}^2\left[ {(\square{h_{\mu \nu}} - {\partial _\mu}{\partial _\nu}h) - {m_0}\sqrt {-\square} \left({{h_{\mu \nu}} - h{\eta _{\mu \nu}}} \right)} \right] = - 2T\mu \nu \, ,$$
(4.13)

where all the metric perturbations are the ones localized at y = 0 and the constant mass scale m0 is given by

$${m_0} = {{M_5^3} \over {M_{{\rm{Pl}}}^2}}\, .$$
(4.14)

Interestingly, we see the special Fierz-Pauli combination hμνμν appearing naturally from the five-dimensional nature of the theory. At this level, this corresponds to a linearized theory of massive gravity with a scale-dependent effective mass \({m^2}(\square) = {m_0}\sqrt {- \square}\), which can be thought in Fourier space, m2(k) = m0k. We could now follow the same procedure as derived in Section 2.2.3 and obtain the expression for the sourced metric fluctuation on the brane

$${h_{\mu \nu}} = - {2 \over {{\rm{M}}_{{\rm{Pl}}}^2}}{1 \over {\square - {m_0}\sqrt {-\square}}}\left({{T_{\mu \nu}} - {1 \over 3}T{\eta _{\mu \nu}} + {1 \over {3m\sqrt {-\square}}}{\partial _\mu}{\partial _\nu}T} \right)\, ,$$
(4.15)

where T = ημνTμν is the trace of the four-dimensional stress-energy tensor localized on the brane. This yields the following gravitational exchange amplitude between two conserved sources Tμν and

$${\mathcal A}_{TT\prime}^{{\rm{DGP}}} = \int {{{\rm{d}}^4}} x\;{h_{\mu \nu}}{T\prime^{\mu \nu}} = \int {{{\rm{d}}^4}} x\,{T\prime^{\mu \nu}}{{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m_0}\sqrt {-\square}}}\;{T^{\alpha \beta}}\, ,$$
(4.16)

where the polarization tensor \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the same as that given for Fierz-Pauli in (2.57) in terms of m0. In particular the polarization tensor includes the standard factor of −1/3μν as opposed to −1/2μν as would be the case in GR. This is again the manifestation of the vDVZ discontinuity which is cured by the Vainshtein mechanism as for Fierz-Pauli massive gravity. See [165] for the explicit realization of the Vainshtein mechanism in DGP which is where it was first shown to work explicitly.

Spectral representation

In Fourier space the propagator for the graviton in DGP is given by

$$\tilde G_{\mu \nu \alpha \beta}^{{\rm{massive}}}(k) = f_{\mu \nu \alpha \beta}^{{\rm{massive}}}(k,{m_0})\;\tilde {\mathcal G}(k)\, ,$$
(4.17)

with the massive polarization tensor fmassive defined in (2.58) and

$$\tilde {\mathcal G}(k) = {1 \over {{k^2} + {m_0}k}}\, ,$$
(4.18)

which can be written in the Kallen-Lehmann spectral representation as a sum of free propagators with mass μ,

$$\tilde {\mathcal G}(k) = \int\nolimits_0^\infty {{{\rho ({\mu ^2})} \over {{k^2} + {\mu ^2}}}} {\rm{d}}{\mu ^2}\, ,$$
(4.19)

with the spectral density ρ (μ2)

$$\rho ({\mu ^2}) = {1 \over \pi}{{{m_0}} \over \mu}{1 \over {{\mu ^2} + m_0^2}}\, ,$$
(4.20)

which is represented in Figure 1. As already emphasized, the graviton in DGP cannot be thought of a single massive mode, but rather as a resonance picked about μ = 0.

We see that the spectral density is positive for any μ2 > 0, confirming the fact that about the normal (flat) branch of DGP there is no ghost.

Notice as well that in the massless limit m0 → 0, we see appearing a representation of the Dirac delta function,

$$\lim\limits_{m \rightarrow 0} {1 \over \pi}{{{m_0}} \over \mu}{1 \over {{\mu ^2} + m_0^2}} = \delta ({\mu ^2})\, ,$$
(4.21)

and so the massless mode is singled out in the massless limit of DGP (with the different tensor structure given by \(f_{\mu v\alpha \beta}^{{\rm{massive}}} \ne f_{\mu v\alpha \beta}^{(0)}\) which is the origin of the vDVZ discontinuity see Section 2.2.3.)

Brane-bending mode

Five-dimensional gauge-fixing

In Section 4.1.1 we have remained vague about the gauge-fixing and the implications for the brane position. The brane-bending mode is actually important to keep track of in DGP and we shall do that properly in what follows by keeping all the modes.

We work in the five-dimensional ADM split with the lapse \(N = 1/\sqrt {{g^{yy}}} = 1 + {1 \over 2}{h_{yy}}\), the shift Nμ = gμy and the four-dimensional part of the metric, gμν (x,y) = ημν + (x,y). The five-dimensional Einstein-Hilbert term is then expressed as

$${\mathcal L}_{\rm{R}}^{(5)} = {{M_5^3} \over 4}\sqrt {- g} N\left({R[g] + {{[K]}^2} - [{K^2}]} \right)\, ,$$
(4.22)

where square brackets correspond to the trace of a tensor with respect to the four-dimensional metric gμν and Kμν is the extrinsic curvature

$${K_{\mu \nu}} = {1 \over {2N}}\left({{\partial _y}{g_{\mu \nu}} - {D_\mu}{N_\nu} - {D_\nu}{N_\mu}} \right)\, ,$$
(4.23)

and Dμ is the covariant derivative with respect to gμν.

First notice that the five-dimensional de Donder gauge choice (4.5) can be made using the five-dimensional gauge fixing term

$${\mathcal L}_{{\rm{Gauge - Fixing}}}^{(5)} = - {{M_5^3} \over 8}{\left({{\partial _A}h_{B}^A - {1 \over 2}{\partial _B}h_{A}^A} \right)^2}$$
(4.24)
$$\begin{array}{*{20}c} {= - {{M_5^3} \over 8}\left[ {{{\left({{\partial _\mu}{h^\mu}_\nu - {1 \over 2}{\partial _\nu}h + {\partial _y}{N_\nu} - {1 \over 2}{\partial _\nu}{h_{yy}}} \right)}^2}} \right.\quad} \\ {\left. {\quad \quad \quad \quad \quad + {{\left({{\partial _\mu}{N^\mu} + {1 \over 2}{\partial _y}{h_{yy}} - {1 \over 2}{\partial _y}h} \right)}^2}} \right],} \end{array}$$
(4.25)

where we keep the same notation as previously, h = ημνhμν is the four-dimensional trace.

After fixing the de Donder gauge (4.5), we can make the addition gauge transformation xAxA + ξA, and remain in de Donder gauge provided satisfies linearly □5ξA = 0. This residual gauge freedom can be used to further fix the gauge on the brane (see [389] for more details, we only summarize their derivation here).

Four-dimensional Gauge-fixing

Keeping the brane at the fixed position y = 0 imposes = 0 since we need ξA (y = 0) = 0 and should be bounded as y → ∞ (the situation is slightly different in the self-accelerating branch and this mode can lead to a ghost, see Section 4.4 as well as [361, 98]).

Using the bulk profile \({h_{AB}}(x,y) = {e^{- \sqrt {- \square} \vert y\vert}}{h_{AB}}(x)\) and integrating over the extra dimension, we obtain the contribution from the bulk on the brane (including the contribution from the gauge-fixing term) in terms of the gauge invariant quantity

$$\begin{array}{*{20}c} {\tilde h_{\mu \nu}} = {h_{\mu \nu}} + {2 \over {\sqrt {-}}}{\partial _{(\mu}}{N_{\nu)}} = - {2 \over {\sqrt {-}}}{K_{\mu \nu}}\\ S_{{\rm{bulk}}}^{{\rm{integrated}}} = {{M_5^3} \over 4}\int {{{\rm{d}}^4}} x\left[ {- {1 \over 2}{{\tilde h}^{\mu \nu}}\sqrt {-} \left({{{\tilde h}_{\mu \nu}} - {1 \over 2}\tilde h{\eta _{\mu \nu}}} \right) + {1 \over 2}{h_{yy}}\sqrt {-} \left({\tilde h - {1 \over 2}{h_{yy}}} \right)} \right]\, . \end{array}$$
(4.26)

Notice again a factor of 2 difference from [389] which arises from the fact that we integrate from y = −∞ to y = +∞ imposing a ℤ2-mirror symmetry at y = 0, rather than considering only one side of the brane as in [389]. Both conventions are perfectly reasonable.

The integrated bulk action (4.27) is invariant under the residual linearized gauge symmetry

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + 2{\partial _{(\mu}}{\xi _{\nu)}}$$
(4.27)
$${N_\mu} \rightarrow {N_\mu} - \sqrt {- \square \xi _\nu}$$
(4.28)
$${h_{yy}} \rightarrow {h_{yy}}$$
(4.29)

which keeps both \({\tilde h_{\mu v}}\) and hyy invariant. The residual gauge symmetry can be used to set the gauge on the brane, and at this level from (4.27) we can see that the most convenient gauge fixing term is [389]

$${\mathcal L}_{{\rm{Residual}}\;{\rm{Gauge - Fixing}}}^{(4)} = - {{M_{{\rm{Pl}}}^2} \over 4}{\left({{\partial _\mu}{h^\mu}_\nu - {1 \over 2}{\partial _\nu}h + {m_0}{N_\nu}} \right)^2}\, ,$$
(4.30)

with again \({m_0} = M_5^3/M_{{\rm{Pl}}}^2\), so that the induced Lagrangian on the brane (including the contribution from the residual gauge fixing term) is

$${S_{{\rm{boundary}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\int {{\rm{d}^4}} x\left[ {{1 \over 2}{h^{\mu \nu}}\square({h_{\mu \nu}} - {1 \over 2}h{\eta _{\mu \nu}}) - 2{m_0}{N_\mu}\left({{\partial _\alpha}{h^{\alpha \mu}} - {1 \over 2}{\partial ^\mu}h} \right) - m_0^2{N_\mu}{N^\mu}} \right].$$
(4.31)

Combining the five-dimensional action from the bulk (4.27) with that on the boundary (4.31) we end up with the linearized action on the four-dimensional DGP brane [389]

$$\begin{array}{*{20}c} {S_{{\rm{DGP}}}^{({\rm{lin}})} = {{M_{{\rm{Pl}}}^2} \over 4}\int {{{\rm{d}}^4}x\left[ {{1 \over 2}{h^{\mu \nu}}\left[ {\square - {m_0}\sqrt - \square} \right]({h_{\mu \nu}} - {1 \over 2}h{\eta _{\mu \nu}}) - {m_0}{N^\mu}{\partial _\mu}{h_{yy}}\quad \quad \quad \quad \quad \quad} \right.}} \\ {\left. {- {m_0}{N^\mu}\left[ {\sqrt - \square + {m_0}} \right]{N_\mu} - {{{m_0}} \over 4}{h_{yy}}\sqrt - \square ({h_{yy}} - 2h)} \right].} \end{array}$$
(4.32)

As shown earlier we recover the theory of a massive graviton in four dimensions, with a soft mass \({m_2}(\square) = {m_0}\sqrt {- \square}\). This analysis has allowed us to keep track of the physical origin of all the modes including the brane-bending mode which is especially relevant when deriving the decoupling limit as we shall see below.

The kinetic mixing between these different modes can be diagonalized by performing the change of variables [389]

$${h_{\mu \nu}} = {1 \over {{M_{{\rm{Pl}}}}}}\left({{{h\prime}_{\mu \nu}} + \pi {\eta _{\mu \nu}}} \right)$$
(4.33)
$${N_\mu} = {1 \over {{M_{{\rm{Pl}}}}\sqrt {{m_0}}}}{N\prime_\mu} + {1 \over {{M_{{\rm{Pl}}}}{m_0}}}{\partial _\mu}\pi$$
(4.34)
$${h_{yy}} = - {{2\sqrt {- \square}} \over {{m_0}{M_{{\rm{Pl}}}}}}\pi \, ,$$
(4.35)

so we see that the mode π is directly related to hyy. In the case of Section 4.1.1, we had set hyy = 0 and the field π is then related to the brane bending mode. In either case we see that the extrinsic curvature Kμν carries part of this mode.

Omitting the mass terms and other relevant operators, the action is diagonalized in terms of the different graviton modes at the linearized level h′μν (which encodes the helicity-2 mode), Nμ (which is part of the helicity-1 mode) and π (helicity-0 mode),

$$S_{{\rm{DGP}}}^{({\rm{lin}})} = {1 \over 4}\int {{{\rm{d}}^4}} x\left[ {{1 \over 2}{{h\prime}^{\mu \nu}}\square ({{h\prime}_{\mu \nu}} - {1 \over 2}h\prime{\eta _{\mu \nu}}) - {{N\prime}^\mu}\sqrt {- \square} {{N\prime}_\mu} + 3\pi \square \pi} \right]\, .$$
(4.36)

Decoupling limit

We will be discussing the meaning of ‘decoupling limits’ in more depth in the context of multi-gravity and ghost-free massive gravity in Section 8. The main idea behind the decoupling limit is to separate the physics of the different modes. Here we are interested in following the interactions of the helicity-0 mode without the complications from the standard helicity-2 interactions that already arise in GR. For this purpose we can take the limit MPl → ∞ while simultaneously sending \({m_0} = M_5^3/M_{{\rm{Pl}}}^2 \to 0\) while keeping the scale \(\Lambda = {(m_0^2{M_{{\rm{Pl}}}})^{1/3}}\) fixed. This is the scale at which the first interactions arise in DGP.

In DGP the decoupling limit should be taken by considering the full five-dimensional theory, as was performed in [389]. The four-dimensional Einstein-Hilbert term does not give to any operators before the Planck scale, so in order to look for the irrelevant operator that come at the lowest possible scale, it is sufficient to focus on the boundary term from the five-dimensional action. It includes operators of the form

$${\mathcal L}_{{\rm{boundary}}}^{(5)} \supset {m_0}M_{{\rm{Pl}}}^2\partial {\left({{{{{h\prime}_{\mu \nu}}} \over {{M_{{\rm{Pl}}}}}}} \right)^n}{\left({{{{{N\prime}_\mu}} \over {\sqrt {{m_0}}{M_{{\rm{Pl}}}}}}} \right)^k}{\left({{{\partial \pi} \over {{m_0}{M_{{\rm{Pl}}}}}}} \right)^\ell}\, ,$$
(4.37)

with integer powers n, k, ≥ 0 and n + k + ≥ 3 since we are dealing with interactions. The scale at which such an operator arises is

$${\Lambda _{n,k,\ell}} = {\left({M_{{\rm{Pl}}}^{n + k + \ell - 2}m_0^{k/2 + \ell - 1}} \right)^{1/(n + 3k/2 + 2\ell - 3)}}$$
(4.38)

and it is easy to see that the lowest possible scale is \({\Lambda _3} = {({M_{{\rm{Pl}}}}m_0^2)^{1/3}}\) which arises for n = 0, k = 0 and = 3, it is thus a cubic interaction in the helicity-0 mode π which involves four derivatives. Since it is only a cubic interaction, we can scan all the possible ways enters at the cubic level in the five-dimensional Einstein-Hilbert action. The relevant piece are the ones from the extrinsic curvature in (4 22), and in particular the combination N ([K ]2 − [K2]), with

$$N = 1 + {1 \over 2}{e^{- \sqrt {- \square} y}}{h_{yy}}$$
(4.39)
$${K_{\mu \nu}} = - {1 \over 2}(1 - {1 \over 2}{e^{- \sqrt {- \square} y}}{h_{yy}})({\partial _\mu}{N_\nu} + {\partial _\nu}{N_\mu})\, .$$
(4.40)

Integrating \({m_0}M_{{\rm{Pl}}}^2N({[K]^2} - [{K^2}])\) along the extra dimension, we obtain the cubic contribution in π on the brane (using the relations (4.34) and (4.35))

$${\mathcal L}{\Lambda _3} = {1 \over {2\Lambda _3^3}}{(\partial \pi)^2}\square \pi \, .$$
(4.41)

So the decoupling limit of DGP arises at the scale Λ3 and reduces to a cubic Galileon for the helicity-0 mode with no interactions for the helicity-2 and -1 modes,

$$\begin{array}{*{20}c} {{\mathcal L_{{\rm{DL}}\;{\rm{DGP}}}} = {1 \over 8}{{h\prime}^{\mu \nu}}\square \left({{{h\prime}_{\mu \nu}} - {1 \over 2}h\prime{\eta _{\mu \nu}}} \right) - {1 \over 4}{{N\prime}^\mu}\sqrt {- \square} {{N\prime}_\mu}} \\ {+ {3 \over 2}\pi \square \pi + {1 \over {2\Lambda _3^3}}{{(\partial \pi)}^2}\square \pi .\quad \quad} \end{array}$$
(4.42)

Phenomenology of DGP

The phenomenology of DGP is extremely rich and has led to many developments. In what follows we review one of the most important implications of the DGP for cosmology which the existence of self-accelerating solutions. The cosmology and phenomenology of DGP was first derived in [159, 163] (see also [388, 385, 387, 386]).

Friedmann equation in de Sitter

To get some intuition on how cosmology gets modified in DGP, we first look at de Sitter-like solutions and then infer the full Friedmann equation in a FLRW-geometry. We thus start with five-dimensional Minkowski in de Sitter slicing (this can be easily generalized to FLRW-slicing),

$${\rm{d}}s_5^2 = {b^2}(y)\,\left({{\rm{d}}{y^2} + \gamma _{\mu \nu}^{({\rm{dS}})}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}} \right)\, ,$$
(4.43)

where \(\gamma _{\mu v}^{{\rm{(dS)}}}\) is the four-dimensional de Sitter metric with constant Hubble parameter H, \(\gamma _{\mu v}^{{\rm{(dS)}}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^v} = - {\rm{d}}{t^2} + {a^2}(t)\,{\rm{d}}{x^2}\), and the scale factor is given by a (t) = exp(Ht). The metric (4.43) is indeed Minkowski in de Sitter slicing if the warp factor b (y) is given by

$$b(y) = {e^{\epsilon H\vert y\vert}},\quad {\rm{with}}\quad \epsilon = \pm 1\, ,$$
(4.44)

and the mod y has be imposed by the ℤ2-orbifold symmetry. As we shall see the branch ϵ = +1 corresponds to the self-accelerating branch of DGP and ϵ = −1 is the stable, normal branch of DGP.

We can now derive the Friedmann equation on the brane by integrating over the 00-component of the Einstein equation (4.2) with the source (4.3) and consider some energy density T00 = ρ. The four-dimensional Einstein tensor gives the standard contribution G00 = 3H2 on the brane and so we obtain the modified Friedmann equation

$${{M_5^3} \over 2}\left[ {\underset {\varepsilon \rightarrow 0} {\lim} \int\nolimits_{- \varepsilon}^\varepsilon {{}^{(5)}{G_{00}}} {\rm{d}}y} \right] + 3{\rm{M}}_{{\rm{Pl}}}^2{H^2} = \rho \, ,$$
(4.45)

with (5)G00 = 3(H2b ″(y)/b (y)), so

$$\underset {\varepsilon \rightarrow 0} {\lim} \int\nolimits_{- \varepsilon}^\varepsilon {{}^{(5)}{G_{00}}} {\rm{d}}y = - 6\epsilon H\, ,$$
(4.46)

leading to the modified Friedmann equation,

$${H^2} - \varepsilon {m_0}H = {1 \over {3M_{{\rm{Pl}}}^2}}\rho \, ,$$
(4.47)

where the five-dimensional nature of the theory is encoded in the new term −ϵm0H (this new contribution can be seen to arise from the helicity-0 mode of the graviton and could have been derived using the decoupling limit of DGP.) for reasons which will become clear in what follows, the choice ϵ = − 1 corresponds to the stable branch of DGP while the other choice ϵ = +1 corresponds to the self-accelerating branch of DGP. As is already clear from the higher-dimensional perspective, when ϵ = +1, the warp factor grows in the bulk (unless we think of the junction conditions the other way around), which is already signaling towards a pathology for that branch of solution.

General Friedmann equation

This modified Friedmann equation has been derived assuming a constant H, which is only consistent if the energy density is constant (i.e., a cosmological constant). We can now derive the generalization of this Friedmann equation for non-constant H. This amounts to account for and other derivative corrections which might have been omitted in deriving this equation by assuming that was constant. But the Friedmann equation corresponds to the Hamiltonian constraint equation and higher derivatives (e.g., ⊃ ä and higher derivatives of H) would imply that this equation is no longer a constraint and this loss of constraint would imply that the theory admits a new degree of freedom about generic backgrounds namely the BD ghost (see the discussion of Section 7).

However, in DGP we know that the BD ghost is absent (this is ensured by the five-dimensional nature of the theory, in five dimensions we start with five dofs, and there is thus no sixth BD mode). So the Friedmann equation cannot include any derivatives of H, and the Friedmann equation obtained assuming a constant H is actually exact in FLRW even if H is not constant. So the constraint (4.47) is the exact Friedmann equation in DGP for any energy density ρ on the brane.

The same trick can be used for massive gravity and bi-gravity and the Friedmann equations (12.51), (12.52) and (12.54) are indeed free of any derivatives of the Hubble parameter.

Observational viability of DGP

Independently of the ghost issue in the self-accelerating branch of the model, there has been a vast amount of investigation on the observational viability of both the self-accelerating branch and the normal (stable) branch of DGP. First because many of these observations can apply equally well to the stable branch of DGP (modulo a minus sign in some of the cases), and second and foremost because DGP represents an excellent archetype in which ideas of modified gravity can be tested.

Observational tests of DGP fall into the following two main categories:

  • Tests of the Friedmann equation. This test was performed mainly using Supernovae, but also using Baryonic Acoustic Oscillations and the CMB so as to fix the background history of the Universe [162, 217, 221, 286, 391, 23, 405, 481, 304, 382, 462]. Current observations seem to slightly disfavor the additional term in the Friedmann equation of DGP, even in the normal branch where the late-time acceleration of the Universe is due to a cosmological constant as in ΛCDM. These put bounds on the graviton mass in DGP to the order of m0 ≲ 10−1 H0, where H0 is the Hubble parameter today (see Ref. [492] for the latest bounds at the time of writing, including data from Planck). Effectively this means that in order for DGP to be consistent with observations, the graviton mass can have no effect on the late-time acceleration of the Universe.

  • Tests of an extra fifth force, either within the solar system, or during structure formation (see for instance [362, 260, 452, 451, 222, 482] Refs. [453, 337, 442] for N-body simulations as well as Ref. [17, 441] using weak lensing).

    Evading fifth force experiments will be discussed in more detail within the context of the Vainshtein mechanism in Section 10.1 and thereafter, and we save the discussion to that section. See Refs. [388, 385, 387, 386, 444] for a five-dimensional study dedicated to DGP. The study of cosmological perturbations within the context of DGP was also performed in depth for instance in [367, 92].

Self-acceleration branch

The cosmology of DGP has led to a major conceptual breakthrough, namely the realization that the Universe could be ‘self-accelerating’. This occurs when choosing the ϵ = +1 branch of DGP, the Friedmann equation in the vacuum reduces to [159, 163]

$${H^2} - {m_0}H = 0\, ,$$
(4.48)

which admits a non-trivial solution H = 0 in the absence of any cosmological constant nor vacuum energy. In itself this would not solve the old cosmological constant problem as the vacuum energy ought to be set to zero on its own, but it can lead to a model of ‘dark gravity’ where the amount of acceleration is governed by the scale m0 which is stable against quantum corrections.

This realization has opened a new field of study in its own right. It is beyond the scope of this review on massive gravity to summarize all the interesting developments that arose in the past decade and we simply focus on a few elements namely the presence of a ghost in this self-accelerating branch as well as a few cosmological observations.

Ghost

The existence of a ghost on the self-accelerating branch of DGP was first pointed out in the decoupling limit [389, 411], where the helicity-0 mode of the graviton is shown to enter with the wrong sign kinetic in this branch of solutions. We emphasize that the issue of the ghost in the self-accelerating branch of DGP is completely unrelated to the sixth BD ghost on some theories of massive gravity. In DGP there are five dofs one of which is a ghost. The analysis was then generalized in the fully fledged five-dimensional theory by K. Koyama in [360] (see also [263, 361] and [98]).

When perturbing about Minkowski, it was shown that the graviton has an effective mass \({m^2} = {m_0}\sqrt {- \square}\). When perturbing on top of the self-accelerating solution a similar analysis can be performed and one can show that in the vacuum the graviton has an effective mass at precisely the Higuchi-bound, \(m_{{\rm{eff}}}^2 = 2{H^2}\) (see Ref. [307]). When matter or a cosmological constant is included on the brane, the graviton mass shifts either inside the forbidden Higuchi-region \(0 < m_{{\rm{eff}}}^2 < 2{H^2}\), or outside \(m_{{\rm{eff}}}^2 > 2{H^2}\). We summarize the three case scenario following [360, 98]

  • In [307] it was shown that when the effective mass is within the forbidden Higuchi-region, the helicity-0 mode of graviton has the wrong sign kinetic term and is a ghost.

  • Outside this forbidden region, when \(m_{{\rm{eff}}}^2 > 2{H^2}\), the zero-mode of the graviton is healthy but there exists a new normalizable brane-bending mode in the self-accelerating branchFootnote 8 which is a genuine degree of freedom. For \(m_{{\rm{eff}}}^2 > 2{H^2}\) the brane-bending mode was shown to be a ghost.

  • Finally, at the critical mass \(m_{{\rm{eff}}}^2 > 2{H^2}\) (which happens when no matter nor cosmological constant is present on the brane), the brane-bending mode takes the role of the helicity-0 mode of the graviton, so that the theory graviton still has five degrees of freedom, and this mode was shown to be a ghost as well.

In summary, independently of the matter content of the brane, so long as the graviton is massive \(m_{{\rm{eff}}}^2 > 0\), the self-accelerating branch of DGP exhibits a ghost. See also [210] for an exact non-perturbative argument studying domain walls in DGP. In the self-accelerating branch of DGP domain walls bear a negative gravitational mass. This non-perturbative solution can also be used as an argument for the instability of that branch.

Evading the ghost?

Different ways to remove the ghosts were discussed for instance in [325] where a second brane was included. In this scenario it was then shown that the graviton could be made stable but at the cost of including a new spin-0 mode (that appears as the mode describing the distance between the branes).

Alternatively it was pointed out in [233] that if the sign of the extrinsic curvature was flipped, the self-accelerating solution on the brane would be stable.

Finally, a stable self-acceleration was also shown to occur in the massless case \(m_{{\rm{eff}}}^2 = 0\) by relying on Gauss-Bonnet terms in the bulk and a self-source AdS5 solution [156]. The five-dimensional theory is then similar as that of DGP (4.1) but with the addition of a five-dimensional Gauss-Bonnet term \({\mathcal R}_{{\rm{GB}}}^2\) in the bulk and the wrong sign five-dimensional Einstein-Hilbert term,

$$\begin{array}{*{20}c} {S = \int {{{\rm{d}}^5}} x\left[ {\sqrt {{- ^{({\rm{5}})}}g} \left({- {{M_5^3} \over 4}{\,{({\rm{5}})}}R{[^{({\rm{5}})}}g] - {{M_5^3{\ell ^2}} \over 4}{\,^{({\rm{5}})}}R_{{\rm{GB}}}^2{[^{({\rm{5}})}}g]} \right)\quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\{\left. {+ \delta (y)\left[ {\sqrt {- g} {{M_{{\rm{Pl}}}^2} \over 2}R + {L_m}(g,{\psi _i})} \right]} \right]\, .} \end{array}$$
(4.49)

The idea is not so dissimilar as in new massive gravity (see Section 13), where here the wrong sign kinetic term in five-dimensions is balanced by the Gauss-Bonnet term in such a way that the graviton has the correct sign kinetic term on the self-sourced AdS5 solution. The length scale is related to this AdS length scale, and the self-accelerating branch admits a stable (ghost-free) de Sitter solution with H−1.

We do not discuss this model any further in what follows since the graviton admits a zero (massless) mode. It is feasible that this model can be understood as a bi-gravity theory where the massive mode is a resonance. It would also be interesting to see how this model fits in with the Galileon theories [412] which admit stable self-accelerating solution.

In what follows, we go back to the standard DGP model be it the self-accelerating branch (ϵ = 1) or the normal branch (ϵ = − 1).

Degravitation

One of the main motivations behind modifying gravity in the infrared is to tackle the old cosmological constant problem. The idea behind ‘degravitation’ [211, 212, 26, 216] is if gravity is modified in the IR, then a cosmological constant (or the vacuum energy) could have a smaller impact on the geometry. In these models, we would live with a large vacuum energy (be it at the TeV scale or at the Planck scale) but only observe a small amount of late-acceleration due to the modification of gravity. In order for a theory of modified gravity to potentially tackle the old cosmological constant problem via degravitation it needs to have the two following properties:

  1. 1.

    First, gravity must be weaker in the infrared and effectively massive [216] so that the effect of IR sources can be degravitated.

  2. 2.

    Second, there must exist some (nearly) static attractor solutions towards which the system can evolve at late-time for arbitrary value of the vacuum energy or cosmological constant.

Flat solution with a cosmological constant

The first requirement is present in DGP, but as was shown in [216] in DGP gravity is not ‘sufficiently weak’ in the IR to allow degravitation solutions. Nevertheless, it was shown in [164] that the normal branch of DGP satisfies the second requirement for any negative value of the cosmological constant. In these solutions the five-dimensional spacetime is not Lorentz invariant, but in a way which would not (at this background level) be observed when confined on the four-dimensional brane.

For positive values of the cosmological constant, DGP does not admit a (nearly) static solution. This can be understood at the level of the decoupling limit using the arguments of [216] and generalized for other mass operators.

Inspired by the form of the graviton in DGP, \({m^2}(\square) = {m_0}\sqrt {- \square}\), we can generalize the form of the graviton mass to

$${m^2}(\square) = m_0^2{\left({{{- \square} \over {m_0^2}}} \right)^\alpha}\, ,$$
(4.50)

with α a positive dimensionless constant. α = 1 corresponds to a modification of the kinetic term. As shown in [153], any such modification leads to ghosts, so we do not consider this case here. α > 1 corresponds to a UV modification of gravity, and so we focus on α < 1.

In the decoupling limit the helicity-2 decouples from the helicity-0 mode which behaves (symbolically) as follows [216]

$$3\square \pi - {1 \over {{M_{{\rm{Pl}}}}m_0^{4(1 - \alpha)}}}\square {\left({{\square ^{1 - \alpha}}\pi} \right)^2} + \cdots = - {1 \over {{M_{{\rm{Pl}}}}}}T\, ,$$
(4.51)

where T is the trace of the stress-energy tensor of external matter fields. At the linearized level, matter couples to the metric \({g_{\mu v}} = {\eta _{\mu v}} + {1 \over {{M_{{\rm{Pl}}}}}}(h_{\mu v}\prime + \pi {\eta _{\mu v}})\). We now check under which conditions we can still recover a nearly static metric in the presence of a cosmological constant Tμν = −ΛCCgμν. In the linearized limit of GR this leads to the profile for the helicity-2 mode (which in that case corresponds to a linearized de Sitter solution)

$${h\prime_{\mu \nu}} = - {{{\Lambda _{{\rm{CC}}}}} \over {6{M_{{\rm{Pl}}}}}}{\eta _{\rho \sigma}}{x^\rho}{x^\sigma}{\eta _{\mu \nu}}\, .$$
(4.52)

One way we can obtain a static solution in this extended theory of massive gravity at the linear level is by ensuring that the solution for cancels out that of hμν so that the metric gμν remains flat. \(\pi = + {{{\Lambda _{{\rm{CC}}}}} \over {6{M_{{\rm{Pl}}}}}}{\eta _{\mu v}}{x^\mu}{x^v}\) is actually the solution of (4.51) when only the term contributes and all the other operators vanish for πxμxμ. This is the case if α < 1/2 as shown in [216]. This explains why in the case of DGP which corresponds to border line scenario α = 1/2, one can never fully degravitate a cosmological constant.

Extensions

This realization has motivated the search for theories of massive gravity with 0 ≤ α < 1/2, and especially the extension of DGP to higher dimensions where the parameter can get as close to zero as required. This is the main motivation behind higher dimensional DGP [359, 240] and cascading gravity [135, 148, 132, 149] as we review in what follows. (In [433] it was also shown how a regularized version of higher dimensional DGP could be free of the strong coupling and ghost issues).

Note that α ≡ 0 corresponds to a hard mass gravity. Within the context of DGP, such a model with an ‘auxiliary’ extra dimension was proposed in [235, 133] where we consider a finite-size large extra dimension which breaks five-dimensional Lorentz invariance. The five-dimensional action is motivated by the five-dimensional gravity with scalar curvature in the ADM decomposition (5)R = R [g ] + [K ]2 − [K2], but discarding the contribution from the four-dimensional curvature R [g ]. Similarly as in DGP, the four-dimensional curvature still appears induced on the brane

$$S = {{{\rm{M}}_{{\rm{Pl}}}^2} \over 2}\int\nolimits_0^\ell {\rm{d}} y\int {{{\rm{d}}^4}} x\sqrt {- g} \,\left({{m_0}\,\left({{{[K]}^2} - [{K^2}]} \right) + \delta (y)R[g]} \right)\, ,$$
(4.53)

where is the size of the auxiliary extra dimension and gμν is a four-dimensional metric and we set the lapse to one (this shift can be kept and will contribute to the four-dimensional Stückelberg field which restores four-dimensional invariance, but at this level it is easier to work in the gauge where the shift is set to zero and reintroduce the Stückelberg fields directly in four dimensions). Imposing the Dirichlet conditions gμν (x, y = 0) = fμν, we are left with a theory of massive gravity at y = 0, with reference metric fμν and hard mass m0. Here again the special structure ([K ]2 − [K2]) inherited (or rather inspired) from five-dimensional gravity ensures the Fierz-Pauli structure and the absence of ghost at the linearized level. Up to cubic order in perturbations it was shown in [138] that the theory is free of ghost and its decoupling limit is that of a Galileon.

Furthermore, it was shown in [133] that it satisfies both requirements presented above to potentially help degravitating a cosmological constant. Unfortunately at higher orders this model is plagued with the BD ghost [291] unless the boundary conditions are chosen appropriately [59]. For this reason we will not review this model any further in what follows and focus instead on the ghost-free theory of massive gravity derived in [137, 144]

Cascading gravity

Deficit angle

It is well known that a tension on a cosmic string does not cause the cosmic strong to inflate but rather creates a deficit angle in the two spatial dimensions orthogonal to the string. Similarly, if we consider a four-dimensional brane embedded in six-dimensional gravity, then a tension on the brane leads to the following flat geometry

$${\rm{d}}s_6^2 = {\eta _{\mu \nu}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^\nu}\, + \,{\rm{d}}{r^2}\, + \,{r^2}\,\left({1 - {{\Delta \theta} \over {2\pi}}} \right){\rm{d}}{\theta ^2}\, ,$$
(4.54)

where the two extra dimensions are expressed in polar coordinates {r, θ} and Δθ is a constant which parameterize the deficit angle in this canonical geometry. This deficit angle is related to the tension on the brane ΛCC and the six-dimensional Planck scale (assuming six-dimensional gravity)

$$\Delta \theta = 2\pi {{{\Lambda _{{\rm{CC}}}}} \over {M_6^4}}\, .$$
(4.55)

For a positive tension Λcc > 0, this creates a positive deficit angle and since Λθ cannot be larger than 2π, the maximal tension on the brane is \(M_6^4\). For a negative tension, on the other hand, there is no such bound as it creates a surplus of angle, see Figure 2.

Figure 2
figure 2

Codimension-2 brane with positive (resp. negative) tension brane leading to a positive (resp. negative) deficit angle in the two extra dimensions.

This interesting feature has lead to many potential ways to tackle the cosmological constant by considering our Universe to live in a 3 + 1-dimensional brane embedded in two or more large extra dimensions. (See Refs. [4, 3, 408, 414, 80, 470, 458, 459, 86, 82, 247, 333, 471, 81, 426, 409, 373, 85, 460, 155] for the supersymmetric large-extra-dimension scenario as an alternative way to tackle the cosmological constant problem). Extending the DGP to more than one extra dimension could thus provide a natural way to tackle the cosmological constant problem.

Spectral representation

Furthermore in n-extra dimensions the gravitational potential is diluted as V (r) ∼ r−1−n. If the propagator has a Källén-Lehmann spectral representation with spectral density π (μ2), the Newtonian potential has the following spectral representation

$$V(r) = \int\nolimits_0^\infty {{{\rho ({\mu ^2}){e^{- \mu r}}} \over r}} \,{\rm{d}}{\mu ^2}\, .$$
(4.56)

In a higher-dimensional DGP scenario, the gravitation potential behaves higher dimensional at large distance, V (r) ∼−(1+n) which implies π (μ2) ∼ μn−2 in the IR as depicted in Figure 1.

Working back in terms of the spectral representation of the propagator as given in (4.19), this means that the propagator goes to 1/k in the IR as μ → 0 when n = 1 (as we know from DGP), while it goes to a constant for n > 1. So for more than one extra dimension, the theory tends towards that of a hard mass graviton in the far IR, which corresponds to α → 0 in the parametrization of (4.50). Following the arguments of [216] such a theory should thus be a good candidate to tackle the cosmological constant problem.

A brane on a brane

Both the spectral representation and the fact that codimension-two (and higher) branes can accommodate for a cosmological constant while remaining flat has made the field of higher-codimension branes particularly interesting.

However, as shown in [240] and [135, 148, 132, 149], the straightforward extension of DGP to two large extra dimensions leads to ghost issues (sixth mode with the wrong sign kinetic term, see also [290, 70]) as well as divergences problems (see Refs. [256, 131, 130, 423, 422, 355, 83]).

To avoid these issues, one can consider simply applying the DGP procedure step by step and consider a 4 + 1-dimensional DGP brane embedded in six dimension. Our Universe would then be on a 3 + 1-dimensional DGP brane embedded in the 4 + 1 one, (note we only consider one side of the brane here which explains the factor of 2 difference compared with (4.1))

$$\begin{array}{*{20}c} {S = {{M_6^4} \over 2}\int {{{\rm{d}}^6}x{{\sqrt {- {g_6}}}^{(6)}}R + {{M_5^3} \over 2}} \int {{{\rm{d}}^5}x{{\sqrt {- {g_5}}}^{(5)}}R}} \\{+ {{M_{{\rm{Pl}}}^2} \over 2}\int {\,{{\rm{d}}^4}x{{\sqrt {- {g_4}}}^{(4)}}R +} \int {\,{{\rm{d}}^4}x{L_{{\rm{matter}}}}({g_4},\,\psi)} \, .} \end{array}$$
(4.57)

This model has two cross-over scales: \({m_5} = M_5^3/M_{{\rm{Pl}}}^2\) which characterizes the scale at which one crosses from the four-dimensional to the five-dimensional regime, and \({m_6} = M_6^4/M_5^3\) yielding the crossing from a five-dimensional to a six-dimensional behavior. Of course we could also have a simultaneous crossing if m5 = m6. In what follows we focus on the case where Mpl > M5 > M6.

Performing the same linearized analysis as in Section 4.1.1 we can see that the four-dimensional theory of gravity is effectively massive with the soft mass in Fourier space

$${m^2}(k) = {{\pi {m_5}} \over 4}{{\sqrt {m_6^2 - {k^2}}} \over {{\rm{arcth}}\sqrt {{{{m_6} - k} \over {{m_6} + k}}}}}\, .$$
(4.58)

We see that the 4 + 1-dimensional brane plays the role of a regulator (a divergence occurs in the limit m5 → 0).

In this six-dimensional model, there are effectively two new scalar degrees of freedom (arising from the extra dimensions). We can ensure that both of them have the correct sign kinetic term by

  • Either smoothing out the brane [240, 148] (this means that one should really consider a six-dimensional curvature on both the smoothed 4 + 1 and on the 3 + 1-dimensional branes, which is something one would naturally expectFootnote 9).

  • Or by including some tension on the 3 + 1 brane (which is also something natural since the setup is designed to degravitate a large cosmological constant on that brane). This was shown to be ghost free in the decoupling limit in [135] and in the full theory in [150].

As already mentioned in two large extra dimensional models there is to be a maximal value of the cosmological constant that can be considered which is related to the six-dimensional Planck scale. Since that scale is in turn related to the effective mass of the graviton and since observations set that scale to be relatively small, the model can only take care of a relatively small cosmological constant. Nevertheless, it still provides a proof of principle on how to evade Weinberg’s no-go theorem [484].

The extension of cascading gravity to more than two extra dimensions was considered in [149]. It was shown in that case how the 3 + 1 brane remains flat for arbitrary values of the cosmological constant on that brane (within the regime of validity of the weak-field approximation). See Figure 3 for a picture on how the scalar potential adapts itself along the extra dimensions to accommodate for a cosmological constant on the brane.

Figure 3
figure 3

Seven-dimensional cascading scenario and solution for one the metric potential F on the (5+1)-dimensional brane in a 7-dimensional cascading gravity scenario with tension on the (3 + 1)-dimensional brane located at y = z = 0, in the case where \(M_6^4/M_5^3 = M_7^5/M_6^4 = {m_7}\). y and z represent the two extra dimensions on the (5 + 1)-dimensional brane. Image reproduced with permission from [149], copyright APS.

Deconstruction

As for DGP and its extensions, to get some insight on how to construct a four-dimensional theory of single massive graviton, we can start with five-dimensional general relativity. This time, we consider the extra dimension to be compactified and of finite size R, with periodic boundary conditions. It is then natural to perform a Kaluza-Klein decomposition and to obtain a tower of Kaluza-Klein graviton mode in four dimensions. The zero mode is then massless and the higher modes are all massive with mass separation m = 1/R. Since the graviton mass is constant in this formalism we omit the subscript 0 in the rest of this review.

Rather than starting directly with a Kaluza-Klein decomposition (discretization in Fourier space), we perform instead a discretization in real space, known as “deconstruction” of five-dimensional gravity [24, 25, 170, 168, 28, 443, 340]. The deconstruction framework helps making the connection with massive gravity more explicit. However, we can also obtain multi-gravity out of it which is then completely equivalent to the Kaluza-Klein decomposition (after a non-linear field redefinition).

The idea behind deconstruction is simply to ‘replace’ the continuous fifth dimension y by a series of N sites yj separated by a distance = R/N. So that the five-dimensional metric is replaced by a set of interacting metrics depending only on x.

In what follows, we review the procedure derived in [152] to recover four-dimensional ghost-free massive gravity as well as bi- and multi-gravity out of five-dimensional GR. The procedure works in any dimensions and we only focus to deconstructing five-dimensional GR for sake of concreteness.

Formalism

Metric versus Einstein-Cartan formulation of GR

Before going further, let us first describe five-dimensional general relativity in its Einstein-Cartan formulation, where we introduce a set of vielbein \(e_A^{\rm{a}}\), so that the relation between the metric and the vielbein is simply,

$${g_{AB}}(x,y) = e_A^a(x,y)e_B^b(x,y){\eta _{ab}}\, ,$$
(5.1)

where, as mentioned previously, the capital Latin letters label five-dimensional spacetime indices, while letters to a,b,c,… label five-dimensional Lorentz indices.

Under the torsionless condition, de+ωe = 0, the antisymmetric spin connection ω, is uniquely determined in terms of the vielbeins

$$\omega _A^{ab} = {1 \over 2}e_A^c(O_{\, \, \, \, c}^{ab} - O_c^{\, \, ab} - O_{\, \, c}^{b\, \, a})\, ,$$
(5.2)

with \({O^{{\rm{ab}}}}_{\rm{c}} = 2{e^{{\rm{a}}A}}{e^{{\rm{b}}B}}{\partial _{{{[{A^e}B]}_{\rm{c}}}}}\). In the Einstein-Cartan formulation of GR, we introduce a 2-form Riemann curvature,

$${{\mathcal R}^{ab}} = {\rm{d}}{\omega ^{ab}} + \omega _{\, \, c}^a \wedge {\omega ^{cb}}\, ,$$
(5.3)

and up to boundary terms, the Einstein-Hilbert action is then given in the respective metric and the vielbein languages by (here in five dimensions for definiteness),

$$S_{{\rm{EH}}}^{(5)} = {{M_5^3} \over 2}\int \, {{\rm{d}}^4}x\, {\rm{d}}y\sqrt {- g} \, {R^{(5)}}[g]$$
(5.4)
$$={{M_5^3} \over {2\times 3!}}\int\varepsilon_{abcde} \, \mathcal{R}^{ab}\wedge e^{c} \wedge e^{d} \wedge e^{e}\,,$$
(5.5)

where R(5)[g ] is the scalar curvature built out of the five-dimensional metric gμν and M5 is the five-dimensional Planck scale.

The counting of the degrees of freedom in both languages is of course equivalent and goes as follows: In d-spacetime dimensions, the metric has d (d + 1)/2 independent components. Covariance removes 2d of them,Footnote 10 which leads to \({{\mathcal N}_d} = d(d - 3)/2\) independent degrees of freedom. In four-dimensions, we recover the usual \({{\mathcal N}_4} = 2\) independent polarizations for gravitational waves. In five-dimensions, this leads to \({{\mathcal N}_5} = 5\) degrees of freedom which is the same number of degrees of freedom as a massive spin-2 field in four dimensions. This is as expect from the Kaluza-Klein philosophy (massless bosons in d + 1 dimensions have the same number of degrees of freedom as massive bosons in d dimensions — this counting does not directly apply to fermions).

In the Einstein-Cartan formulation, the counting goes as follows: The vielbein has d2 independent components. Covariance removes 2d of them, and the additional global Lorentz invariance removes an additional d (d − 1)/2, leading once again to a total of \({{\mathcal N}_d} = d(d - 3)/2\) independent degrees of freedom.

In GR one usually considers the metric and the vielbein formulation as being fully equivalent. However, this perspective is true only in the bosonic sector. The limitations of the metric formulation becomes manifest when coupling gravity to fermions. For such couplings one requires the vielbein formulation of GR. For instance, in four spacetime dimensions, the covariant action for a Dirac fermion ψ at the quadratic order is given by (see Ref. [392]),

$$S_{\rm{Dirac}}=\int {{1} \over {3!}}\varepsilon_{abcd}\ e^a\wedge e^b \wedge e^c\ \left[ {{i} \over {2}}\bar \psi \gamma^d\, \overleftrightarrow{D}\, \psi-{{m} \over {4}}e^d \bar \psi \psi \right]\,,$$
(5.6)

where the γa’s are the Dirac matrices and represents the covariant derivative, \(D\psi = d\psi - {1 \over 8}{\omega ^{ab}}[{\gamma _a},{\gamma _b}]\psi\).

In the bosonic sector, one can convert the covariant action of bosonic fields (e.g., of scalar, vector fields, etc.…) between the vielbein and the metric language without much confusion, however this is not possible for the covariant Dirac action, or other half-spin fields. For these types of matter fields, the Einstein-Cartan Formulation of GR is more fundamental than its metric formulation. In doubt, one should always start with the vielbein formulation. This is especially important in the case of deconstruction when a discretization in the metric language is not equivalent to a discretization in the vielbein variables. The same holds for Kaluza-Klein decomposition, a point which might have been under-appreciated in the past.

Gauge-fixing

The discretization process breaks covariance and so before staring this procedure it is wise to fix the gauge (failure to do so leads to spurious degrees of freedom which then become ghost in the four-dimensional description). We thus start in five spacetime dimensions by setting the gauge

$${G_{AB}}(x,y)\, {\rm{d}}{x^A}\, {\rm{d}}{x^B} = {\rm{d}}{y^2} + {g_{\mu \nu}}(x,y)\, {\rm{d}}{x^\mu}\, {\rm{d}}{x^\nu}\, ,$$
(5.7)

meaning that the lapse is set to unity and the shift to zero. Notice that one could in principle only set the lapse to unity and keep the shift present throughout the discretization. From a four-dimensional point of view, the shift will then ‘morally’ play the role of the Stückelberg fields, however they do so only after a cumbersome field redefinition. So for sake of clarity and simplicity, in what follows we first gauge-fix the shift and then once the four-dimensional theory is obtained to restore gauge invariance by use of the Stückelberg trick presented previously.

In vielbein language, we fix the five-dimensional coordinate system and use four Lorentz transformations to set

$${e^a} = \left(\begin{array}{*{20}c} {e_\mu ^a\, {\rm{d}}{x^\mu}} \\ {{\rm{d}}y} \\ \end{array} \right)\, ,$$
(5.8)

and use the remaining six Lorentz transformations to set

$$\omega _y^{ab} = {e^{\mu [a}}{\partial _y}e_\mu ^{b]} = 0\, .$$
(5.9)

In this gauge, the five-dimensional Einstein-Hilbert term (5.4), (5.5) is given by

$$S_{{\rm{EH}}}^{(5)} = {{M_5^3} \over 2}\int \, {{\rm{d}}^4}x\, {\rm{d}}y\sqrt {- g} \, \left({R[g] + {{[K]}^2} - [{K^2}]} \right)$$
(5.10)
$$\begin{array}{*{20}c} {= {{M_5^3} \over 4}\int \left({{\varepsilon _{abcd}}\, {R^{ab}} \wedge {e^c} \wedge {e^d} - {K^a} \wedge {K^b} \wedge {e^c} \wedge {e^d}} \right.} \\{\left. {+ 2{K^a} \wedge {\partial _y}{e^b} \wedge {e^c} \wedge {e^d}} \right) \wedge {\rm{d}}y\, ,\quad \quad \quad \quad \quad} \end{array}$$
(5.11)

where R [g ], is the four-dimensional curvature built out of the four-dimensional metric gμν, Rab is the 2-form curvature built out of the four-dimensional vielbein \(e_\mu ^a\) and its associated connection \({\omega ^{ab}} = \omega _\mu ^{ab}d{x^\mu},\,{R^{ab}} = d{\omega ^{ab}} + {\omega ^a}_c\wedge{\omega ^{cb}}\), and \(K_{\,\,\,v}^\mu = {g^{\mu \alpha}}{K_{\alpha v}}\) is the extrinsic curvature,

$${K_{\mu \nu}} = {1 \over 2}{\partial _y}{g_{\mu \nu}} = e_{(\mu}^a{\partial _y}e_{\nu)}^b\, {\eta _{ab}}$$
(5.12)
$$K_\mu ^a = {e^{\nu a}}{K_{\mu \nu}}\, .$$
(5.13)

Discretization in the vielbein

One could in principle go ahead and perform the discretization directly at the level of the metric but first this would not lead to a consistent truncated theory of massive gravity.Footnote 11 As explained previously, the vielbein is more fundamental than the metric itself, and in what follows we discretize the theory keeping the vielbein as the fundamental object.

$$y \hookrightarrow y_j$$
(5.14)
$$e^a_\mu(x,y) \hookrightarrow {e_j}_\mu ^a(x)=e^a_\mu(x,y_j)$$
(5.15)
$$\partial_y e^a_\mu(x,y) \hookrightarrow m_N\left({e_{j+1}}_\mu ^a-{e_j}_\mu ^a\right).$$
(5.16)

The gauge choice (5.9) then implies

$$\omega^{ab}_y=e^{\mu [ a}\partial_y e^{b]}_\mu =0 \quad \hookrightarrow \quad {e_{j + 1}}^{\mu [a}{e_j}_\mu ^{b]} = 0\,,$$
(5.17)

where the arrow ↪ represents the deconstruction of five-dimensional gravity. We have also introduced the ‘truncation scale’, mN = Nm = −1 = NR−1, i.e., the scale of the highest mode in the discretized theory. After discretization, we see the Deser-van Nieuwenhuizen [187] condition appearing in Eq. (5.17), which corresponds to the symmetric vielbein condition. This is a sufficient condition to allow for a formulation back into the metric language [410, 314, 172]. Note, however, that as mentioned in [152], we have not assumed that this symmetric vielbein condition was true, we simply derived it from the discretization procedure in the five-dimensional gauge choice \(\omega _y^{ab} = 0\). In terms of the extrinsic curvature, this implies

$$K^a_\mu \hookrightarrow m_N\left(e_{j+1}{}^a_\mu-e_{j}{}^a_\mu\right)\,.$$
(5.18)

This can be written back in the metric language as follows

$$g_{\mu\nu}(x,y) \hookrightarrow g_{j, \mu\nu}(x)=g_{\mu\nu}(x,y_j)$$
(5.19)
$$K^{\mu}_{\nu} \hookrightarrow -m_N \mathcal{K}^{\mu}_{\nu}[g_{j},\, g_{j+1}]\equiv-m_N\left(\delta^{\mu}_{\nu} -\left(\sqrt{g_{j}^{-1}\, g_{j+1}}\right)^{\mu}_{\nu}\right)\,,$$
(5.20)

where the square root in the extrinsic curvature appears after converting back to the metric language. The square root exists as long as the metrics gj and gj+1 have the same signature and \(g_j^{- 1}{g_{j + 1}}\) has positive eigenvalues so if both metrics were diagonal the ‘time’ direction associated with each metric would be the same, which is a meaningful requirement.

From the metric language, we thus see that the discretization procedure amounts to converting the extrinsic curvature to an interaction between neighboring sites through the building block \({\mathcal K}_v^\mu [{g_j},{g_{j + 1}}]\)

Ghost-free massive gravity

Simplest discretization

In this subsection we focus on deriving a consistent theory of massive gravity from the discretization procedure (5.19, 5.20). For this, we consider a discretization with only two sites j = 1, 2 and will only be considered in the four-dimensional action induced on one site (say site 1), rather than the sum of both sites. This picture is analogous in spirit to a braneworld picture where we induce the action at one point along the extra dimension. This picture gives the theory of a unique dynamical metric, expressed in terms of a reference metric which corresponds to the fixed metric on the other site. We emphasize that this picture corresponds to a trick to build a consistent theory of massive gravity, and would otherwise be more artificial than its multi-gravity extension. However, as we shall see later, massive gravity can be seen as a perfectly consistent limit of multi (or bi-)gravity where the massless spin-2 field (and other fields in the multi-case) decouple and is thus perfectly acceptable.

To simplify the notation for this two-site case, we write the vielbein on both sites as e1 = e, e2 = f, and similarly for the metrics g1,μν, = g,μν and g2,μν = fμν. Out of the five-dimensional action for GR, we obtain the theory of massive gravity in four dimensions, (on site 1),

$$S^{(5)}_{\rm{EH}} \hookrightarrow S^{(4)}_{\rm{mGR}}\,,$$
(5.21)

with

$$S_{{\rm{mGR}}}^{(4)} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \, \left({R[g] + {m^2}\left({{{[{\mathcal K}]}^2} - [{{\mathcal K}^2}]} \right)} \right)$$
(5.22)
$$= {{M_{{\rm{Pl}}}^2} \over 4}\int \, {\varepsilon _{abcd}}\left({{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{{\mathcal A}^{abcd}}(e,f)} \right)\, ,$$
(5.23)

with the mass term in the vielbein language

$${{\mathcal A}^{abcd}}(e,f) = ({f^a} - {e^a}) \wedge ({f^a} - {e^a}) \wedge {e^c} \wedge {e^d}\, ,$$
(5.24)

or the mass term building block in the metric language,

$${\mathcal K}_\nu ^\mu = \delta _\nu ^\mu - \left({\sqrt {{g^{- 1}}f}} \right)_\nu ^\mu \, .$$
(5.25)

and we introduced the four-dimensional Planck scale, \(M_{{\rm{Pl}}}^2 = M_5^3\int {dy}\), where in this case we limit the integral about one site.

The theory of massive gravity (5.22), or equivalently (5.23) is one special example of a ghost-free theory of massive gravity (i.e., for which the BD ghost is absent). In terms of the ‘Stückelbergized’ tensor \({\mathbb X}\) introduced in Eq. (2.76), we see that

$${\mathcal K}_\nu ^\mu = \delta _\nu ^\mu - \left({\sqrt {\mathbb X}} \right)_\nu ^\mu \, ,$$
(5.26)

or in other words,

$${\mathbb X}_\nu ^\mu = \delta _\nu ^\mu - 2{\mathcal K}_\nu ^\mu + {\mathcal K}_\alpha ^\mu {\mathcal K}_\nu ^\alpha \, ,$$
(5.27)

and the mass term can be written as

$${{\mathcal L}_{{\rm{mass}}}} = - {{{m^2}M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} \left({[{{\mathcal K}^2}] - {{[{\mathcal K}]}^2}} \right)$$
(5.28)
$$=-{{m^{2}M_{\rm{Pl}}^2} \over {2}}\sqrt{-g} \left([(\mathbb{I}-\sqrt{\mathbb{X}})^2]-[\mathbb{I}-\sqrt{\mathbb{X}}]^2\right)\,.$$
(5.29)

This also a generalization of the Fierz-Pauli mass term, albeit more complicated on first sight than the ones considered in (2.83) or (2.84), but as we shall see, a generalization of the Fierz-Pauli mass term which remains free of the BD ghost as is proven in depth in Section 7. We emphasize that the idea of the approach is not to give a proof of the absence of ghost (which is provided later) but rather to provide an intuitive argument of why the mass term takes its very peculiar structure.

Generalized mass term

This mass term is not the unique acceptable generalization of Fierz-Pauli gravity and by considering more general discretization procedures we can generate the entire 2-parameter family of acceptable potentials for gravity which will also be shown to be free of ghost in Section 7.

Rather than considering the straight-forward discretization e (x, y) ↪ ej (x), we could consider the average value on one site, pondered with arbitrary weight r,

$$e(x,y)\hookrightarrow r e_j+(1-r)e_{j+1}\,.$$
(5.30)

The mass term at one site is then generalized to

$$K^a\wedge K^b \wedge e^c \wedge e^d \hookrightarrow m^2 \mathcal{A}^{abcd}_{r,s}(e_j,e_{j+1})\,,$$
(5.31)

and the most general action for massive gravity with reference vielbein is thusFootnote 12

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\int \, {\varepsilon _{abcd}}\left({{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{\mathcal A}_{r,s}^{abcd}(e,\, f)} \right)\, ,$$
(5.32)

with

$${\mathcal A}_{r,s}^{abcd}(e,\, f) = ({f^a} - {e^a}) \wedge ({f^b} - {e^b}) \wedge ((1 - r){e^c} + r{f^c}) \wedge ((1 - s){e^d} + s{f^d})\, ,$$

for any r, s ∈ ℝ.

In particular for the two-site case, this generates the two-parameter family of mass terms

$$\begin{array}{*{20}c} {{\mathcal A}_{r,s}^{abcd}(e,f) = {c_0}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {c_1}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {f^d}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{+ {c_2}\,{e^a} \wedge {e^b} \wedge {f^c} \wedge {f^d} + {c_3}\, {e^a} \wedge {f^b} \wedge {f^c} \wedge {f^d} + {c_4} {f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \end{array}$$
(5.33)
$$\equiv {{\mathcal A}_{1 - r,1 - s}}(f,e)\, ,$$
(5.34)

with c0 = (1 − s)(1 − r), c1 = (−2 + 3s + 3r − 4rs), c2 = (1 − 3s − 3r + 6rs), c3 = (r + s − 4rs) and c4 = rs. This corresponds to the most general potential which, by construction, includes no cosmological constant nor tadpole. One can also always include a cosmological constant for such models, which would naturally arise from a cosmological constant in the five-dimensional picture.

We see that in the vielbein language, the expression for the mass term is extremely natural and simple. In fact this form was guessed at already for special cases in Ref. [410] and even earlier in [502]. However, the crucial analysis on the absence of ghosts and the reason for these terms was incorrect in both of these presentations. Subsequently, after the development of the consistent metric formulation, the generic form of the mass terms was given in Refs. [95]Footnote 13 and [314].

In the metric Language, this corresponds to the following Lagrangian for dRGT massive gravity [144], or its generalization to arbitrary reference metric [296]

$${{\mathcal L}_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left({R + {{{m^2}} \over 2}\left({{{\mathcal L}_2}[{\mathcal K}] + {\alpha _3}{{\mathcal L}_3}[{\mathcal K}] + {\alpha _4}{{\mathcal L}_4}[{\mathcal K}]} \right)} \right)\, ,$$
(5.35)

where the two parameters are related to the two discretization parameters r, s as

$${\alpha _3} = r + s,\quad {\rm{and}}\quad {\alpha _4} = rs\, ,$$
(5.36)

and for any tensor Q, we define the scalar n symbolically as

$${{\mathcal L}_n}[Q] = \varepsilon \varepsilon {Q^n}\, ,$$
(5.37)

for any n = 0, ⋯ d, where d is the number of spacetime dimensions. ε is the Levi-Cevita antisymmetric symbol, so for instance in four dimensions, \({{\mathcal L}_2}[Q] = {\varepsilon ^{\mu v\alpha \beta}}{\varepsilon _{\mu \prime v\prime \alpha \beta}}Q_\mu ^{\mu \prime}Q_v^{v\prime} = 2!({[Q]^2} - [{Q^2}])\), so we recover the mass term expressed in (5.28). Their explicit form is given in what follows in the relations (6.11)(6.13) or (6.16)(6.18).

This procedure is easily generalizable to any number of dimensions, and massive gravity in d dimensions has (d − 2)-free parameters which are related to the (d − 2) discretization parameters.

Multi-gravity

In Section 5.2, we showed how to obtain massive gravity from considering the five-dimensional Einstein-Hilbert action on one site.Footnote 14 Instead in this section, we integrate over the whole of the extra dimension, which corresponds to summing over all the sites after discretization. Following the procedure of [152], we consider N = 2M + 1 sites to start which leads to multi-gravity [314], and then focus on the two-site case leading to bi-gravity [293].

Starting with the five-dimensional action (5.12) and applying the discretization procedure (5.31) with \({\mathcal A}_{r,\,s}^{abcd}\) given in (5.33), we get

$$\begin{array}{*{20}c} {{S_{N\,{\rm{mGR}}}} = {{M_4^2} \over 4}\sum\limits_{j = 1}^N {\int}\, {\varepsilon _{abcd}}\left({{R^{ab}}[{e_j}] \wedge e_j^c \wedge e_j^d + m_N^2{\mathcal A}_{{r_j},{s_j}}^{abcd}({e_j},\,{e_{j + 1}})} \right)\quad \quad \quad} \\ {= {{M_4^2} \over 2}\sum\limits_{j = 1}^N {\int {{{\rm{d}}^4}}} x\sqrt {- {g_j}} \left({R[{g_j}] + {{m_N^2} \over 2}\sum\limits_{n = 0}^4 {\alpha _n^{(j)}} {{\mathcal L}_n}({{\mathcal K}_{j,j + 1}})} \right)\, ,} \end{array}$$
(5.38)

with \(M_4^2 = M_5^3R = M_5^3/m,\,\alpha _2^{(j)} = - 1/2\), and in this deconstruction framework we obtain no cosmological constant nor tadpole, \(\alpha _0^{(j)} = \alpha _1^{(j)} = 0\) at any site j, (but we keep them for generality). In the mass Lagrangian, we use the shorthand notation for the tensor \({\mathcal K}_{\,\,\,\,\,v}^\mu [{g_i},{g_{j + 1}}]\). This is a special case of multi-gravity presented in [314] (see also [417] for other ‘topologies’ in the way the multiple gravitons interact), where each metric only interacts with two other metrics, i.e., with its closest neighbors, leading to 2N-free parameters. For any fixed j, one has \(\alpha _3^{(j)} = ({r_j} + {s_j})\), and \(\alpha _4^{(j)} = {r_j} + {s_j}\).

To see the mass spectrum of this multi-gravity theory, we perform a Fourier decomposition, which is what one would obtain (after a field redefinition) by performing a KK decomposition rather than a real space discretization. KK decomposition and deconstruction are thus perfectly equivalent (after a non-linear — but benignFootnote 15 — field redefinition). We define the discrete Fourier transform of the vielbein variables,

$$\tilde e_{\mu ,n}^a = {1 \over {\sqrt N}}\sum\limits_{j = 1}^N {e_{\mu ,j}^a} {e^{i{{2\pi} \over N}j}}\, ,$$
(5.39)

with the inverse map,

$$e_{\mu ,j}^a = {1 \over {\sqrt N}}\sum\limits_{n = - M}^M {\tilde e_{\mu ,n}^a} {e^{- i{{2\pi} \over N}n}}\, .$$
(5.40)

In terms of the Fourier transform variables, the multi-gravity action then reads at the linear level

$${\mathcal L} = \sum\limits_{n = - M}^M {\left[ {(\partial {{\tilde h}_n})(\partial {{\tilde h}_{- n}}) + m_n^2{{\tilde h}_n}{{\tilde h}_{- n}}} \right]} + {{\mathcal L}_{{\rm{int}}}}$$
(5.41)

with \(M_{{\rm{Pl}}}^{- 1}{{\tilde h}_{\mu v,n}} = {{\tilde e}^a}_{\mu, n}\tilde e_{v,n}^b{\eta _{ab}} - {\eta _{\mu v}}\) and MPl represents the four-dimensional Planck scale, \({M_{{\rm{Pl}}}} = {M_4}\sqrt N\). The reality condition on the vielbein imposes n = ẽ*n and similarly for \({\tilde h_n}\). The mass spectrum is then

$${m_n} = {m_N}\sin \left({{n \over N}} \right) \approx nm\quad {\rm{for}}\quad n \ll N.$$
(5.42)

The counting of the degrees of freedom in multi-gravity goes as follows: the theory contains 2M massive spin-2 fields with five degrees of freedom each and one massless spin-2 field with two degrees of freedom, corresponding to a total of 10M + 2 degrees of freedom. In the continuum limit, we also need to account for the zero mode of the lapse and the shift which have been gauged fixed in five dimensions (see Ref. [443] for a nice discussion of this point). This leads to three additional degrees of freedom, summing up to a total of 5N degrees of freedom of the four coordinates x2.

Bi-gravity

Let us end this section with the special case of bi-gravity. Bi-gravity can also be derived from the deconstruction paradigm, just as massive gravity and multi-gravity, but the idea has been investigated for many years (see for instance [436, 324]). Like massive gravity, bi-gravity was for a long time thought to host a BD ghost parasite, but a ghost-free realization was recently proposed by Hassan and Rosen [293] and bi-gravity is thus experiencing a revived amount of interested. This extensions is nothing other than the ghost-free massive gravity Lagrangian for a dynamical reference metric with the addition of an Einstein-Hilbert term for the now dynamical reference metric.

Bi-gravity from deconstruction

Let us consider a two-site discretization with periodic boundary conditions, j = 1, 2, 3 with quantities at the site j = 3 being identified with that at the site j = 1. Similarly, as in Section 5.2 we denote by \({g_{\mu v}} = e_\mu ^ae_v^b{\eta _{ab}}\) and by \({f_{\mu v}} = f_\mu ^af_v^b{\eta _{ab}}\) the metrics and vielbeins at the respective locations y1 and y2.

Then applying the discretization procedure highlighted in Eqs. (5.14, 5.15, 5.18, 5.19 and 5.20) and summing over the extra dimension, we obtain the bi-gravity action

$$\begin{array}{*{20}c} S_{\rm{bi-gravity}}={{M_{\rm{Pl}}^2}\over {2}} \int {\rm{d}}^4x \sqrt{-g} R[g]+ {{M_f^2}\over {2}} \int {\rm{d}}^4x \sqrt{-f} R[f]\\ + {{M_{\rm{Pl}}^2} {m^{2}}\over{4}} \int {\rm{d}}^4 x \sqrt{-g} \sum\nolimits_{n=0}^4\alpha_n \mathcal{L}_n[\mathcal{K}[g,f]]\,, \end{array}$$
(5.43)

where \({\mathcal K}[g,f]\) is given in (5.25) and we use the notation Mg = MPl. We can equivalently well write the mass terms in terms of \({\mathcal K}[g,f]\) rather than \({\mathcal K}[g,f]\) as performed in (6.21).

Notice that the most naive discretization procedure would lead to Mg = MPl = Mf, but these can be generalized either ‘by hand’ by changing the weight of each site during the discretization, or by considering a non-trivial configuration along the extra dimension (for instance warping along the extra dimensionFootnote 16), or most simply by performing a conformal rescaling of the metric at each site.

Here, \({{\mathcal L}_0}[{\mathcal K}[g,f]]\) corresponds to a cosmological constant for the metric gμν and the special combination \(\sum\nolimits_{n = 0}^4 {{{(- 1)}^n}C_4^n{{\mathcal L}_n}[{\mathcal K}[g,f]]}\), where the \(C_n^m\) are the binomial coefficients is the cosmological constant for the metric fμν, so only 2,3,4 correspond to genuine interactions between the two metrics.

In the deconstruction framework, we naturally obtain α2 = 1 and no tadpole nor cosmological constant for either metrics.

Mass eigenstates

In this formulation of bi-gravity, both metrics g and f carry a superposition of the massless and the massive spin-2 field. As already emphasize the notion of mass (and of spin) only makes sense for a field living in Minkowski, and so to analyze the mass spectrum, we expand both metrics about flat spacetime,

$${g_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}\delta {g_{\mu \nu}}$$
(5.44)
$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}\delta {f_{\mu \nu}}\, .$$
(5.45)

The general mass spectrum about different backgrounds is richer and provided in [300]. Here we only focus on a background which preserves Lorentz invariance (in principle we could also include other maximally symmetric backgrounds which hae the same amount of symmetry as Minkowski).

Working about Minkowski, then to quadratic order in h, the action for bi-gravity reads (for

$$S_{{\rm{bi - gravity}}}^{(2)} = \int {{{\rm{d}}^4}} x\left[ {- {1 \over 4}\delta {g^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}\delta {g_{\alpha \beta}} - {1 \over 4}\delta {f^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}\delta {f_{\alpha \beta}} - {1 \over 8}m_{{\rm{eff}}}^2\left({h_{\mu \nu}^2 - {h^2}} \right)} \right]\, ,$$
(5.46)

where all indices are raised and lowered with respect to the flat Minkowski metric and the Lichnerowicz operator \(\hat \varepsilon _{\mu v}^{\alpha \beta}\) was defined in (2.37). We see appearing the Fierz-Pauli mass term combination \(h_{\mu v}^2 - {h^2}\) introduced in (2.44) for the massive field with the effective mass Meff defined as [293]

$$M_{\rm{eff}}^2=\left(M_{\rm{Pl}}^{-2}+M_f^{-2}\right)^{-1}$$
(5.47)
$$m_{{\rm{eff}}}^2 = {m^2}{{M_{{\rm{Pl}}}^2} \over {M_{{\rm{eff}}}^2}}\, .$$
(5.48)

The massive field h is given by

$${h_{\mu \nu}} = {M_{{\rm{eff}}}}\left({{1 \over {{M_{{\rm{Pl}}}}}}\delta {g_{\mu \nu}} - {1 \over {{M_f}}}\delta {f_{\mu \nu}}} \right) = {M_{{\rm{eff}}}}\left({{g_{\mu \nu}} - {f_{\mu \nu}}} \right)\, ,$$
(5.49)

while the other combination represents the massless field μν,

$${\ell _{\mu \nu}} = {M_{{\rm{eff}}}}\left({{1 \over {{M_f}}}\delta {g_{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}\delta {f_{\mu \nu}}} \right)\, ,$$
(5.50)

so that in terms of the light and heavy spin-2 fields (or more precisely in terms of the two mass eigenstates h and ), the quadratic action for bi-gravity reproduces that of a massless spin-2 field and a Fierz-Pauli massive spin-2 field h with mass meff,

$$\begin{array}{*{20}c} {S_{{\rm{bi - gravity}}}^{(2)} = \int {{{\rm{d}}^4}{\rlap {-}{x}}\left[ {{1 \over 4}{h^{\mu \nu}}\left[ {\hat {\mathcal E} _{\mu \nu}^{\alpha \beta} + {1 \over 2}m_{{\rm{eff}}}^2\left({\delta _\mu ^\alpha \delta _\nu ^\beta - {\eta ^{\alpha \beta}}{\eta _{\mu \nu}}} \right)} \right]{h_{\alpha \beta}}} \right.}} \\ {\left. {- {1 \over 4}{\ell ^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{\ell _{\alpha \beta}}} \right].\quad \quad \quad \quad \,\,\,}\\ \end{array}$$
(5.51)

As explained in [293], in the case where there is a large Hierarchy between the two Planck scales MPl and Mf, the massive particles is always the one that enters at the lower Planck mass and the massless one the one that has a large Planck scale. For instance if MfMPl, the massless particle is mainly given by δfμν and the massive one mainly by δgμν. This means that in the limit Mf → ∞ while keeping MPl fixed, we recover the theory of a massive gravity and a fully decoupled massless graviton as will be explained in Section 8.2.

Coupling to matter

So far we have only focus on an empty five-dimensional bulk with no matter. It is natural, though, to consider matter fields living in five dimensions, χ (x, y) with Lagrangian (in the gauge choice (5.7))

$${{\mathcal L}_{{\rm{matter}}}} = \sqrt {- g} \left({- {1 \over 2}{{({\partial _\mu}\chi)}^2} - {1 \over 2}{{({\partial _y}\chi)}^2} - V(\chi)} \right)\, ,$$
(5.52)

in addition to arbitrary potentials (we focus on the case of a scalar field for simplicity, but the same philosophy can be applied to higher-spin species be it bosons or fermions). Then applying the same discretization scheme used for gravity, every matter field then comes in N copies

$$\chi(x,y)\hookrightarrow \chi^{(j)}(x)=\chi(x,y_j)\,,$$
(5.53)

for j = 1, ⋯, N and each field χj is coupled to the associated vielbein e(j) or metric \(g_{\mu v}^{(i)} = e_\mu ^{(j)}{}^ae_v^{(j)}{}^b{\eta _{ab}}\) at the same site. In the discretization procedure, the gradient along the extra dimension yields a mixing (interaction) between fields located on neighboring sites,

$$\int {\rm{d}} y \, (\partial_y\chi)^2 \hookrightarrow R \sum\nolimits_{j=1}^N\, m^2 (\chi^{(j+1)}(x)- \chi^{(j)}(x))^2\,,$$
(5.54)

(assuming again periodic boundary conditions, χ(N +1) = χ(1)). The discretization procedure could be also performed using a more complicated definition of the derivative along y involving more than two sites, which leads to further interactions between the different fields.

In the two-sight derivative formulation, the action for matter is then

$$\begin{array}{*{20}c} {S_{{\rm{matter}}}} \hookrightarrow {1 \over m}\int {{{\rm{d}}^4}} x\sum\limits_j {\sqrt {- {g^{(j)}}}} \left({- {1 \over 2}{g^{(j)\, \mu \nu}}{\partial _\mu}{\chi ^{(j)}}{\partial _\nu}{\chi ^{(j)}}} \right. \\ \left. {\quad \quad \quad \quad \quad \quad \quad - {1 \over 2}{m^2}{{({\chi ^{(j + 1)}} - {\chi ^{(j)}})}^2} - V({\chi ^{(j)}})} \right). \\ \end{array}$$
(5.55)

The coupling to gauge fields or fermions can be derived in the same way, and the vielbein formalism makes it natural to extend the action (5.6) to five dimensions and applying the discretization procedure. Interestingly, in the case of fermions, the fields and would not directly couple to one another, but they would couple to both the vielbein e(j) at the same site and the one e(j −1) on the neighboring site.

Notice, however, that the current full proofs for the absence of the BD ghost do not include such couplings between matter fields living on different metrics (or vielbeins), nor matter fields coupling directly to more than one metric (vielbein).

No new kinetic interactions

In GR, diffeomorphism invariance uniquely fixes the kinetic term to be the Einstein-Hilbert one

$${{\mathcal L}_{EH}} = \sqrt {- g} R \,,$$
(5.56)

(see, for instance, Refs. [287, 483, 175, 225, 76] for the uniqueness of GR for the theory of a massless spin-2 field).

In more than four dimensions, the GR action can be supplemented by additional Lovelock invariants [383] which respect diffeomorphism invariance and are expressed in terms of higher powers of the Riemann curvature but lead to second order equations of motion. In four dimensions there is only one non-trivial additional Lovelock invariant corresponding the Gauss-Bonnet term but it is topological and thus does not affect the theory, unless other degrees of freedom such as a scalar field is included.

So, when dealing with the theory of a single massless spin-2 field in four dimensions the only allowed kinetic term is the well-known Einstein-Hilbert one. Now when it comes to the theory of a massive spin-2 field, diffeomorphism invariance is broken and so in addition to the allowed potential terms described in (6.9)(6.13), one could consider other kinetic terms which break diffeomorphism.

This possibility was explored in Refs. [231, 310, 230] where it was shown that in four dimensions, the following derivative interaction \({\mathcal L}_3^{{\rm{(der)}}}\) is ghost-free at leading order (i.e., there is no higher derivatives for the Stückelberg fields when introducing the Stückelberg fields associated with linear diffeomorphism),

$${\mathcal L}_3^{({\rm{der}})} = {\varepsilon ^{\mu \nu \, \rho \sigma}}{\varepsilon ^{\mu \prime \nu \prime \rho \prime \sigma \prime}}{h_{\sigma \sigma \prime}}{\partial _\rho}{h_{\mu \mu \prime}}{\partial _{\rho \prime}}{h_{\nu \nu \prime}}\, .$$
(5.57)

So this new derivative interaction would be allowed for a theory of a massive spin-2 field which does not couple to matter. Note that this interaction can only be considered if the spin-2 field is massive in the first place, so this interaction can only be present if the Fierz-Pauli mass term (2.44) is already present in the theory.

Now let us turn to a theory of gravity. In that case, we have seen that the coupling to matter forces linear diffeomorphisms to be extended to fully non-linear diffeomorphism. So to be viable in a theory of massive gravity, the derivative interaction (5.57) should enjoy a ghost-free non-linear completion (the absence of ghost non-linearly can be checked for instance by restoring non-linear diffeomorphism using the non-linear Stückelberg decomposition (2.80) in terms of the helicity-1 and -0 modes given in (2.46), or by performing an ADM analysis as will be performed for the mass term in Section 7.) It is easy to check that by itself \({\mathcal L}_3^{{\rm{(der)}}}\) has a ghost at quartic order and so other non-linear interactions should be included for this term to have any chance of being ghost-free.

Within the deconstruction paradigm, the non-linear completion of \({\mathcal L}_3^{{\rm{(der)}}}\) could have a natural interpretation as arising from the five-dimensional Gauss-Bonnet term after discretization. Exploring the avenue would indeed lead to a new kinetic interaction of the form \(\sqrt {- g} {{\mathcal K}_{\mu v}}{{\mathcal K}_{\alpha \beta}}^*{R^{\mu v\alpha \beta}}\), where *R is the dual Riemann tensor [339, 153]. However, a simple ADM analysis shows that such a term propagates more than five degrees of freedom and thus has an Ostrogradsky ghost (similarly as the BD ghost). As a result this new kinetic interaction (5.57) does not have a natural realization from a five-dimensional point of view (at least in its metric formulation, see Ref. [153] for more details.)

We can push the analysis even further and show that no matter what the higher order interactions are, as soon as \({\mathcal L}_3^{{\rm{(der)}}}\) is present it will always lead to a ghost and so such an interaction is never acceptable [153].

As a result, the Einstein-Hilbert kinetic term is the only allowed kinetic term in Lorentzinvariant (massive) gravity.

This result shows how special and unique the Einstein-Hilbert term is. Even without imposing diffeomorphism invariance, the stability of the theory fixes the kinetic term to be nothing else than the Einstein-Hilbert term and thus forces diffeomorphism invariance at the level of the kinetic term. Even without requiring coordinate transformation invariance, the Riemann curvature remains the building block of the kinetic structure of the theory, just as in GR.

Before summarizing the derivation of massive gravity from higher dimensional deconstruction/Kaluza-Klein decomposition, we briefly comment on other ‘apparent’ modifications of the kinetic structure like in f (R) — gravity (see for instance Refs. [89, 354, 46] for f (R) massive gravity and their implications to cosmology).

Such kinetic terms à la f (R) are also possible without a mass term for the graviton. In that case diffeomorphism invariance allows us to perform a change of frame. In the Einstein-frame f (R) gravity is seen to correspond to a theory of gravity with a scalar field, and the same result will hold in f (R) massive gravity (in that case the scalar field couples non-trivially to the Stückelberg fields). As a result f (R) is not a genuine modification of the kinetic term but rather a standard Einstein-Hilbert term and the addition of a new scalar degree of freedom which not a degree of freedom of the graviton but rather an independent scalar degree of freedom which couples non-minimally to matter (see Ref. [128] for a review on f (R)-gravity.)

Part II Ghost-free Massive Gravity

Massive, Bi- and Multi-Gravity Formulation: A Summary

The previous ‘deconstruction’ framework gave a intuitive argument for the emergence of a potential of the form (6.3) (or (6.1) in the vielbein language) and its bi- and multi-metric generalizations. In deconstruction or Kaluza-Klein decomposition a certain type of interaction arises naturally and we have seen that the whole spectrum of allowed potentials (or interactions) could be generated by extending the deconstruction procedure to a more general notion of derivative or by involving the mixing of more sites in the definition of the derivative along the extra dimensions. We here summarize the most general formulation for the theories of massive gravity about a generic reference metric, bi-gravity and multi-gravity and provide a dictionary between the different languages used in the literature.

The general action for ghost-free (or dRGT) massive gravity [144] in the vielbein language is [95, 314] (see however Footnote 13 with respect to Ref. [95], see also Refs. [502, 410] for earlier work)

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\, \int {\left({{\varepsilon _{abcd}}{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{L^{({\rm{mass}})}}(e,\, f)} \right),}$$
(6.1)

with

$$\begin{array}{*{20}c} {{{\mathcal L}^{({\rm{mass}})}}(e,f) = {\varepsilon _{abcd}}\left[ {{c_0}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {c_1}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {f^d}} \right.} \\ {\quad \quad \quad \quad + {c_2}\, {e^a} \wedge {e^b} \wedge {f^c} \wedge {f^d} + {c_3}\, {e^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \\ {\left. {+ {c_4}\, {f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \right],\,\,\,\quad \quad \quad} \\ \end{array}$$
(6.2)

or in the metric language [144],

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}x\sqrt {- g} \left({R + {{{m^2}} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]} \right)\, .}$$
(6.3)

In what follows we will use the notation for the overall potential of massive gravity

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]] = - {{\mathcal L}^{({\rm{mass}})}}(e,f)\, ,$$
(6.4)

so that

$${{\mathcal L}_{{\rm{mGR}}}} = M_{{\rm{Pl}}}^2{{\mathcal L}_{{\rm{GR}}}}[g] - {m^2}\;{\mathcal U}[g,f]\, ,$$
(6.5)

where GR[g ] is the standard GR Einstein-Hilbert Lagrangian for the dynamical metric gμν and fμν is the reference metric and for bi-gravity,

$${{\mathcal L}_{{\rm{bi - gravity}}}} = M_{{\rm{Pl}}}^2{{\mathcal L}_{{\rm{GR}}}}[g] + M_f^2{{\mathcal L}_{{\rm{GR}}}}[f] - {m^2}\;{\mathcal U}[g,f]\, ,$$
(6.6)

where both gμν and fμν are then dynamical metrics.

Both massive gravity and bi-gravity break one copy of diff invariance and so the Stückelberg fields can be introduced in exactly the same way in both cases \({\mathcal U}[g,f] \to {\mathcal U}[g,\tilde f]\) where the Stückelbergized metric \({\tilde f_{\mu v}}\) was introduced in (2.75) (or alternatively \({\mathcal U}[g,f] \to {\mathcal U}[\tilde g,f]\). Thus bi-gravity is by no means an alternative to introducing the Stückelberg fields as is sometimes stated.

In these formulations, 0 (or the term proportional to c0) correspond to a cosmological constant, 1 to a tadpole, 2 to the mass term and 3,4 to allowed higher order interactions. The presence of the tadpole 1 would imply a non-zero vev. The presence of the potentials 3,4 without 2 would lead to infinitely strongly coupled degrees of freedom and would thus be pathological. We recall that \({\mathcal K}[g,f]\) is given in terms of the metrics g and f as

$${\mathcal K}_{\,\,\,\nu}^\mu [g,f] = \delta _{\,\,\,\nu}^\mu - \left({\sqrt {{g^{- 1}}f}} \right)_\nu ^\mu \, ,$$
(6.7)

and the Lagrangians n are defined as follows in arbitrary dimensions d [144]

$${{\mathcal L}_n}[Q] = - (d - n)!\sum\limits_{m = 1}^n {{{(- 1)}^m}} {{(n - 1)!} \over {(n - m)!(d - n + m)!}}[{Q^m}]{\mathcal L}_n^{(n - m)}[Q]\, ,$$
(6.8)

with 0[Q ] = d ! and = (1[Q ] = (d − 1)![Q ] or equivalently in four dimensions [292]

$${{\mathcal L}_0}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \nu \alpha \beta}}$$
(6.9)
$${{\mathcal L}_1}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \alpha \beta}}\, Q_\mu ^{\mu \prime}$$
(6.10)
$${{\mathcal L}_2}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \beta}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}$$
(6.11)
$${{\mathcal L}_3}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \prime \beta}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}Q_\alpha ^{\alpha \prime}$$
(6.12)
$${{\mathcal L}_4}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \prime \beta \prime}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}Q_\alpha ^{\alpha \prime}Q_\beta ^{\beta \prime}\, .$$
(6.13)

We have introduced the constant \({{\mathcal L}_0}\) (\({{\mathcal L}_0} = 4!\) and \(\sqrt {- g{{\mathcal L}_0}}\) is nothing other than the cosmological constant) and the tadpole 1 for completeness. Notice however that not all these five Lagrangians are independent and the tadpole can always be re-expressed in terms of a cosmological constant and the other potential terms.

Alternatively, we may express these scalars as follows [144]

$${{\mathcal L}_0}[Q] = 4!$$
(6.14)
$${{\mathcal L}_1}[Q] = 3!\, [Q]$$
(6.15)
$${{\mathcal L}_2}[Q] = 2!({[Q]^2} - [{Q^2}])$$
(6.16)
$${{\mathcal L}_3}[Q] = ({[Q]^3} - 3[Q][{Q^2}] + 2[{Q^3}])$$
(6.17)
$${{\mathcal L}_4}[Q] = ({[Q]^4} - 6{[Q]^2}[{Q^2}] + 3{[{Q^2}]^2} + 8[Q][{Q^3}] - 6[{Q^4}])\, .$$
(6.18)

These are easily generalizable to any number of dimensions, and in d dimensions we find d such independent scalars.

The multi-gravity action is a generalization to multiple interacting spin-2 fields with the same form for the interactions, and bi-gravity is the special case of two metrics (N = 2), [314]

$${S_N} = {{M_{{\rm{Pl}}}^2} \over 4}\sum\limits_{j = 1}^N {\int {\left({{\varepsilon _{abcd}}{R^{ab}}[{e_j}] \wedge e_j^c \wedge e_j^d + m_N^2{{\mathcal L}^{({\rm{mass}})}}({e_j},{e_{j + 1}})} \right)\,}} ,$$
(6.19)

or

$${S_N} = {{M_{{\rm{Pl}}}^2} \over 2}\sum\limits_{j = 1}^N {\int {{{\rm{d}}^4}x\sqrt {- {g_j}} \left({R[{g_j}] + {{m_N^2} \over 2}\sum\limits_{n = 0}^4 {\alpha _n^{(j)}} {{\mathcal L}_n}[{\mathcal K}[{g_j},{g_{j + 1}}]]} \right)\,}} .$$
(6.20)

Inverse argument

We could have written this set of interactions in terms of \({\mathcal K}[f,g]\) rather than \({\mathcal K}[g,f]\),

$$\begin{array}{*{20}c} {{\mathcal U} = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]\quad} \\ {\, = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- f} \sum\limits_{n = 0}^4 {{{\tilde \alpha}_n}} {{\mathcal L}_n}[{\mathcal K}[f,g]]\, ,} \\ \end{array}$$
(6.21)

with

$$\left(\begin{array}{*{20}c} {{{\tilde \alpha}_0}} \\ {{{\tilde \alpha}_1}} \\ {{{\tilde \alpha}_2}} \\ {{{\tilde \alpha}_3}} \\ {{{\tilde \alpha}_4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 \\ {- 4} & {- 1} & 0 & 0 & 0 \\ 6 & 3 & 1 & 0 & 0 \\ {- 4} & {- 3} & {- 2} & {- 1} & 0 \\ 1 & 1 & 1 & 1 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)$$
(6.22)

Interestingly, the absence of tadpole and cosmological constant for say the metric implies α0 = α1 = 0 which in turn implies the absence of tadpole and cosmological constant for the other metric f, ã0 = ã1 = 0, and thus ã2 = α2 = 1.

Alternative variables

Alternatively, another fully equivalent convention has also been used in the literature [292] in terms of \({\mathbb X}_{\,\,\,v}^\mu = {g^{\mu \alpha}}{f_{\alpha v}}\) defined in (2.76),

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 0}^4 {{{{\beta _n}} \over {n!}}} {{\mathcal L}_n}[\sqrt {\mathbb{X} }]\, ,$$
(6.23)

which is equivalent to (6.4) with 0 = 4! and

$$\left(\begin{array}{*{20}c} {{\beta _0}} \\ {{\beta _1}} \\ {{\beta _2}} \\ {{\beta _3}} \\ {{\beta _4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 1 & 1 & 1 & 1 \\ 0 & {- 1} & {- 2} & {- 3} & {- 4} \\ 0 & 0 & 2 & 6 & {12} \\ 0 & 0 & 0 & {- 6} & {- 24} \\ 0 & 0 & 0 & 0 & {24} \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)\, ,$$
(6.24)

or the inverse relation,

$$\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right) = {1 \over {24}}\left(\begin{array}{*{20}c} {24} & {24} & {12} & 4 & 1 \\ 0 & {- 24} & {- 24} & {- 12} & {- 4} \\ 0 & 0 & {12} & {12} & 6 \\ 0 & 0 & 0 & {- 4} & {- 4} \\ 0 & 0 & 0 & 0 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\beta _0}} \\ {{\beta _1}} \\ {{\beta _2}} \\ {{\beta _3}} \\ {{\beta _4}} \\ \end{array} \right)\, ,$$
(6.25)

so that in order to avoid a tadpole and a cosmological constant we need to set for instance β4 = − (24β0 + 24β1 + 12β2 + 4β3) and β3 = −6(4β0 + 3β1 + β2).

Expansion about the reference metric

In the vielbein language the mass term is extremely simple, as can be seen in Eq. (6.1) with \({\mathcal A}\) defined in (2.60). Back to the metric language, this means that the mass term takes a remarkably simple form when writing the dynamical metric gμν in terms of the reference metric and a difference \({\tilde h_{\mu v}} = 2{h_{\mu v}} + h_{\mu v}^2\) as

$${g_{\mu \nu}} = {f_{\mu \nu}} + 2{h_{\mu \nu}} + {h_{\mu \alpha}}{h_{\nu \beta}}{f^{\alpha \beta}}\, ,$$
(6.26)

where fαβ = (f−1)αβ The mass terms is then expressed as

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- f} \sum\limits_{n = 0}^4 {{\kappa _n}} {{\mathcal L}_n}[{f^{\mu \alpha}}{h_{\alpha \nu}}]\, ,$$
(6.27)

where the n have the same expression as the n in (6.9)-(6.13) so \({\tilde {\mathcal L}_n}\) is genuinely nth order in hμν. The expression (6.27) is thus at most quartic order in hμν but is valid to all orders in hμν, (there is no assumption that h be small). In other words, the mass term (6.27) is not an expansion in hμν truncated to a finite (quartic) order, but rather a fully equivalent way to rewrite the mass Lagrangian in terms of the variable hμν rather than gμν. Of course the kinetic term is intrinsically non-linear and includes a infinite expansion in hμν. A generalization of such parameterizations are provided in [300].

The relation between the coefficients κn and αn is given by

$$\left(\begin{array}{*{20}c} {{\kappa _0}} \\ {{\kappa _1}} \\ {{\kappa _2}} \\ {{\kappa _3}} \\ {{\kappa _4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 \\ 4 & 1 & 0 & 0 & 0 \\ 6 & 3 & 1 & 0 & 0 \\ 4 & 3 & 3 & 1 & 0 \\ 1 & 1 & 1 & 1 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)\, .$$
(6.28)

The quadratic expansion about a background different from the reference metric was derived in Ref. [278]. Notice however that even though the mass term may not appear as having an exact Fierz-Pauli structure as shown in [278], it still has the correct structure to avoid any BD ghost, about any background [295, 294, 300, 297].

Evading the BD Ghost in Massive Gravity

The deconstruction framework gave an intuitive approach on how to construct a theory of massive gravity or multiple interacting ‘gravitons’. This lead to the ghost-free dRGT theory of massive gravity and its bi- and multi-gravity extensions in a natural way. However, these developments were only possible a posteriori.

The deconstruction framework was proposed earlier (see Refs. [24, 25, 168, 28, 443, 168, 170]) directly in the metric language and despite starting from a perfectly healthy five-dimensional theory of GR, the discretization in the metric language leads to the standard BD issue (this also holds in a KK decomposition when truncating the KK tower at some finite energy scale). Knowing that massive gravity (or multi-gravity) can be naturally derived from a healthy five-dimensional theory of GR is thus not a sufficient argument for the absence of the BD ghost, and a great amount of effort was devoted to that proof, which is known by now a multitude of different forms and languages.

Within this review, one cannot make justice to all the independent proofs that have been formulated by now in the literature. We thus focus on a few of them — the Hamiltonian analysis in the ADM language — as well as the analysis in the Stückelberg language. One of the proofs in the vielbein formalism will be used in the multi-gravity case, and thus we do not emphasize that proof in the context of massive gravity, although it is perfectly applicable (and actually very elegant) in that case. Finally, after deriving the decoupling limit in Section 8.3, we also briefly review how it can be used to prove the absence of ghost more generically.

We note that even though the original argument on how the BD ghost could be circumvented in the full nonlinear theory was presented in [137] and [144], the absence of BD ghost in “ghost-free massive gravity” or dRGT has been the subject of many discussions [12, 13, 345, 342, 95, 341, 344, 96] (see also [350, 351, 349, 348, 352] for related discussions in bi-gravity). By now the confusion has been clarified, and see for instance [295, 294, 400, 346, 343, 297, 15, 259] for thorough proofs addressing all the issues raised in the previous literature. (See also [347] for the proof of the absence of ghosts in other closely related models).

ADM formulation

ADM formalism for GR

Before going onto the subtleties associated with massive gravity, let us briefly summarize how the counting of the number of degrees of freedom can be performed in the ADM language using the Hamiltonian for GR. Using an ADM decomposition (where this time, we single out the time, rather than the extra dimension as was performed in Part I),

$${\rm{d}}{s^2} = - {N^2}{\rm{d}}{t^2} + {\gamma _{ij}}({\rm{d}}{x^i} + {N^i}{\rm{d}}t)\;({\rm{d}}{x^j} + {N^j}{\rm{d}}t)\,,$$
(7.1)

with the lapse N, the shift and the 3-dimensional space metric γij. In this section indices are raised and lowered with respect to γij and dots represent derivatives with respect to t. In terms of these variables, the action density for GR is

$${{\mathcal L}_{{\rm{GR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {\rm{d}} t\left({\sqrt {- g} R + {\partial _t}\left[ {\sqrt {- g} [k]} \right]} \right)$$
(7.2)
$$= {{M_{{\rm{Pl}}}^2} \over 2}\int {\rm{d}} tN\sqrt \gamma \left({{}^{(3)}R[\gamma ] + {{[k]}^2} - [{k^2}]} \right)\,,$$
(7.3)

where (3)R is the three-dimensional scalar curvature built out of γ (no time derivatives in (3)R) and i is the three-dimensional extrinsic curvature,

$${k_{ij}} = {1 \over {2N}}({\dot \gamma _{ij}} - {\nabla _{(i}}{N_{j)}})\,.$$
(7.4)

The GR action can thus be expressed in a way which has no double or higher time derivatives and only first time-derivatives squared of γij This means that neither the shift nor the lapse are truly dynamical and they do not have any associated conjugate momenta. The conjugate momentum associated with γ is,

$${p^{ij}} = {{\partial \sqrt {- g} R} \over {\partial {{\dot \gamma}_{ij}}}}\,.$$
(7.5)

We can now construct the Hamiltonian density for GR in terms of the 12 phase space variables (γij and pij carry 6 component each),

$${{\mathcal H}_{{\rm{GR}}}} = N{{\mathcal R}_0}(\gamma ,p) + {N^i}{{\mathcal R}_i}(\gamma ,p)\,.$$
(7.6)

So we see that in GR, both the shift and the lapse play the role of Lagrange multipliers. Thus they propagate a first-class constraint each which removes 2 phase space degrees of freedom per constraint. The counting of the number of degrees of freedom in phase space thus goes as follows:

$$(2 \times 6) - 2\;{\rm{lapse\;constraints}} - 2 \times 3\;{\rm{shift\;constraints}} = 4 = 2 \times 2\,,$$
(7.7)

corresponding to a total of 4 degrees of freedom in phase space, or 2 independent degrees of freedom in field space. This is the very well-known and established result that in four dimensions GR propagates 2 physical degrees of freedom, or gravitational waves have two polarizations.

This result is fully generalizable to any number of dimensions, and in spacetime dimensions, gravitational waves carry d (d − 3)/2 polarizations. We now move to the case of massive gravity.

ADM counting in massive gravity

We now amend the GR Lagrangian with a potential \({\mathcal U}\). As already explained, this can only be performed by breaking covariance (with the exception of a cosmological constant). This potential could be a priori an arbitrary function of the metric, but contains no derivatives and so does not affect the definition of the conjugate momenta pij This translates directly into a potential at the level of the Hamiltonian density,

$${\mathcal H} = N{{\mathcal R}_0}(\gamma ,p) + {N^i}{{\mathcal R}_i}(\gamma ,p) + {m^2}{\mathcal U}({\gamma _{ij}},{N^i},N)\,,$$
(7.8)

where the overall potential for ghost-free massive gravity is given in (6.4).

If \({\mathcal U}\) depends non-linearly on the shift or the lapse then these are no longer directly Lagrange multipliers (if they are non-linear, they still appear at the level of the equations of motion, and so they do not propagate a constraint for the metric but rather for themselves). As a result for an arbitrary potential one is left with (2 × 6) degrees of freedom in the three-dimensional metric and its momentum conjugate and no constraint is present to reduce the phase space. This leads to 6 degrees of freedom in field space: the two usual transverse polarizations for the graviton (as we have in GR), in addition to two ‘vector’ polarizations and two ‘scalar’ polarizations.

These 6 polarizations correspond to the five healthy massive spin-2 field degrees of freedom in addition to the sixth BD ghost, as explained in Section 2.5 (see also Section 7.2).

This counting is also generalizable to an arbitrary number of dimensions, in spacetime dimensions, a massive spin-2 field should propagate the same number of degrees of freedom as a massless spin-2 field in d + 1 dimensions, that is (d + 1)(d − 2)/2 polarizations. However, an arbitrary potential would allow for d (d − 1)/2 independent degrees of freedom, which is 1 too many excitations, always corresponding to one BD ghost degree of freedom in an arbitrary number of dimensions.

The only way this counting can be wrong is if the constraints for the shift and the lapse cannot be inverted for the shift and the lapse themselves, and thus at least one of the equations of motion from the shift or the lapse imposes a constraint on the three-dimensional metric γij. This loophole was first presented in [138] and an example was provided in [137]. It was then used in [144] to explain how the ‘no-go’ on the presence of a ghost in massive gravity could be circumvented. Finally, this argument was then carried through fully non-linearly in [295] (see also [342] for the analysis in 1 + 1 dimensions as presented in [144]).

Eliminating the BD ghost

Linear Fierz-Pauli massive gravity

Fierz-Pauli massive gravity is special in that at the linear level (quadratic in the Hamiltonian), the lapse remains linear, so it still acts as a Lagrange multiplier generating a primary second-class constraint. Defining the metric as hμν = MPl(gμνημν), (where for simplicity and definiteness we take Minkowski as the reference metric fμν = ημν, although most of what follows can be easily generalizable to an arbitrary reference metric fμν). Expanding the lapse as N = 1 + δN, we have h00 = δN + γijNiNj and h0i = γijNj. In the ADM decomposition, the Fierz-Pauli mass term is then (see Eq. (2.45))

$$\begin{array}{*{20}c} {{{\mathcal U}^{(2)}} = - {m^{- 2}}{{\mathcal L}_{{\rm{FP}}\,{\rm{mass}}}} = {1 \over 8}(h_{\mu \nu}^2 - {h^2})\quad \;\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {= {1 \over 8}(h_{ij}^2 - {{(h_i^i)}^2} - 2(N_i^2 - \delta Nh_i^i))\,,}\\ \end{array}$$
(7.9)

and is linear in the lapse. This is sufficient to deduce that it will keep imposing a constraint on the three-dimensional phase space variables {γij, pij} and remove at least half of the unwanted BD ghost. The shift, on the other hand, is non-linear already in the Fierz-Pauli theory, so their equations of motion impose a relation for themselves rather than a constraint for the three-dimensional metric. As a result the Fierz-Pauli theory (at that order) propagates three additional degrees of freedom than GR, which are the usual five degrees of freedom of a massive spin-2 field. Nonlinearly however the Fierz-Pauli mass term involve a non-linear term in the lapse in such a way that the constraint associated with it disappears and Fierz-Pauli massive gravity has a ghost at the non-linear level, as pointed out in [75]. This is in complete agreement with the discussion in Section 2.5, and is a complementary way to see the issue.

In Ref. [111], the most general potential was considered up to quartic order in the hμν, and it was shown that there is no choice of such potential (apart from a pure cosmological constant) which would prevent the lapse from entering non-linearly. While this result is definitely correct, it does not however imply the absence of a constraint generated by the set of shift and lapse Nμ = {N, Ni}. Indeed there is no reason to believe that the lapse should necessarily be the quantity to generates the constraint necessary to remove the BD ghost. Rather it can be any combination of the lapse and the shift.

Example on how to evade the BD ghost non-linearly

As an instructive example presented in [137], consider the following Hamiltonian,

$${\mathcal H} = N{\tilde{\mathcal C}_0}(\gamma ,p) + {N^i}{\tilde{\mathcal C}_i}(\gamma ,p) + {m^2}{\mathcal U},$$
(7.10)

with the following example for the potential

$${\mathcal U} = V(\gamma ,p){{{\gamma _{ij}}{N^i}{N^j}} \over {2N}}\,.$$
(7.11)

In this example neither the lapse nor the shift enter linearly, and one might worry on the loss of the constraint to project out the BD ghost. However, upon solving for the shift and substituting back into the Hamiltonian (this is possible since the lapse is not dynamical), we get

$${\mathcal H} = N\left({{{\tilde{\mathcal C}}_0}(\gamma ,p) - {{{\gamma ^{ij}}{{\tilde{\mathcal C}}_i}{{\tilde{\mathcal C}}_j}} \over {2{m^2}V(\gamma ,p)}}} \right)\,,$$
(7.12)

and the lapse now appears as a Lagrange multiplier generating a constraint, even though it was not linear in (7.10). This could have been seen more easily, without the need to explicitly integrating out the shift by computing the Hessian

$${L_{\mu \nu}} = {{{\partial ^2}{\mathcal H}} \over {\partial {N^\mu}\partial {N^\nu}}} = {m^2}{{{\partial ^2}{\mathcal U}} \over {\partial {N^\mu}\partial {N^\nu}}}\,.$$
(7.13)

In the example (7.10), one has

$${L_{\mu \nu}} = {{{m^2}V(\gamma ,p)} \over {{N^3}}}\left({\begin{array}{*{20}c} {N_i^2} & {- N\,{N_i}} \\ {- N\,{N_j}} & {{N^2}\,{\gamma _{ij}}} \\ \end{array}} \right)\qquad \Rightarrow \qquad \det \;({L_{\mu \nu}}) = 0\,.$$
(7.14)

The Hessian cannot be inverted, which means that the equations of motion cannot be solved for all the shift and the lapse. Instead, one of these ought to be solved for the three-dimensional phase space variables which corresponds to the primary second-class constraint. Note that this constraint is not associated with a symmetry in this case and while the Hamiltonian is then pure constraint in this toy example, it will not be in general.

Finally, one could also have deduce the existence of a constraint by performing the linear change of variable

$${N_i} \rightarrow {n_i} = {{{N_i}} \over N}\,,$$
(7.15)

in terms of which the Hamiltonian is then explicitly linear in the lapse,

$${\mathcal H} = N\left({{{\tilde{\mathcal C}}_0}(\gamma ,p) + {n^i}{{\tilde{\mathcal C}}_i}(\gamma ,p) + {m^2}V(\gamma ,p){{{\gamma _{ij}}{n^i}{n^j}} \over 2}} \right)\,,$$
(7.16)

and generates a constraint that can be read for {ni, γij, pij}.

Condition to evade the ghost

To summarize, the condition to eliminate (at least half of) the BD ghost is that the det of the Hessian (7.13) Lμν vanishes as explained in [144]. This was shown to be the case in the ghost-free theory of massive gravity (6.3) [(6.1)] exactly in some cases and up to quartic order, and then fully non-linearly in [295]. We summarize the derivation in the general case in what follows.

Ultimately, this means that in massive gravity we should be able to find a new shift ni related to the original one as follows Ni = f0(γ, n) + Nf1(γ, n), such that the Hamiltonian takes the following factorizable form

$${\mathcal H} = ({{\mathcal A}_1}(\gamma ,p) + N{{\mathcal C}_1}(\gamma ,p)){\mathcal F}(\gamma ,p,n) + ({{\mathcal A}_2}(\gamma ,p) + N{{\mathcal C}_2}(\gamma ,p))\,.$$
(7.17)

In this form, the equation of motion for the shift is manifestly independent of the lapse and integrating over the shift ni manifestly keeps the Hamiltonian linear in the lapse and has the constraint \({{\mathcal C}_1}(\gamma, p){\mathcal F}(\gamma, p,{n^i}(\gamma)) + {{\mathcal C}_2}(\gamma, p) = 0\). However, such a field redefinition has not (yet) been found. Instead, the new shift ni found below does the next best thing (which is entirely sufficient) of a. Keeping the Hamiltonian linear in the lapse and b. Keeping its own equation of motion independent of the lapse, which is sufficient to infer the presence of a primary constraint.

Primary constraint

We now proceed by deriving the primary first-class constraint present in ghost-free (dRGT) massive gravity. The proof works equally well for any reference at no extra cost, and so we consider a general reference metric in its own ADM decomposition, while keep the dynamical metric in its original ADM form (since we work in unitary gauge, we may not simplify the metric further),

$${g_{\mu \nu}}{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - {N^2}\;{\rm{d}}{t^2} + {\gamma _{ij}}({\rm{d}}{x^i} + {N^i}{\rm{d}}t)\;({\rm{d}}{x^j} + {N^j}{\rm{d}}t)$$
(7.18)
$${f_{\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - {\bar {\mathcal N}^2}\;{\rm{d}}{t^2} + {\bar f_{ij}}({\rm{d}}{x^i} + {\bar {\mathcal N}^i}\;{\rm{d}}t)\;({\rm{d}}{x^j} + {\bar {\mathcal N}^j}\;{\rm{d}}t)\,,$$
(7.19)

and denote again by pij the conjugate momentum associated with γij. \({\overset - f _{ij}}\) is not dynamical in massive gravity so there is no conjugate momenta associated with it. The bars on the reference metric are there to denote that these quantities are parameters of the theory and not dynamical variables, although the proof for a dynamical reference metric and multi-gravity works equally well, this is performed in Section 7.4.

Proceeding similarly as in the previous example, we perform a change of variables similar as in (7.15) (only more complicated, but which remains linear in the lapse when expressing in terms of ni) [295, 296]

$${N^i} \rightarrow {n^i}\quad {\rm{defined\ as}}\quad {N^i} - {\bar {\mathcal N} ^i} = (\bar {\mathcal N} \delta _j^i + N\;D_j^i)\;{n^j}\,,$$
(7.20)

where the matrix \(D_j^i\) satisfies the following relation

$$D_k^iD_j^k = ({P^{- 1}})_k^i{\gamma ^{k\ell}}{\bar f_{\ell j}}\,,$$
(7.21)

with

$$P_j^i = \delta _j^i + ({n^i}{\bar f_{j\ell}}{n^\ell} - {n^k}{\bar f_{k\ell}}{n^\ell}\delta _j^i)\,.$$
(7.22)

In what follows we use the definition

$$\tilde D_{\;j}^i = \kappa D_{\;j}^i\,,$$
(7.23)

with

$$\kappa = \sqrt {1 - {n^i}{n^j}{{\bar f}_{ij}}} \,.$$
(7.24)

The field redefinition naturally involves a square root through the expression of the matrix D in (7.21), which should come as no surprise from the square root structure of the potential term. For the potential to be writable in the metric language, the square root in the definition of the tensor \({\mathcal K}_{\,\,\,v}^\mu\), should exist, which in turns imply that the square root in the definition of \(D_j^i\) in (7.21) must also exist. While complicated, the important point to notice is that this field redefinition remains linear in the lapse (and so does not spoil the standard constraints of GR).

The Hamiltonian for massive gravity is then

$$\begin{array}{*{20}c} {{{\mathcal H}_{{\rm{mGR}}}} = {{\mathcal H}_{{\rm{GR}}}} + {m^2}{\mathcal U}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= N\,{{\mathcal R}_0}(\gamma ,p) + \left({{{\bar {\mathcal N}}^i} + \left({\bar {\mathcal N}\delta _j^i + N\;D_j^i} \right){n^j}} \right)\,{{\mathcal R}_i}(\gamma ,p)} \\ {+ {m^2}\,{\mathcal U}(\gamma ,{N^i}(n),N)\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(7.25)

where \({\mathcal U}\) includes the new contributions from the mass term. \({\mathcal U}(\gamma, {N^i},N)\) is neither linear in the lapse N, nor in the shift Ni. There is actually no choice of potential \({\mathcal U}\) which would keep it linear in the lapse beyond cubic order [111]. However, as we shall see, when expressed in terms of the redefined shift ni, the non-linearities in the shift absorb all the original non-linearities in the lapse and \({\mathcal U}(\gamma, {n^i}N)\). In itself this is not sufficient to prove the presence of a constraint, as the integration over the shift ni could in turn lead to higher order lapse in the Hamiltonian,

$${\mathcal U}(\gamma ,{N^i}({n^j}),N) = N\,{{\mathcal U}_0}(\gamma ,{n^j}) + \bar {\mathcal N} \,{{\mathcal U}_1}(\gamma ,{n^j})\,,$$
(7.26)

with

$${{\mathcal U}_0} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt \gamma \sum\limits_{n = 0}^3 {{{(4 - n){\beta _n}} \over {n!}}} {{\mathcal L}_n}[\tilde D_{\,j}^i]$$
(7.27)
$$\begin{array}{*{20}c} {{{\mathcal U}_1} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt \gamma \left({3!{\beta _1}\kappa + 2{\beta _2}D_{\,j}^iP_{\,i}^j} \right.\quad \quad \quad \quad \quad \quad \quad} \\ {\left. {+ {\beta _3}\kappa \left[ {2D_{\;k}^{\left[ k \right.}{n^{\left. i \right]}}{{\bar f}_{ij}}D_{\,\ell}^j{n^\ell} + D_{\;i}^{\left[ i \right.}D_{\;j}^{\left. j \right]}} \right]} \right) - {{M_{{\rm{Pl}}}^2} \over 4}{\beta _4}\sqrt {\bar f} \,,} \\ \end{array}$$
(7.28)

where the β’s are expressed in terms of the α’s as in (6.28). For the purpose of this analysis it is easier to work with that notation.

The structure of the potential is so that the equations of motion with respect to the shift are independent of the lapse and impose the following relations in terms of \({\bar n_i} = {n^j}\,{\bar f_{ij}}\),

$${m^2}\sqrt \gamma \left[ {3!{\beta _1}{{\bar n}_i} + 4{\beta _2}\tilde D_{\,\left[ j \right.}^j{{\bar n}_{\left. i \right]}} + {\beta _3}\tilde D_{\;j}^{\left[ j \right.}\left({\tilde D_{\;k}^{\left. k \right]}{{\bar n}_i} - 2\tilde D_{\;i}^{\left. k \right]}{{\bar n}_k}} \right)} \right] = \kappa {{\mathcal R}_i}(\gamma ,p)\,,$$
(7.29)

which entirely fixes the three shifts ni in terms of γij and pij as well as the reference metric \({\overset - f _{ij}}\) (note that \({\overset - {\mathcal N} ^i}\) entirely disappears from these equations of motion).

The two requirements defined previously are thus satisfied: a. The Hamiltonian is linear in the lapse and b. the equations of motion with respect to the shift ni are independent of the lapse, which is sufficient to infer the presence of a primary constraint. This primary constraint is derived by varying with respect to the lapse and evaluating the shift on the constraint surface (7.29),

$${{\mathcal C}_0} = {{\mathcal R}_0}(\gamma ,p) + D_{\,j}^i{n^j}{{\mathcal R}_i}(\gamma ,p) + {m^2}{{\mathcal U}_0}(\gamma ,n(\gamma ,p)) \approx 0\,,$$
(7.30)

where the symbol “≈” means on the constraint surface. The existence of this primary constraint is sufficient to infer the absence of BD ghost. If we were dealing with a generic system (which could allow for some spontaneous parity violation), it could still be in principle that there are no secondary constraints associated with \({{\mathcal C}_0} = 0\) and the theory propagates 5.5 physical degrees of freedom (11 dofs in phase space). However, physically this never happens in the theory of gravity we are dealing with preserves parity and is Lorentz invariant. Indeed, to have 5.5 physical degrees of freedom, one of the variables should have an equation of motion which is linear in time derivatives. Lorentz invariance then implies that it must also be linear in space derivatives which would then violate parity. However, this is only an intuitive argument and the real proof is presented below. Indeed, it ghost-free massive gravity admits a secondary constraint which was explicitly found in [294].

Secondary constraint

Let us imagine we start with initial conditions that satisfy the constraints of the system, in particular the modified Hamiltonian constraint (7.30). As the system evolves the constraint (7.30) needs to remain satisfied. This means that the modified Hamiltonian constraint ought to be independent of time, or in other words it should commute with the Hamiltonian. This requirement generates a secondary constraint,

$${{\mathcal C}_2} \equiv {{\rm{d}} \over {{\rm{d}}t}}{{\mathcal C}_0} = \{{{\mathcal C}_0},{H_{{\rm{mGR}}}}\} \approx \{{{\mathcal C}_0},{H_1}\} \approx 0\,,$$
(7.31)

with \({H_{{\rm{mGR,1}}}} = \int {{{\rm{d}}^{\rm{3}}}x{{\mathcal H}_{{\rm{mGR,1}}}}}\) and

$${{\mathcal H}_1} = \left({{{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i}(\gamma ,p)} \right){{\mathcal R}_i} + {m^2}\bar {\mathcal N} {{\mathcal U}_1}(\gamma ,n(\gamma ,p))\,.$$
(7.32)

Finding the precise form of this secondary constraint requires a very careful analysis of the Poisson bracket algebra of this system. This formidable task lead to some confusions at first (see Refs. [345]) but was then successfully derived in [294] (see also [258, 259] and [343]). Deriving the whole set of Poisson brackets is beyond the scope of this review and we simply give the expression for the secondary constraint,

$$\begin{array}{*{20}c} {{{\mathcal C}_2} \equiv {{\mathcal C}_0}{\nabla _i}({{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i}) + {m^2}\bar {\mathcal N} ({\gamma _{ij}}p_\ell ^\ell - 2{p_{ij}}){\mathcal U}_1^{ij}\quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ 2{m^2}\bar {\mathcal N} \sqrt \gamma {\nabla _i}{\mathcal U}_{1\,j}^iD_{\;k}^j{n^k} + ({{\mathcal R}_j}D_{\;k}^i{n^k} - \sqrt \gamma \bar {\mathcal B} _j^{\;i}){\nabla _i}({{\bar {\mathcal N}}^j} + \bar {\mathcal N} {n^j})} \\ {+ \left({{\nabla _i}{{\mathcal R}_0} + {\nabla _i}{{\mathcal R}_j}D_{\;k}^j{n^k}} \right)({{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i})\;,\;\quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.33)

where unless specified otherwise, all indices are raised and lowered with respect to the dynamical metric γij, and the covariant derivatives are also taken with respect to the same metric. We also define

$${\mathcal U}_1^{ij} = {1 \over {\sqrt \gamma}}{{\partial {{\mathcal U}_1}} \over {\partial {\gamma _{ij}}}}$$
(7.34)
$$\begin{array}{*{20}c} {{{\bar {\mathcal B}}_{ij}} = - {{M_{{\rm{Pl}}}^2} \over 4}\left[ {({{\tilde D}^{- 1}})_{\;j}^k{{\bar f}_{ik}}\left({3{\beta _1}{{\mathcal L}_0}[\tilde D] + 2{\beta _2}{{\mathcal L}_1}[\tilde D] + {{{\beta _3}} \over 2}{{\mathcal L}_2}[\tilde D]} \right)} \right.} \\ {\left. {- {\beta _2}{{\bar f}_{ij}} + 2{\beta _3}{{\bar f}_{i\left[ k \right.}}\tilde D_{\left. {\,j} \right]}^k} \right]\;\,.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.35)

The important point to notice is that the secondary constraint (7.33) only depends on the phase space variables γij, pij and not on the lapse N. Thus it constraints the phase space variables rather than the lapse and provides a genuine secondary constraint in addition to the primary one (7.30) (indeed one can check that \({{\mathcal C}_2}{\vert_{{{\mathcal C}_{0 = 0}} \ne 0}}\).

Finally, we should also check that this secondary constraint is also maintained in time. This was performed [294], by inspecting the condition

$${{\rm{d}} \over {{\rm{d}}t}}{{\mathcal C}_2} = \{{{\mathcal C}_2},{H_{{\rm{mGR}}}}\} \approx 0\,.$$
(7.36)

This condition should be satisfied without further constraining the phase space variables, which would otherwise imply that fewer than five degrees of freedom are propagating. Since five fully fledged dofs are propagating at the linearized level, the same must happen non-linearly.Footnote 17 Rather than a constraint on {γij, pij}, (7.36) must be solved for the lapse. This is only possible if both the two following conditions are satisfied

$$\{{{\mathcal C}_2}(x),{{\mathcal H}_1}(y)\}\rlap{/}{\approx}0\quad {\rm{and}}\quad \{{{\mathcal C}_2}(x),{{\mathcal C}_0}(y)\}\rlap{/}{\approx}0\,.$$
(7.37)

As shown in [294], since these conditions do not vanish at the linear level (the constraints reduce to the Fierz-Pauli ones in that case), we can deduce that they cannot vanish non-linearly and thus the condition (7.36) fixes the expression for the lapse rather than constraining further the phase space dofs. Thus there is no tertiary constraint on the phase space.

To conclude, we have shown in this section that ghost-free (or dRGT) massive gravity is indeed free from the BD ghost and the theory propagates five physical dofs about generic backgrounds. We now present the proof in other languages, but stress that the proof developed in this section is sufficient to infer the absence of BD ghost.

Secondary constraints in bi- and multi-gravity

In bi- or multi-gravity where all the metrics are dynamical the Hamiltonian is pure constraint (every term is linear in the one of the lapses as can be seen explicitly already from (7.25) and (7.26)).

In this case, the evolution equation of the primary constraint can always be solved for their respective Lagrange multiplier (lapses) which can always be set to zero. Setting the lapses to zero would be unphysical in a theory of gravity and instead one should take a ‘bifurcation’ of the Dirac constraint analysis as explained in [48]. Rather than solving for the Lagrange multipliers we can choose to use the evolution equation of some of the primary constraints to provide additional secondary constraints instead of solving them for the lagrange multipliers.

Choosing this bifurcation leads to statements which are then continuous with the massive gravity case and one recovers the correct number of degrees of freedom. See Ref. [48] for an enlightening discussion.

Absence of ghost in the Stückelberg language

Physical degrees of freedom

Another way to see the absence of ghost in massive gravity is to work directly in the Stückelberg language for massive spin-2 fields introduced in Section 2.4. If the four scalar fields ϕa were dynamical, the theory would propagate six degrees of freedom (the two usual helicity-2 which dynamics is encoded in the standard Einstein-Hilbert term, and the four Stückelberg fields). To remove the sixth mode, corresponding to the BD ghost, one needs to check that not all four Stückelberg fields are dynamical but only three of them. See also [14] for a theory of two Stückelberg fields.

Stated more precisely, in the Stückelberg language beyond the DL, if a is the equation of motion with respect to the field the correct requirement for the absence of ghost is that the Hessian defined as

$${{\mathcal A}_{ab}} = - {{\delta {{\mathcal E}_a}} \over {\delta {{\ddot \phi}^b}}} = {{{\delta ^2}{\mathcal L}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}}$$
(7.38)

be not invertible, so that the dynamics of not all four Stückelberg may be derived from it. This is the case if

$$\det \;({{\mathcal A}_{ab}}) = 0\,,$$
(7.39)

as first explained in Ref. [145]. This condition was successfully shown to arise in a number of situations for the ghost-free theory of massive gravity with potential given in (6.3) or equivalently in (6.1) in Ref. [145] and then more generically in Ref. [297].Footnote 18 For illustrative purposes, we start by showing how this constraint arises in simple two-dimensional realization of ghost-free massive gravity before deriving the more general proof.

Two-dimensional case

Consider massive gravity on a two-dimensional space-time, ds2 = − N2 dt2 + γ (dx + Nx dt)2, with the two Stückelberg fields ϕ0,1 [145]. In this case the graviton potential can only have one independent non-trivial term, (excluding the tadpole),

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma ({{\mathcal L}_2}({\mathcal K}) + 1)\,.$$
(7.40)

In light-cone coordinates,

$${\phi ^ \pm} = {\phi ^0} \pm {\phi ^1}$$
(7.41)
$${{\mathcal D}_ \pm} = {1 \over {\sqrt \gamma}}{\partial _x} \pm {1 \over N}\left[ {{\partial _t} - {N_x}{\partial _x}} \right]\,,$$
(7.42)

the potential is thus

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma \sqrt {({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \,.$$
(7.43)

The Hessian of this Lagrangian with respect to the two Stückelberg fields ϕ± is then

$$\begin{array}{*{20}c} {{{\mathcal A}_{ab}} = {{{\delta ^2}{{\mathcal L}_{{\rm{mGR}}}}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}} = - {m^2}{{{\delta ^2}{\mathcal U}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}}\;\quad \quad \quad \quad \quad \quad \quad} \\ {\propto \left({\begin{array}{*{20}c} {{{({{\mathcal D}_ -}{\phi ^ -})}^2}} & {- ({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \\ {- ({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} & {{{({{\mathcal D}_ +}{\phi ^ +})}^2}} \\ \end{array}} \right)\,\;,} \\ \end{array}$$
(7.44)

and is clearly non-invertible, which shows that not both Stückelberg fields are dynamical. In this special case, the Hamiltonian is actually pure constraint as shown in [145], and there are no propagating degrees of freedom. This is as expected for a massive spin-two field in two dimensions.

As shown in Refs. [144, 145] the square root can be traded for an auxiliary non-dynamical variable \(\lambda _{\,\,\,\,v}^\mu\). In this two-dimensional example, the mass term (7.43) can be rewritten with the help of an auxiliary non-dynamical variable λ as

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma \left({\lambda + {1 \over {2\lambda}}({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \right)\,.$$
(7.45)

A similar trick will be used in the full proof.

Full proof

The full proof in the minimal model (corresponding to α2 = 1 and α3 = −2/3 and α4 = 1/6 in (6.3) or β2 = β3 = 0 in the alternative formulation (6.23)), was derived in Ref. [297]. We briefly review the essence of the argument, although the full technical derivation is beyond the scope of this review and refer the reader to Refs. [297] and [15] for a fully-fledged derivation.

Using a set of auxiliary variables \(\lambda _b^a\) (with λab = λba, so these auxiliary variables contain ten elements in four dimensions) as explained previously, we can rewrite the potential term in the minimal model as [79, 342],

$${\mathcal U} = {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} ([\lambda ] + [{\lambda ^{- 1}}\cdot Y])\,,$$
(7.46)

where the matrix Y has been defined in (2.77) and is equivalent to X used previously. Upon integration over the auxiliary variable λ we recover the square-root structure as mentioned in Ref. [144]. We now perform an ADM decomposition as in (7.1) which implies the ADM decomposition on the matrix Y,

$$Y_{\,b}^a = {g^{\mu \nu}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^c}{f_{cb}} = - {{\mathcal{D}}_t}{\phi ^a}{{\mathcal{D}}_t}{\phi ^c}{f_{cb}} + V_{\,b}^a\,,$$
(7.47)

with

$${{\mathcal D}_t} = {1 \over N}({\partial _t} - {N^i}{\partial _i})$$
(7.48)
$$V_{\;b}^a = {\gamma ^{ij}}{\partial _i}{\phi ^a}{\partial _j}{\phi ^c}{f_{cb}}\,.$$
(7.49)

Since the matrix uses a projection along the 3 spatial directions it is genuinely a rank-3 matrix rather than rank 4. This implies that det V = 0. Notice that we consider an arbitrary reference metric f, as the proof does not depend on it and can be done for any f at no extra cost [297]. The canonical momenta conjugate to ϕa is given by

$${p_a} = {1 \over 2}\tilde \alpha {({\lambda ^{- 1}})_{ab}}{{\mathcal{D}}_0}{\phi ^b}\,,$$
(7.50)

with

$$\tilde \alpha = 2M_{{\rm{Pl}}}^2{m^2}\sqrt \gamma \,.$$
(7.51)

In terms of these conjugate momenta, the equations of motion with respect to λab then imposes the relation (after multiplying with the matrixFootnote 19 α λ on both side),

$${\lambda ^{ac}}{C_{ab}}{\lambda ^{bd}} = {V^{ab}}\,,$$
(7.52)

with the matrix Cab defined as

$${C_{ab}} = {\tilde \alpha ^2}{f_{ab}} + {p_a}{p_b}\,.$$
(7.53)

Since det V = 0, as mentioned previously, the equation of motion (7.52) is only consistent if we also have det C = 0. This is the first constraint found in [297] which is already sufficient to remove (half) the BD ghost,

$${{\mathcal C}_1} \equiv {{\det C} \over {\det f}} = {\tilde \alpha ^2} + {({f^{- 1}})^{ab}}{p_a}{p_b} = 0\,,$$
(7.54)

which is the primary constraint on a subset of physical phase space variables {γij, pa}, (by construction det f ≠ 0). The secondary constraint is then derived by commuting \({{\mathcal C}_1}\) with the Hamiltonian. Following the derivation of [297], we get on the constraint surface

$${{\mathcal C}_2} = {1 \over {{{\tilde \alpha}^2}N}}{{{\rm{d}}{{\mathcal C}_1}} \over {{\rm{d}}t}} = {1 \over {{{\tilde \alpha}^2}N}}\int {\rm{d}} y\,\{{{\mathcal C}_1}(y),\;H(x)\}$$
(7.55)
$$\begin{array}{*{20}c} {\propto - {\gamma ^{- 1/2}}{\gamma _{ij}}{\pi ^{ij}} - 2{{\tilde \alpha} \over \gamma}{{({\lambda ^{- 1}})}_{ab}}{\partial _i}{\phi ^a}{\gamma ^{ij}}\nabla _j^{(f)}{p^b}} \\ {\equiv 0\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(7.56)

where πij is the momentum conjugate associated with γij, and Δ(f) is the covariant derivative associated with f.

Stückelberg method on arbitrary backgrounds

When working about different non-Minkowski backgrounds, one can instead generalize the definition of the helicity-0 mode as was performed in [400]. The essence of the argument is to perform a rotation in field space so that the fluctuations of the Stückelberg fields about a curved background form a vector field in the new basis, and one can then employ the standard treatment for a vector field. See also [10] for another study of the Stückelberg fields in an FLRW background.

Recently, a covariant Stückelberg analysis valid about any background was performed in Ref. [369] using the BRST formalism. Interestingly, this method also allows to derive the decoupling limit of massive gravity about any background.

In what follows, we review the approach derived in [400] which provides yet another independent argument for the absence of ghost in all generalities. The proofs presented in Sections 7.1 and 7.2 work to all orders about a trivial background while in [400], the proof is performed about a generic (curved) background, and the analysis can thus stop at quadratic order in the fluctuations. Both types of analysis are equivalent so long as the fields are analytic, which is the case if one wishes to remain within the regime of validity of the theory.

Consider a generic background metric, which in unitary gauge (i.e., in the coordinate system {x} where the Stückelberg background fields are given by \({\phi ^a}(x) = {x^\mu}\delta _\mu ^a\), the background metric is given by \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b{(x)_{{\eta _{ab}}}}\), and the background Stückelberg fields are given by \(\phi _{{\rm{bg}}}^a(x) = {x^a} - A_{{\rm{bg}}}^a(x)\).

We now add fluctuations about that background,

$${\phi ^a} = \phi _{{\rm{bg}}}^a - {a^a} = {x^a} - {A^a}$$
(7.57)
$${g_{\mu \nu}} = g_{\mu \nu}^{{\rm{bg}}} + {h_{\mu \nu}},$$
(7.58)

with \({A^a} = A_{{\rm{bg}}}^a + {a^a}\).

Flat background metric

First, note that if we consider a flat background metric to start with, then at zeroth order in h, the ghost-free potential is of the form [400], (this can also be seen from [238, 419])

$${{\mathcal L}_A} = - {1 \over 4}FF(1 +\partial A + \cdots)\,,$$
(7.59)

with Fab = aAbbAa. This means that for a symmetric Stückelberg background configuration, i.e., if the matrix \({\partial _\mu}\phi _{{\rm{bg}}}^a\) is symmetric, then \(F_{ab}^{{\rm{bg}}} = 0\), and at quadratic order in the fluctuation a, the action has a U (1)-symmetry. This symmetry is lost non-linearly, but is still relevant when looking at quadratic fluctuations about arbitrary backgrounds. Now using the split about the background, \({A^a} + A_{{\rm{bg}}}^a + {a^a}\), this means that up to quadratic order in the fluctuations aa, the action at zeroth order in the metric fluctuation is of the form [400]

$${\mathcal L}_a^{(2)} = {\bar B^{\mu \alpha \nu \beta}}{f_{\mu \nu}}{f_{\alpha \beta}},$$
(7.60)

with fμν = μaν, − νaμ and \({\overset - B ^{\mu \alpha v\beta}}\) is a set of constant coefficients which depends on \(A_{{\rm{bg}}}^a\). This quadratic action has an accidental U (1)-symmetry which is responsible for projecting out one of the four dofs naively present in the four Stückelberg fluctuations aa. Had we considered any other potential term, the U (1) symmetry would have been generically lost and all four Stückelberg fields would have been dynamical.

Non-symmetric background Stückelberg

If the background configuration is not symmetric, then at every point one needs to perform first an internal Lorentz transformation Λ(x) in the Stückelberg field space, so as to align them with the coordinate basis and recover a symmetric configuration for the background Stückelberg fields. In this new Lorentz frame, the Stückelberg fluctuation is \({\tilde a^\mu} = \Lambda _v^\mu (x){a_v}\). As a result, to quadratic order in the Stückelberg fluctuation the part of the ghost-free potential which is independent of the metric fluctuation and its curvature goes symbolically as (7.60) with f replaced by \(f \to \tilde f + (\partial \Lambda){\Lambda ^{- 1}}\tilde a\), (with = \({\tilde f_{\mu v}} = {\partial _\mu}{\tilde a_v} - {\partial _v}{\tilde a_\mu}\)). Interestingly, the Lorentz boost ( Λ)Λ−1 now plays the role of a mass term for what looks like a gauge field ã. This mass term breaks the U (1) symmetry, but there is still no kinetic term for ã0, very much as in a Proca theory. This part of the potential is thus manifestly ghost-free (in the sense that it provides a dynamics for only three of the four Stückelberg fields, independently of the background).

Next, we consider the mixing with metric fluctuation h while still assuming zero curvature. At linear order in h, the ghost-free potential, (6.3) goes as follows

$${\mathcal L}_{Ah}^{(2)} = {h^{\mu \nu}}\sum\limits_{n = 1}^3 {{c_n}} X_{\mu \nu}^{(n)} + hF(\partial A + \cdots)\,,$$
(7.61)

where the tensors \(X_{\mu v}^{(n)}\) are similar to the ones found in the decoupling limit, but now expressed in terms of the symmetric full four Stückelberg fields rather than just π, i.e., replacing by μAν + νAμ in the respective expressions (8.29), (8.30) and (8.31) for \(X_{\mu v}^{(1,2,3)}\). Starting with the symmetric configuration for the Stückelberg fields, then since we are working at the quadratic level in perturbations, one of the Aμ in the \(X_{\mu v}^{(n)}\) is taken to be the fluctuation aμ, while the others are taken to be the background field \(A_\mu ^{{\rm{bg}}}\). As a result in the first terms in hX in (7.61)0a0 cannot come at the same time as h00 or h0i, and we can thus integrate by parts the time derivative acting on any a0, leading to a harmless first time derivative on hij, and no time evolution for a0.

As for the second type of term in (7.61), since F = 0 on the background field \(A_\mu ^{{\rm{bg}}}\), the second type of terms is forced to be proportional to fμν and cannot involve any 0 a0 at all. As a result a0 is not dynamical, which ensures that the theory is free from the BD ghost.

This part of the argument generalizes easily for non symmetric background Stückelberg configurations, and the same replacement \(f \to \tilde f + (\partial \Lambda){\Lambda ^{- 1}}\tilde a\) still ensures that ã0 acquires no dynamics from (7.61).

Background curvature

Finally, to complete the argument, we consider the effect from background curvature, then \(g_{\mu v}^{{\rm{bg}}} \ne {\eta _{\mu v}}\) with \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b(x)\). The space-time curvature is another source of ‘misalignment’ between the coordinates and the Stückelberg fields. To rectify for this misalignment, we could go two ways: Either perform a local change of coordinate so as to align the background metric \(g_{\mu v}^{{\rm{bg}}}\) with the flat reference metric ημν (i.e., going to local inertial frame), or the other way around: i.e., express the flat reference metric in terms of the curved background metric, \({\eta _{ab}} = e_a^\mu e_b^vg_{\mu v}^{{\rm{bg}}}\), in terms of the inverse vielbein, \(e_a^\mu \equiv ({e^{- 1}})_a^\mu\). Then the building block of ghost-free massive gravity is the matrix \({\mathbb X}\), defined previously as

$${\mathbb X}_\nu ^\mu = ({g^{- 1}}\eta)_\nu ^\mu = {g^{\mu \gamma}}(e_{\,a}^\alpha {\partial _\gamma}{\phi ^a})(e_{\,b}^\beta {\partial _\nu}{\phi ^b})g_{\alpha \beta}^{{\rm{bg}}}\,.$$
(7.62)

As a result, the whole formalism derived previously is directly applicable with the only subtlety that the Stückelberg fields ϕa should be replaced by their ‘vielbein-dependent’ counterparts, i.e., \({\partial _\mu }{A_\nu } \to {g_{\mu {\nu ^{{\text{bg}}}}}} - g_{\nu \alpha }^{{\text{bg}}}e_{{\kern 1pt} a}^\alpha {\partial _\mu }{\phi ^a}\). In terms of the Stückelberg field fluctuation aa, this implies the replacement \({a^a} \to {\bar a_\mu} = g_{\mu v}^{{\rm{bg}}}e_{\,\,\,\,\,a}^v{a^a}\), and symbolically, \(f \to \bar f + (\partial \Sigma){\Sigma ^{- 1}}\bar a\), with Σ = . The situation is thus the same as when we were dealing with a non-symmetric Stückelberg background configuration, after integration by parts (which might involve curvature harmless contributions), the potential can be written in a way which never involves any time derivative on ā0. As a result, āμ, plays the role of an effective Proca vector field which only propagates three degrees of freedom, and this about any curved background metric. The beauty of this argument lies in the correct identification of the proper degrees of freedom when dealing with a curved background metric.

Absence of ghost in the vielbein formulation

Finally, we can also prove the absence of ghost for dRGT in the Vielbein formalism, either directly at the level of the Lagrangian in some special cases as shown in [171] or in full generality in the Hamiltonian formalism, as shown in [314]. The later proof also works in all generality for a multi-gravity theory and will thus be presented in more depth in what follows, but we first focus on a special case presented in Ref. [171].

Let us start with massive gravity in the vielbein formalism (6.1). As was the case in Part II, we work with the symmetric vielbein condition, \(e_\mu ^af_v^b{\eta _{ab}} = e_v^af_\mu ^b{\eta _{ab}}\). For simplicity we specialize further to the case where \(f_\mu ^a = \delta _\mu ^a\), so that the symmetric vielbein condition imposes e = eμa. Under this condition, the vielbein contains as many independent components as the metric. The symmetric veilbein condition ensures that one is able to reformulate the theory in a metric language. In spacetime dimensions, there is a priori d (d + 1)/2 independent components in the symmetric vielbein.

Varying the action (6.1) with respect to the vielbein leads to the modified Einstein equation,

$${G_a} = {t_a} = - {{{m^2}} \over 2}{\varepsilon _{abcd}}\left({4{c_0}\,{e^b} \wedge {e^c} \wedge {e^d} + 3{c_1}\,{e^b} \wedge {e^c} \wedge {f^d}} \right.$$
(7.63)
$$\left. {+ 2{c_2}\,{e^b} \wedge {f^c} \wedge {f^d} + {c_3}\,{f^b} \wedge {f^c} \wedge {f^d}} \right),$$
(7.64)

with Ga = εabcdωbced. From the Bianchi identity, \({\mathcal D}{G_a} = {\rm{d}}{G_a} = {\rm{d}}{G_a} - \omega _a^b{G_b}\), we infer the d constraints

$${\mathcal D}{t_a} = {\rm{d}}{t_a} - \omega _{\;a}^b{t_b} = 0\,,$$
(7.65)

leading to d (d − 1)/2 independent components in the vielbein. This is still one too many component, unless an additional constraint is found. The idea behind the proof in Ref. [171], is then to use the Bianchi identities to infer an additional constraint of the form,

$${m^a} \wedge {G_a} = {m^a} \wedge {t_a}\,,$$
(7.66)

where ma is an appropriate one-form which depends on the specific coefficients of the theory. Such a constrain is present at the linear level for Fierz-Pauli massive gravity, and it was further shown in Ref. [171] that special choices of coefficients for the theory lead to remarkably simple analogous relations fully non-linearly. To give an example, we consider all the coefficients cn to vanish but c1 ≠ 0. In that case the Bianchi identity (7.65) implies

$${\mathcal D}{t_a} = 0\qquad \Rightarrow \qquad \omega _{\;cb}^b = 0\,,$$
(7.67)

where similarly as in (5.2), the torsionless connection is given in term of the vielbein as

$$\omega _\mu ^{ab} = {1 \over 2}e_\mu ^c(o_{\;\;\;\;c}^{ab} - o_c^{\;ab} - o_{\;\;c}^{b\;\;a})\,,$$
(7.68)

with \({o^{ab}}_c = 2{e^{a\mu}}{e^{bv}}{\partial _{\left[ \mu \right.}}{e_{\left. v \right]}}_c\). The Bianchi identity (7.67) then implies \(e_a^{\,\,\,\,b}{\partial _{\left[ b \right.}}e_{\left. a \right]}^a = 0\), so that we obtain an extra constraint of the form (7.66) with ma = ea. Ref. [171] derived similar constraints for other parameters of the theory.

Absence of ghosts in multi-gravity

We now turn to the proof for the absence of ghost in multi-gravity and follow the vielbein formulation of Ref. [314]. In this subsection we use the notation that uppercase Latin indices represent d-dimensional Lorentz indices, A, B, ⋯ = 0, ⋯, d − 1, while lowercase Latin indices represent the d − 1-dimensional Lorentz indices along the space directions after ADM decomposition, a, b, ⋯ = 1, ⋯, d − 1. Greek indices represent d-dimensional spacetime indices μ, ν, = 0, ⋯, d − 1, while the ‘middle’ of the Latin alphabet indices i, j ⋯ represent pure space indices i, j, ⋯ = 1, ⋯, d, − 1. Finally, capital indices label the metric and span over I, J, K, ⋯ = 1, ⋯, N.

Let us start with N non-interacting spin-2 fields. The theory has then N copies of coordinate transformation invariance (the coordinate system associated with each metric can be changed separately), as well as N copies of Lorentz invariance. At this level may, for each vielbein e(J), J = 1, ⋯, N we may use part of the Lorentz freedom to work in the upper triangular form for the vielbein,

$${e_{(J)}}_{\;\mu}^A = \left({\begin{array}{*{20}c} {{N_{(J)}}} & {N_{(J)}^i{e_{(J)}}_{\;i}^a} \\ 0 & {{e_{(J)}}_{\;i}^a} \\ \end{array}} \right)\,,\qquad {e_{(J)}}_{\;A}^\mu = \left({\begin{array}{*{20}c} {{N_{(J)}}^{- 1}} & 0 \\ {- N_{(J)}^i{N^{- 1}}} & {{e_{(J)}}_{\;a}^i} \\ \end{array}} \right)\,,$$
(7.69)

leading to the standard ADM decomposition for the metric,

$$\begin{array}{*{20}c} {{g_{(J)\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = {e_{(J)}}_{\;\mu}^A{e_{(J)}}_{\;\nu}^B{\eta _{AB}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= - {N_{(J)}}^2{\rm{d}}{t^2} + {\gamma _{(J)}}_{ij}\left({{\rm{d}}{x^i} + N_{(J)}^i{\rm{d}}t} \right)\;({\rm{d}}{x^j} + {N_{(J)}}^j{\rm{d}}t)\,,} \\ \end{array}$$
(7.70)

with the three-dimensional metric \(\gamma (J)ij = e(J)_{\,\,\,i}^ae(J)_{\,\,\,\,j}^b{\delta _{ab}}\). Starting with non-interacting fields, we simply take copies of the GR action,

$${L_{N{\rm{GR}}}} = \int {\rm{d}} t\sum\limits_{J = 1}^N {\sqrt {- {g_{(J)}}}} {R_{(J)}}\,,$$
(7.71)

and the Hamiltonian in terms of the vielbein variables then takes the form (7.6)

$${{\mathcal H}_{N{\rm{GR}}}} = \int {\;{{\rm{d}}^d}} x\sum\limits_{J = 1}^N {\left({{\pi _{(J)}}_a^{\;i}\dot e_{(J)\;\;i}^{\;\;\;\;\;\;a} + {N_{(J)}}{{\mathcal C}_{(J)}}_0 + N_{(J)}^i{{\mathcal C}_{(J)}}_i - {1 \over 2}\lambda _{(J)}^{ab}{{\mathcal P}_{(J)}}_{ab}} \right)} \,,$$
(7.72)

where \({\pi _{(J)}}_a^{\,\,i}\) is the conjugate momentum associated with the vielbein\({e_{(J)}}_{\,\,i}^a\) and the constraints \({{\mathcal C}_{(J)0,i}} = {{\mathcal C}_{0,i}}({e_{(J)}},{\pi _{(J)}})\) are the ones mentioned previously in (7.6) (now expressed in the vielbein variables) and are related to diffeomorphism invariance. In the vielbein language there is an addition d (d − 1)/2 primary constraints for each vielbein field

$${{\mathcal P}_{(J)}}_{ab} = {e_{(J)}}_{\left[ {ai} \right.}\;{\pi _{(J)}}_{\left. b \right]}^{\;\;i},$$
(7.73)

related to the residual local Lorentz symmetry still present after fixing the upper triangular form for the vielbeins.

Now rather than setting part of the N Lorentz frames to be on the upper diagonal form for all the vielbein (7.69) we only use one Lorentz boost to set one of the vielbein in that form, say e(1), and ‘unboost’ the N − 1 other frames, so that for any of the other vielbein one has

$${e_{(J)}}_{\;\mu}^A = \left({\begin{array}{*{20}c} {{N_{(J)}}{{\tilde \gamma}_{(J)}} + N_{(J)}^i{e_{(J)}}_{\;i}^a{p_{(J)}}_a} & {{N_{(J)}}p_{(J)}^a + N_{(J)}^i{e_{(J)}}_{\;i}^b{S_{(J)}}_b^a} \\ {{e_{(J)}}_{\;i}^a} & {{e_{(J)}}_{\;i}^b{S_{(J)}}_b^a} \\ \end{array}} \right)$$
(7.74)
$${S_{(J)}}_b^a = \delta _b^a + \tilde \gamma _{(J)}^{- 1}p_{(J)}^a{p_{(J)}}_b$$
(7.75)
$${\tilde \gamma _{(J)}} = \sqrt {1 + {p_{(J)}}_ap_{(J)}^a}$$
(7.76)

where p(J)a is the boost that would bring that vielbein in the upper diagonal form.

We now consider arbitrary interactions between the N fields of the form (6.1),

$${L_{N\,{\rm{int}}}} = \sum\limits_{{J_1}, \cdots ,{J_d} = 1}^N {{\alpha _{{J_1}, \cdots ,{J_d}}}} {\varepsilon _{{a_1} \cdots {a_d}}}\,e_{({J_1})}^{{a_1}} \wedge \cdots \wedge e_{({J_d})}^{{a_d}}\,,$$
(7.77)

where for concreteness we assume dN, otherwise the formalism is exactly the same (there is some redundancy in this formulation, i.e., some interactions are repeated in this formulation, but this has no consequence for the argument). Since the vielbeins \({e_{(J)}}_0^A\) are linear in their respective shifts and lapse \({N_{(J)}},N_{(J)}^i\) and the vielbeins \({e_{(J)}}_i^A\) do not depend any shift nor lapse, it is easy to see that the general set of interactions (7.77) lead to a Hamiltonian which is also linear in every shift and lapse,

$${{\mathcal H}_{N\,{\rm{int}}}} = \sum\limits_{J = 1}^N {\left({{N_{(J)}}{\mathcal C}_{(J)}^{{\rm{int}}}(e,p) + N_{(J)}^i{\mathcal C}_{(J)^{i}}^{{\rm{int}}}(e,p)} \right)} \,.$$
(7.78)

Indeed the wedge structure of (6.1) or (7.77) ensures that there is one and only one vielbein with time-like index \({e_{(J)}}_0^A\) for every term \({\varepsilon _{{a_1} \ldots {a_d}}}e_{({J_1})}^{{a_1}}\wedge \ldots \wedge e_{(Jd)}^{{a_d}}\).

Notice that for the interactions, the terms \({\mathcal C}_{(J)0,i}^{{\rm{int}}}\) can depend on all the N vielbeins e(J ′) and all the N − 1 ‘boosts’ p(J′), (as mentioned previously, part of one Lorentz frame is set so that p(1) = 0 and e(1) is in the upper diagonal form). Following the procedure of [314], we can now solve for the N − 1 remaining boosts by using (N − 1) of the N shift equations of motion

$${{\mathcal C}_{(J)\,i}}(e,\pi) + {{\mathcal C}_{(J)\,i}}(e,p) = 0\qquad \forall \;\;J = 1, \cdots ,N\,.$$
(7.79)

Now assuming that all N vielbein are interacting,Footnote 20 (i.e., there is no vielbein e(J) which does not appear at least once in the interactions (7.77) which mix different vielbeins), the shift equations (7.79) will involve all the N − 1 boosts and can be solved for them without spoiling the linearity in any of the N lapses N(J). As a result, the N − 1 lapses N(J) for J = 2, ⋯, N are Lagrange multiplier for (N − 1) first class constraints. The lapse N(1) for the first vielbein combines with the remaining shift \(N_{(1)}^i\) to generate the one remaining copy of diffeomorphism invariance.

We now have all the ingredients to count the number of dofs in phase space: We start with d2 components in each of the N vielbein \(e_{(J)i}^a\) and associated conjugate momenta, that is a total of 2 × d2 × N phase space variables. We then have 2 × d (d − 1)/2 × N constraintsFootnote 21 associated with the \(\lambda _{(J)}^{ab}\). There is one copy of diffeomorphism removing 2 × (d + 1) phase space dofs (with Lagrange multiplier N(1) and \(N_{(1)}^i\) and (N − 1) additional first-class constraints with Lagrange multipliers N(J ≥2) removing 2 × (N − 1) dofs. As a result we end up with

$$\begin{array}{*{20}c} {\left({2 \times {{d(d - 1)} \over 2} \times N} \right) - 2 \times {{d(d - 1)} \over 2} \times N - 2 \times (d + 1) - 2 \times (N - 1)} \\ {= \left({{d^2}N - 2N + d(N - 2)} \right){\rm{phase}}\;{\rm{space}}\;{\rm{dofs}}\;\quad \quad \quad \quad \quad \quad \;\;} \\ {= {1 \over 2}\left({{d^2}N - 2N + d(N - 2)} \right){\rm{field}}\;{\rm{space}}\;{\rm{dofs}}\quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.80)
$$\begin{array}{*{20}c} {= {1 \over 2}\left({{d^2} - d - 2} \right){\rm{dofs}}\;{\rm{for}}\;{\rm{a}}\;{\rm{massless}}\;{\rm{spin - 2}}\;{\rm{field}}\quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {1 \over 2}\left({{d^2} + d - 2} \right) \times (N - 1)\;{\rm{dofs}}\;{\rm{for}}\;(N - 1){\rm{a}}\;{\rm{massive}}\;{\rm{spin - 2}}\;{\rm{fields}}\,,} \\ \end{array}$$
(7.81)

which is the correct counting in (d + 1) spacetime dimensions, and the theory is thus free of any BD ghost.

Decoupling Limits

Scaling versus decoupling

Before moving to the decoupling of massive gravity and bi-gravity, let us make a brief interlude concerning the correct identification of degrees of freedom. The Stückelberg trick used previously to identify the correct degrees of freedom works in all generality, but care must be used when taking a “decoupling limit” (i.e., scaling limit) as will be done in Section 8.2.

Imagine the following gauge field theory

$${\mathcal L} = - {1 \over 2}{m^2}{A_\mu}{A^\mu}\,,$$
(8.1)

i.e., the Proca mass term without any kinetic Maxwell term for the gauge field. Since there are no dynamics in this theory, there is no degrees of freedom. Nevertheless, one could still proceed and use the same split \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}{\mathcal X}/m\) as performed previously,

$${\mathcal L} = - {1 \over 2}{m^2}A_\mu ^ \bot {A^{\bot \,\mu}} + m({\partial _\mu}{A^{\bot \,\mu}})\chi - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.2)

so as to introduce what appears to be a kinetic term for the mode χ. At this level the theory is still invariant under χχ + and \(A_\mu ^ \bot \to A_\mu ^ \bot - {\partial _\mu}\xi\) and so while there appears to be a dynamical degree of freedom χ, the symmetry makes that degree of freedom unphysical, so that (8.2) still propagates no physical degree of freedom.

Now consider the m ⊒ 0 scaling limit of (8.2) while keeping \(A_\mu ^ \bot\) and χ finite. In that scaling limit, the theory reduces to

$${{\mathcal L}_{m \rightarrow 0}} = - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.3)

i.e., one degree of freedom with no symmetry which implies that the theory (8.3) propagates one degree of freedom. This is correct and thus means that (8.3) is not a consistent decoupling limit of (8.2) since the number of degrees of freedom is different already at the linear level. In the rest of this review, we will call a decoupling limit a specific type of scaling limit which preserves the same number of physical propagating degrees of freedom in the linear theory. As suggested by the name, a decoupling limit is a special kind of limit in which some of the degrees of freedom of the original theory might decouple from the rest, but the total number of degrees of freedom remains identical. For the theory (8.2), this means that the scaling ought to be taken not with \(A_\mu ^ \bot\) fixed but rather with \(\tilde A_\mu ^ \bot = A_\mu ^ \bot/m\) fixed. This is indeed a consistent rescaling which leads to finite contributions in the limit m ⊒ 0,

$${{\mathcal L}_{m \rightarrow 0}} = - {1 \over 2}\tilde A_\mu ^ \bot {\tilde A^{\bot \,\mu}} + ({\partial _\mu}{\tilde A^{\bot \,\mu}})\chi - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.4)

which clearly propagates no degrees of freedom.

This procedure is true in all generality: a decoupling limit is a special scaling limit where all the fields in the original theory are scaled with the highest possible power of the scale in such a way that the decoupling limit is finite.

A decoupling limit of a theory never changes the number of physical degrees of freedom of a theory. At best it ‘decouples’ some of them in such a way thai they are inaccessible from another sector.

Before looking at the massive gravity limit of bi-gravity and other decoupling limits of massive and bi-gravity, let us start by describing the different scaling limits that can be taken. We start with a bi-gravity theory where the two spin-2 fields have respective Planck scales Mg and Mf and the interactions between the two metrics arises at the scale m. In order to stick to the relevant points we perform the analysis in four dimensions, but the following arguments extend trivially to arbitrary dimensions.

  • Non-interacting Limit: The most natural question to ask is what happens in the limit where the interactions between the two fields are ‘switched off’, i.e., when sending the scale m ⊒ 0, (the limit m ⊒ 0 is studied more carefully in Sections 8.3 and 8.4). In that case if the two Planck scales Mg,f remain fixed as m → 0, we then recover two massless non-interacting spin-2 fields (carrying both 2 helicity-2 modes), in addition to a decoupled sector containing a helicity-0 mode and a helicity-1 mode. In bi-gravity matter fields couple only to one metric, and this remains the case in the limit m → 0, so that the two massless spin-2 fields live in two fully decoupled sectors even when matter in included.

  • Massive Gravity: Alternatively, we may look at the limit where one of the spin-2 fields (say fμν) decouples. This can be studied by sending its respective Planck scale to infinity. The resulting limit corresponds to a massive spin-2 field (carrying five dofs) and a decoupled massless spin-2 field carrying 2 dofs. This is nothing other than the massive gravity limit of bi-gravity (which includes a fully decoupled massless sector).

    If one considers matter coupling to the metric which scales in such a way that a non-trivial solution for fμν survives in the \({M_f} \to \infty \,\lim {\rm{it}}\,{f_{\mu v}} \to {\overset - f _{\mu v}}\), we then obtain a massive gravity sector on an arbitrary non-dynamical reference metric \({\overset - f _{\mu v}}\). The dynamics of the massless spin-2 field fully decouples from that of the massive sector.

  • Other Decoupling Limits Finally, one can look at combinations of the previous limits, and the resulting theory depends on how fast Mf, Mg → ∞ compared to how fast m → 0. For instance if one takes the limit Mf, Mg → ∞ and m → 0, while keeping both Mg/Mf and \(\Lambda _3^3 = {M_g}{m^2}\) fixed, then we obtain what is called the Λ3-decoupling limit of bi-gravity (derived in Section 8.4), where the dynamics of the two helicity-2 modes (which are both massless in that limit), and that of the helicity-1 and -0 modes can be followed without keeping track of the standard non-linearities of GR.

    If on top of this Λ3-decoupling limit one further takes Mf → ∞, then one of the massless spin-2 fields fully decoupled (no communication between that field and the helicity-1 and -0 modes). If, on the other hand, we take the additional limit m → 0 on top of the Λ3-decoupling limit, then the helicity-0 and -1 modes fully decouple from both helicity-2 modes.

In all of these decoupling limits, the number of dofs remains the same as in the original theory, some fields are simply decoupled from the rest of the standard gravitational sector. These prevents any communication between these decoupled fields and the gravitational sector, and so from the gravitational sector view point it appears as if these decoupled fields did not exist.

It is worth stressing that all of these limits are perfectly sensible and lead to sensible theories, (from a theoretical view point). This is important since if one of these scaling limits lead to a pathological theory, it would have severe consequences for the parent bi-gravity theory itself.

Similar decoupling limit could be taken in multi-gravity and out of N interacting spin-2 fields, we could obtain for instance N decoupled massless spin-2 fields and 3(N − 1) decoupled dofs in the helicity-0 and -1 modes.

In what follows we focus on massive gravity limit of bi-gravity when Mf ⊒∞

Massive gravity as a decoupling limit of bi-gravity

Minkowski reference metric

In the following two sections we review the decoupling arguments given previously in the literature, (see for instance [154]). We start with the theory of bi-gravity presented in Section 5.4 with the action (5.43)

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}} = {{M_g^2} \over 2}\sqrt {- g} R[g] + {{M_f^2} \over 2}\sqrt {- f} R[f] + {1 \over 4}{m^2}M_{{\rm{Pl}}}^2\sqrt {- g} \,{{\mathcal L}_m}(g,f)} \\ {+ \sqrt {- g} {\mathcal L}_g^{({\rm{matter}})}({g_{\mu \nu}},{\psi _g}) + \sqrt {- f} {\mathcal L}_f^{({\rm{matter}})}({f_{\mu \nu}},{\psi _f})\,,\quad \quad} \\ \end{array}$$
(8.5)

with \({{\mathcal L}_m}(g,f) = \sum\nolimits_{n = 0}^4 {{\alpha _n}{{\mathcal L}_n}[{\mathcal K}(g,f)]}\) as defined in (6.3) and where \({\mathcal K}_v^\mu = \delta _v^\mu - \sqrt {{g^{\mu \alpha}}{f_{\alpha v}}}\). We also allow for the coupling to matter with different species ψg,f living on each metrics.

We now consider matter fields ψf such that fμν = ημν is a solution to the equations of motion (so for instance there is no overall cosmological constant living on the metric fμν). In that case we can write that metric fμν as

$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}{\chi _{\mu \nu}}\,,$$
(8.6)

We may now take the limit Mf → ∞ while keeping the scales Mg and m and all the fields χ, g, ψf,g fixed. We then recover massive gravity plus a completely decoupled massless spin-2 field χμν, and a fully decoupled matter sector ψf living on flat space

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}}\overset {{M_f} \rightarrow \infty}{\rightarrow}{{\mathcal L}_{{\rm{MG}}}}(g,\eta) + \sqrt {- g} {\mathcal L}_g^{({\rm{matter}})}({g_{\mu \nu}},{\psi _g})\quad \quad \quad \quad \quad\;\;} \\ {+ {1 \over 2}{\chi ^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}} + {\mathcal L}_f^{({\rm{matter}})}({\eta _{\mu \nu}},{\psi _f})\,,} \\ \end{array}$$
(8.7)

with the massive gravity Lagrangian MG is expressed in (6.3). That massive gravity Lagrangian remains fully non-linear in this limit and is expressed in terms of the full metric gμν and the reference metric ημν. While the metric fμν is ‘frozen’ in this limit, we emphasize however that the massless spin-2 field χμν is itself not frozen — its dynamics is captured through the kinetic term \({{\mathcal X}^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{{\mathcal X}_{\alpha \beta}}\), but that spin-2 field decouple from its own matter sector ψf, (although this can be accommodated for by scaling the matter fields ψf accordingly in the limit Mf → ∞ so as to maintain some interactions).

At the level of the equations of motion, in the limit Mf → ∞ we obtain the massive gravity modified Einstein equation for gμν, the free massless linearized Einstein equation for which fully decouples and the equation of motion for all the matter fields ψf on flat spacetime, (see also Ref. [44]).

(A)dS reference metric

To consider massive gravity with an (A)dS reference metric as a limit of bi-gravity, we include a cosmological constant for the metric f into (8.5)

$${{\mathcal L}_{{\rm{CC}},{\rm{f}}}} = - M_f^2\int {{{\rm{d}}^4}} x\sqrt {- f} {\Lambda _f}\,.$$
(8.8)

There can also be in principle another cosmological constant living on top of the metric but this can be included into the potential \({\mathcal U}(g,f)\). The background field equations of motion are then given by

$$M_f^2{G_{\mu \nu}}[f] + {{{m^2}M_{{\rm{Pl}}}^2} \over {4\sqrt {- f}}}\left({{\delta \over {\delta {f^{\mu \nu}}}}\sqrt {- g} \,{\mathcal U}(g,f)} \right) = {T_{\mu \nu}}({\psi _f}) - M_f^2{\Lambda _f}{f_{\mu \nu}}$$
(8.9)
$$M_{{\rm{Pl}}}^2{G_{\mu \nu}}[g] + {{{m^2}M_{{\rm{Pl}}}^2} \over {4\sqrt {- g}}}\left({{\delta \over {\delta {g^{\mu \nu}}}}\sqrt {- g} \,{\mathcal U}(g,f)} \right) = {T_{\mu \nu}}({\psi _g})\,.$$
(8.10)

Taking now the limit Mf → ∞ while keeping the cosmological constant Λf fixed, the background solution for the metric fμν is nothing other than dS (or AdS depending on the sign of Λf). So we can now express the metric fμν as

$${f_{\mu \nu}} = {\gamma _{\mu \nu}} + {1 \over {{M_f}}}{\chi _{\mu \nu}}\,,$$
(8.11)

where γμν is the dS metric with Hubble parameter \(H\sqrt {{\Lambda _f}/3}\). Taking the limit Mf → ∞, we recover massive gravity on (A)dS plus a completely decoupled massless spin-2 field χμν,

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}} - M_f^2\int {{{\rm{d}}^4}} x\sqrt {- f} {\Lambda _f}\;\overset {{M_f} \rightarrow \infty}{\rightarrow} {{M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} R + {{{m^2}} \over 4}{\mathcal U}(g,\gamma)\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over 2}{\chi ^{\mu \nu}}\hat \varepsilon _{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}}\,,} \\ \end{array}$$
(8.12)

where once again the scales MPl and m are kept fixed in the limit Mf → ∞. γμν now plays the role of a non-trivial reference metric for massive gravity. This corresponds to a theory of massive gravity on a more general reference metric as presented in [296]. Here again the Lagrangian for massive gravity is given in (6.3) with now \({\mathcal K}_v^\mu (g) = \delta _v^\mu - \sqrt {{g^{\mu \alpha}}{\gamma _{\alpha v}}}\). The massive gravity action remains fully non-linear in the limit Mf → ∞ and is expressed solely in terms of the full metric gμν and the reference metric γμν while the excitations χμν for the massless graviton remain dynamical but fully decouple from the massive sector.

Arbitrary reference metric

As is already clear from the previous discussion, to recover massive gravity on a non-trivial reference metric as a limit of bi-gravity, one needs to scale the Matter Lagrangian that couples to what will become the reference metric (say the metric f for definiteness) in such a way that the Riemann curvature of f remains finite in that decoupling limit. For a macroscopic description of the matter living on this is in principle always possible. For instance one can consider a point source of mass Mbh living on the metric f. Then, taking the limit Mf, Mbh → ∞ while keeping the ratio MBH/Mf fixed, leads to a theory of massive gravity on a Schwarzschild reference metric and a decoupled massless graviton. However, some care needs to be taken to see how this works when the dynamics of the matter sourcing is included.

As soon as the dynamics of the matter field is considered, one has to send the scale of that field to infinity so that it maintains some nonzero effect on f in the limit Mf → ∞ i.e.,

$$\underset {{M_f} \rightarrow \infty} {\lim} {1 \over {M_f^2}}{T^{\mu \nu}} = \underset {{M_f} \rightarrow \infty} {\lim} {1 \over {\sqrt {- f} M_f^2}}{{\delta \sqrt {- f} {\mathcal L}_f^{({\rm{matter}})}} \over {\delta {f_{\mu \nu}}}} \rightarrow {\rm{finite}}\,.$$
(8.13)

Nevertheless, this can be achieved in such a way that the fluctuations of the matter fields remain finite and decouple in the limit Mf → ∞. We note that this scaling is the key difference between the decoupling limit of bi-gravity on a Minkowski reference metric derived in section 8.2.1 where the matter field scale as \({\rm{li}}{{\rm{m}}_{{M_f} \to \infty}}{1 \over {M_f^2}}{T^{\mu v}} \to 0\) and the decoupling limit of bi-gravity on an arbitrary reference metric derived here.

As an example, suppose that the Lagrangian for the matter (for example a scalar field) sourcing the f metric is

$${\mathcal L}_f^{({\rm{matter}})} = \sqrt {- f} \left({- {1 \over 2}{f^{\mu \nu}}{\partial _\mu}\chi {\partial _\nu}\chi - {V_0}F\left({{\chi \over \lambda}} \right)} \right)$$
(8.14)

where F (X) is an arbitrary dimensionless function of its argument. Then choosing to take the form

$$\chi = {M_f}\bar \chi + \delta \chi \,,$$
(8.15)

and rescaling \({V_0} = M_f^2{\overset - V _0}\) and \(\lambda = {M_f}\overset - \lambda\), then on taking the limit Mf → ∞ keeping \(\bar {\mathcal X}\), \(\delta {\mathcal X}\) and \(\overset - \lambda\) fixed, since

$${\mathcal L}_f^{({\rm{matter}})} \rightarrow M_f^2\sqrt {- f} \left({- {1 \over 2}{f^{\mu \nu}}{\partial _\mu}\bar \chi {\partial _\nu}\bar \chi - {{\bar V}_0}F\left({{{\bar \chi} \over {\bar \lambda}}} \right)} \right) + {\rm{fluctuations}}\,,$$
(8.16)

we find that the background stress energy blows up in such a way that \({1 \over {M_f^2}}{T^{\mu v}}\) remains finite and nontrivial, and in addition the background equations of motion for \(\bar {\mathcal X}\) remain well-defined and nontrivial in this limit,

$${\square_f}\bar \chi = {{{{\bar V}_0}} \over {\bar \lambda}}F\prime\left({{{\bar \chi} \over {\bar \lambda}}} \right)\,.$$
(8.17)

This implies that even in the limit Mf → ∞ can remain consistently as a nontrivial sourced metric which is a solution of some dynamical equations sourced by matter. In addition the action for the fluctuations δχ asymptotes to a free theory which is coupled only to the fluctuations of which are themselves completely decoupled from the fluctuations of the metric g and matter fields coupled to g.

As a result, massive gravity with an arbitrary reference metric can be seen as a consistent limit of bi-gravity in which the additional degrees of freedom in the metric and matter that sources the background decouple. Thus all solutions of massive gravity may be seen as Mf → ∞ decoupling limits of solutions of bi-gravity. This will be discussed in more depth in Section 8.4. For an arbitrary reference metric which can be locally written as a small departures about Minkowski the decoupling limit is derived in Eq. (8.81).

Having derived massive gravity as a consistent decoupling limit of bi-gravity, we could of course do the same for any multi-metric theory. For instance, out of N-interacting fields, we could take a limit so as to decouple one of the metrics, we then obtain the theory of (N − 1)-interacting fields, all of which being massive and one decoupled massless spin-2 field.

Decoupling limit of massive gravity

We now turn to a different type of decoupling limit, whose aim is to disentangle the dofs present in massive gravity itself and analyze the ‘irrelevant interactions’ (in the usual EFT sense) that arise at the lowest possible scale. One could naively think that such interactions arise at the scale given by the graviton mass, but this is not so. In a generic theory of massive gravity with Fierz-Pauli at the linear level, the first irrelevant interactions typically arise at the scale Λ5 = (m4MPl)1/5. For the setups we have in mind, m ≪ Λ5MPl. But we shall see that interactions arising at such a low-energy scale are always pathological (reminiscent to the BD ghost [111, 173]), and in ghost-free massive gravity the first (irrelevant) interactions actually arise at the scale Λ3 = (m3MPl)1/3.

We start by deriving the decoupling limit in the absence of vectors (helicity-1 modes) and then include them in the following section 8.3.4. Since we are interested in the decoupling limit about flat spacetime, we look at the case where Minkowski is a vacuum solution to the equations of motion. This is the case in the absence of a cosmological constant and a tadpole and we thus focus on the case where α0 = α1 = 0 in (6.3).

Interaction scales

In GR, the interactions of the helicity-2 mode arise at the very high energy scale, namely the Planck scale. In massive gravity a new scale enters and we expect some interactions to arise at a lower energy scale given by a geometric combination of the Planck scale and the graviton mass. The potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt {- g} {{\mathcal L}_n}[{\mathcal K}[g,\eta ]]\) (6.3) includes generic interactions between the canonically normalized helicity-0 (π), helicity-1 (Aμ), and helicity-2 modes (hμν) introduced in (2.48)

$$\begin{array}{*{20}c} {{{\mathcal L}_{j,k,\ell}} = {m^2}M_{{\rm{Pl}}}^2{{\left({{h \over {M_{{\rm{Pl}}}}}} \right)}^j}{{\left({{{\partial A} \over {mM_{{\rm{Pl}}}}}} \right)}^{2k}}{{\left({{{{\partial ^2}\pi} \over {{m^2}M_{{\rm{Pl}}}}}} \right)}^\ell}} \\ {= \Lambda _{j,k,\ell}^{- 4 + (j + 4k + 3\ell)}\;{h^j}{{(\partial A)}^{2k}}{{({\partial ^2}\pi)}^\ell}\,,\quad} \\ \end{array}$$
(8.18)

at the scale

$${\Lambda _{j,k,\ell}} = {\left({{m^{2k + 2\ell - 2}}M_{{\rm{Pl}}}^{j + 2k + \ell - 2}} \right)^{1/(j + 4k + 3\ell - 4)}}\,,$$
(8.19)

and with j, k, ∈ ℕ, and j + 2k + > 2.

Clearly, the lowest interaction scale is Λj=0,k= 0, =3 ≡ Λ5 = (MPlm4)1/5 which arises for an operator of the form (2π)3. If present such an interaction leads to an Ostrogradsky instability which is another manifestation of the BD ghost as identified in [173].

Even if that very interaction is absent there is actually an infinite set of dangerous interactions of the form (2π) which arise at the scale Λj=0,k =0;ℓ≥3, with

$${\Lambda _5} = {({M_{{\rm{Pl}}}}{m^4})^{1/5}} \leq ({\Lambda _{j = 0,k = 0,\ell \geq 3}}) < {\Lambda _3} = {({M_{{\rm{Pl}}}}{m^2})^{1/3}}\,.$$
(8.20)

with Λj=0,k =0,→∞ = Λ3.

Any interaction with j > 0 or k > 0 automatically leads to a larger scale, so all the interactions arising at a scale between Λ5 (inclusive) and Λ3 are of the form (2π) and carry an Ostrogradsky instability. For DGP we have already seen that there is no interactions at a scale below Λ3. In what follows we show that same remains true for the ghost-free theory of massive gravity proposed in (6.3). To see this let us identify the interactions with j = k = 0 and arbitrary power for (2π).

Operators below the scale Λ3

We now express the potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt {- g} {{\mathcal L}_n}[{\mathcal K}]\) introduced in (6.3) using the metric in term of the helicity-0 mode, where we recall that the quantity \({\mathcal K}\) is defined in (6.7), as \({\mathcal K}_v^\mu [g,\tilde f] = \delta _{\,\,\,\,\,v}^\mu - (\sqrt {{g^{- 1}}\tilde f})_v^\mu\) where \(\tilde f\) is the ‘Stückelbergized’ reference metric given in (2.78). Since we are interested in interactions without the helicity-2 and -1 modes (j = k = 0), it is sufficient to follow the behaviour of the helicity-0 mode and so we have

$$\left. {\begin{array}{*{20}c} {{{\left. {{{\tilde f}_{\mu \nu}}} \right\vert}_{h = A = 0}} = {\eta _{\mu \nu}} - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}{\Pi _{\mu \nu}} + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _{\mu \nu}^2}\quad \\ {{{\left. {{g^{\mu \nu}}} \right\vert}_{h = 0}} = {\eta ^{\mu \nu}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}} \right\}\quad \quad \Rightarrow \quad {\mathcal K}_{\;\;\nu}^\mu {\vert _{h = A = 0}} = {{\Pi _\nu ^\mu} \over {{M_{{\rm{Pl}}}}{m^2}}}\,,$$
(8.21)

with again Πμν = μν π and \(\Pi _{\mu v}^2: = {\eta ^{\alpha \beta}}{\Pi _{\mu \alpha}}{\Pi _{v\beta}}\).

As a result, we infer that up to the scale Λ3 (excluded), the potential in (6.3) is

$${\left. {{{\mathcal L}_{{\rm{mass}}}} = {{{m^2}M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,\tilde f]]} \right\vert _{h = A = 0}}$$
(8.22)
$$\begin{array}{*{20}c} {= {{{m^2}M_{{\rm{Pl}}}^2} \over 4}\sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}\left[ {{{{\Pi _{\mu \nu}}} \over {{M_{{\rm{Pl}}}}{m^2}}}} \right] \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad\;} \\ {= {1 \over 4}{\epsilon ^{\mu \nu \alpha \beta}}{\epsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta {\prime}}}\left({{{{\alpha _2}} \over {{m^2}}}\delta _\nu ^{\mu {\prime}}\delta _\nu ^{\nu {\prime}} + {{{\alpha _3}} \over {{M_{{\rm{Pl}}}}{m^4}}}\delta _\nu ^{\mu {\prime}}\Pi _\nu ^{\nu {\prime}} + {{{\alpha _4}} \over {M_{{\rm{Pl}}}^2{m^6}}}\Pi _\nu ^{\mu {\prime}}\Pi _\nu ^{\nu {\prime}}} \right)\Pi _\alpha ^{\alpha {\prime}}\Pi _\beta ^{\beta {\prime}}\,,} \\ \end{array}$$
(8.23)

where as mentioned earlier we focus on the case without a cosmological constant and tadpole i.e., α0 = α1 = 0. All of these interactions are total derivatives. So even though the ghost-free theory of massive gravity does in principle involve some interactions with higher derivatives of the form (2π) it does so in a very precise way so that all of these terms combine so as to give a total derivative and being harmless.Footnote 22

As a result the potential term constructed proposed in Part II (and derived from the deconstruction framework) is free of any interactions of the form (2π). This means that the BD ghost as identified in the Stückelberg language in [173] is absent in this theory. However, at this level, the BD ghost could still reappear through different operators at the scale Λ3 or higher.

Λ3-decoupling limit

Since there are no operators all the way up to the scale Λ3 (excluded), we can take the decoupling limit by sending MPl ⊒ ∞, m ⊒ 0 and maintaining the scale Λ3 fixed.

The operators that arise at the scale Λ3 are the ones of the form (8.18) with either j = 1, k = 0 and arbitrary ≥ 2 or with j = 0, k = 1 and arbitrary ≥ 1. The second case scenario leads to vector interactions of the form (∂A)2(2π) and will be studied in the next Section 8.3.4. For now we focus on the first kind of interactions of the form h (∂2π),

$${\mathcal L}_{{\rm{mass}}}^{{\rm{dec}}} = {h_{\mu \nu}}{\bar X_{\mu \nu}}\,,$$
(8.24)

with [144] (see also refs. [137] and [143])

$$\begin{array}{*{20}c} {{{\left. {{{\bar X}_{\mu \nu}} = {\delta \over {\delta {h_{\mu \nu}}}}{{\mathcal L}_{{\rm{mass}}}}} \right\vert}_{h = A = 0}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{{\left. {= {{M_{{\rm{Pl}}}^2{m^2}} \over 4}{\delta \over {\delta {h_{\mu \nu}}u}}\left({\sqrt {- g} \sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,\tilde f]]} \right)} \right\vert}_{h = A = 0}}\,.} \\ \end{array}$$
(8.25)

Using the fact that

$${\left. {{{\delta {{\mathcal K}^n}} \over {\delta {h^{\mu \nu}}}}} \right\vert _{h = A = 0}} = {n \over 2}(\Pi _{\mu \nu}^{n - 1} - \Pi _{\mu \nu}^n)\,,$$
(8.26)

we obtain

$${\bar X_{\mu \nu}} = {{\Lambda _3^3} \over 8}\sum\limits_{n = 2}^4 {{\alpha _n}} \left({{{4 - n} \over {\Lambda _3^{3n}}}X_{\mu \nu}^{(n)}[\Pi ] + {n \over {\Lambda _3^{3(n - 1)}}}X_{\mu \nu}^{(n - 1)}[\Pi ]} \right)\,,$$
(8.27)

where the tensors \(X_{\mu v}^{(n)}\) are constructed out of Πμν, symbolically, X(n)Π(n) but in such a way that they are transverse and that their resulting equations of motion never involve more than two derivatives on each fields,

$${X^{(0)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu \alpha \beta}}$$
(8.28)
$${X^{(1)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha \beta}}\;Q_\nu ^{\nu {\prime}}$$
(8.29)
$${X^{(2)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta}}\;Q_\nu ^{\nu {\prime}}\;Q_\alpha ^{\alpha {\prime}}$$
(8.30)
$${X^{(3)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta {\prime}}}\;Q_\nu ^{\nu {\prime}}\;Q_\alpha ^{\alpha {\prime}}Q_\beta ^{\beta {\prime}}$$
(8.31)
$${X^{(n \geq 4)}}_{\,\mu \prime}^\mu [Q] = 0\,,$$
(8.32)

where we have included X(0) and X(n ≥4) for completeness (these become relevant for instance in the context of bi-gravity). The generalization of these tensors to arbitrary dimensions is straightforward and in d-spacetime dimensions there are d such tensors, symbolically X(n) = εε Πnδdn1 for n = 0, ⋯ d, − 1.

Since we are dealing with the decoupling limit with MPl → ∞ the metric is flat \({g_{\mu v}} = {\eta _{\mu v}} + M_{{\rm{Pl}}}^{- 1}{h_{\mu v}} \to {\eta _{\mu v}}\) and all indices are raised and lowered with respect to the Minkowski metric. These tensors can be written more explicitly as follows

$$X_{\mu \nu}^{(0)}[Q] = 3!{\eta _{\mu \nu}}$$
(8.33)
$$X_{\mu \nu}^{(1)}[Q] = 2!([Q]{\eta _{\mu \nu}} - {Q_{\mu \nu}})$$
(8.34)
$$X_{\mu \nu}^{(2)}[Q] = ({[Q]^2} - [{Q^2}]){\eta _{\mu \nu}} - 2([Q]{Q_{\mu \nu}} - Q_{\mu \nu}^2)$$
(8.35)
$$\begin{array}{*{20}c} {X_{\mu \nu}^{(3)}[Q] = ({{[Q]}^3} - 3[Q][{Q^2}] + 2[{Q^3}]){\eta _{\mu \nu}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- 3({{[Q]}^2}{Q_{\mu \nu}} - 2[Q]Q_{\mu \nu}^2 - [{Q^2}]{Q_{\mu \nu}} + 2Q_{\mu \nu}^3){.}} \\ \end{array}$$
(8.36)

Note that they also satisfy the recursive relation

$$X_{\mu \nu}^{(n)} = {1 \over {4 - n}}(- n\Pi _{\,\mu}^\alpha \delta _\nu ^\beta + {\Pi ^{\alpha \beta}}{\eta _{\mu \nu}})\;X_{\alpha \beta}^{(n - 1)},$$
(8.37)

with \(X_{\mu v}^{(0)} = 3!{\eta _{\mu v}}\).

Decoupling limit

From the expression of these tensors in terms of the fully antisymmetric Levi-Cevita tensors, it is clear that the tensors are transverse and that the equations of motion of \({h^{\mu v}}\overset - {{X_{\mu v}}}\) with respect to both h and π never involve more than two derivatives. This decoupling limit is thus free of the Ostrogradsky instability which is the way the BD ghost would manifest itself in this language. This decoupling limit is actually free of any ghost-lie instability and the whole theory is free of the BD even beyond the decoupling limit as we shall see in depth in Section 7.

Not only does the potential term proposed in (6.3) remove any potential interactions of the form (2π) which could have arisen at an energy between Λ5 = (MPlm4)1/5 and Λ3, but it also ensures that the interactions that arise at the scale Λ3 are healthy.

As already mentioned, in the decoupling limit MPl ⊒ ∞ the metric reduces to Minkowski and the standard Einstein-Hilbert term simply reduces to its linearized version. As a result, neglecting the vectors for now the full Λ3-decoupling limit of ghost-free massive gravity is given by

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\Lambda _3}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 8}{h^{\mu \nu}}\left({2{\alpha _2}X_{\mu \nu}^{(1)} + {{2{\alpha _2} + 3{\alpha _3}} \over {\Lambda _3^3}}X_{\mu \nu}^{(2)} + {{{\alpha _3} + 4{\alpha _4}} \over {\Lambda _3^6}}X_{\mu \nu}^{(3)}} \right)} \\ {= - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{a_n}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(8.38)

with α1 = α2/4, α2 = (2α2 + 3α3)/8 and α3 = (α3 + 4α4)/8 and the correct normalization should be α2 = 1.

Unmixing and Galileons

As was already the case at the linearized level for the Fierz-Pauli theory (see Eqs. (2.47) and (2.48)) the kinetic term for the helicity-0 mode appears mixed with the helicity-2 mode. It is thus convenient to diagonalize these two modes by performing the following shift,

$${h_{\mu \nu}} = {\tilde h_{\mu \nu}} + {\alpha _2}\pi {\eta _{\mu \nu}} - {{2{\alpha _2} + 3{\alpha _3}} \over {2\Lambda _3^3}}{\partial _\mu}\pi {\partial _\nu}\pi \,,$$
(8.39)

where the non-linear term has been included to unmix the coupling \({h^{\mu v}}X_{\mu v}^{(2)}\), leading to the following decoupling limit [137]

$${{\mathcal L}_{{\Lambda _3}}} = - {1 \over 4}\left[ {{{\tilde h}^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{{\tilde h}_{\alpha \beta}} + \sum\limits_{n = 2}^5 {{{{c_n}} \over {\Lambda _3^{3(n - 2)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ] - {{2({\alpha _3} + 4{\alpha _4})} \over {\Lambda _3^6}}{{\tilde h}^{\mu \nu}}X_{\mu \nu}^{(3)}} \right]\,,$$
(8.40)

where we introduced the Galileon Lagrangians \({\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]\) as defined in Ref. [412]

$${\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ] = {1 \over {(6 - n)!}}{(\partial \pi)^2}{{\mathcal L}_{n - 2}}[\Pi ]$$
(8.41)
$$= - {2 \over {n(5 - n)!}}\pi {{\mathcal L}_{n - 1}}[\Pi ]\,,$$
(8.42)

where the Lagrangians n [Q ] = εεQnδ4−n for a tensor \(Q_{\,\,\,\,v}^\mu\) are defined in (6.9)(6.13), or more explicitly in (6.14)(6.18), leading to the explicit form for the Galileon Lagrangians

$${\mathcal L}_{({\rm{Gal}})}^{(2)}[\pi ] = {(\partial \pi)^2}$$
(8.43)
$${\mathcal L}_{({\rm{Gal}})}^{(3)}[\pi ] = {(\partial \pi)^2}[\Pi ]$$
(8.44)
$${\mathcal L}_{({\rm{Gal}})}^{(4)}[\pi ] = {(\partial \pi)^2}({[\Pi ]^2} - [{\Pi ^2}])$$
(8.45)
$${\mathcal L}_{({\rm{Gal}})}^{(5)}[\pi ] = {(\partial \pi)^2}({[\Pi ]^3} - 3[\Pi ][{\Pi ^2}] + 2[{\Pi ^3}])\,,$$
(8.46)

and the coefficients cn are given in terms of the αn as follows,

$$\begin{array}{*{20}c} {{c_2} = 3\alpha _2^2\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} & {\quad {c_3} = {3 \over 2}{\alpha _2}(2{\alpha _2} + 3{\alpha _3})\,,\quad \quad \quad} \\ {{c_4} = {1 \over 4}(4\alpha _2^2 + 9\alpha _3^2 + 16{\alpha _2}({\alpha _3} + {\alpha _4}))\,,} & {\quad {c_5} = {5 \over 8}(2{\alpha _2} + 3{\alpha _3})({\alpha _3} + 4{\alpha _4})\,.} \\ \end{array}$$
(8.47)

Setting α2 = 1, we indeed recover the same normalization of −3/4(∂π)2 for the helicity-0 mode found in (2.48).

X(3)-coupling

In general, the last coupling \({\tilde h^{\mu v}}X_{\mu v}^{(3)}\) between the helicity-2 and helicity-0 mode cannot be removed by a local field redefinition. The non-local field redefinition

$${\tilde h_{\mu \nu}} \rightarrow {\tilde h_{\mu \nu}} + G_{\mu \nu \alpha \beta}^{{\rm{massless}}}\,{X^{(3)\,\alpha \beta}}\,,$$
(8.48)

where \(G_{\mu v\alpha \beta}^{{\rm{massless}}}\) is the propagator for a massless spin-2 field as defined in (2.64), fully diagonalizes the helicity-0 and -2 mode at the price of introducing non-local interactions for π.

Note however that these non-local interactions do not hide any new degrees of freedom. Furthermore, about some specific backgrounds, the field redefinition is local. Indeed focusing on static and spherically symmetric configurations if we consider π = π0(r) and \({\tilde h_{\mu v}}\) given by

$${\tilde h_{\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - \psi (r)\;{\rm{d}}{t^2} + \phi (r)\;{\rm{d}}{r^2}\,,$$
(8.49)

so that

$${\tilde h^{\mu \nu}}X_{\mu \nu}^{(3)} = - \psi \prime (r){\pi\prime_0}{(r)^3}\,.$$
(8.50)

The standard kinetic term for ψ sets ψ ′(r) = ϕ (r)/r as in GR and the X(3) coupling can be absorbed via the field redefinition, \(\phi \to \bar \phi - 2({\alpha _3} + 4{\alpha _4}){{\pi \prime_0}}{(r)^3}/r\Lambda _3^{- 6}\), leading to the following new sextic interactions for π,

$${\tilde h^{\mu \nu}}X_{\mu \nu}^{(3)} \rightarrow - {1 \over {{r^2}}}{\pi\prime_0}{(r)^6}\,,$$
(8.51)

interestingly this new order-6 term satisfy all the relations of a Galileon interaction but cannot be expressed covariantly in a local way. See [61] for more details on spherically symmetric configurations with the X(3)-coupling.

Vector interactions in the Λ3-decoupling limit

As can be seen from the relation (8.19), the scale associated with interactions mixing two helicity-1 fields with an arbitrary number of fields π, (j = 0, k = 1 and arbitrary ) is also Λ3. So at that scale, there are actually an infinite number of interactions when including the mixing with between the helicity-1 and -0 modes (however as mentioned previously, since the vector field always appears quadratically it is always consistent to set them to zero as was performed previously).

The full decoupling limit including these interactions has been derived in Ref. [419], (see also Ref. [238]) using the vielbein formulation of massive gravity as in (6.1) and we review the formalism and the results in what follows.

In addition to the Stückelberg fields associated with local covariance, in the vielbein formulation one also needs to introduce 6 additional Stückelberg fields ωab associated to local Lorentz invariance, ωab = − ωba. These are non-dynamical since they never appear with derivatives, and can thus be treated as auxiliary fields which can be integrated. It is however useful to keep them in the decoupling limit action, so as to retain a closes-form expression. In terms of the Lorentz Stückelberg fields, the full decoupling limit of massive gravity in four dimensions at the scale Λ3 is then (before diagonalization) [419]

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{(0)} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 2}{h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{\alpha _n}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {{3{\beta _1}} \over 8}\delta _{abcd}^{\alpha \beta \gamma \delta}\delta _\alpha ^a\left({\delta _\beta ^bF_{\;\gamma}^c\omega _{\;\delta}^d + 2[\omega _{\;\beta}^b\omega _{\;\gamma}^c + {1 \over 2}\delta _\beta ^b\omega _{\;\mu}^c\omega _{\;\gamma}^\mu ](\delta + \Pi)_\delta ^d} \right)\quad \quad} \\ {+ {{{\beta _2}} \over 8}\delta _{abcd}^{\alpha \beta \gamma \delta}(\delta + \Pi)_\alpha ^a\left({2\delta _\beta ^bF_{\;\gamma}^c\omega _{\;\delta}^d + [\omega _{\;\beta}^b\omega _{\;\gamma}^c + \delta _\beta ^b\omega _{\;\mu}^c\omega _{\;\gamma}^\mu ](\delta + \Pi)_\delta ^d} \right)\quad} \\ {+ {{{\beta _3}} \over {48}}\delta _{abcd}^{\alpha \beta \gamma \delta}(\delta + \Pi)_\alpha ^a(\delta + \Pi)_\beta ^b\left({3F_{\;\gamma}^c\omega _{\;\delta}^d + \omega _{\;\mu}^c\omega _{\;\gamma}^\mu (\delta + \Pi)_\delta ^d} \right),\quad \quad \quad} \\ \end{array}$$
(8.52)

(the superscript (0) indicates that this decoupling limit is taken with Minkowski as a reference metric), with Fab = aAbbAa and the coefficients βn are related to the αn as in (6.28).

The auxiliary Lorentz Stückelberg fields carries all the non-linear mixing between the helicity-0 and -1 modes,

$${\omega _{ab}} = \int\nolimits_0^\infty {\rm{d}} u\,{e^{- 2u}}{e^{- u\Pi _a^{a\prime}}}{F_{a\prime b\prime}}{e^{- u\Pi _b^{b\prime}}}$$
(8.53)
$$= \sum\limits_{n,m} {{{(n + m)!} \over {{2^{1 + n + m}}n!m!}}} {(- 1)^{n + m}}{({\Pi ^n}\,F\,{\Pi ^m})_{ab}}\,.$$
(8.54)

In some special cases these sets of interactions can be resummed exactly, as was first performed in [139], (see also Refs. [364, 456]).

This decoupling limit includes non-linear combinations of the second-derivative tensor Πμν and the first derivative Maxwell tensor Fμν. Nevertheless, the structure of the interactions is gauge invariant for Aμ, and there are no higher derivatives on in the equation of motion for A, so the equations of motions for both the helicity-1 and -2 modes are manifestly second order and propagating the correct degrees of freedom. The situation is more subtle for the helicity-0 mode. Taking the equation of motion for that field would lead to higher derivatives on π itself as well as on the helicity-1 field. Since this theory has been proven to be ghost-free by different means (see Section 7), it must be that the higher derivatives in that equation are nothing else but the derivative of the equation of motion for the helicity-1 mode similarly as what happens in Section 7.2.

When working beyond the decoupling limit, the even the equation of motion with respect to the helicity-1 mode is no longer manifestly well-behaved, but as we shall see below, the Stückelberg fields are no longer the correct representation of the physical degrees of freedom. As we shall see below, the proper number of degrees of freedom is nonetheless maintained when working beyond the decoupling limit.

Beyond the decoupling limit

Physical degrees of freedom

In Section 8.3, we have introduced four Stückelberg fields ϕa which transform as scalar fields under coordinate transformation, so that the action of massive gravity is invariant under coordinate transformations. Furthermore, the action is also invariant under global Lorentz transformations in the field space,

$${x^\mu} \rightarrow {x^\mu}\,,\qquad {g_{\mu \nu}} \rightarrow {g_{\mu \nu}}\,,\quad {\rm{and}}\quad {\phi ^a} \rightarrow \tilde \Lambda _{\,b}^a{\phi ^b}\,.$$
(8.55)

In the DL, taking MPl → ∞, all fields are living on flat space-time, so in that limit, there is an additional global Lorentz symmetry acting this time on the space-time,

$${x^\mu} \rightarrow \bar \Lambda _\nu ^\mu \,{x^\nu}\,,\qquad {h_{\mu \nu}} \rightarrow \bar \Lambda _{\,\mu}^\alpha \bar \Lambda _{\,\nu}^\beta {h_{\alpha \beta}}\,,\quad {\rm{and}}\quad {\phi ^a} \rightarrow {\phi ^a}\,.$$
(8.56)

The internal and space-time Lorentz symmetries are independent, (the internal one is always present while the space-time one is only there in the DL). In the DL we can identify both groups and work in the representation of the single group, so that the action is invariant under,

$${x^\mu} \rightarrow \Lambda _\nu ^\mu \,{x^\nu}\,,\qquad {h_{\mu \nu}} \rightarrow \Lambda _{\,\mu}^\alpha \Lambda _{\,\nu}^\beta {h_{\alpha \beta}},\quad {\rm{and}}\quad {\phi ^a} \rightarrow \Lambda _{\,b}^a{\phi ^b}\,.$$
(8.57)

The Stückelberg fields ϕa then behave as Lorentz vectors under this identified group, and π defined previously behaves as a Lorentz scalar. The helicity-0 mode of the graviton also behaves as a scalar in this limit, and captures the behavior of the graviton helicity-0 mode. So in the DL limit, the right requirement for the absence of BD ghost is indeed the requirement that the equations of motion for π remain at most second order (time) in derivative as was pointed out in [173], (see also [111]). However, beyond the DL, the helicity-0 mode of the graviton does not behave as a scalar field and neither does the π in the split of the Stückelberg fields. So beyond the DL there is no reason to anticipate that captures a whole degree of freedom, and it indeed, it does not. Beyond the DL, the equation of motion for will typically involve higher derivatives, but the correct requirement for the absence of ghost is different, as explained in Section 7.2. One should instead go back to the original four scalar Stückelberg fields ϕa and check that out of these four fields only three of them be dynamical. This has been shown to be the case in Section 7.2. These three degrees of freedom, together with the two standard graviton polarizations then gives the correct five degrees of freedom and circumvent the BD ghost.

Recently, much progress has been made in deriving the decoupling limit about arbitrary backgrounds, see Ref. [369].

Decoupling limit on (Anti) de Sitter

Linearized theory and Higuchi bound

Before deriving the decoupling limit of massive gravity on (Anti) de Sitter, we first need to analyze the linearized theory so as to infer the proper canonical normalization of the propagating dofs and the proper scaling in the decoupling limit, similarly as what was performed for massive gravity with flat reference metric. For simplicity we focus on (3 + 1) dimensions here, and when relevant give the result in arbitrary dimensions. Linearized massive gravity on (A)dS was first derived in [307, 308]. Since we are concerned with the decoupling limit of ghost-free massive gravity, we follow in this section the procedure presented in [154]. We also focus on the dS case first before commenting on the extension to AdS.

At the linearized level about dS, ghost-free massive gravity reduces to the Fierz-Pauli action with \({g_{\mu v}} = {\gamma _{\mu v}} + {\tilde h_{\mu v}} = {\gamma _{\mu v}} + {h_{\mu v}}/{M_{{\rm{Pl}}}}\), where γμν is the dS metric with constant Hubble parameter H0,

$${\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{h^{\mu \nu}}({{\mathcal{\hat E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} - {{{m^2}} \over 8}{\gamma ^{\mu \nu}}{\gamma ^{\alpha \beta}}({H_{\mu \alpha}}{H_{\nu \beta}} - {H_{\mu \nu}}{H_{\alpha \beta}})\,,$$
(8.58)

where Hμν, is the tensor fluctuation as introduced in (2.80), although now considered about the dS metric,

$$\begin{array}{*{20}c} {{H_{\mu \nu}} = {h_{\mu \nu}} + 2{{{\nabla _{(\mu}}{A_{\nu)}}} \over m} + 2{{{\Pi _{\mu \nu}}} \over {{m^2}}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- {1 \over {{M_{{\rm{Pl}}}}}}\left[ {{{{\nabla _\mu}{A_\alpha}} \over m} + {{{\Pi _{\mu \alpha}}} \over {{m^2}}}} \right]\left[ {{{{\nabla _\nu}{A_\beta}} \over m} + {{{\Pi _{\nu \beta}}} \over {{m^2}}}} \right]{\gamma ^{\alpha \beta}}\,,} \\ \end{array}$$
(8.59)

with πμν = ∇μνπ, ∇ being the covariant derivative with respect to the dS metric γμν and indices are raised and lowered with respect to this same metric. Similarly, \({\hat \varepsilon _{{\rm{dS}}}}\) is now the Lichnerowicz operator on de Sitter,

$$\begin{array}{*{20}c} {({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} = - {1 \over 2}\left[ {{\square h_{\mu \nu}} - 2{\nabla _{(\mu}}{\nabla _\alpha}h_{\nu)}^\alpha + {\nabla _\mu}{\nabla _\nu}h\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\ {\left. {- {\gamma _{\mu \nu}}(\square h - {\nabla _\alpha}{\nabla _\beta}{h^{\alpha \beta}}) + 6H_0^2\left({{h_{\mu \nu}} - {1 \over 2}h{\gamma _{\mu \nu}}} \right)} \right]\,.} \\ \end{array}$$
(8.60)

So at the linearized level and neglecting the vector fields, the helicity-0 and -2 mode of massive gravity on dS behave as

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{h^{\mu \nu}}({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} - {{{m^2}} \over 8}(h_{\mu \nu}^2 - {h^2}) - {1 \over 8}F_{\mu \nu}^2\quad \quad \quad \;} \\ {- {1 \over 2}{h^{\mu \nu}}({\Pi _{\mu \nu}} - [\Pi ]{\gamma _{\mu \nu}}) - {1 \over {2{m^2}}}([{\Pi ^2}] - {{[\Pi ]}^2})\,.} \\ \end{array}$$
(8.61)

After integration by parts, [Π2] = [Π]2 − 3H2(∂π)2. The helicity-2 and -0 modes are thus diagonalized as in flat space-time by setting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\gamma _{\mu v}}\),

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{{\bar h}^{\mu \nu}}({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{{\bar h}_{\alpha \beta}} - {{{m^2}} \over 8}(\bar h_{\mu \nu}^2 - {{\bar h}^2}) - {1 \over 8}F_{\mu \nu}^2\quad \quad \;} \\ {- {3 \over 4}\left({1 - 2{{\left({{H \over m}} \right)}^2}} \right)\left({{{(\partial \pi)}^2} - {m^2}\bar h\pi - 2{m^2}{\pi ^2}} \right)\,.} \\ \end{array}$$
(8.62)

The most important difference from linearized massive gravity on Minkowski is that the properly canonically normalized helicity-0 mode is now instead

$$\phi = \sqrt {1 - 2{{{H^2}} \over {{m^2}}}} \;\pi \,.$$
(8.63)

for a standard coupling of the form \({1 \over {{M_{{\rm{Pl}}}}}}\pi T\), where T is the trace of the stress-energy tensor, as we would infer from the coupling \({1 \over {{M_{{\rm{P}}1}}}}{h_{\mu \nu}}{T^{\mu \nu}}\) after the shift \({h_{\mu \nu}} = {\bar h_{\mu \nu}} + \pi {\gamma _{\mu \nu}}\), this means that the properly normalized helicity-0 mode couples as

$${\mathcal L}_{{\rm{helicity - 0}}}^{{\rm{matter}}} = {{{m^2}} \over {{M_{{\rm{Pl}}}}\sqrt {{m^2} - 2{H^2}}}}\phi T\,,$$
(8.64)

and that coupling vanishes in the massless limit. This might suggest that in the massless limit m → 0, the helicity-0 mode decouples, which would imply the absence of the standard vDVZ discontinuity on (Anti) de Sitter [358, 430], unlike what was found on Minkowski, see Section 2.2.3, which confirms the Newtonian approximation presented in [186].

While this observation is correct on AdS, in the dS one cannot take the massless limit without simultaneously sending H → 0 at least the same rate. As a result, it would be incorrect to deduce that the helicity-0 mode decouples in the massless limit of massive gravity on dS.

To be more precise, the linearized action (8.62) is free from ghost and tachyons only if m ≡ 0 which corresponds to GR, or if m2 > 2H2, which corresponds to the well-know Higuchi bound [307, 190]. In d spacetime dimensions, the Higuchi bound is m2 > (d − 2)H2. In other words, on dS there is a forbidden range for the graviton mass, a theory with 0 < m2 < 2H2 or with m2 < 0 always excites at least one ghost degree of freedom. Notice that this ghost, (which we shall refer to as the Higuchi ghost from now on) is distinct from the BD ghost which corresponded to an additional sixth degree of freedom. Here the theory propagates five dof (in four dimensions) and is thus free from the BD ghost (at least at this level), but at least one of the five dofs is a ghost. When 0 < m2 < 2H2, the ghost is the helicity-0 mode, while for m2 < 0, the ghost is he helicity-1 mode (at quadratic order the helicity-1 mode comes in as \(- {{{m^2}} \over 4}F_{\mu v}^2\)). Furthermore, when m2 < 0, both the helicity-2 and -0 are also tachyonic, although this is arguably not necessarily a severe problem, especially not if the graviton mass is of the order of the Hubble parameter today, as it would take an amount of time comparable to the age of the Universe to see the effect of this tachyonic behavior. Finally, the case m2 = 2H2 (or m2 = (d − 2)H2 in d spacetime dimensions), represents the partially massless case where the helicity-0 mode disappears. As we shall see in Section 9.3, this is nothing other than a linear artefact and non-linearly the helicity-0 mode always reappears, so the PM case is infinitely strongly coupled and always pathological.

A summary of the different bounds is provided below as well as in Figure 4:

  • m2 < 0: Helicity-1 modes are ghost, helicity-2 and -0 are tachyonic, sick theory

  • m2 = 0: General Relativity: two healthy (helicity-2) degrees of freedom, healthy theory,

  • 0 < m2 < 2H2: One “Higuchi ghost” (helicity-0 mode) and four healthy degrees of freedom (helicity-2 and -1 modes), sick theory,

  • m2 = 2H2: Partially Massless Gravity: Four healthy degrees (helicity-2 and -1 modes), and one infinitely strongly coupled dof (helicity-0 mode), sick theory,

  • m2 > 2H2: Massive Gravity on dS: Five healthy degrees of freedom, healthy theory.

Figure 4
figure 4

Degrees of freedom for massive gravity on a maximally symmetric reference metric. The only theoretically allowed regions are the upper left green region and the line m = 0 corresponding to GR.

Massless and decoupling limit
  • As one can see from Figure 4, in the case where H2 < 0 (corresponding to massive gravity on AdS), one can take the massless limit m ⊒ 0 while keeping the AdS length scale fixed in that limit. In that limit, the helicity-0 mode decouples from external matter sources and there is no vDVZ discontinuity. Notice however that the helicity-0 mode is nevertheless still strongly coupled at a low energy scale.

    When considering the decoupling limit m ⊒ 0, MPl ⊒ ∞ of massive gravity on AdS, we have the choice on how we treat the scale H in that limit. Keeping the AdS length scale fixed in that limit could lead to an interesting phenomenology in its own right, but is yet to be explored in depth.

  • In the dS case, the Higuchi forbidden region prevents us from taking the massless limit while keeping the scale H fixed. As a result, the massless limit is only consistent if H ⊒ 0 simultaneously as m ⊒ 0 and we thus recover the vDVZ discontinuity at the linear level in that limit.

    When considering the decoupling limit m ⊒ 0, MPl ⊒ ∞ of massive gravity on dS, we also have to send H ⊒ 0. If H/m ⊒ 0 in that limit, we then recover the same decoupling limit as for massive gravity on Minkowski, and all the results of Section 8.3 apply. The case of interest is thus when the ratio H/m remains fixed in the decoupling limit.

Decoupling limit

When taking the decoupling limit of massive gravity on dS, there are two additional contributions to take into account:

  • First, as mentioned in Section 8.3.5, care needs to be applied to properly identify the helicity-0 mode on a curved background. In the case of (A)dS, the formalism was provided in Ref. [154] by embedding a d-dimensional de Sitter spacetime into a flat (d + 1)-dimensional spacetime where the standard Stückelberg trick could be applied. As a result the ‘covariant’ fluctuation defined in (2.80) and used in (8.59) needs to be generalized to (see Ref. [154] for details)

    $$\begin{array}{*{20}c} {{1 \over {{M_{{\rm{Pl}}}}}}{H_{\mu \nu}} = {1 \over {{M_{{\rm{Pl}}}}}}{h_{\mu \nu}} + {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}} - {1 \over {\Lambda _3^6}}\Pi _{\mu \nu}^2\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over {\Lambda _3^3}}{{{H^2}} \over {{m^2}}}\left({{{(\partial \pi)}^2}({\gamma _{\mu \nu}} - {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}}) - {1 \over {\Lambda _3^6}}{\Pi _{\mu \alpha}}{\Pi _{\nu \beta}}{\partial ^\alpha}\pi {\partial ^\beta}\pi} \right)} \\ {+ {H^2}{{{H^2}} \over {{m^2}}}{{{{(\partial \pi)}^4}} \over {\Lambda _3^3}} + \cdots \,.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
    (8.65)

    Any corrections in the third line vanish in the decoupling limit and can thus be ignored, but the corrections of order H2 in the second line lead to new non-trivial contributions.

  • Second, as already encountered at the linearized level, what were total derivatives in Minkowski (for instance the combination [Π2] − [Π]2), now lead to new contributions on de Sitter. After integration by parts, m−2([Π2] − [Π]2) = m−2 = 12H2/m2(∂π)2. This was the origin of the new kinetic structure for massive gravity on de Sitter and will have further effects in the decoupling limit when considering similar contributions from 3,4(Π), where 3,4 are defined in (6.12, 6.13) or more explicitly in (6.17, 6.18).

Taking these two effects into account, we obtain the full decoupling limit for massive gravity on de Sitter,

$${\mathcal L}_{{\Lambda _3}}^{({\rm{dS}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} + {{{H^2}} \over {{m^2}}}\sum\limits_{n = 2}^5 {{{{\lambda _n}} \over {\Lambda _3^{3(n - 1)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]\,,$$
(8.66)

where \({\mathcal L}_{{\Lambda _3}}^{(0)}\) is the full Lagrangian obtained in the decoupling limit in Minkowski and given in (8.52), and \({\mathcal L}_{{\rm{(Gal)}}}^{(n)}\) are the Galileon Lagrangians as encountered previously. Notice that while the ratio H/m remains fixed, this decoupling limit is taken with H, m ⊒ 0, so all the fields in (8.66) live on a Minkowski metric. The constant coefficients λn depend on the free parameters of the ghost-free theory of massive gravity, for the theory (6.3) with α1 = 0 and α2 = 1, we have

$${\lambda _2} = {3 \over 2}\,,\quad {\lambda _3} = {3 \over 4}(1 + 2{\alpha _3})\,,\quad {\lambda _4} = {1 \over 4}(- 1 + 6{\alpha _4})\,,\quad {\lambda _5} = - {3 \over {16}}({\alpha _3} + 4{\alpha _4})\,.$$
(8.67)

At this point we may perform the same field redefinition (8.39) as in flat space and obtain the following semi-diagonalized decoupling limit,

$${\mathcal L}_{{\Lambda _3}}^{{\rm{dS}})} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {{{\alpha _3} + 4{\alpha _4}} \over {8\Lambda _3^9}}{h^{\mu \nu}}X_{\mu \nu}^{(3)} + \sum\limits_{n = 2}^5 {{{{{\tilde c}_n}} \over {\Lambda _3^{3(n - 2)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]$$
(8.68)

where the contributions from the helicity-1 modes are the same as the ones provided in (8.52), and the new coefficients \({\tilde c_n} = - {c_n}/4 + {H^2}/{m^2}{\lambda _n}\) cancel identically for m2 = 2H2, α3 = −1 and α4 = −α3/4 = 1/4, as pointed out in [154], and the same result holds for bi-gravity as pointed out in [301]. Interestingly, for these specific parameters, the helicity-0 loses its kinetic term, and any self-mixing as well as any mixing with the helicity-2 mode. Nevertheless, the mixing between the helicity-1 and -0 mode as presented in (8.52) are still alive. There are no choices of parameters which would allow to remove the mixing with the helicity-1 mode and as a result, the helicity-0 mode generically reappears through that mixing. The loss of its kinetic term implies that the field is infinitely strongly coupled on a configuration with zero vev for the helicity-1 mode and is thus an ill-defined theory. This was confirmed in various independent studies, see Refs. [185, 147].

Λ3-decoupling limit of bi-gravity

We now proceed to derive the Λ3-decoupling limit of bi-gravity, and we will see how to recover the decoupling limit about any reference metric (including Minkowski and de Sitter) as special cases. As already seen in Section 8.3.4, the full DL is better formulated in the vielbein language, even though in that case Stückelberg fields ought to be introduced for the broken diff and the broken Lorentz. Yet, this is a small price to pay, to keep the action in a much simpler form. We thus proceed in the rest of this section by deriving the Λ3-decoupling of bi-gravity and start in its vielbein formulation. We follow the derivation and formulation presented in [224]. As previously, we focus on (3 + 1)-spacetime dimensions, although the whole formalism is trivially generalizable to arbitrary dimensions.

We start with the action (5.43) for bi-gravity, with the interaction

$${{\mathcal L}_{g,f}} = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]$$
(8.69)
$$\begin{array}{*{20}c} {= - {{M_{{\rm{Pl}}}^2{m^2}} \over 2}{\varepsilon _{abcd}}\int {\left[ {{{{\beta _0}} \over {4!}}{e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {{{\beta _1}} \over {3!}}{f^a} \wedge {e^b} \wedge {e^c} \wedge {e^d}} \right.} \quad \quad \quad \quad \quad \quad \quad} \\ {\left. {+ {{{\beta _2}} \over {2!2!}}{f^a} \wedge {f^b} \wedge {e^c} \wedge {e^d} + {{{\beta _3}} \over {3!}}{f^a} \wedge {f^b} \wedge {f^c} \wedge {e^d} + {{{\beta _4}} \over {4!}}{f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \right]\,,} \\ \end{array}$$
(8.70)

where the relation between the α’s and the β’s is given in (6.28).

We now introduce Stückelberg fields ϕa = xa − χa for diffs and \(\Lambda _b^a\) for the local Lorentz. In the case of massive gravity, there was no ambiguity in how to perform this ‘Stückelbergization’ but in the case of bi-gravity, one can either ‘Stückelbergize the metric fμν or the metric gμν. In other words the broken diffs and local Lorentz symmetries can be restored by performing either one of the two replacements in (8.69),

$$f_\mu ^a \rightarrow \tilde f_\mu ^a = {\Lambda ^a}_bf_c^b(\phi (x))\,{\partial _\mu}{\phi ^c}\,.$$
(8.71)

or alternatively

$$e_\mu ^a \rightarrow \tilde e_\mu ^a = {\Lambda ^a}_be_c^b(\phi (x))\,{\partial _\mu}{\phi ^c}\,.$$
(8.72)

For now we stick to the first choice (8.71) but keep in mind that this freedom has deep consequences for the theory, and is at the origin of the duality presented in Section 10.7.

Since we are interested in the decoupling limit, we now perform the following splits, (see Ref. [419] for more details),

$$\begin{array}{*{20}c} {e_\mu ^a = \bar e_\mu ^a + {1 \over {2{M_{{\rm{Pl}}}}}}h_\mu ^a\,,\qquad f_\mu ^a = \bar e_\mu ^a + {1 \over {2{M_f}}}v_\mu ^a} \\ {{\Lambda ^a}_b = {e^{{{\hat \omega}^a}_{\;\;b}}} = {\delta ^a}_b + {{\hat \omega}^a}_{\;\;b} + {1 \over 2}{{\hat \omega}^a}_{\;\;c}{{\hat \omega}^c}_{\;\;b} + \cdots \quad \quad \quad} \\ {\hat \omega _{\;\;\;b}^a = {{\omega _{\;\;\;b}^a} \over {m{M_{{\rm{Pl}}}}}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{\partial _\mu}{\phi ^a} = {\partial _\mu}\left({{x^a} + {{{A^a}} \over {m{M_{{\rm{Pl}}}}}} + {{{\partial ^a}\pi} \over {\Lambda _3^3}}} \right)\quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(8.73)

and perform the scaling or decoupling limit,

$${M_{{\rm{Pl}}}} \rightarrow \infty \,,\quad {M_f} \rightarrow \infty \,,\quad m \rightarrow 0$$
(8.74)

while keeping

$$\begin{array}{*{20}c} {{\Lambda _3} = {{({m^2}{M_{{\rm{Pl}}}})}^{{1 \over 3}}} \rightarrow {\rm{constant}}\,,\quad \,{M_{{\rm{Pl}}}}/{M_f} \rightarrow {\rm{constant}}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{\rm{and}}\,\quad {\beta _n} \rightarrow {\rm{constant}}\,.} \\ \end{array}$$
(8.75)

Before performing any change of variables (any diagonalization), in addition to the kinetic term for quadratic h, v and A, there are three contributions to the decoupling limit of bi-gravity:

  1. Mixing of the helicity-0 mode with the helicity-1 mode Aμ, as derived in (8.52),

  2. Mixing of the helicity-0 mode with the helicity-2 mode \(h_\mu ^a\), as derived in (8.40),

  3. Mixing of the helicity-0 mode with the new helicity-2 mode \(\upsilon _\mu ^a\),

noticing that before field redefinitions, the helicity-0 mode do not self-interact (their self-interactions are constructed so as to be total derivatives).

As already explained in Section 8.3.6, the first contribution ❶ arising from the mixing between the helicity-0 and -1 modes is the same (in the decoupling limit) as what was obtained in Minkowski (and is independent of the coefficients βn or αn). This implies that the can be directly read of from the three last lines of (8.52). These contributions are the most complicated parts of the decoupling limit but remained unaffected by the dynamics of i.e., unaffected by the bi-gravity nature of the theory. This statement simply follows from scaling considerations. In the decoupling limit there cannot be any mixing between the helicity-1 and neither of the two helicity-2 modes. As a result, the helicity-1 modes only mix with themselves and the helicity-0 mode. Hence, in the scaling limit (8.74, 8.75) the helicity-1 decouples from the massless spin-2 field.

Furthermore, the first line of (8.52) which corresponds to the dynamics of \(h_\mu ^a\) and the helicity-0 mode is also unaffected by the bi-gravity nature of the theory. Hence, the second contribution ❷ is the also the same as previously derived. As a result, the only new ingredient in bi-gravity is the mixing ❸ between the helicity-0 mode and the second helicity-2 mode \(\upsilon _\mu ^a\), given by a fixing of the form hμνXμν.

Unsurprisingly, these new contributions have the same form as ❷, with three distinctions: First the way the coefficients enter in the expressions get modified ever so slightly (β1β1/3 and β3 → 3β3). Second, in the mass term the space-time index for ought to dressed with the Stückelberg field,

$$v_\mu ^a \rightarrow v_b^a{\partial _\mu}{\phi ^b} = v_b^a(\delta _\mu ^b + \Pi _\mu ^b/\Lambda _3^3)\,.$$
(8.76)

Finally, and most importantly, the helicity-2 field \(\upsilon _a^\mu\) (which enters in the mass term) is now a function of the ‘Stückelbergized’ coordinates ϕa, which in the decoupling limit means that for the mass term

$$v_b^a = v_b^a[{x^\mu} + {\partial ^\mu}\pi /\Lambda _3^3] \equiv v_b^a[\tilde x]\,.$$
(8.77)

These two effects do not need to be taken into account for the υ that enters in its standard curvature term as it is Lorentz and diff invariant.

Taking these three considerations into account, one obtains the decoupling limit for bi-gravity,

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{({\rm{bi - gravity}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} - {1 \over 4}{v^{\mu \nu}}[x]\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{v_{\alpha \beta}}[x]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {- {1 \over 2}{{{M_{{\rm{Pl}}}}} \over {{M_f}}}{v^{\mu \beta}}[\tilde x]\left({\delta _\beta ^\nu + {{\Pi _\beta ^\nu} \over {\Lambda _3^3}}} \right)\sum\limits_{n = 0}^3 {{{{{\tilde \beta}_{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}[\Pi ]\,,} \\ \end{array}$$
(8.78)

with \({\tilde \beta _n} = {\beta _n}/(4 - n)!(n - 1)!\). Modulo the non-trivial dependence on the coordinate \(\tilde x = x + \partial \pi/\Lambda _3^3\), this is a remarkable simple decoupling limit for bi-gravity. Out of this decoupling limit we can re-derive all the DL found previously very elegantly.

Notice as well the presence of a tadpole for υ if β1 ≠ 0. When this tadpole vanishes (as well as the one for h), one can further take the limit Mf → ∞ keeping all the other β’s fixed as well as Λ3, and recover straight away the decoupling limit of massive gravity on Minkowski found in (8.52), with a free and fully decoupled massless spin-2 field.

In the presence of a cosmological constant for both metrics (and thus a tadpole in this framework), we can also take the limit Mf → ∞ and recover straight away the decoupling limit of massive gravity on (A)dS, as obtained in (8.66).

This illustrates the strength of this generic decoupling limit for bi-gravity (8.78). In principle we could even go further and derive the decoupling limit of massive gravity on an arbitrary reference metric as performed in [224]. To obtain a general reference metric we first need to add an external source for that generates a background for \({\overset - V _{\mu v}} = {M_f}/{M_{{\rm{Pl}}}}{\overset - U _{\mu v}}\). The reference metric is thus expressed in the local inertial frame as

$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}{\bar V_{\mu \nu}} + {1 \over {4M_f^2}}{\bar V_{\mu \alpha}}{\bar V_{\beta \nu}}{\eta ^{\alpha \beta}} + {1 \over {{M_f}}}{v_{\mu \nu}} + {\mathcal O}(M_f^{- 2})$$
(8.79)
$$= {\eta _{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}{\bar U_{\mu \nu}} + {1 \over {{M_f}}}{v_{\mu \nu}} + {\mathcal O}{({M_{{\rm{Pl}}}},{M_f})^{- 2}}\,.$$
(8.80)

The fact that the metric looks like a perturbation away from Minkowski is related to the fact that the curvature needs to scale as m2 in the decoupling limit in order to avoid the issues previously mentioned in the discussion of Section 8.2.3.

We can then perform the scaling limit Mf → ∞, while keeping the β’s and the scale Λ3 = (MPlm2)1/3 fixed as well as the field υμν and the fixed tensor \({\overset - U _{\mu v}}\). The decoupling limit is then simply given by

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{({\rm{\bar U}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} - {1 \over 2}{{\bar U}^{\mu \beta}}[\tilde x]\left({\delta _\beta ^\nu + {{\Pi _\beta ^\nu} \over {\Lambda _3^3}}} \right)\sum\limits_{n = 0}^3 {{{{{\tilde \beta}_{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}[\Pi ]} \\ {- {1 \over 4}{v^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{v_{\alpha \beta}}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(8.81)

where the helicity-2 field υ fully decouples from the rest of the massive gravity sector on the first line which carries the other helicity-2 field as well as the helicity-1 and -0 modes. Notice that the general metric \(\overset - U\) has only an effect on the helicity-0 self-interactions, through the second term on the first line of (8.81) (just as observed for the decoupling limit on AdS). These new interactions are ghost-free and look like Galileons for conformally flat \({\overset - U _{\mu v}} = \lambda {\eta _{\mu v}}\), with λ constant, but not in general. In particular, the interactions found in (8.81) would not be the covariant Galileons found in [166, 161, 157] (nor the ones found in [237]) for a generic metric.

Extensions of Ghost-free Massive Gravity

Massive gravity can be seen as a theory of a spin-2 field with the following free parameters in addition to the standard parameters of GR (e.g., the cosmological constant, etc…),

  • Reference metric fab,

  • Graviton mass m,

  • (d − 2) dimensionless parameters αn (or the β’s).

As natural extensions of massive gravity one can make any of these parameters dynamical. As already seen, the reference metric can be made dynamical leading to bi-gravity which in addition to massive spin-2 field carries a massless one as well.

Another natural extension is to promote the graviton mass m, or any of the free parameters αn (or βn) to a function of a new dynamical variable, say of an additional scalar field In principle the mass and the parameters α’s can be thought as potentials for an arbitrary number of scalar fields m = m (ψj), αn = αn (ψj), and not necessarily the same fields for each one of them [320]. So long as these functions are pure potentials and hide no kinetic terms for any new degree of freedom, the constraint analysis performed in Section 7 will go relatively unaffected, and the theory remains free from the BD ghost. This was shown explicitly for the mass-varying theory [319, 315] (where the mass is promoted to a scalar function of a new single scalar field, m = m (ϕ), while the parameters α remain constantFootnote 23), as well as a general massive scalar-tensor theory [320], and for quasi-dilaton which allow for different couplings between the spin-2 and the scalar field, motivated by scale invariance. We review these models below in Sections 9.1 and 9.2.

Alternatively, rather than considering the parameters and as arbitrary, one may set them to special values of special interest depending on the reference metric fμν. Rather than an ‘extension’ per se this is more special cases in the parameter space. The first obvious one is m = 0 (for arbitrary reference metric and parameters), for which one recovers the theory of GR (so long as the spin-2 field couples to matter in a covariant way to start with). Alternatively, one may also sit on the Higuchi bound, (see Section 8.3.6) with the parameters m2 = 2H2, α3 = −1/3 and α4 = 1/12 in four dimensions. This corresponds to the partially massless theory of gravity, which at the moment is pathological in its simplest realization and will be reviewed below in Section 9.3.

The coupling massive gravity to a DBI Galileon [157] was considered in [237, 461, 261] leading to a generalized Galileon theory which maintains a Galileon symmetry on curved backgrounds. This theory was shown to be free of any Ostrogradsky ghost in [19] and the cosmology was recently studied in [315] and perturbations in [20].

Finally, as other extensions to massive gravity, one can also consider all the extensions applicable to GR. This includes the higher order Lovelock invariants in dimensions greater than four, as well as promoting the Einstein-Hilbert kinetic term to a function f (R), which is equivalent to gravity with a scalar field. In the case of massive gravity this has been performed in [89] (see also [46, 354]), where the absence of BD ghost was proven via a constraint analysis, and the cosmology was explored (this was also discussed in Section 5.6 and see also Section 12.5). f (R) extensions to bi-gravity were also derived in [416, 415].

Trace-anomaly driven inflation in bi-gravity was also explored in Ref. [47]. Massless quantum effects can be taking into account by including the trace anomaly \({{\mathcal T}_A}\) given as [203]

$${\mathcal{T}_A} = {c_1}({1 \over 3}{R^2} - 2R_{\mu \nu}^2 + R_{\mu \nu \alpha \beta}^2 + {2 \over 3}\square R) + {c_2}({R^2} - 4R_{\mu \nu}^2 + R_{\mu \nu \alpha \beta}^2) + {c_3}\square R,$$
(9.1)

where c1,2,3 are three constants depending on the field content (for instance the number of scalars, spinors, vectors, graviton etc.) Including this trace anomaly to the bi-gravity de Sitter-like solutions were found which could represent a good model for anomaly-driven models of inflation.

Mass-varying

The idea behind mass-varying gravity is to promote the graviton mass to a potential for an external scalar field ψ, mm (ψ), which has its own dynamics [319], so that in four dimensions, the dRGT action for massive gravity gets promoted to

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Mass - Varying}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left({R + {{{m^2}(\psi)} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {\mathcal{L}_n}[\mathcal{K}]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\ {\left. {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\psi {\partial _\nu}\psi - W(\psi)} \right),} \end{array}$$
(9.2)

and the tensors \({\mathcal K}\) are given in (6.7). This could also be performed for bi-gravity, where we would simply include the Einstein-Hilbert term for the metric fμν. This formulation was then promoted not only to varying parameters αnαn (ψ) but also to multiple fields αA, with \(A = 1, \cdots {\mathcal N}\) in [320],

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Generalized}}\;{\rm{MG}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {\Omega ({\psi _A})R + {1 \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} ({\psi _A}){\mathcal{L}_n}[\mathcal{K}]} \right.} \\ {\left. {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}{\psi _A}{\partial _\nu}{\psi ^A} - W({\psi _A})} \right].} \end{array}$$
(9.3)

The absence of BD ghost in these theories were performed in [319] and [320] in unitary gauge, in the ADM language by means of a constraint analysis as formulated in Section 7.1. We recall that in the absence of the scalar field ψ, the primary second-class (Hamiltonian) constraint is given by

$${\mathcal{C}_0} = {\mathcal{R}_0}(\gamma ,p) + {D^i}_j{n^j}{\mathcal{R}_i}(\gamma ,p) + {m^2}{\mathcal{U}_0}(\gamma ,n(\gamma ,p)) \approx 0.$$
(9.4)

In the case of a mass-varying theory of gravity, the entire argument remains the same, with the simple addition of the scalar field contribution,

$$\begin{array}{*{20}c} {\mathcal{C}_0^{{\rm{mass - varying}}} = {{\tilde{\mathcal{R}}}_0}(\gamma ,p,\psi ,{p_\psi}) + {D^i}_j{n^j}{{\tilde{\mathcal{R}}}_i}(\gamma ,p,\psi ,{p_\psi}) + {m^2}(\psi){\mathcal{U}_0}(\gamma ,n(\gamma ,p))} \\ {\approx 0,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \end{array}$$
(9.5)

where pψ is the conjugate momentum associated with the scalar field ψ and

$${\tilde{\mathcal{R}}_0}(\gamma ,p,\psi ,{p_\psi}) = {\mathcal{R}_0}(\gamma ,p) + {1 \over 2}\sqrt \gamma {\partial _i}\psi {\partial ^i}\psi + {1 \over {2\sqrt \gamma}}p_\psi ^2$$
(9.6)
$${\tilde{\mathcal{R}}_i}(\gamma ,p,\psi ,{p_\psi}) = {\mathcal{R}_i}(\gamma ,p) + {p_\psi}{\partial _i}\psi .$$
(9.7)

then the time-evolution of this primary constraint leads to a secondary constraint similarly as in Section 7.1. The expression for this secondary constraint is the same as in (7.33) with a benign new contribution from the scalar field [319]

$$\begin{array}{*{20}c} {{{\tilde{\mathcal{C}}}_2} = {\mathcal{C}_2} + {{\partial {m^2}(\psi)} \over {\partial \psi}}\left[ {{\mathcal{U}_0}{\partial _i}\psi (\bar{\mathcal{N}} {n^i} + {{\bar{\mathcal{N}}}^i}) + {{\bar{\mathcal{N}}} \over {\sqrt \gamma}}{\mathcal{U}_1}{p_\psi} + \bar{\mathcal{N}} {\partial _i}\psi {D^i}_k{n^k}} \right]} \\ {\approx 0.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \end{array}$$
(9.8)

then as in the normal fixed-mass case, the tertiary constraint is a constraint for the lapse and the system of constraint truncates leading to 5+1 physical degrees of freedom in four dimensions. The same logic goes through for generalized massive gravity as explained in [320].

One of the important aspects of a mass-varying theory of massive gravity is that it allows more flexibility for the graviton mass. In the past the mass could have been much larger and could have lead to potential interesting features, be it for inflation (see for instance Refs. [315, 378] and [282]), the Hartle-Hawking no-boundary proposal [498, 439, 499], or to avoid the Higuchi bound [307], and yet be compatible with current bounds on the graviton mass. If the graviton mass is an effective description from higher dimensions it is also quite natural to imagine that the graviton mass would depend on some moduli.

Quasi-dilaton

The Planck scale Mpl, or Newton constant explicitly breaks scale invariance, but one can easily extend the theory of GR to a scale invariant one MPlMPleλ(x) by including a dilaton scalar field λ which naturally arises from string theory or from extra dimension compactification (see for instance [122] and see Refs. [429, 120, 248] for the role of a dilaton scalar field on cosmology).

When dealing with multi-gravity, one can extend the notion of conformal transformation to the global rescaling of the coordinate system of one metric with respect to that of another metric. In the case of massive gravity this amounts to considering the global rescaling of the reference coordinates with respect to the physical one. As already seen, the reference metric can be promoted to a tensor with respect to transformations of the physical metric coordinates, by introducing four Stückelberg fields ϕa, fμνfab μ ϕaνϕb. Thus the theory can be made invariant under global rescaling of the reference metric if the reference metric is promoted to a function of the quasi-dilaton scalar field σ,

$${f_{ab}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b} \rightarrow {e^{2\sigma /{M_{{\rm{Pl}}}}}}{f_{ab}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}.$$
(9.9)

This is the idea behind the quasi-dilaton theory of massive gravity proposed in Ref. [119]. The theoretical consistency of this model was explored in [119] and is reviewed below. The Vainshtein mechanism and the cosmology were also explored in [119, 118] as well as in Refs. [288, 243, 127] and we review the cosmology in Section 12.5. As we shall see in that section, one of the interests of quasi-dilaton massive gravity is the existence of spatially flat FLRW solutions, and particularly of self-accelerating solutions. Nevertheless, such solutions have been shown to be strongly coupled within the region of interest [118], but an extension of that model was proposed in [127] and shown to be free from such issues.

Recently, the decoupling limit of the original quasi-dilaton model was derived in [239]. Interestingly, a new self-accelerating solution was found in this model which admits no instability and all the modes are (sub)luminal for a given realistic set of parameters. The extension of this solution to the full theory (beyond the decoupling limit) should provide for a consistent self-accelerating solution which is guaranteed to be stable (or with a harmless instability time scale of the order of the age of the Universe at least).

Theory

As already mentioned, the idea behind quasi-dilaton massive gravity (QMG) is to extend massive gravity to a theory which admits a new global symmetry. This is possible via the introduction of a quasi-dilaton scalar field σ (x). The action for QMG is thus given by

$$\begin{array}{*{20}c} {{S_{{\rm{QMG}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {R - {\omega \over {2M_{{\rm{Pl}}}^2}}{{(\partial \sigma)}^2} + {{{m^2}} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {\mathcal{L}_n}[\tilde{\mathcal{K}}[g,\eta ]]} \right]} \\ {+ \int {{{\rm{d}}^4}} x\sqrt {- g} {\mathcal{L}_{{\rm{matter}}}}(g,\psi),} \end{array}$$
(9.10)

where ψ represent the matter fields, g is the dynamical metric, and unless specified otherwise all indices are raised and lowered with respect to g, and represents the scalar curvature with respect to g. The Lagrangians n were expressed in (6.96.13) or (6.146.18) and the tensor \(\tilde K\) is given in terms of the Stückelberg fields as

$${\tilde{\mathcal{K}}^\mu}_\nu [g,\eta ] = {\delta ^\mu}_\nu - {e^{\sigma /{M_{{\rm{Pl}}}}}}\sqrt {{g^{\mu \alpha}}{\partial _\alpha}{\phi ^a}{\partial _\nu}{\phi ^b}{\eta _{ab}}} .$$
(9.11)

In the case of the QMG presented in [119], there is no cosmological constant nor tadpole (α0 = α1 = 0) and α2 = 1. This is a very special case of the generalized theory of massive gravity presented in [320], and the proof for the absence of BD ghost thus goes through in the same way. Here again the presence of the scalar field brings only minor modifications to the Hamiltonian analysis in the ADM language as presented in Section 9.1, and so we do not reproduce the proof here. We simply note that the theory propagates six degrees of freedom in four dimensions and is manifestly free of any ghost on flat space time provided that ω > 1/6. The key ingredient compared to mass-varying gravity or generalized massive gravity is the presence of a global rescaling symmetry which is both a space-time and internal transformation [119],

$${x^\mu} \rightarrow {e^\xi}{x^\mu},\quad {g_{\mu \nu}} \rightarrow {e^{- 2\xi}}{g_{\mu \nu}},\quad \sigma \rightarrow \sigma - {M_{{\rm{Pl}}}}\xi ,\quad {\rm{and}}\quad {\phi ^a} \rightarrow {e^\xi}{\phi ^a}.$$
(9.12)

Notice that the matter action \({{\rm{d}}^{\rm{4}}}x\sqrt {- g} {\mathcal L}(g,\psi)\) breaks this symmetry, reason why it is called a ‘quasi-dilaton’.

An interesting feature of QMG is the fact that the decoupling limit leads to a bi-Galileon theory, one Galileon being the helicity-0 mode presented in Section 8.3, and the other Galileon being the quasi-dilaton σ. Just as in massive gravity, there are no irrelevant operators arising at energy scale below Λ3, and at that scale the theory is given by

$$\mathcal{L}_{{\Lambda _3}}^{({\rm{QMG}})} = \mathcal{L}_{{\Lambda _3}}^{(0)} - {\omega \over 2}{(\partial \sigma)^2} + {1 \over 2}\sigma \sum\limits_{n = 1}^4 {{{(4 - n){\alpha _n} - (n + 1){\alpha _{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} {\mathcal{L}_n}[\Pi ],$$
(9.13)

where the decoupling limit Lagrangian \({\mathcal L}_{{\Lambda _3}}^{(0)}\) in the absence of the quasi-dilaton is given in (8.52) and we recall that \({\alpha _2} = 1,\,{\alpha _1} = 0,\,\Pi _{\,\,\,v}^\mu = {\partial ^\mu}{\partial _v}\pi\) and the Lagrangians n are expressed in (6.10)(6.13) or (6.15)(6.18). We see emerging a bi-Galileon theory for π and σ, and thus the decoupling limit is manifestly ghost-free. We could then apply a similar argument as in Section 7.2.4 to infer the absence of BD ghost for the full theory based on this decoupling limit. Up to integration by parts, the Lagrangian (9.13) is invariant under both independent Galilean transformation ππ +c +υμxμ and \(\sigma \to \sigma + \tilde c + {\tilde \upsilon _\mu}{x^\mu}\).

One of the relevance of this decoupling limit is that it makes the study of the Vainshtein mechanism more explicit. As we shall see in what follows (see Section 10.1), the Galileon interactions are crucial for the Vainshtein mechanism to work.

Note that in (9.13), the interactions with the quasi-dilaton come in the combination ((4 − n)αn −(n + 1)αn+1), while in \({\mathcal L}_{{\Lambda _3}}^{(0)}\), the interactions between the helicity-0 and -2 modes come in the combination ((4 − n)αn + (n + 1)αn+1). This implies that in massive gravity, the interactions between the helicity-2 and -0 mode disappear in the special case where αn = −(n + 1)/(4 − n)αn+1 (this corresponds to the minimal model), and the Vainshtein mechanism is no longer active for spherically symmetric sources (see Refs. [99, 56, 58, 57, 435]). In the case of QMG, the interactions with the quasi-dilaton survive in that specific case α3 = −4α4, and a Vainshtein mechanism could still be feasible, although one might still need to consider non-asymptotically Minkowski configurations.

The cosmology of QMD was first discussed in [119] where the existence of self-accelerating solutions was pointed out. This will be reviewed in the section on cosmology, see Section 12.5. We now turn to the extended version of QMG recently proposed in Ref. [127].

Extended quasi-dilaton

Keeping the same philosophy as the quasi-dilaton in mind, a simple but yet powerful extension was proposed in Ref. [127] and then further extended in [126], leading to interesting phenomenology and stable self-accelerating solutions. The phenomenology of this model was then further explored in [45]. The stability of the extended quasi-dilaton theory of massive gravity was explored in [353] and was proven to be ghost-free in [406].

The key ingredient behind the extended quasi-dilaton theory of massive gravity (EMG) is to notice that two most important properties of QMG namely the absence of BD ghost and the existence of a global scaling symmetry are preserved if the covariantized reference metric is further generalized to include a disformal contribution of the form μσ ∂νσ (such a contribution to the reference metric can arise naturally from the brane-bending mode in higher dimensional braneworld models, see for instance [157]).

The action for EMG then takes the same form as in (9.10) with the tensor \(\tilde {\mathcal K}\), promoted to

$$\tilde{\mathcal{K}} \rightarrow \bar{\mathcal{K}} = \mathbb{I}- {e^{\sigma /{M_{{\rm{Pl}}}}}}\sqrt {{g^{- 1}}\bar f} ,$$
(9.14)

with the tensor defined as

$${\bar f_{\mu \nu}} = {\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{\eta _{ab}} - {{{\alpha _\sigma}} \over {{M_{{\rm{Pl}}}}\Lambda _3^3}}{e^{- 2\sigma /{M_{{\rm{Pl}}}}}}{\partial _\mu}\sigma {\partial _\nu}\sigma ,$$
(9.15)

where ασ is a new coupling dimensionless constant (as mentioned in [127], this coupling constant is expected to enjoy a non-renormalization theorem in the decoupling limit, and thus to receive quantum corrections which are always suppressed by at least \({m^2}/\Lambda _3^2\). Furthermore, this action can be generalized further by

  • Considering different coupling constants for the \(\tilde {\mathcal K}\)’s entering in \({{\mathcal L}_2}[\tilde {\mathcal K}]\), \({{\mathcal L}_3}[\tilde {\mathcal K}]\) and \({{\mathcal L}_4}[\tilde {\mathcal K}]\).

  • One can also introduce what would be a cosmological constant for the metric \(\bar f\), namely a new term of the for \(\sqrt {- \bar f} {e^{4\sigma/{M_{{\rm{Pl}}}}}}\).

  • General shift-symmetric Horndeski Lagrangians for the quasi-dilaton.

With these further generalizations, one can obtain self-accelerating solutions similarly as in the original QMG. For these self-accelerating solutions, the coupling constant does not enter the background equations of motion but plays a crucial role for the stability of the scalar perturbations on top of these solutions. This is one of the benefits of this extended quasi-dilaton theory of massive gravity.

Partially massless

Motivations behind PM gravity

The multiple proofs for the absence of BD ghost presented in Section 7 ensures that the ghost-free theory of massive gravity, (or dRGT) does not propagate more than five physical degrees of freedom in the graviton. For a generic finite mass m the theory propagates exactly five degrees of freedom as can be shown from a linear analysis about a generic background. Yet, one can ask whether there exists special points in parameter space where some of degrees of freedom decouple. General relativity, for which m = 0 (and the other parameters αn are finite) is one such example. In the massless limit of massive gravity the two helicity-1 modes and the helicity-0 mode decouple from the helicity-2 mode and we thus recover the theory of a massless spin-2 field corresponding to GR, and three decoupled degrees of freedom. The decoupling of the helicity-0 mode occurs via the Vainshtein mechanismFootnote 24 as we shall see in Section 10.1.

As seen in Section 8.3.6, when considering massive gravity on de Sitter as a reference metric, if the graviton mass is precisely m2 = 2H2, the helicity-0 mode disappears linearly as can be seen from the linearized Lagrangian (8.62). The same occurs in any dimension when the graviton mass is tied to the de Sitter curvature by the relation m2 = (d − 2)H2. This special case is another point in parameter space where the helicity-0 mode could be decoupled, corresponding to a partially massless (PM) theory of gravity as first pointed out by Deser and Waldron [190, 189, 188], (see also [500] for partially massless higher spin, and [450] for related studies).

The absence of helicity-0 mode at the linearized level in PM is tied to the existence of a new scalar gauge symmetry at the linearized level when m2 = 2H2 (or (d − 2)H2 in arbitrary dimensions), which is responsible for making the helicity-0 mode unphysical. Indeed the action (8.62) is invariant under a special combination of a linearized diff and a conformal transformation [190, 189, 188],

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + {\nabla _\mu}{\nabla _\nu}\xi - (d - 2){H^2}\xi {\gamma _{\mu \nu}}.$$
(9.16)

If a non-linear completion of PM gravity exist, then there must exist a non-linear completion of this symmetry which eliminates the helicity-0 mode to all orders. The existence of such a symmetry would lead to several outstanding features:

  • It would protect the structure of the potential.

  • In the PM limit of massive gravity, the helicity-0 mode fully decouples from the helicity-2 mode and hence from external matter. As a consequence, there is no Vainshtein mechanism that decouples the helicity-0 mode in the PM limit of massive gravity unlike in the massless limit. Rather, the helicity-0 mode simply decouples without invoking any strong coupling effects and the theoretical and observational luggage that goes with it.

  • Last but not least, in PM gravity the symmetry underlying the theory is not diffeomorphism invariance but rather the one pointed out in (9.16). This means that in PM gravity, an arbitrary cosmological constant does not satisfy the symmetry (unlike in GR). Rather, the value of the cosmological constant is fixed by the gauge symmetry and is proportional to the graviton mass. As we shall see in Section 10.3 the graviton mass does not receive large quantum corrections (it is technically natural to set to small values). So, if a PM theory of gravity existed it would have the potential to tackle the cosmological constant problem.

Crucially, breaking of covariance implies that matter is no longer covariantly conserved. Instead the failure of energy conservation is proportional to the graviton mass,

$${\nabla _\mu}{\nabla _\nu}{T^{\mu \nu}} = - {{{m^2}} \over {d - 2}}T,$$
(9.17)

which in practise is extremely small.

It is worth emphasizing that if a PM theory of gravity existed, it would be distinct from the minimal model of massive gravity where the non-linear interactions between the helicity-0 and -2 modes vanish in the decoupling limit but the helicity-0 mode is still fully present. PM gravity is also distinct from some specific branches of solutions found in cosmology (see Section 12) on top of which the helicity-0 mode disappears. If a PM theory of gravity exists the helicity-0 mode would be fully absent of the whole theory and not only for some specific branches of solutions.

The search for a PM theory of gravity

A candidate for PM gravity:

The previous considerations represent some strong motivations for finding a fully fledged theory of PM gravity (i.e., beyond the linearized theory) and there has been many studies to find a nonlinear realization of the PM symmetry. So far all these studies have in common to keep the kinetic term for gravity unchanged (i.e., keeping the standard Einstein-Hilbert action, with a potential generalization to the Lovelock invariants [298]).

Under this assumption, it was shown in [501, 330], that while the linear level theory admits a symmetry in any dimensions, at the cubic level the PM symmetry only exists in d = 4 spacetime dimensions, which could make the theory even more attractive. It was also pointed out in [191] that in four dimensions the theory is conformally invariant. Interestingly, the restriction to four dimensions can be lifted in bi-gravity by including the Lovelock invariants [298].

From the analysis in Section 8.3.6 (see Ref. [154]) one can see that the helicity-0 mode entirely disappears from the decoupling limit of ghost-free massive gravity, if one ignores the vectors and sets the parameters of the theory to m2 = 2H2, a.3 = −1 and α.4 = 1/4 in four dimensions. The ghost-free theory of massive gravity with these parameters is thus a natural candidate for the PM theory of gravity. Following this analysis, it was also shown that bi-gravity with the same parameters for the interactions between the two metrics satisfies similar properties [301]. Furthermore, it was also shown in [147] that the potential has to follow the same structure as that of ghost-free massive gravity to have a chance of being an acceptable candidate for PM gravity. In bi-gravity the same parameters as for massive gravity were considered as also being the natural candidate [301], in addition of course to other parameters that vanish in the massive gravity limit (to make a fair comparison once needs to take the massive gravity limit of bi-gravity with care as was shown in [301]).

Re-appearance of the Helicity-0 mode:

Unfortunately, when analyzing the interactions with the vector fields, it is clear from the decoupling limit (8.52) that the helicity-0 mode reappears non-linearly through their couplings with the vector fields. These never cancel, not even in four dimensions and for no parameters of theory. So rather than being free from the helicity-0 mode, massive gravity with m2 = (d − 2)H2 has an infinitely strongly coupled helicity-0 mode and is thus a sick theory. The absence of the helicity-0 mode is simple artefact of the linear theory.

As a result we can thus deduce that there is no theory of PM gravity. This result is consistent with many independent studies performed in the literature (see Refs. [185, 147, 181, 194]).

Relaxing the assumptions:
  • One assumption behind this result is the form of the kinetic term for the helicity-2 mode, which is kept to the be Einstein-Hilbert term as in GR. A few studies have considered a generalization of that kinetic term to diffeomorphism-breaking ones [231, 310] however further analysis [339, 153] have shown that such interactions always lead to ghosts nonperturbatively. See Section 5.6 for further details.

  • Another potential way out is to consider the embedding of PM within bi-gravity or multigravity. Since bi-gravity is massive gravity and a decoupled massless spin-2 field in some limit it is unclear how bi-gravity could evade the results obtained in massive gravity but this approach has been explored in [301, 298, 299, 184]. A perturbative relation between bi-gravity and conformal gravity was derived at the level of the equations of motion in Ref. [299] (unlike claimed in [184]).

  • The other assumptions are locality and Lorentz-invariance. It is well known that Lorentz-breaking theories of massive gravity can excite fewer than five degrees of freedom. This avenue is explored in Section 14.

To summarize there is to date no known non-linear PM symmetry which could project out the helicity-0 mode of the graviton while keeping the helicity-2 mode massive in a local and Lorentz invariant way.

Massive Gravity Field Theory

Vainshtein mechanism

As seen earlier, in four dimensions a massless spin-2 field has five degrees of freedom, and there is no special PM case of gravity where the helicity-0 mode is unphysical while the graviton remains massive (or at least there is to date no known such theory). The helicity-0 mode couples to matter already at the linear level and this additional coupling leads to a extra force which is at the origin of the vDVZ discontinuity see in Section 2.2.3. In this section, we shall see how the non-linearities of the helicity-0 mode is responsible for a Vainshtein mechanism that screens the effect of this field in the vicinity of matter.

Since the Vainshtein mechanism relies strongly on non-linearities, this makes explicit solutions very hard to find. In most of the cases where the Vainshtein mechanism has been shown to work successfully, one assumes a static and spherically symmetric background source. Already in that case the existence of consistent solutions which extrapolate from a well-behaved asymptotic behavior at infinity to a screened solution close to the source are difficult to obtain numerically [121] and were only recently unveiled [37, 39] in the case of non-linear Fierz-Pauli gravity.

This review on massive gravity cannot do justice to all the ongoing work dedicated to the study of the Vainshtein mechanism (also sometimes called ‘kinetic chameleon’ as it relies on the kinetic interactions for the helicity-0 mode). In what follows, we will give the general idea behind the Vainshtein mechanism starting from the decoupling limit of massive gravity and then show explicit solutions in the decoupling limit for static and spherically symmetric sources. Such an analysis is relevant for observational tests in the solar system as well as for other astrophysical tests (such as binary pulsar timing), which we shall explore in Section 11. We refer to the following review on the Vainshtein mechanism for further details, [35] as well as to the following work [160, 38, 99, 332, 36, 244, 40, 338, 321, 440, 316, 53, 376, 366, 407]. Recently, it was also shown that the Vainshtein mechanism works for bi-gravity, see Ref. [34].

We focus the rest of this section to the case of four space-time dimensions, although many of the results presented in what follows are well understood in arbitrary dimensions.

Effective coupling to matter

As already mentioned, the key ingredient behind the Vainshtein mechanism is the importance of interactions for the helicity-0 mode which we denote as π. From the decoupling limit analysis performed for massive gravity (see (8.52)) and bi-gravity (see (8.78)), we see that in some limit the helicity-0 mode π behaves as a scalar field, which enjoys a special global symmetry

$$\pi \rightarrow \pi + c + {v_\mu}{x^\mu},$$
(10.1)

and yet only carries two derivatives at the level of the equations of motion, (which as we have seen is another way to see the absence of BD ghost).

These types of interactions are very similar to the Galileon-type of interactions introduced by Nicolis, Rattazzi and Trincherini in Ref. [412] as a generalization of the decoupling limit of DGP. For simplicity we shall focus most of the discussion on the Vainshtein mechanism with Galileons as a special example, and then mention in Section 10.1.3 peculiarities that arise in the special case of massive gravity (see for instance Refs. [58, 57]).

We thus start with a cubic Galileon theory

$$\mathcal{L} = - {1 \over 2}{(\partial \pi)^2} - {1 \over {{\Lambda ^3}}}{(\partial \pi)^2}\square \pi + {1 \over {{M_{{\rm{Pl}}}}}}\pi T,,$$
(10.2)

where \(T = T_\mu ^\mu\) is the trace of the stress-energy tensor of external sources, and Λ is the strong coupling scale of the theory. As seen earlier, in the case of massive gravity, Λ = Λ3 = (m2MPl)1/3. This is actually precisely the way the helicity-0 mode enters in the decoupling limit of DGP [389] as seen in Section 4.2. It is in that very context that the Vainshtein mechanism was first shown to work explicitly [165].

The essence of the Vainshtein mechanism is that close to a source, the Galileon interactions dominate over the linear piece. We make use of this fact by splitting the source into a background contribution T0 and a perturbation δT. The background source T0 leads to a background profile π0 for the field, and the response to the fluctuation δT on top of this background is given by so that the total field is expressed as

$$\pi = {\pi _0} + \phi .$$
(10.3)

for a sufficiently large source (or as we shall see below if T0 represents a static point-like source, then sufficiently close to the source), the non-linearities dominate and symbolically 2π0 ≫ Λ3.

We now follow the perturbations in the action (10.2) and notice that the background configuration π0 leads to a modified effective metric for the perturbations,

$${\mathcal{L}^{(2)}} = - {1 \over 2}{Z^{\mu \nu}}({\pi _0}){\partial _\mu}\phi {\partial _\nu}\phi + {1 \over {{M_{{\rm{Pl}}}}}}\phi \delta T,$$
(10.4)

up to second order in perturbations, with the new effective metric

$${Z^{\mu \nu}} = {\eta ^{\mu \nu}} + {2 \over {{\Lambda ^3}}}{X^{(1)\mu \nu}}({\Pi _0}),$$
(10.5)

where the tensor X(1) is the same as that defined for massive gravity in (8.29) or in (8.34), so symbolically Z is of the form \(Z \sim 1 + {{{\partial ^2}{\pi _0}} \over {{\Lambda ^3}}}\). One can generalize the initial action (10.2) to arbitrary set of Galileon interactions

$$\mathcal{L} = \pi \sum\limits_{n = 1}^4 {{{{c_{n + 1}}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}[\Pi ],$$
(10.6)

with again Πμν = μν π and where the scalars n have been defined in (6.10)(6.13). The effective metric would then be of the form

$${Z^{\mu \nu}}({\pi _0}) = \sum\limits_{n = 1}^4 {{{n(n + 1){c_n}} \over {{\Lambda ^{3(n - 1)}}}}} {X^{(n - 1)\mu \nu}}({\Pi _0}),$$
(10.7)

where all the tensors \(X_{\mu v}^{(n)}\) are defined in (8.28)(8.32). Notice that μZμν = 0 identically. For sufficiently large sources, the components of Z are large, symbolically, Z ∼ (2π03)n ≫ 1 for n ≥ 1.

Canonically normalizing the fluctuations in (10.4), we have symbolically,

$$\hat \phi = \sqrt Z \phi ,$$
(10.8)

assuming Zμν μν, which is not generally the case. Nevertheless, this symbolic scaling is sufficient to get the essence of the idea. For a more explicit canonical normalization in specific configurations see Ref. [412]. As nicely explained in that reference, if Zμν is conformally flat, one should not only scale the field \(\phi \to \hat \phi\) but also the space-like coordinates \(x \to \hat x\) so at to obtain a standard canonically normalized field in the new system, \(\int {{{\rm{d}}^{\rm{4}}}\tilde x - {1 \over 2}{{({\partial _{\tilde x}}\hat \phi)}^2}}\). For now we stick to the simple normalization (10.8) as it is sufficient to see the essence of the Vainshtein mechanism. In terms of the canonically normalized field \(\hat \phi\), the perturbed action (10.4) is then

$${\mathcal{L}^{(2)}} = - {1 \over 2}{(\partial \hat \phi)^2} + {1 \over {{M_{{\rm{Pl}}}}\sqrt {Z({\pi _0})}}}\hat \phi \delta T,$$
(10.9)

which means that the coupling of the fluctuations to matter is medium dependent and can arise at a scale very different from the Planck scale. In particular, for a large background configuration, ∂2π0 ≫ Λ3 and Z (π0) ≫ 1, so the effective coupling scale to external matter is

$${M_{{\rm{eff}}}} = {M_{{\rm{Pl}}}}\sqrt Z \gg {M_{{\rm{Pl}}}},$$
(10.10)

and the coupling to matter is thus very suppressed. In massive gravity Λ is related to the graviton mass, Λ ∼ m2/3, and so the effective coupling scale Meff → ∞ as m → 0, which shows how the helicity-0 mode characterized by decouples in the massless limit.

We now first review how the Vainshtein mechanism works more explicitly in a static and spherically symmetric configuration before applying it to other systems. Note that the Vainshtein mechanism relies on irrelevant operators. In a standard EFT this cannot be performed without going beyond the regime of validity of the EFT. In the context of Galileons and other very specific derivative theories, one can reorganize the EFT so that the operators considered can be large and yet remain within the regime of validity of the reorganized EFT. This will be discussed in more depth in what follows.

Static and spherically symmetric configurations in Galileons

Suppression of the force

We now consider a point like source

$${T_0} = - M{\delta ^{(3)}}(r) = - M{{\delta (r)} \over {4\pi {r^2}}},$$
(10.11)

where M is the mass of the source localized at r = 0. Since the source is static and spherically symmetric, we can focus on configurations which respect the same symmetry, π0 = π0(r). The background configuration for the field π0(r) in the case of the cubic Galileon (10.2) satisfies the equation of motion [411]

$${1 \over {{r^2}}}{\partial _r}\left[ {{r^3}\left({{{\pi _0{\prime} (r)} \over r} + {1 \over {{\Lambda ^3}}}{{\left({{{\pi _0{\prime} (r)} \over r}} \right)}^2}} \right)} \right] = {M \over {4\pi {M_{{\rm{Pl}}}}}}{{\delta (r)} \over {{r^2}}},$$
(10.12)

and so integrating both sides of the equation, we obtain an algebraic equation for π0 (r),

$${{\pi _0{\prime} (r)} \over r} + {1 \over {{\Lambda ^3}}}{\left({{{\pi _0{\prime} (r)} \over r}} \right)^2} = {M \over {{M_{{\rm{Pl}}}}}}{1 \over {4\pi {r^3}}}.$$
(10.13)

We can define the Vainshtein or strong coupling radius r* as

$${r_{\ast}} = {1 \over \Lambda}{\left({{M \over {4\pi {M_{{\rm{Pl}}}}}}} \right)^{1/3}},$$
(10.14)

so that at large distances compared to that Vainshtein radius the linear term in (10.12) dominates while the interactions dominate at distances shorter than r*,

$$\begin{array}{*{20}c} {{\rm{for}}\ r \gg {r_{\ast}},\quad \pi _0{\prime} (r)\sim{M \over {4\pi {M_{{\rm{Pl}}}}}}{1 \over {{r^2}}}\quad \quad} \\ {{\rm{for}}\ r \ll {r_{\ast}},\quad \pi _0{\prime} (r)\sim{M \over {4\pi {M_{{\rm{Pl}}}}}}{1 \over {r_{\ast}^{3/2}{r^{1/2}}}}.} \end{array}$$
(10.15)

So, at large distances rr*, one recovers a Newton square law for the force mediated by π, and that fields mediates a force which is just a strong as standard gravity (i.e., as the force mediated by the usual helicity-2 modes of the graviton). On shorter distances scales, i.e., close to the localized source, the force mediated by the new field π is much smaller than the standard gravitational one,

$${{F_{r \ll {r_{\ast}}}^{(\pi)}} \over {{F_{{\rm{Newt}}}}}} \sim {\left({{r \over {{r_{\ast}}}}} \right)^{3/2}} \ll 1\quad {\rm{for}}\quad r \ll {r_ \star}.$$
(10.16)

In the case of the quartic Galileon (which typically arises in massive gravity), the force is even suppressed and goes as

$${{F_{r \ll {r_{\ast}}}^{({\rm{quartic}}\;\;\pi)}} \over {{F_{{\rm{Newt}}}}}} \sim {\left({{r \over {{r_{\ast}}}}} \right)^2} \ll 1\quad {\rm{for}}\quad r \ll {r_ \star}.$$
(10.17)

for a graviton mass of the order of the Hubble parameter today, i.e., Λ ∼ (1000 km)−1, then taking into account the mass of the Sun, the force at the position of the Earth is suppressed by 12 orders of magnitude compared to standard Newtonian force in the case of the cubic Galileon and by 16 orders of magnitude in the quartic Galileon. This means that the extra force mediated by is utterly negligible compared to the standard force of gravity and deviations to GR are extremely small.

Considering the Earth-Moon system, the force mediated by at the surface of the Moon is suppressed by 13 orders of magnitude compared to the Newtonian one in the cubic Galileon. While small, this is still not far off from the possible detectability from the lunar laser ranging space experiment [488], as will be discussed further in what follows. Note that in the quartic Galileon, that force is suppressed instead by 17 orders of magnitude and is there again very negligible.

When applying this naive estimate (10.16) to the Hulse-Taylor system for instance, we would infer a suppression of 15 orders of magnitude compared to the standard GR results. As we shall see in what follows this estimate breaks down when the time evolution is not negligible. These points will be discussed in the phenomenology Section 11, but before considering these aspects we review in what follows different aspects of massive gravity from a field theory perspective, emphasizing the regime of validity of the theory as well as the quantum corrections that arise in such a theory and the emergence of superluminal propagation.

Perturbations

We now consider perturbations riding on top of this background configuration for the Galileon field, π = π0(r) + ϕ (xμ). As already derived in Section 10.1.1, the perturbations ϕ see the effective space-dependent metric given in (10.7). Focusing on the cubic Galileon for concreteness, the background solution for π0 is given by (10.13). In that case the effective metric is

$${Z^{\mu \nu}} = {\eta ^{\mu \nu}} + {4 \over {{\Lambda ^3}}}\left(\square{{\pi _0}{\eta ^{\mu \nu}} - {\partial ^\mu}{\partial ^\nu}{\pi _0}} \right)$$
(10.18)
$$\begin{array}{*{20}c} {{Z_{\mu \nu}}\;\;{\rm{d}}{x^\mu}\;\;{\rm{d}}{x^\nu} = - \left({1 + {4 \over {{\Lambda ^3}}}\left({{{2\pi _0{\prime}} \over r} + \pi _0{\prime}{\prime}(r)} \right)} \right)\;\;{\rm{d}}{t^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ \left({{{1 + 8\pi _0{\prime} (r)} \over {r{\Lambda ^3}}}} \right)\;\;{\rm{d}}{r^2} + \left({1 + {4 \over {{\Lambda ^3}}}\left({{{\pi _0{\prime} (r)} \over r} + \pi \prime {\prime _0}(r)} \right)} \right){r^2}\;\;{\rm{d}}\Omega _2^2,} \\ \end{array}$$
(10.19)

so that close to the source, for rr*,

$${Z_{\mu \nu}}\;\;{\rm{d}}{x^\mu}\;\;{\rm{d}}{x^\nu} = 6{\left({{{{r_{\ast}}} \over r}} \right)^{1/2}}\left({- {\rm{d}}{t^2} + {4 \over 3}\;\;{\rm{d}}{r^2} + {1 \over 3}{r^2}\;\;{\rm{d}}\Omega _2^2} \right) + O{({r_{\ast}}/r)^0}.$$
(10.20)

A few comments are in order:

  • First, we recover \(Z \sim \sqrt {{r_*}/r} \gg 1\) for rr*, which is responsible for the redressing of the strong coupling scale as we shall see in (10.24). On the no-trivial background the new strong coupling scale is \({\Lambda _*} \sim \sqrt Z \Lambda \gg \Lambda\) for rr*. Similarly, on top of this background the coupling to external matter no longer occurs at the Planck scale but rather at the scale \(\sqrt Z {M_{{\rm{Pl}}}} \sim {10^7}{M_{{\rm{Pl}}}}\).

  • Second, we see that within the regime of validity of the classical calculation, the modes propagating along the radial direction do so with a superluminal phase and group velocity \(c_r^2 = 4/3 > 1\) and the modes propagating in the orthoradial direction do so with a subluminal phase and group velocity \(c_\Omega ^2 = 1/3\). This result occurs in any Galileon and multi-Galileon theory which exhibits the Vainshtein mechanism [412, 129, 246]. The subluminal velocity is not of great concern, not even for Cerenkov radiation since the coupling to other fields is so much suppressed, but the superluminal velocity has been source of many questions [1]. It is definitely one of the biggest issues arising in these kinds of theories see Section 10.6.

Before discussing the biggest concerns of the theory, namely the superluminalities and the low strong-coupling scale, we briefly present some subtleties that arise when considering static and spherically symmetric solutions in massive gravity as opposed to a generic Galileon theory.

Static and spherically symmetric configurations in massive gravity

The Vainshtein mechanism was discussed directly in the context of massive gravity (rather than the Galileon larger family) in Refs. [363, 365, 99, 440] and more recently in [58, 455, 57]. See also Refs. [478, 105, 61, 413, 277, 160, 38, 37, 39] for other spherically symmetric solutions in massive gravity.

While the decoupling limit of massive gravity resembles that of a Galileon, it presents a few particularities which affects the precise realization of the Vainshtein mechanism:

  • First if the parameters of the ghost-free theory of massive gravity are such that α3 + 4α4 ≠ 0, there is a mixing \({h^{\mu v}}X_{\mu v}^{(3)}\) between the helicity-0 and -2 modes of the graviton that cannot be removed by a local field redefinition (unless we work in an special types of backgrounds). The effects of this coupling were explored in [99, 57] and it was shown that the theory does not exhibit any stable static and spherically symmetric configuration in presence of a localized point-like matter source. So in order to be phenomenologically viable, the theory of massive gravity needs to be tuned with α3 + 4α4 = 0. Since these parameters do not get renormalized this is a tuning and not a fine-tuning.

  • When α3+4α4 = 0 and the previous mixing \({h^{\mu v}}X_{\mu v}^{(3)}\) is absent, the decoupling limit of massive gravity resembles a specific quartic Galileon, where the coefficient of the cubic Galileon is related to quartic coefficient (and if one vanishes so does the other one),

    $$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Helicity - 0}}}} = - {3 \over 4}{{(\partial \pi)}^2} + {{3\alpha} \over {4\Lambda _3^3}}\mathcal{L}_{({\rm{Gal}})}^{(3)}[\pi ] - {1 \over 4}{{\left({{\alpha \over {\Lambda _3^3}}} \right)}^2}\mathcal{L}_{({\rm{Gal}})}^{(4)}[\pi ]} \\ {+ {1 \over {{M_{{\rm{Pl}}}}}}\left({\pi T + {\alpha \over {\Lambda _3^3}}{\partial _\mu}\pi {\partial _\nu}\pi {T^{\mu \nu}}} \right),\quad \quad} \\ \end{array}$$
    (10.21)

    where we have set α2 = 1 and the Galileon Lagrangians \({\mathcal L}_{{\rm{Gal}}}^{(3,4)}[\pi ]\) are given in (8.44) and (8.45). Note that in this decoupling limit the graviton mass always enters in the combination \(\alpha/\Lambda _3^3\), with α = (1 + 3/2α3). As a result this decoupling limit can never be used to directly probe the graviton mass itself but rather of the combination \(\alpha/\Lambda _3^3\) [57]. Beyond the decoupling limit however the theory breaks the degeneracy between and m.

    Not only is the cubic Galileon always present when the quartic Galileon is there, but one cannot prevent the new coupling to matter μπ∂ν∂πTμν which is typically absent in other Galileon theories.

The effect of the coupling μπ∂ν∂πTμν was explored in [58]. First it was shown that this coupling contributes to the definition of the kinetic term of π and can lead to a ghost unless α. > 0 so this restricts further the allowed region of parameter space for massive gravity. Furthermore, even when α > 0, none of the static spherically symmetric solutions which asymptote to π → 0 at infinity (asymptotically flat solutions) extrapolate to a Vainshtein solution close to the source. Instead the Vainshtein solution near the source extrapolate to cosmological solutions at infinity which is independent of the source

$${\pi _0}(r) \rightarrow {{3 + \sqrt 3} \over 4}{{\Lambda _3^3} \over \alpha}{r^2}\quad {\rm{for}}\quad r \gg {r_{\ast}}$$
(10.22)
$${\pi _0}(r) \rightarrow {\left({{{\Lambda _3^3} \over \alpha}} \right)^{2/3}}{\left({{M \over {4\pi {M_{{\rm{Pl}}}}}}} \right)^{1/3}}\quad {\rm{for}}\quad r \ll {r_{\ast}}.$$
(10.23)

if π was a scalar field in its own right such an asymptotic condition would not be acceptable. However, in massive gravity π is the helicity-0 mode of the gravity and its effect always enters from the Stückelberg combination μνπ, which goes to a constant at infinity. Furthermore, this result is only derived in the decoupling limit, but in the fully fledged theory of massive gravity, the graviton mass kicks in at the distance scale m−1 and suppresses any effect at these scales.

Interestingly, when performing the perturbation analysis on this solution, the modes along all directions are subluminal, unlike what was found for the Galileon in (10.20). It is yet unclear whether this is an accident to this specific solution or if this is something generic in consistent solutions of massive gravity.

Validity of the EFT

The Vainshtein mechanism presented previously relies crucially on interactions which are important at a low energy scale Λ ≪ MPl. These interactions are operators of dimension larger than four, for instance the cubic Galileon (∂π)2π is a dimension-7 operator and the quartic Galileon is a dimension-10 operator. The same can be seen directly within massive gravity. In the decoupling limit (8.38), the terms \({h^{\mu v}}X_{\mu v}^{(2,3)}\) are respectively dimension-7 and-10 operators. These operators are thus irrelevant from a traditional EFT viewpoint and the theory is hence not renormalizable.

This comes as no surprise, since gravity itself is not renormalizable and there is thus no reason to expect massive gravity nor its decoupling limit to be renormalizable. However, for the Vainshtein mechanism to be successful in massive gravity, we are required to work within a regime where these operators dominate over the marginal ones (i.e., over the standard kinetic term ∂π)2 in the strongly coupled region where 2π ≫ Λ3). It is, therefore, natural to wonder whether or not one can ever use the effective field description within the strong coupling region without going outside the regime of validity of the theory.

The answer to this question relies on two essential features:

  1. 1.

    First, as we shall see in what follows, the Galileon interactions or the interactions that arise in the decoupling limit of massive gravity and which are essential for the Vainshtein mechanism do not get renormalized within the decoupling limit (they enjoy a non-renormalization theorem which we review in what follows).

  2. 2.

    The non-renormalization theorem together with the shift and Galileon symmetry implies that only higher operators of the form (π)m, with , m ≥ 2 are generated by quantum corrections. These operators differ from the Galileon operators in that they always generate terms that more than two derivatives on the field at the level of the equation of motion (or they always have two or more derivatives per field at the level of the action).

This means that there exists a regime of interest for the theory, for which the operators generated by quantum corrections are irrelevant (non-important compared to the Galileon interactions). Within the strong coupling region, the field itself can take large values, π ∼ Λ, ∂π ∼ Λ2, ∂2π ∼ Λ3, and one can still rely on the Galileon interactions and take no other operator into account so long as any further derivative of the field is suppressed, dnπ ≪ Λn+1 for any n ≥ 3.

This is similar to the situation in DBI scalar field models, where the field operator itself and its velocity is considered to be large π ∼ Λ and ∂π Λ2, but the field acceleration and any higher derivatives are suppressed nπ ≪ Λn+1 for n ≥ 2 (see [157]). In other words, the Effective Field expansion should be reorganized so that operators which do not give equations of motion with more than two derivatives (i.e., Galileon interactions) are considered to be large and ought to be treated as the relevant operators, while all other interactions (which lead to terms in the equations of motion with more than two derivatives) are treated as irrelevant corrections in the effective field theory language.

Finally, as mentioned previously, the Vainshtein mechanism itself changes the canonical scale and thus the scale at which the fluctuations become strongly coupled. On top of a background configuration, interactions do not arise at the scale Λ but rather at the rescaled strong coupling scale \({\Lambda _*} = \sqrt Z \Lambda\), where Z is expressed in (10.7). In the strong coupling region, Z ≫ 1 and so Λ* ≫ Λ. The higher interactions for fluctuations on top of the background configuration are hence much smaller than expected and their quantum corrections are therefore suppressed.

When taking the cubic Galileon and considering the strong coupling effect from a static and spherically symmetric source then

$${\Lambda _{\ast}} \sim \sqrt Z \Lambda \sim \sqrt {{{\pi _0{\prime} (r)} \over {r{\Lambda ^3}}}} \Lambda ,$$
(10.24)

where the profile for the cubic Galileon in the strong coupling region is given in (10.15). If the source is considered to be the Earth, then at the surface of the Earth this gives

$${\Lambda _{\ast}}\sim\left({{M \over {{M_{{\rm{Pl}}}}}}{1 \over {{{(r\Lambda)}^3}}}} \right)\Lambda \sim{10^7}\Lambda \sim{\rm{c}}{{\rm{m}}^{- 1}},$$
(10.25)

taking Λ ∼ (1000 km)−1, which would be the scale Λ3 in massive gravity for a graviton mass of the order of the Hubble parameter today. In the quartic Galileon this enhancement in the strong coupling scale does not work as well in the purely static and spherically symmetric case [88] however considering a more realistic scenario and taking the smallest breaking of the spherical symmetry into account (for instance the Earth dipole) leads to a comparable result of a few cm [57]. Notice that this is the redressed strong coupling scale when taking into consideration only the effect of the Earth. When getting to these smaller distance scales, all the other matter sources surrounding whichever experiment or scattering process needs to be accounted for and this pushes the redressed strong coupling scale even higher [57].

Non-renormalization

The non-renormalization theorem mentioned above states that within a Galileon theory the Galileon operators themselves do not get renormalized. This was originally understood within the context of the cubic Galileon in the procedure established in [411] and is easily generalizable to all the Galileons [412]. In what follows, we review the essence of non-renormalization theorem within the context of massive gravity as derived in [140].

Let us start with the decoupling limit of massive gravity (8.38) in the absence of vector modes (the Vainshtein mechanism presented previously does not rely on these modes and it thus consistent for the purpose of this discussion to ignore them). This decoupling limit is a very special scalar-tensor theory on flat spacetime

$${\mathcal{L}_{{\Lambda _3}}} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {1 \over 4}{h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{c_n}} \over {\Lambda _3^3(n - 1)}}} X_{\mu \nu}^{(n)},$$
(10.26)

where the coefficients cn are given in (8.47) and the tensors are given in (8.298.31) or (8.338.36). The theory described by (10.26) (including the two interactions hX(2,3)) enjoys two kinds of symmetries: a gauge symmetry for (linearized diffeomorphism) hμνhμν + ∂(μξν) and a global shift and Galilean symmetry for π, ππ + c + vμxμ. Notice that unlike in a pure Galileon theory, here the global symmetry for is an exact symmetry of the Lagrangian (not a symmetry up to boundary terms). This means that the quantum corrections generated by this theory ought to preserve the same kinds of symmetries.

The non-renormalization theorem follows simply from the antisymmetric structure of the interactions (8.30) and (8.31). Let us consider the contributions of the vertices

$${V_2} = {h^{\mu \nu}}X_{\mu \nu}^{(2)} = {h^{\mu \nu}}{\varepsilon ^{\mu \alpha \beta \gamma}}{\varepsilon ^{\nu {\alpha {\prime}}{\beta {\prime}}}}_\gamma {\partial _\alpha}{\partial _{{\alpha {\prime}}}}\pi {\partial _\beta}{\partial _{{\beta {\prime}}}}\pi$$
(10.27)
$${V_3} = {h^{\mu \nu}}X_{\mu \nu}^{(3)} = {h^{\mu \nu}}{\varepsilon ^{\mu \alpha \beta \gamma}}{\varepsilon ^{\nu {\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}}{\partial _\alpha}{\partial _{{\alpha {\prime}}}}\pi {\partial _\beta}{\partial _{{\beta {\prime}}}}\pi {\partial _\gamma}{\partial _{{\gamma {\prime}}}}\pi$$
(10.28)

to an arbitrary diagram. If all the external legs of this diagram are π fields then it follows immediately that the contribution of the process goes as (2π or with more derivatives and is thus not an operator which was originally present in (10.26). So let us consider the case where a vertex (say V3) contributes to the diagram with a spin-2 external leg of momentum pμ. The contribution from that vertex to the whole diagram is given by

$$\begin{array}{*{20}c} {i{\mathcal{M}_{{V_3}}} \propto i\int {{{{{\rm{d}}^4}k} \over {{{(2\pi)}^4}}}} {{{{\rm{d}}^4}q} \over {{{(2\pi)}^4}}}{\mathcal{G}_k}{\mathcal{G}_q}{\mathcal{G}_{p - k - q}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \times \left[ {{\epsilon ^{{\ast}\mu \nu}}{\varepsilon _\mu}^{\alpha \beta \gamma}{\varepsilon _\nu}^{{\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}{k_\alpha}{k_{{\alpha {\prime}}}}{q_\beta}{q_{{\beta {\prime}}}}{{(p - k - q)}_\gamma}{{(p - k - q)}_{{\gamma {\prime}}}}} \right]\quad \\ \propto i{\epsilon ^{{\ast}\mu \nu}}{\varepsilon _\mu}^{\alpha \beta \gamma}{\varepsilon _\nu}^{{\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}{p_\gamma}{p_{{\gamma {\prime}}}}\int {{{{{\rm{d}}^4}k} \over {{{(2\pi)}^4}}}} {{{{\rm{d}}^4}q} \over {{{(2\pi)}^4}}}{\mathcal{G}_k}{\mathcal{G}_q}{\mathcal{G}_{p - k - q}}{k_\alpha}{k_{{\alpha {\prime}}}}{q_\beta}{q_{{\beta {\prime}}}}, \\ \end{array}$$
(10.29)

where ϵ *μν is the polarization of the spin-2 external leg and is the Feynman propagator for the π-particle, \({{\mathcal G}_k} = i{({k^2} - i\varepsilon)^{- 1}}\). This contribution is quadratic in the momentum of the external spin-2 field pγpγ′, which means that in position space it has to involve at least two derivatives in (there could be more derivatives arising from the integral over the propagator \({{\mathcal G}_{p - k - q}}\) inside the loops). The same result holds when inserting a V2 vertex as explained in [140]. As a result any diagram in this theory can only generate terms of the form (2h) (2π)m, or terms with even more derivatives. As a result the operators presented in (10.26) or in the decoupling limit of massive gravity are not renormalized. This means that within the decoupling limit the scale A does not get renormalized, and it can be set to an arbitrarily small value (compared to the Planck scale) without running issues. The same holds for the other parameter c2 or c3.

When working beyond the decoupling limit, we expect operators of the form h2(2π)n to spoil this non-renormalization theorem. However, these operators are MPl suppressed, and so they lead to quantum corrections which are themselves MPl suppressed. This means that the quantum corrections to the graviton mass is suppressed as well [140]

$$\delta {m^2} \lesssim {m^2}{\left({{m \over {{M_{{\rm{Pl}}}}}}} \right)^{2/3}}.$$
(10.30)

This result is crucial for the theory. It implies that a small graviton mass is technically natural.

Quantum corrections beyond the decoupling limit

As already emphasized, the consistency of massive gravity relies crucially on a very specific set of allowed interactions summarized in Section 6. Unlike for GR, these interactions are not protected by any (known) symmetry and we thus expect quantum corrections to destabilize this structure. Depending on the scale at which these quantum corrections kick in, this could lead to a ghost at an unacceptably low scale.

Furthermore, as discussed previously, the mass of the graviton itself is subject to quantum corrections, and for the theory to be viable the graviton mass ought to be tuned to extremely small values. This tuning would be technically unnatural if the graviton mass received large quantum corrections.

We first summarize the results found so far in the literature before providing further details

  1. 1.

    Destabilization of the potential:

    At one-loop, matter fields do not destabilize the structure of the potential. Graviton loops on the hand do lead to new operators which do not belong to the ghost-free family of interactions presented in (6.96.13), however they are irrelevant below the Planck scale.

  2. 2.

    Technically natural graviton mass:

    As already seen in (10.30), the quantum corrections for the graviton mass are suppressed by the graviton mass itself, δm2m2(m/MPl)2/3 this result is confirmed at one-loop beyond the decoupling limit and as result a small graviton mass is technically natural.

Matter loops

The essence of these arguments go as follows: Consider a ‘covariant’ coupling to matter, matter(gμν, ψi), for any species ψi be it a scalar, a vector, or a fermion (in which case the coupling has to be performed in the vielbein formulation of gravity, see (5.6)).

At one loop, virtual matter fields do not mix with the virtual graviton. As a result as far as matter loops are concerned, they are ‘unaware’ of the graviton mass, and only lead to quantum corrections which are already present in GR and respect diffeomorphism invariance. So the only potential term (i.e., operator with no derivatives on the metric fluctuation) it can lead to is the cosmological constant.

This result was confirmed at the level of the one-loop effective action in [146], where it was shown that a field of mass M leads to a running of the cosmological constant δΛCCM4. This result is of course well-known and is at the origin of the old cosmological constant problem [484]. The key element in the context of massive gravity is that this cosmological constant does not lead to any ghost and no new operators are generated from matter loops, at the one-loop level (and this independently of the regularization scheme used, be it dimensional regularization, cutoff regularization, or other.) At higher loops we expect virtual matter fields and graviton to mix and effect on the structure of the potential still remains to be explored.

Graviton loops

When considering virtual gravitons running in the loops, the theory does receive quantum corrections which do not respect the ghost-free structure of the potential. These are of course suppressed by the Planck scale and the graviton mass and so in dimensional regularization, we generate new operators of the formFootnote 25

$$\mathcal{L}_{{\rm{QC}}}^{({\rm{potential}})}\sim{{{m^4}} \over {M_{{\rm{Pl}}}^n}}\;{h^n},$$
(10.31)

with n ≥ 2, and where m is the graviton mass, and the contractions of h do not obey the structure presented in (6.9)(6.13). In a normal effective field theory this is not an issue as such operators are clearly irrelevant below the Planck scale. However, for massive gravity, the situation is more subtle.

As see in Section 10.1 (see also Section 10.2), massive gravity is phenomenologically viable only if it has an active Vainshtein mechanism which screens the effect of the helicity-0 mode in the vicinity of dense environments. This Vainshtein mechanisms relies on having a large background for the helicity-0 mode, π = π0 + δπ with \({\partial ^2}{\pi _0} \gg \Lambda _3^3 = {m^2}{M_{{\rm{Pl}}}}\), which in unitary gauge implies h = h0 + (δh, with h0MPl.

To mimic this effect, we consider a given background for h = h0MPl. Perturbing the new operators (10.31) about this background leads to a contribution at quadratic order for the perturbations δh which does not satisfy the Fierz-Pauli structure,

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{{{m^4}h_0^{n - 2}} \over {M_{{\rm{Pl}}}^n}}\;\delta {h^2}.$$
(10.32)

In terms of the helicity-0 mode π, considering δhδ2π/m2 this leads to higher derivative interactions

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{{h_0^{n - 2}} \over {M_{{\rm{Pl}}}^n}}\;\;{\left({{\partial ^2}\pi} \right)^2},$$
(10.33)

which revive the BD ghost at the scale \(m_{{\rm{ghost}}}^2 \sim h_0^2{({M_{{\rm{Pl}}}}/{h_0})^n}\). The mass of the ghost can be made arbitrarily small, (smaller than Λ3) by taking n ≫ 1 and h0MPl as is needed for the Vainshtein mechanism. In itself this would be a disaster for the theory as it means precisely in the regime where we need the Vainshtein mechanism to work, a ghost appears at an arbitrarily small scale and we can no longer trust the theory.

The resolution to this issue lies within the Vainshtein mechanism itself and its implementation not only at the classical level as was done to estimate the mass of the ghost in (10.33) but also within the calculation of the quantum corrections themselves. To take the Vainshtein mechanism consistently into account one needs to consider the effective action redressed by the interactions themselves (as was performed at the classical level for instance in (10.9)).

This redressing was taken into at the level of the one-loop effective action in Ref. [146] and it was shown that when resumed, the large background configuration has the effect of further suppressing the quantum corrections so that the mass of the ghost never reaches below the Planck scale even when h0MPl. To be more precise (10.33) is only one term in an infinite order expansion in h0. Resuming these terms leads rather to contribution of the form (symbolically)

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{1 \over {1 + {{{h_0}} \over {{M_{{\rm{Pl}}}}}}}}{1 \over {M_{{\rm{Pl}}}^2}}\;\;{\left({{\partial ^2}\pi} \right)^2},$$
(10.34)

so that the effective scale at which this operator is relevant is well above the Planck scale when h0MPl and is at the Planck scale when working in the weak-field regime h0MPl. Notice that h0 ∼ −MPl corresponds to a physical singularity in massive gravity (see [56]), and the theory would break down at that point anyways, irrespectively of the ghost.

As a result, at the one-loop level the quantum corrections destabilize the structure of the potential but in a way which is irrelevant below the Planck scale.

Strong coupling scale vs cutoff

Whether it is to compute the Vainshtein mechanism or quantum corrections to massive gravity, it is crucial to realize that the scale Λ = (m2 MPl)1/3 (denoted as Λ in what follows) is not necessarily the cutoff of the theory.

The cutoff of a theory corresponds to the scale at which the given theory breaks down and new physics is required to describe nature. For GR the cutoff is the Planck scale. For massive gravity the cutoff could potentially be below the Planck scale, but is likely well above the scale Λ, and the redressed scale Λ* computed in (10.24). Instead Λ (or Λ* on some backgrounds) is the strong-coupling scale of the theory.

When hitting the scale Λ or Λ* perturbativity breaks down (in the standard field representation of the theory), which means that in that representation loops ought to be taken into account to derive the correct physical results at these scales. However, it does not necessarily mean that new physics should be taken into account. The fact that tree-level calculations do not account for the full results does in no way imply that theory itself breaks down at these scales, only that perturbation theory breaks down.

Massive gravity is of course not the only theory whose strong coupling scale departs from its cutoff. See, for instance, Ref. [31] for other examples in chiral theory, or in gravity coupled to many species. To get more intuition on these types of theories and on the distinction between strong coupling scale and cutoff, consider a large number N ≫ 1 of scalar fields coupled to gravity. In that case the effective strong coupling scale seen by these scalars is \({M_{{\rm{eff}}}} = {M_{{\rm{Pl}}}}/\sqrt N \ll {M_{{\rm{Pl}}}}\), while the cutoff of the theory is still MPl (the scale at which new physics enters in GR is independent of the number of species living in GR).

The philosophy behind [31] is precisely analogous to the distinction between the strong coupling scale and the cutoff (onset of new physics) that arises in massive gravity, and summarizing the results of [31] would not make justice of their work, instead we quote the abstract and encourage the reader to refer to that article for further details:

“In effective field theories it is common to identify the onset of new physics with the violation of tree-level unitarity. However, we show that this is parametrically incorrect in the case of chiral perturbation theory, and is probably theoretically incorrect in general. In the chiral theory, we explore perturbative unitarity violation as a function of the number of colors and the number of flavors, holding the scale of the “new physics” (i.e., QCD) fixed. This demonstrates that the onset of new physics is parametrically uncorrelated with tree-unitarity violation. When the latter scale is lower than that of new physics, the effective theory must heal its unitarity violation itself, which is expected because the field theory satisfies the requirements of unitarity. (…) A similar example can be seen in the case of general relativity coupled to multiple matter fields, where iteration of the vacuum polarization diagram restores unitarity. We present arguments that suggest the correct identification should be connected to the onset of inelasticity rather than unitarity violation.” [31].

Superluminalities and (a)causality

Besides the presence of a low strong coupling scale in massive gravity (which is a requirement for the Vainshtein mechanism, and is thus not a feature that should necessarily try to avoid), another point of concern is the possibility to have superluminal propagation. This statements requires a qualification and to avoid any confusion, we shall first review the distinction between phase velocity, group velocity, signal velocity and front velocity and their different implications. We follow the same description as in [399] and [77] and refer to these books and references therein for further details.

  1. 1.

    Phase Velocity: For a wave of constant frequency, the phase velocity is the speed at which the peaks of the oscillations propagate. For a wave [77]

    $$f(t,x) = A\sin (\omega t - kx) = A\sin \left({\omega \left({t - {x \over {{v_{{\rm{phase}}}}}}} \right)} \right),$$
    (10.35)

    the phase velocity vphase is given by

    $${v_{{\rm{phase}}}} = {\omega \over k}.$$
    (10.36)
  2. 2.

    Group Velocity: If the amplitude of the signal varies, then the group velocity represents the speed at which the modulation or envelop of the signal propagates. In a medium where the phase velocity is constant and does not depend on frequency, the phase and the group velocity are the same. More generally, in a medium with dispersion relation ω (k), the group velocity is

    $${v_{{\rm{group}}}} = {{\partial \omega (k)} \over {\partial k}}.$$
    (10.37)

    We are familiar with the notion that the phase velocity can be larger than speed of light c (in this review we use units where c = 1.) Similarly, it has been known for now almost a century that

    “(…) the group velocity could exceed c in a spectral region of an anomalous dispersion” [399].

    While being a source of concern at first, it is now well-understood not to be in any conflict with the theory of general (or special) relativity and not to be the source of any acausality. The resolution lies in the fact that the group velocity does not represent the speed at which new information is transmitted. That speed is instead refer as the front velocity as we shall see below.

  3. 3.

    Signal Velocityyields the arrival of the main signal, with intensities of the order of magnitude of the input signal ” [77]. Nowadays it is common to define the signal velocity as the velocity from the part of the pulse which has reached at least half the maximum intensity. However, as mentioned in [399], this notion of speed rather is arbitrary and some known physical systems can exhibit a signal velocity larger than c.

  4. 4.

    Front Velocity: Physically, the front velocity represents the speed of the front of a disturbance, or in other words “Front velocity (…) correspond[s] to the speed at which the very first, extremely small (perhaps invisible) vibrations will occur.” [77]. The front velocity is thus the speed at which the very first piece of information of the first “forerunner” propagates once a front or a “sudden discontinuous turn-on of a field ” is turned on [399].

“The front is defined as a surface beyond which, at a given instant in time the medium is completely at rest ” [77],

$$f(t,x) = \theta (t)\sin (\omega t - kx),$$
(10.38)

where θ (t) is the Heaviside step function.

In practise the front velocity is the large (high frequency) limit of the phase velocity.

The distinction between these four types of velocities in presented in Figure 5. They are important to keep in mind and especially to be distinguished when it comes to superluminal propagation. Superluminal phase, group and signal velocities have been observed and measured experimentally in different physical systems and yet cause no contradiction with special relativity nor do they signal acausalities. See Ref. [318] for an enlightening discussion of the case of QED in curved spacetime.

Figure 5
figure 5

Difference between phase, group, signal and front velocities. At t = δt, the phase and group velocities are represented on the left and given respectively by v phase = δxP/δt and \({\upsilon _{{\rm{group}}}} = \delta {x_G}/\delta t\) (in the limit δt → 0.) The signal and front velocity represented on the right are given by vsignal = δxS/δt (where δxS is the point where at least half the intensity of the original signal is reached.) The front velocity is given by vfront = δxF/δt.