The $\Lambda_2$ limit of massive gravity

Lorentz-invariant massive gravity is usually associated with a strong coupling scale $\Lambda_3$. By including non-trivial effects from the Stueckelberg modes, we show that about these vacua, one can push the strong coupling scale to higher values and evade the linear vDVZ-discontinuity. For generic parameters of the theory and generic vacua for the Stueckelberg fields, the $\Lambda_2$-decoupling limit of the theory is well-behaved and free of any ghost or gradient-like instabilities. We also discuss the implications for nonlinear sigma models with Lorentzian target spaces.


Introduction and Summary
As an effective field theory on Minkowski space, Lorentz-invariant massive gravity with generic interactions is strongly coupled and breaks perturbative unitarity at a scale Λ * with Λ * < Λ 3 = (M Pl m 2 ) 1/3 [1]. When the graviton mass m is taken to be of the current Hubble scale, this is a very small scale phenomenologically. Moreover, all the interactions that arise strictly below the scale Λ 3 are associated with the nonlinear Boulware-Deser (BD) ghost [2][3][4]. This makes the Vainshtein mechanism [5] in all these massive gravity theories untrustworthy as a resolution of the linear vDVZ-discontinuity (van Dam-Veltman-Zakharov [6,7]). As a result, none of the theories of massive gravity with a strong coupling scale Λ * < Λ 3 have a smooth massless limit to General Relativity within the regime of validity of their effective field theory.
Fortunately, all the interactions below Λ 3 can be eliminated by a unique graviton potential [8,9], and this coincides with the elimination of the BD ghost [9][10][11]. In ghost-free massive gravity [8,9] gravitational waves carry 5 modes, as expected for a massive spin-2 particle in four dimensions, and the Vainshtein mechanism operates in a much more controlled way [12]. See [13] for a recent review of massive gravity and [14] for an introduction on the Vainshtein mechanism. Λ 2 -limit of massive gravity The scale Λ 3 = (M Pl m 2 ) 1/3 is usually considered as the highest possible strong coupling scale in a Lorentz-invariant theory of massive gravity (bearing in mind we consider m ≪ M Pl ). This usually comes from analyzing ghost-free massive gravity around the trivial Lorentzinvariant vacuum g µν = η µν , φ A = x A , where the φ A are the Stückelberg scalar fields that ensure that the theory of massive gravity is diffeomorphism invariant.
However, in ghost-free massive gravity (also known as the dRGT model [8,9] 1 ), about non-trivial vacua which leads to where S GFMG is given by Eq. (2.14), the first term is the linear Einstein-Hilbert term, T µν is the stress-energy tensor of the matter fields and we have defined the massive gravity nonlinear sigma model as The interesting properties of this nonlinear sigma model and its generalization have been discussed in [15] and will also be mentioned later in this paper.
We emphasize that the massive gravity nonlinear sigma model (1.4) does not amount to simply setting g µν := η µν in ghost-free massive gravity, which would be an inconsistent procedure. Rather, we take a well-defined Λ 2 -decoupling limit which preserves the total number of degrees of freedom along the flow M Pl → ∞, and hence will automatically carry over desirable properties of ghost-free massive gravity (such as the absence of the BD ghost) to the decoupled theory. This fact alone is sufficient to guarantee that L MG−NLS [φ A ] does not carry more that 3 propagating degrees of freedom (in D = 4 dimensions), while the full action (1.3) still carries all the 5 propagating degrees of freedom. The very existence of such a decoupling limit relies on configurations for φ A for which all 3 propagating degrees of freedom in L MG−NLS [φ A ] are active.
In what follows we will first perform a full nonlinear Hamiltonian analysis for this massive gravity nonlinear sigma model. That is, we run a Dirac-Bergmann algorithm for the model, finding out all the constraints and checking their consistencies. We stress again that since we are taking a consistent decoupling limit, it is guaranteed that the number of degrees of freedom is not more than three, since h µν accounts for the additional two. For technical reasons, we will limit ourselves to the so-called minimal model although our results hold in all generality for generic sets of parameters. As expected, this Hamiltonian analysis concludes that in four dimensions, 3 out of the 4 Stückelberg fields are dynamical degrees of freedom. In other words, both the vector and scalar modes in φ A are dynamical. Interestingly, even though 'gravity' is entirely decoupled, the BD ghost mode is still eliminated. This of course is due to the matrix square root structure and the anti-symmetization scheme of the ghost-free graviton potential [16] and was guaranteed by taking the decoupling limit.
Having proven that the nonlinear sigma model (1.4) includes 3 degrees of freedom one can then search for backgrounds where the longitudinal mode is dynamical. In principle most vacua of the theory will excite all 3 DoFs, but the trivial one φ A = x A and any Lorentz-invariant generalization are special in that at linear order they exhibit an accidental U (1)-gauge symmetry. For the isolated nonlinear sigma model, the longitudinal mode is thus infinitely strongly coupled on these trivial vacua and their regime of validity is null. For massive gravity, however, the coupling to gravity breaks the accidental U (1) and provides a kinetic term for all the relevant degrees of freedom. This implies that vacua where the Stückelberg fields preserve Lorentz-invariance are acceptable vacua for massive gravity and the strong coupling scale on these vacua is lowered to Λ 3 , but these vacua are not acceptable for the nonlinear sigma model. Instead, for the nonlinear sigma model and for massive gravity with a Λ 2 -decoupling limit, one needs to consider non-trivial (weakly Lorentz-breaking) vacua for the Stückelberg fields. (Of course, for the nonlinear sigma model alone, Λ 2 is a free tunable dimensionfull parameter.) Finding exact vacua may be generically challenging from a purely technical viewpoint. Plane waves are exact solutions which play the role of instructive toy-models. More generic vacua can be constructed perturbatively, either by performing a small field expansion about the trivial vacuum or by performing a local expansion about a given point in spacetime. The latter expansion will prove convenient to establish the full stability of the DoFs and derive the corresponding strong coupling scale.
A nontrivial backgroundφ a will necessarily introduce some characteristic energy scale L −1 (it may of course introduce more scales and when that happens, the relevant energy scale for this discussion is the smallest one). When taking the decoupling limit (1.2) we maintain the scale L −1 fixed and the resulting strong coupling scale ends up being Λ 2 dressed by some positive powers of L −1 . This scale L −1 plays a similar role as the anti-de Sitter (AdS) curvature when considering massive gravity on AdS [17][18][19][20][21]. Note however that unlike massive gravity on AdS, we will focus this discussion to the case where the spacetime curvature vanishes (at least up to order m 2 corrections).

Absence of linear vDVZ-discontinuity 2
The previous Λ 2 -decoupling limit of ghost-free massive gravity has another virtue: Namely the absence of coupling between matter fields and the Stückelberg fields. Indeed in the decoupled limit (1.3), only the standard tensor modes h µν couple to matter as in General Relativity while the additional three degrees of freedom and specifically the longitudinal mode fully decouple. This immediately implies that already in the linear regime, i.e. already at large distances compared to L and Λ −1 2 but smaller than m −1 , the phenomenology of ghost-free massive gravity on these vacua is very close to General Relativity, without even needing to invoke any explicit Vainshtein mechanism (or in other words the non-trivial vacua already 2 We emphasize that for most theories of massive gravity, the vDVZ discontinuity is arising from considering a linear theory beyond its regime of validity, and represents a failure of the linear theory; while the discontinuity is expected to be absent at the non-linear level. In this manuscript we show that in a large class of non-trivial vacua, the absence of the discontinuity is already manifest at the linear level for ghost-free massive gravity. automatically implement the Vainshtein mechanism). Beyond this decoupling limit we expect corrections suppressed by positive powers of Λ * /M Pl , and fifth forces will also be suppressed by a similar amount (see Ref. [22] for relevant discussions).
The decoupling of the longitudinal mode also implies that the theory (i.e., ghost-free massive gravity with the Minkowski reference metric) is free from the standard vDVZ-discontinuity at the linearized level about these non-trivial vacua, similarly as for massive gravity on AdS [17][18][19] (or a general FLRW background [23,24]). A crucial distinction with massive gravity on AdS is that in our approach the gravitational (or geometric) sector is insensitive to the scale L in the decoupling limit and the background metric is Minkowski-like (or can be taken to be de Sitter or FLRW if the relevant cosmological constant or matter fields are included). For massive gravity on AdS on the other hand, the gravitational sector is strongly sensitive to the AdS curvature scale L even in the decoupling limit. For massive gravity on AdS, setting a limit where the metric is Minkowski requires sending L −1 → 0 and therefore leads to an arbitrarily low strong coupling scale (see Fig. 1).
Our approach also differs from standard Lorentz-violating theories of massive gravity (see Ref. [25] for a classification), where the strong coupling scale can be Λ 2 (or even higher when considering Lorentz-breaking generalizations of the Einstein-Hilbert term [26]). Indeed in these theories, the Lagrangian manifestly breaks Lorentz invariance. In the model we consider here, the fundamental theory preserves Lorentz invariance and the latter is only broken spontaneously about the vacua we consider.

Nonlinear sigma models with Lorentzian target spaces
The potential of massive gravity can be seen as a non-standard nonlinear sigma model for the four Stückelberg fields φ A , mapping from the spacetime metric g µν (or η µν in the absence of gravity) to the target space (the reference metric [27]).
For a standard nonlinear sigma model, a typical requirement is that the target space be Riemannian (its metric being positive definite) to avoid ghost DoFs (see e.g. [28][29][30]). From this point of view, it is not surprising that generically massive gravity is plagued by the BD ghost, as the internal space of the Stückelberg fields is Lorentzian (pseudo-Riemannian with signature (− + · · · +)). Ghost-free massive gravity then acts as a unique and special case that evades the Riemannian requirement. For a symmetric target space, the Lorentzian nature translates to non-compactness of the associated symmetry group.
At the technical level, the reason why the BD ghost is eliminated in ghost-free massive gravity is due to the existence of a second-class constraint [8]. Taking the decoupling limit (1.2) of ghost-free massive gravity, the nonlinear sigma model decouples from the gravitational tensor DoFs. Since a decoupling limit never changes the number of DoFs (if taken appropriately 3 ), the absence of the sixth BD mode in ghost-free massive gravity ensures the absence of ghost in the nonlinear sigma model. As a result and as we mentioned above, the nonlinear sigma model that arises from massive gravity is free of the ghost associated with the negative direction of the target space. This is in contrast with the other known ways to avoid the Riemannian requirement of the target space which all rely on invoking some gauge DoFs. This is for instance the case of the string Polyakov/Nambu-Goto action [31][32][33][34][35][36], or more generally for p-brane actions [37], where the target space is the spacetime itself, thus Lorentzian. Another known mechanism is to invoke normal gauge fields that are auxiliary, that is, without a kinetic term for the gauge field. This mechanism is used in supergravity model building (see e.g. [38,39]). All these known exceptions with a Lorentzian target space do not compromise the spirit of the Riemannian requirement in the sense that once the auxiliary gauge/diffeomorphism DoFs are fixed by making use of the auxiliary field equations of motion and gauge choices the target space becomes manifestly Riemannian. On the other hand, the massive gravity nonlinear sigma model and its generalization relies on two second class constraints to project out the would-be ghost associated with the negative direction.
Since the ghost-free graviton potential is unique, up to a few free parameters, it follows that the massive gravity nonlinear sigma model (1.4) in D dimensions -with the sum starting from n = 1, the internal space metric η AB replaced by f AB (φ) and the coefficients α n generalized to be functions of the Stückelberg fields α n (φ) [15,40] -is the only nonlinear sigma model where the target space is Lorentzian. We emphasize that the target space can be higher-dimensional than that of the spacetime (that is N > D). The case of N < D is more subtle and will be discussed in § 8. See [15] for a bi-gravity braneworld interpretation of this generalized nonlinear sigma model and more discussions on nonlinear sigma models with Lorentzian target spaces.
Outline.-The rest of the manuscript is organized as follows: We start by introducing ghostfree massive gravity and a generalization of the Nambu-Goto action in § 2, derive the value of the strong coupling scale about the trivial vacuum on Minkowski and AdS, and explain the origin of the vDVZ-discontinuity on Minkowski and its absence on AdS. We then perform the full nonlinear Hamiltonian analysis in § 3 for the massive gravity nonlinear sigma model and confirm the existence of two second class constraints that remove the BD ghost associated with the negative direction of the target space. Motivated by this result we first provide in § 4 an explicit exact nonlinear example of vacuum solution where all the DoFs are manifest. Although that vacuum turns out to be unstable, it corresponds to a useful explicit proof-of-principle. In § 5 we then derive more general classes of backgrounds by expanding the background itself and by adopting a local coordinate expansion. We find a family of stable vacua where all the DoFs are manifest and healthy. The related strong coupling scale on these stable vacua is established in § 6. These results are valid in dimensions larger than two. In two-dimensions we show in § 7 that the U (1)-symmetry is preserved to all orders and the corresponding nonlinear sigma model hence propagates no DoFs. In § 8, we give a short summary of our main results. 2 Ghost-free massive gravity and nonlinear sigma model In this section, we introduce the ghost-free graviton potential in a conceptually novel way: As a non-standard nonlinear sigma model with a Lorentzian target space. In this formulation, the importance of the scale Λ 2 is manifest.

Nambu-Goto action for non-compact space
We start by considering a theory of N scalar fields φ A living on a D-dimensional flat spacetime metric η µν . These N scalar fields may be thought as coordinates of a non-trivial target (field space, or internal) manifold specified by the metric f AB (φ). This corresponds to a nonlinear sigma model whose action can typically be written as Nonlinear sigma models [41] are effective field theories for multiple fields φ A with applications in various areas of physics (see, e.g., [28][29][30] and references therein for a review). The nonlinear sigma model of Eq. (2.1) is well-defined and free of ghost if the internal space metric f AB is positive definite, i.e., the target space has to be Riemannian (as opposed to pseudo-Riemannian). If the target space is symmetric, this means that the associated isometry group needs to be compact.
When considering a non-compact space, the internal space metric f AB typically has a negative eigenvalue and the sigma model (2.1) has a ghost. One possible way out is to ensure that the mode associated with the negative direction is in fact not dynamical or a gauge mode. This is indeed the resolution for the Polyakov action for a p-brane 4 where the spacetime metric η µν is promoted to an auxiliary field g µν (x) and diffeomorphism invariance ensures that the would-be ghost DoF associated with the negative direction of the internal space is a gauge mode: If the internal space has signature (− + · · · +), then naively the field φ 0 (x) behaves as a ghost. But this action is invariant under the diffeomorphisms and the naive ghost is merely a gauge degree of freedom. This is obvious in the 'static' gauge where φ µ = x µ , for µ = 0, . . . , p, and the left-over target space is manifestly positive definite for the remaining φ A with A = p + 1, . . . , N − 1. An alternative way to see this is to write the auxiliary field metric g µν in ADM form [42] g µν dx µ dx ν = − N 0 2 dt 2 +γ ij dx i + N i dt dx j + N j dt , and then the lapse N 0 plays the role of a Lagrange multiplier that imposes a first class constraint projecting out the would-be ghost DoF, with p A = ∂L Polyakov /∂φ A , and we have accounted for the entire dependence on the lapse in the Hamiltonian. Actually, for the p = 1 string case, we see that for this procedure to work it is essential that the internal space metric f AB be not sign definite, otherwise the constraint would fix more than one phase space variable. In addition to this Hamiltonian constraint, there are D − 1 additional first class constraints generated by the shifts N i but only the Hamiltonian constraint is required to remove the would-be ghost in this Lorentzian space.
Since the metric g µν is not dynamical in this model and merely plays the role of auxiliary variables, we can integrate it out without changing the number of DoFs, and we are then left with the well-known Nambu-Goto action for the p-brane: (2.4) The Nambu-Goto action still enjoys the same gauge symmetry, and static gauge can still be chosen to make the target space manifestly positive definite.
On the other hand, if the D-dimensional tensor X µ ν defined as 5 is diagonalizable, then the Nambu-Goto action may also be re-written as The matrix square root is taken as the principal branch solution of the matrix equation X µ α X α ν = g µα ∂αφ A ∂ν φ B fAB.
where our anti-symmetrization convention is with the averaging factor 1/n! in front. In this language, the absence of ghost for this non-compact target space can be traced back to signaling that not all of the N scalar fields φ A are dynamical.

Generalization of Nambu-Goto
Inspired by the expression (2.6) for the Nambu-Goto action, it is now natural to extend it to the following Lagrangians for n ≤ D, We may also consider a fully equivalent representation of thẽ L n by taking linear combinations of them and defining the following Lagrangians So long as N ≥ D, all of these Lagrangians for any 0 ≤ n ≤ D satisfy the same relation (2.7) as the Nambu-Goto action, namely, which ensures the absence of ghost in any of these theories. While this generalization seems to be natural mathematically or at a superficial level, there is a crucial difference between the Nambu-Goto action and the generalized Lagrangians considered in (2.9): For the Nambu-Goto action, the rank of the matrix H AB = ∂ 2 L n /∂φ A ∂φ B is N −D, while for the L n the rank of the associated matrix H AB is N − 1. Also as we have seen, the removal of the degrees of freedom for the Nambu-Goto action is associated with a gauge symmetry, while for the other L n no symmetry is present and the removal of the ghost is related to second-class constraints. Nevertheless, for each one of these Lagrangians the vanishing of the Hessian is what signals the absence of the would-be ghost for these Σ-models on the target space. Therefore, the generalized Nambu-Goto Lagrangian (which we will refer to as the massive gravity nonlinear sigma model for reasons to become clear shortly) is given by where 12) and N ≥ D.
Intriguingly, this generalization of the p-brane Nambu-Goto action exactly gives rise to the graviton potential of ghost-free massive gravity when N = D. To consider in the context of a curved spacetime, we note that, instead of Eq. (2.6), the Nambu-Goto action can equivalently be casted as The generalization of this action to terms with fewer factors of X is exactly the ghost-free graviton potential. The difference again is that, while the Nambu-Goto term is diffeomorphism invariant, the terms with fewer factors of X are not.
In what follows we will also consider embedding these models in a gravitational setup, i.e., coupling to the dynamical part of g µν . This leads to ghost-free massive gravity in D dimensions (we shall consider only N = D in the following, as our main interest is in the context of massive gravity) [8,9] (2.14) 16) and the fields φ A play the role of Stückelberg fields that restore diffeomorphism invariance.
In the gravitational setup, L 1 is a tadpole term and X µ 1 [µ 1 X µ 2 µ 2 · · · X µ D µ D ] acts as a cosmological constant so we do not consider their contributions. Without loss of generality we may always set α 2 = 1 and β 1 = −1. The constants α n and β n are related via and β 1<n<D are two sets of equivalent free parameters of ghost-free massive gravity.

Linearized theory on Minkowski
"Σ-model".-Before considering the effects of gravity, we first focus on the "potential" term of massive gravity as a Lagrangian for the scalar fields φ A in their own right living on a flat Minkowski spacetime, decoupled from the gravitational sector. Note that a priori it is not certain that this "potential" scalar theory from massive gravity is actually continuously connected to massive gravity, which would require the existence of a decoupling limit of some sort. We will see that such a decoupling limit indeed exists. At any rate, for now, one may consider the "potential" action of massive gravity as a scalar field theory on its own. Let us, for instance, consider the following Lagrangian As will be shown in section 3, non-perturbatively, this Lagrangian carries D − 1 degrees of freedom (the constraint that removes the ghost in ghost-free massive gravity remains active even in the absence of gravity). However, perturbatively about the trivial vacuum φ A = x µ δ A µ , the Lagrangian (2.18) only carries D − 2 rather than D − 1 DoFs. Indeed, at the linearized level, φ A = x µ δ A µ + V A , the Lagrangian L 2 is a Maxwell theory for V A and enjoys a U (1) gauge symmetry. In dimensions N = D > 2, that symmetry is an artifact of the linearized theory and does not survive at the nonlinear level.
This realization has a profound impact not only for the scalar theory (2.18), but also for massive gravity as we shall see later. Indeed, for the scalar theory (2.18), the fact that one DoF fails to be dynamical on the trivial vacuum φ a = x a implies that this vacuum is infinitively strongly coupled and cannot be trusted (its has no regime of validity). This means that the theory (2.18) only makes sense if considered about different non-trivial vacua which excites all D − 1 degrees of freedom.
Implications for massive gravity.-In the context of massive gravity the situation is more positive for the vacuum φ A = x A . Indeed the mixing with gravity breaks the U (1) gauge symmetry and all D − 1 DoFs in the fields φ A are dynamical. The trivial vacuum φ α = x α has then an interesting non-trivial regime of validity. In this case one of the DoF in φ α only becomes dynamical (at the linearized level) through its mixing with gravity. This implies that, at the linear level, this DoF directly couples to matter with the same strength as gravity, which is at the origin of the linear vDVZ discontinuity.
To see this explicitly, let us start with the ghost-free massive gravity Lagrangian (2.14) and set the cosmological constant Λ c = 0 so as to have Minkowski as a vacuum solution. When splitting the fields φ α = x α + A α + η αβ ∂ β χ and the metric as g µν = η µν + h µν /M Pl , at the linear level, the only place where the kinetic term for χ enters is through its coupling with h µν . Symbolically, this is given by where T µν [ψ i ] is the stress-energy tensor of the external fields ψ i coupled to gravity. The mixing term can be taken care of by performing the field space rotation, symbolically, h µν =h µν +χη µν with Λ 3 3 = m 2 M Pl andχ the canonically normalized helicity-0 mode, At the linear level, the coupling between χ and any non-conformal matterχT is insensitive to the graviton mass m and does not vanish in the massless limit. This is of course at the origin of the well-known linear vDVZ-discontinuity and its resolution lies in the nonlinear interactions which become increasingly important in the small mass limit as pointed out by A. Vainshtein in [5]. In the context of nonlinear massive gravity the implementation of this Vainshtein mechanism was considered for instance in [12,14,43,44]. At the nonlinear level the theory involves interactions of the form h(∂ 2χ ) n+1 /Λ 3n 3 , which implies that the theory is strongly coupled at the scale Λ 3 [8,27].

Linearized theory on AdS
"Σ-model".-When applied to AdS, the previous analysis has a rather different outcome: Consider again the Lagrangian L 2 in (2.18) in its own right (i.e. separated from its gravitational context) in N = D dimensions but on an AdS spacetime, so that the tensor K and X now read is the AdS metric with curvature L −2 , so that its associated Ricci tensor is . Then the AdS curvature is sufficient to break the U (1) gauge symmetry already at the linear level. Indeed at the linear level about the trivial vacuum g µν = γ where all the contractions and covariant derivatives are with respect to the AdS spacetime metric. The appearance of a mass term for A µ on AdS implies that the theory enjoys no accidental U (1) and the helicity-0 mode χ acquires a kinetic term A 2 µ ⊃ (∂χ) 2 . It follows that on AdS the trivial vacuum φ a = x a is a perfectly well defined and acceptable vacuum for the sigma model (2.18) of N = D fields, out of which D − 1 are dynamical. Naturally, this result holds true for any generalization of that model L 2 + D n=3 α n L n .
Implications for massive gravity on AdS.-This result propagates to the case of gravity where it was shown that the linearized vDVZ is absent on AdS [17][18][19][20][21]. Indeed, in the limit where the AdS curvature is larger than the graviton mass m L −1 , the canonically normalized field is nowχ = Λ 3 * χ with (2.23) and the coupling betweenχ and matter now goes as which makes the massless limit of the linearized theory well-defined already at the linear level about AdS. This massless limit seems to occur without the need of a Vainshtein mechanism but we stress that 1. The Vainshtein mechanism is actually (secretly) active through the AdS background and this absence of discontinuity is in fact a direct implementation of the Vainshtein mechanism.
2. Strong coupling is still present in that theory. Indeed, the nonlinear theory includes interactions of the form (∂χ) 2 (∂ 2χ ) n−1 /Λ 3n * implying that the theory is then strongly coupled at the scale Λ * as given in (2.23).
As shown in Fig. 1, taking the limit m → 0 and L −1 → 0 leads to the same scaling as if one had started straight from massive gravity on Minkowski and taken the massless limit. However, for a finite mass m the strong coupling scale can be pushed higher if the AdS curvature is sufficiently large m L −1 , although this comes at the price of working about a non-Minkowski reference metric.
In what follows we will show how one can capture some of these features of massive gravity on AdS (namely the absence of linearized vDVZ-discontinuity and a higher strong coupling scale) while maintaining the reference metric nearly Minkowski. What we will consider instead is a non-trivial Lorentz-violating vacuum for the Stückelberg fields.

Nonlinear Hamiltonian analysis
In the rest of this manuscript, we focus on the case where and no longer distinguish between spacetime and target space indices. In this section we run the Dirac-Bergmann algorithm for the nonlinear theory (2.11). We will see rigorously that even when decoupling gravity, the BD ghost is eliminated, as argued above, and in general there is no gauge symmetry to further reduce the number of DoFs in the fields φ α . Lorentzinvariant vacua are hence special as they re-introduce an accidental U (1)-symmetry at linear order, but that U (1) is not a symmetry of the full sigma model and does not survive at higher order. Therefore in D dimensions, φ α involves D − 1 dynamical DoFs.
For simplicity, and without loss of generality, we focus in this section on the minimal model given in (2.11). The general model yields the same result. To explicitly perform the Hamiltonian analysis, it is convenient to work with an equivalent form of the minimal Lagrangian: where the auxiliary variable λ µν is a symmetric tensor with inverseλ µν . See Appendix A.1 for the equivalence between this Lagrangian and −TrX. To derive the Hamiltonian, we perform an ADM-like split for the symmetric tensor λ µν where latin indices are for now lowered or raised with σ ij or its inverse σ ij respectively. The conjugate momenta for φ α and σ ij are defined as where the Lorentz index α is lowered with η αβ . After the Legendre transform, the Hamiltonian becomes quadratic in µ k and linear in λ 0 . Integrating out µ k , we get where we have introduced the new set of Lagrange multipliers µ ij to impose the relation (3.6), and we have defined where C (1) and C (1) ij are primary constraints. If now one further integrates out σ ij , one can see that it is not possible to have any further constraints apart from the secondary associated with C (1) . But to be prudent, we show this explicitly by keeping σ ij .
Since C (1) and C (1) ij contain only conjugate momenta but not the fields themselves, it is clear that we have kl (y)} = 0, (3.13) and thus the time preservation of C (1) and C ij generate secondary constraints Then we check whether the time preservation of C (2) and C (2) ij give rise to any tertiary constraints. Making use of the Poisson brackets: the consistency equationsĊ (2) (x) = 0 andĊ (2) ij (x) = 0 lead to ij (x), H 0 (y)}, (3.21) where {C (2) (x), H 0 (y)} = 0 and {C (2) ij (x), H 0 (y)} = 0 in general. This is a non-degenerate system of linear equations for unknowns λ 0 and µ mn .
One can indeed check that all λ 0 and µ mn are determined by this system of linear equations. This is more easily performed in a specific number of dimensions. For example, in D = 4 dimensions, one can show that the rank of the system of linear equations is 7, which corresponds to the number of λ 0 and µ mn . Thus, all λ 0 and µ mn are determined. The Dirac-Bergmann algorithm ends here and all constraints are second class. Counting the phase space DoFs, we have, in D = 4 dimensions, (4 + 6) × 2 − (6 + 1) − (6 + 1) = 6 = 3 × 2, (3.22) meaning that the number of physical DoFs is indeed 3. This result was proven for the minimal model L 1 , but by continuity it holds for a general theory of (2.11). We will re-confirm this result with a couple of different methods in the following.

Exact non-trivial vacuum solution
Having shown that the massive gravity nonlinear sigma model also propagates two constraints that remove the BD ghost, and thus has 3 DoFs on generic backgrounds in D = 4 dimensions, we shall now present an explicit example where this occurs. In order to separate ourselves from the precise matter content of the model we work in the vacuum. In this sense our approach is different from, say, massive gravity on AdS, which requires a negative cosmological constant to source the background configuration. For the sake of simplicity, we focus once again on the minimal model, although our conclusions remain the same for any linear combinations of the Lagrangians L n .

Plane-waves
One of the difficulties in solving this equation for generic configurations of the fields φ a lies in evaluating the square-root that enters in X µ ν . In what follows we will evaluate this squareroot by performing perturbative expansions about the trivial vacuum, but for now we may consider the particularly simple -yet instructive-example of plane waves 6 . Take for instancē where we have used the notation x 0 = t, x 1 = x and the index I labels the orthogonal directions, I = 2, · · · , D − 1. This solves the vacuum equations of motion for arbitrary combinations of the Lagrangians L n defined in (2.9) and for arbitrary analytic functions F I and G I . Indeed the tensorX µ ν associated with these plane wave configurations (4.1) satisfies ∂ µ (X n ) µ ν = 0 and ∂ µ TrX n = 0 no matter what the power n is. This implies that the background configuration (4.1) satisfies the equations of motion for the fields φ α for arbitrary 6 Despite the terminology these solutions do not need to exhibit an oscillator behavior and the functions F I and G I are arbitrary.
combinations of the Lagrangians L n .
For instance, without loss of generality, we can set G I = 0 for any I = 2, · · · , D − 1, F I = 0 for any I = 3, · · · , D − 1 and write F 2 (t − x) = F (t − x). Then, if for simplicity, we work in D = 3-dimensions and havē While the square root matrix X µ ν has many branches of solution, it is understood that one should choose the branch that connects with the identity matrix when F (t − x) → 0. So the matrixX µ ν associated with the non-trivial vacuum (4.2) is where the prime denotes a derivative with respect to the function's argument, and one can indeed check that this matrix satisfies Tr[X n ] = 3 for any power n, and so we have ∂ µ TrX n = 0. Furthermore, we can explicitly check that ∂ µX µ ν = ∂ µ X 2 µ ν = ∂ µ X 3 µ ν = 0, so (4.3) satisfies the vacuum equations of motion for arbitrary combination of Lagrangians L 1 + α 2 L 2 + α 3 L 3 . This result is independent of the number of dimension and remains valid for arbitrary configurations of the form (4.1).

Degrees of freedom
Having established that the plane wave configurations (4.1) are exact vacuum solutions, we now proceed to evaluate the number of perturbative DoFs. To establish the number of DoFs on that vacuum, it is sufficient to look at fluctuations of the form where we introduced a dimensionless parameter ε to count the order in perturbations. Focusing on the minimal model L 1 , then to quadratic order in V (quadratic order in ε), we have where F µνρσ are functions of F .
The Hamiltonian analysis performed in § 3 confirms that this model only has D −1 DoFs. About the trivial vacuumφ α = x α (F ≡ 0), V 0 is indeed an auxiliary variable. On more generic vacua, the auxiliary variable is instead a linear combination of the fields V µ , and to simplify the derivation we can perform a rotation in field space V µ = W µ + R µ ν W ν so that W 0 is identified as the appropriate auxiliary variable. In D = 3-dimensions, the appropriate rotation is given by with so thatẆ 0 entirely disappears from the resulting Lagrangian and there are only two conjugate momenta given by: The Hamiltonian is then (to quadratic order in ε) where A n are functions of the background configuration F and are n th order in the remaining phase space variables W i , π i . The exact expressions for A 0 and A 1 are given in (A. 16) and (A.15) of appendix A.2 but are irrelevant to this discussion. A 0 is given by and vanishes on the Lorentz-preserving vacuum where F ≡ 0. About this trivial vacuum, W 0 is a Lagrange multiplier that generates a first-class constraint associated with an accidental U (1)-symmetry. Here we see explicitly that this symmetry is broken on generic backgrounds and while W 0 is still an auxiliary variable, it no longer generates a constraint for the phase space variables W i , π i . Then all the D − 1 remaining DoFs are dynamical and the resulting Hamiltonian (after integrating out the auxiliary variable W 0 ) is given by 7 This provides an explicit example of vacuum where all the expected DoFs are excited as they should. Unfortunately, in this specific example, A 0 > 0 and the resulting Hamiltonian is not bounded from below. As a result, in this specific example, the background solution turns out to be unstable. However, it represents an explicit proof-of-principle that non-trivial vacua can excite all the dynamical DoFs without needing to resort to a mixing with the tensor (gravitational) fields. In what follows we will show how to construct a more general class of stable vacua by considering solutions for the Stückelberg fields which are perturbative about the trivial one. We emphasize that looking for perturbative vacua is only used as an approximate tool to derive explicit vacua, but the theory also contains much more general classes of vacua.

General perturbative backgrounds
We now present a different way to derive an acceptable non-trivial vacuum by relying on a perturbative approach. This will allow us to derive the Hamiltonian for a large class of vacua, confirming the DoF counting result of the full Hamiltonian analysis in § 3 and 4, and determining the absence of ghosts and gradient instabilities for a subclass of these vacua.

Hamiltonian of fluctuations
As considered previously, we look at fluctuations V µ in a non-trivial vacuumφ µ , where as before ε is a small dimensionless parameter which keeps track of the order in perturbations about the vacuumφ µ . Now for convenience and ease of the presentation, the vacuum configuration itself is treated perturbatively, and we will be considering the background to be perturbative in the dimensionless parameter (in what follows 'barred' quantities will represent quantities that only involve the background).
For concreteness, we focus on a specific Lagrangian in what follows and choose with K µ ν = δ µ ν − X µ ν , so L α 2 NLS differs from the minimal model L 1 in (2.11). Including higher α n terms will add some computational complexity, but as we shall see below the α 2 term is sufficient for our purposes. In what follows we look at the Hamiltonian for the fluctuations V α living on top of the perturbed backgroundφ µ . We therefore wish to compute the Hamiltonian quadratic in ε and perturbatively in . We will see that working up to second order in is sufficient for this analysis. The resulting quadratic Lagrangian for V µ is given (symbolically) by where we have defined As expected, to lowest order in , we recover the Maxwell term for V µ and the theory enjoys an accidental U (1)-symmetry. The exact expressions at linear and quadratic order in in arbitrary dimensions are given in Appendix A.3.
We now follow the same procedure as in the previous section, see Eq. (4.6), and perform a field space rotation so as to identify the auxiliary variable W 0 , and set the elementsT i perturbatively in so that the resulting Lagrangian does not involve anyẆ 0 (after appropriate integrations by parts). This procedure can be performed in arbitrary dimensions and if we focus for simplicity in D = 3 dimensions, we get where we have definedF µν ≡ 2∂ [µBν] . (5.8) After substitutingT i into Eq. (5.4), we can confirm that W 0 is manifestly an auxiliary variable. To pass to the Hamiltonian formulation, we therefore define the conjugate momenta π i = ∂L/∂Ẇ i and get where G 1 and G 2 do not depend on W 0 and their exact expressions is not relevant to the discussion here. In D = 3 dimensions the termĀ is given bȳ One important point to notice is that the term quadratic in the auxiliary variable W 0 only enters at quadratic order in . This means that up to leading and first order in the background expansion (zero and first order in ), the variable W 0 still acts as a Lagrange multiplier which generates the accidental U (1)-symmetry and removes one additional DoF. Indeed, had we truncated the theory to first order in , W 0 would then act as a Lagrange multiplier that enforces a primary constraint C On the other hand, when the O( 2 ) corrections are included,Ā does not vanish for the background chosen and W 0 still remains an auxiliary variable but ceases to be a Lagrange multiplier. To that order, integrating out W 0 we then get Therefore, we can see that all the D − 1 DoFs are now activated. The reason why the Hamiltonian is non-analytical in after integrating out W 0 is simply because our background itself is a perturbation around the trivial background φ α = x α , where there is an accidental gauge symmetry and only D − 2 DoFs are active. The non-analyticity in the Hamiltonian (5.12) reflects the fact that a DoF activated by a perturbative background is very weakly coupled, as we shall see more explicitly in what follows. It is straightforward to construct backgrounds for whichĀ does not vanish and is positive, and we shall construct approximate solutions below.

The longitudinal mode
In the last subsection, we have derived the quadratic Hamiltonian for the field W i on a generic backgroundB µ . Around the trivial backgroundB µ = 0 (orφ α = x α ), the longitudinal mode of W i is only a gauge mode. But, around a generic background (at least including the O( 2 ) terms), this mode becomes dynamical and there are in total D − 1 DoFs. Since the leading order (O( 0 )) of the Hamiltonian (5.9) is just the Maxwell theory, D − 2 of these DoFs are just the transverse modes of an Abelian gauge field, thus totally free of ghost or gradient instabilities. Therefore, to study the linear stability of this theory, we only need to focus on the longitudinal mode From the Hamiltonian (5.12), we see that the leading contribution to the longitudinal momentum mode χ comes from the term ∂ i π i 2 /4 2Ā . We shall scale it with so as to make the kinetic term of O( 0 ): Note that this is not yet the canonical normalization for the kinetic term, as there is still a characteristic scale inĀ. Up to O( ) neither G 1 nor G 2 contribute to the longitudinal mode ψ. This is because, up to O( ) in the Hamiltonian (5.9), there is still a gauge symmetry, enforced by a first class constraint C ( ) 1 , as we mentioned above. To see this explicitly, note that, at order O( ), the contributions in G 1 and G 2 which are independent of π i are given by 14) These expressions are clearly independent of the longitudinal mode since G ij vanishes for the longitudinal mode W i ∝ ∂ i ψ. So the leading gradient terms, i.e., ψ 2 terms, come from the next order pieces in G 1 and G 2 . Thus, to make the leading gradient terms of O( 0 ), we can define the longitudinal mode as After performing the scaling of Eqs. (5.13) and (5.16), the leading contribution to the Hamiltonian for the longitudinal mode goes schematically as The first term always comes in as squared, so we may definẽ 18) and regardχ as the new conjugate momentum. Therefore, the leading Hamiltonian is The linear stability of the longitudinal mode is guaranteed if one can find a backgroundB µ , such thatĀ is positive and the gradient term for ψ is positive definite at least for a local patch of spacetime.

Local backgrounds free of ghost and gradient instabilities
For a smooth Λ 2 -decoupling limit to be well-defined, it is essential that there are some stable background solutions in the massive gravity nonlinear sigma model. For the perturbative backgrounds being considered, we have come to the conclusion that the background is stable if the longitudinal mode is stable, that is, H L NLS is bounded from below. While one requires an exact solution to be stable across the whole spacetime, it is not necessary for a perturbative background to be stable globally, as the perturbative background may only be a good approximation of the underlying exact solution within a coordinate patch. Thus, to facilitate the stability analysis, we will expand a generic perturbative background within a local spacetime patch. Within this approach, it is easy to give explicit examples where ghost and gradient instabilities are both absent. SupposeB α has a characteristic length scale L, we can at least expect that within the spacetime patch x < L,B α is smooth and analytical, and approximates the underlying exact solution to a sufficiently good extent. Thus, we Taylor expandB µ around the coordinate origin and substituteB into the Hamiltonian (5.19). HereH µ ρ andM µ ρσ are constant. To leading order, both in and x/L, we have where now we havē Now, sinceF µν andĀ are just constants, we can move ∂ i and √ ∇ 2 around by partial integration, so we may re-write Eq. (5.21) as The gradient terms in this expansion are rather simple. In fact, they are manifestly positive definite. Thus, there are no gradient instabilities for any perturbative background within a local patch L. To determine the consistency of a perturbative background, one only needs to check for ghost instabilities, which amounts to checking whether or not a perturbative background gives rise to a positiveĀ.
The equations of motion for φ µ in this approach becomes, to lowest and sufficient order, In this section we have established the existence of stable vacua for the longitudinal mode. Since on this perturbative vacua, the other DoFs simply behave as an Abelian gauge theory (with small corrections), these DoFs are obviously free of ghost and gradient instabilities. Moreover, the longitudinal mode does not mix with the gauge modes to leading order. Thus, at least within our perturbative approach, there are backgrounds in the massive gravity nonlinear sigma model that are entirely free of ghost and gradient instabilities.
6 The Λ 2 -decoupling limit In § 2.3, we have seen that around the trivial background the longitudinal mode of ghost-free massive gravity only acquires a kinetic term via mixing with the tensor modes. Thus, around the trivial background, the theory is strongly coupled at the scale Λ 3 . In the previous sections, we have shown that the massive gravity nonlinear sigma model (2.11) has D − 1 DoFs and there are non-trivial backgrounds where all of these D − 1 DoFs are excited and are stable, at least perturbatively. This means that on these generic vacua, ghost-free massive gravity (i.e., the dRGT model) admits a Λ 2 -decoupling limit: which leads to We emphasize that directly setting g µν = η µν in ghost-free massive gravity would be an inconsistent procedure. Rather, the correct way to obtain the massive gravity nonlinear sigma model is through the Λ 2 -decoupling limit defined above. In this way, the healthy properties of ghost-free massive gravity can be carried over to the resulting scaled theory, i.e., the massive gravity nonlinear sigma model. To prove a smooth Λ 2 -decoupling limit exists, we need to make sure the would-be decoupled theory has the right DoFs and there are backgrounds where these DoFs are well-behaved, which we have proven in the previous sections. In what follows we can therefore work in this Λ 2 -decoupling limit and determine how the strong couplings scale gets redressed by the scale L −1 .

Generic operators
In § 5, we have shown that there are healthy backgrounds that are a small deviation from the trivial oneφ µ = x µ . It may well be the case that there are healthy backgrounds far away from the trivial solution which could in principle be written as whereφ α is an exact background andQ α ρ ∼ O(1) is assumed to have a characteristic length scale L. One might also considerQ α ρ not to be O(1), but that simply amounts to redefining graviton mass m and tuning dimensionless parameters α n (or β n ) away from O(1).
Schematically, the spacetime derivative of the background goes as The matrix square root goes like X ∼ ∂φ(1 + ∂V /∂φ + (∂V /∂φ) 2 + · · · ) + O(h/M Pl ). Substituting these into the action (2.14), the quadratic kinetic terms around this background are schematically given by In our dimensional analysis below, we shall neglect all O(1) factors such as f µν ρσ (∂φ) as well as the Lorentz indices unless needed for the discussion.
As shown in the previous sections, one DoF in V µ is not dynamical, so one can always perturbatively make a field redefinition so that W 0 is manifestly an auxiliary variable and the D −1 components of W i are dynamical. At linear order in W µ , this redefinition should reduce to a linear rotation similar to that of Eq. (5.6) but withT i now depending on the generic backgroundφ. As shown in the previous section, the kinetic terms after the field redefinition will be schematically given by There is a characteristic scale L −1 coming out of the background every time a derivative is shifted from W µ to the backgroundφ. As W 0 is an auxiliary field, one can integrate it out, which, to leading order in perturbations in W µ , should be We will later include all possible nonlinear terms of W µ for W 0 . Therefore, integrating out W 0 at leading order, we have where W i ⊥ represent the D − 2 transverse modes and W i the longitudinal mode which is absent on the trivial vacuum but not on generic ones. Note that in deriving Eq. (6.10) we have neglected the L∂W i term of Eq. (6.9). This is because a derivative on W µ is greater than L −1 within x L, so one can symbolically think of L∂ as a large number. (6.11) and from these normalizations, it is obvious that the lowest strong coupling scale should come from some pure W µ interactions, i.e., terms without h.

The canonical normalizations are then
Although the model is fixed (up to a few parameters), we now have the freedom to choose the vacuumφ. This choice will then affect the normalization and hence the scale of the interactions. We shall first assume that all a priori conceivable terms exist, and then comment on specific classes of vacua where certain terms happen to cancel. Before canonical normalization and integrating out W 0 , a generic interaction for W µ is given by Next, we integrate out W 0 , which, including all possible nonlinear orders, may be written as where we have used Eq. (6.9) for W 0 . (In here N is not to be confused with the dimension of the target space that appeared earlier.) Substituting W 0 into Eq. (6.12), a generic interaction term is then given by Assuming that M of the W i are the longitudinal mode W i and the rest are the transverse mode W i ⊥ , the canonical normalization gives with integers T, N, K, P, Q, M satisfying For operators with QK − P − M ≤ 0, the corresponding operator is either relevant or has a strong coupling scale that is no smaller than Λ 2 (simply noting that T − 2 + Q(N − K) > 0).

Strong coupling scale
The operators that enter at the lowest energy scale satisfy QK − P − M > 0, which requires For these operators, the associated energy scale is a geometric mean of Λ 2 and L −1 (the characteristic scale of the background): with m = QK − P − M > 0 and n = T + Q(N − K) − 2 > 0. For the stable perturbative backgrounds we have identified with the local coordinate expansion, the existence of a valid effective field theory requires that L is larger than Λ −1 2 * , which implies L −1 < Λ 2 . It follows that the lowest interaction scale then comes from a geometric mean where L −1 has as many powers as possible. That is, the lowest strong coupling scale corresponds to the greatest ratio of In summary, using the relation (6.16) as well as K > 0, it is clear that the greatest ratio corresponds to N = K = 2, P = M = 0, Q = T with T = 3. This ratio comes from cubic terms that go like Since W 0 is an auxiliary field, the ∂ 3 in front of (W 0 ) 3 should only contain spatial derivatives. Thus, if all a priori possible terms exist in the perturbative expansion of W µ on some backgroundφ, then the lowest strong coupling is given by On the other hand, it is conceivable that for certain backgrounds some operators may not exist or cancel out. In addition, some operators may be removable by field redefinitions. Around those backgrounds, Λ 2 * can potentially be raised to In summary, the precise value of the strong coupling scale depends on the detailed properties of the vacuum and its characteristic scale L, which should be analyzed on a case by case basis. But the range of the dressed scale Λ 2 * is and can be parametrically larger than the standard Λ 3 scale one typically derives in massive gravity. Notice that when L is so large that the resulting scale Λ 2 * becomes comparable or smaller than Λ 3 then the interactions with the gravity can no longer be ignored and the correct strong coupling scale does not actually fall below Λ 3 .

U (1) symmetry in 2D
The general results of the previous sections apply to dimensions greater than two. In D = 2 dimensions, the massive gravity nonlinear sigma model has an extra gauge DoF, on top of the constraints that eliminate the BD ghost. So there is no physical DoF in the 2D massive gravity nonlinear sigma model, if the internal space is of the same dimension as the spacetime.
In this section, we show explicitly the gauge transformation around an arbitrary background.
The general massive gravity nonlinear sigma model in 2D is given by For simplicity, we adopt here a Euclidean signature for η µν and η ab , as our goal is mainly to count the number of DoFs in the theory. AssumingĀ µ is a background solution which satisfies the equations of motion, we look for a small perturbation around it The equations of motion forĀ µ are The quadratic Lagrangian for the perturbations V µ on the vacuumĀ µ is captured by whereĀ ν satisfies the background equations of motion (7.3). By direct calculation, one can show that this Lagrangian is invariant under the infinitesimal gauge transformation: where ξ(t, x) is the gauge parameter, once the on-shell conditions are imposed onĀ µ . This implies that the U (1)-symmetry remains about any on-shell background of the theory. Since we have worked at quadratic order about an arbitrary background, our analysis is equivalent to working to all orders about the trivial background. The helicity-0 mode is hence fully absent from the theory which propagates no physical degrees of freedom in D = 2 dimensions. The existence of this symmetry is very specific to D = 2 dimensions and as we have seen does not generalize to higher dimensions where the U (1)-symmetry is broken in the full theory.

Discussions
In this paper, we have developed the Λ 2 -decoupling limit of Lorentz-invariant massive gravity (specifically ghost-free massive gravity [8,9]). This is an approximate description of a large family of solutions of Lorentz-invariant massive gravity, all of which spontaneously break Lorentz invariance. Hence this excludes the usual Lorentz invariant vacuum which lies within the Λ 3 regime. Interestingly the Λ 2 Λ 3 regime is far closer in spirit to the decoupling limit of massive gravity on AdS where the strong coupling scale is also parametrically higher. As in the case of massive gravity on AdS, the vDVZ-discontinuity is simply absent already at the linear level, and hence these backgrounds easily comply with existing tests of gravity.
Beyond the scheme of massive gravity, we have also shown an interesting connection between ghost-free massive gravity as a generalization of the p-brane Nambu-Goto action. In particular, we have pointed out that the ghost-free graviton potential can be viewed as a non-standard nonlinear sigma model that uniquely evades the compact requirement for the target space. This evasion is different from all the known examples where some auxiliary gauge trick is utilized and the first class constraints associated with the gauge symmetries explicitly project out the would-be ghost, while the massive gravity nonlinear sigma model makes use of second class constraints to project out the would-be ghost.
The uniqueness of ghost-free massive gravity, which essentially is due to the uniqueness of the matrix square root and anti-symmetrization scheme of the graviton potential, suggests that Lagrangian (2.11) is a unique generalization of the Nambu-Goto action that eliminates the ghost associated with the negative direction of the target space [15]. Without spoiling the spirit of this uniqueness, a further generalization is to promote the α n parameters to be functions of φ A , which also gives rise to a consistent nonlinear sigma model [15]. On the other hand, letting the target space have more than one negative direction, such as (−−, + · · · +), is necessary problematic [15]. Such a nonlinear sigma model has more than one ghost in the spectrum, but the unique matrix square root and anti-symmetrization scheme can only eliminate one ghost 8 . (In the Nambu-Goto special case, having more than one negative direction is possible as there are more than one diffeomorphism invariance, if D = p + 1 > 1.) For most of this manuscript, we have restricted ourselves to an internal space which is at least as large as the spacetime dimension, N ≥ D. The case N < D has its own interest, and was for example applied for the description of realistic condensed matter systems using the AdS/CFT correspondence in [45]. However, the absence of the BD ghost for N < D is more subtle. As shown in [46], in some cases of N < D, all the N DoFs may propagate. We note that this happens whenever the lapse function squared of the reference metric −f 00 + f 0k (f −1 ) kl f l0 vanishes, (here we have extended the target space metric f AB with zeros such that it formally has the same dimension as g µν ), which is when the unitary gauge Hamiltonian proof of the ghost-free-ness of massive gravity with a general reference metric [47] fails.
We have studied the massive gravity nonlinear sigma model by performing a nonlinear Hamiltonian analysis/Dirac-Bergmann algorithm, finding an exact solution and examining perturbations on that solution, and examining perturbations on a general perturbative background and determining its stability. Our study of the massive gravity nonlinear sigma model indicates that: • Ghost-free massive gravity (i.e., the dRGT model) admits a smooth Λ 2 -decoupling limit where the tensor modes are completely decoupled, and the whole matrix square root and anti-symmetrization structure is kept intact.
• Ghost-free massive gravity admits many non-trivial Λ 2 -backgrounds that are stable, around which all the D − 1 DoFs are propagating. These backgrounds need nonvanishing support from the vector modes, and spontaneously break the Lorentz invariance with the strength of the graviton Compton length scale.
• There is no linear vDVZ-discontinuity around these Λ 2 backgrounds. Thus these backgrounds trivially pass the local gravity tests such as the solar system tests for a Hubble scale graviton Compton length. In some sense, the Λ 2 backgrounds are the ones with the Vainshtein mechanism already implemented.
• Around these Λ 2 backgrounds, the strong coupling scale is raised to Λ 2 * , which is parametrically larger than Λ 3 .
It has been shown that homogeneous and isotropic cosmological solutions, as well as static, spherically symmetric black holes, in ghost-free massive gravity are absent/unstable [43,44,48], and it has been argued that the "natural" cosmological solutions in ghost-free massive gravity are inhomogeneous/anisotropic and the "natural" black hole solutions are nonstatic/spherically symmetric, the deviations from the exact symmetries being typically of O(m 2 ). In the Λ 2 decoupling limit, we are forced to break Lorentz symmetries in order to have stable backgrounds, and indeed we expect that it is the Λ 2 decoupling limit that is the most appropriate description of the generic inhomogenous cosmologies in massive gravity. We remind the reader that this forced inhomgeneity is not in conflict with observations since the scale of the inhomgeneity is set by m −1 which can be made arbitrarily large, and is usually taken to be at least of the order of the current Hubble horizon.
The existence of the Λ 2 -decoupling corresponds to a description of backgrounds which in unitary gauge will locally take the form They are physically different solutions from the Minkowski metric η µν even if the O(m 2 ) corrections were excluded, and the differences will show up in perturbations in the gravitational sector. If m −1 is taken to be a cosmological scale (of the order of the observable Universe today), all these backgrounds have essentially an approximately FRW geometry below the Hubble horizon, and at scales larger than the current Hubble scale can become inhomogeneous. We thus expect that the Λ 2 solutions describe a typical inhomogeneous cosmology, which may be approximately homogenous out to the scale m −1 . Once again, these Λ 2 backgrounds have the virtue that there is no linear vDVZ-discontinuity, and hence it will be significantly easier to satisfy current tests of gravity, raising the possibility that it is these Λ 2 backgrounds that may have the most direct connection with phenomenology.
We have shown that a Λ 2 background that is perturbatively away from the trivial Λ 3 backgroundφ α = x α is sufficient to excite the longitudinal mode. This suggests that one can continuously connect the trivial Λ 3 background with some nontrivial Λ 2 backgrounds. There may be some backgrounds such that in some local region (for instance around a star or black hole) the background is of the Λ 2 type, and asymptotically the background approaches the Λ 3 limit. How a particular background is chosen is determined by the initial and boundary conditions. supported by a Department of Energy grant DE-SC0009946. AJT and SYZ are supported by Department of Energy Early Career Award DE-SC0010600.

A.1 Equivalent Lagrangians for the minimal model
In the Vielbein formulation of ghost-free massive gravity [49], the flat space limit of the minimal model (in the X µ ν formulation) is given by where Λ µ ν is an auxiliary field, satisfying Λ µ ν η µσ Λ σ ρ = η νρ . Thus, the minimal model Lagrangian can be written in either of the following equivalent forms where λ ρσ and λ ρσ are symmetric in exchanging ρ and σ. Since Λ µ ν is quadratic in either of the two Lagrangians, we can easily integrate it out respectively. Up to a global rescaling of λ αβ , we get whereλ ρσ is the inverse of λ ρσ . In § 3, we take advantage of Lagrangian (A.4), as this form entitles an ADM-like splitting for λ αβ in the full Hamiltonian analysis. This action also resembles the Polyakov action to some extent. Expressions similar to Lagrangian (A.5), with gravitons activated, have been utilized to re-confirm the absence of the BD ghost in ghost-free massive gravity [50,51]. Further integrating out λ ρσ , we arrive at L m1 = −tr η −1 ∂φη∂φ T = −tr η µρ ∂ ρ φ α ∂ ν φ β η αβ , (A.6) L m2 = −tr ∂φ T η −1 ∂φη = −tr ∂ ρ φ µ η ρα ∂ α φ β η βν . (A.7)

A.2 Plane-wave Hamiltonian
To count the DoFs about the non-trivial plane-wave vacuum configuration (4.1), we work in the Hamiltonian formalism. To provide an explicit derivation, we focus on the D = 3 dimensional case provided in Eq. (4.2) and without loss of generality, we consider solely the Lagrangian L 1 .
We consider linear fluctuations V α about the vacuum configurationφ α so that the fields φ α take the form φ α =φ α + V α . (A.8) To quadratic order in fluctuations, we then have where F µνρσ are functions of F . Since the BD ghost is absent from this theory (as confirmed by the Hamiltonian analysis of § 3, some combination of the V µ 's must play the role of a Lagrange multiplier. On arbitrary backgrounds the Lagrange multiplier is a linear combination of the V µ 's, and, to make the primary constraint manifest, we can rotate the fluctuations V α in field space in such a way that W 0 becomes an auxiliary field. By requiring ∂L 1 /∂Ẇ 0 not to contaiṅ W µ , we get in D = 3 dimensions which gets rid ofẆ 0 completely (after appropriate integrations by parts while maintaining at most one time-derivative per field). Then one can define the conjugate momentum for i = 1, 2 where A n are functions of the background configuration F and are n th order in the remaining phase space variables W i , π i , A 0 = 128F 2 (F 2 + 8) 2 (3F 2 + 16) , (A.14) +2(F 2 + 8)(3F 2 + 16) F (∂ 2 π 2 F − 4∂ 2 π 1 + 4∂ 1 π 2 ) − 8(∂ 1 π 1 + ∂ 2 π 2 ) +F 4 (6∂ 2 π 2 + ∂ 2 W 1 F ) + 8π 2 (3F 2 + 8)F 2 F − 64π 1 (F 2 + 4)F F , (A.15) As soon as A 0 = 0, W 0 enters quadratically and it no longer imposes an additional first-class constraint. Rather one can easily integrate it out giving rise to the following Hamiltonian Since we are looking for the stability of the fluctuations V α , it is sufficient to construct the Lagrangian and Hamiltonian at quadratic order in fluctuations, i.e., to second order in ε. Moreover, we treat the backgroundφ α perturbatively and for the sake of this analysis it will be sufficient to work to second order in . To that order in perturbations, the explicit form of Lagrangian (5.4) is then given by