Massive Gravity from Double Copy

We consider the double copy of massive Yang-Mills theory in four dimensions, whose decoupling limit is a nonlinear sigma model. The latter may be regarded as the leading terms in the low energy effective theory of a heavy Higgs model, in which the Higgs has been integrated out. The obtained double copy effective field theory contains a massive spin-2, massive spin-1 and a massive spin-0 field, and we construct explicitly its interacting Lagrangian up to fourth order in fields. We find that up to this order, the spin-2 self interactions match those of the dRGT massive gravity theory, and that all the interactions are consistent with a $\Lambda_3= (m^2 M_{Pl})^{1/3}$ cutoff. We construct explicitly the $\Lambda_3$ decoupling limit of this theory and show that it is equivalent to a bi-Galileon extension of the standard $\Lambda_3$ massive gravity decoupling limit theory. Although it is known that the double copy of a nonlinear sigma model is a special Galileon, the decoupling limit of massive Yang-Mills theory is a more general Galileon theory. This demonstrates that the decoupling limit and double copy procedures do not commute and we clarify why this is the case in terms of the scaling of their kinematic factors.


Introduction
The Bern-Carrasco Johansson (BCJ) double copy [1,2] is a relation between the scattering amplitudes of two different theories. The BCJ relation, or colour-kinematics duality, states that in a gauge theory, one can always represent kinematic factors of scattering amplitudes so that they satisfy an analogue relation to the gauge group colour factors. Replacing the colour factors by kinematic factors in a given theory, leads to new scattering amplitudes describing other theories.
The first and most important example is the relationship between Yang-Mills theory and gravity amplitudes [2]. The origin of this relation can be understood from the string theory point of view by considering how open and closed string amplitudes are related, and looking at the low energy effective field theories of the two string theories. This is encapsulated by the KLT relations [3]. However the 'double copy' paradigm has been found to be more general, and there are known examples of extensions of double copy relations between two non-gravitational theories for example non-linear sigma model and DBI or special Galileon theories [4][5][6][7][8][9][10][11][12] as well as extended gravitational relations such as that between super Yang-Mills and supergravity theories [13][14][15]. Recently the double copy paradigm was extended for gauge theories with massive matter fields [16][17][18][19].
The physical applications of double copy extend beyond calculations of scattering amplitudes in Minkowski spacetime. For example, double copy is used for UV considerations of effective field theories [20][21][22][23], efficient gravitational wave calculations [24][25][26][27][28][29] and relations between classical solutions in different theories (known as classical double copy) . The double copy has been shown to apply for scattering amplitudes around more general backgrounds [52,53].
In this paper we initiate the application of the double copy paradigm to the scattering amplitudes of massive Yang-Mills theory, i.e. the low energy effective field theory of Yang-Mills coupled to a heavy Higgs field (with the Higgs integrated out) which spontaneously breaks the gauge symmetry in a way that all of the gauge bosons acquire the same mass. On the gauge theory side, the act of spontaneously breaking symmetries is well understood and is a major component of the standard model. Double copy of gauge theories with spontaneously broken gauge symmetries have been studied in [54][55][56], however the case where both of the copies of gauge theory have completely broken gauge symmetry (i.e. with only massive gauge bosons) has not been explored. On the gravitational side, the broken gauge symmetries (by virtue of the mass for the bosons) imply if the double copy procedure is still valid, broken diffeomorphism symmetries. The latter are in the purview of massive gravity theories 1 , and so we may naturally expect massive gravity in some form to arise from the double copy procedure.
Since a massive spin-1 particle has 3 degrees of freedom in four dimensions, the double copy theory contains 9 propagating states, which decompose into a single massive spin-2 particle, a single massive spin-1 particle and a massive scalar. The interactions of massive spin-2 particles are well known to be highly constrained. Generic interactions are expected to lead to a breakdown of perturbative unitarity at the Λ 5 = (m 4 M Pl ) 1/5 scale [58], where m is the spin-2 mass. Special tunings can be made that raise this scale to the Λ 3 = (m 2 M Pl ) 1/3 scale which is the highest possible scale in four dimensions [59,60]. An explicit nonlinear effective theory exhibiting this scale is the so-called ghost-free massive gravity or de Rham-Gabadadze-Tolley (dRGT) model [60].
Remarkably, we find that the double copy paradigm automatically leads to a theory in which the interactions of the massive spin-2 field are described by the dRGT massive gravity [60], at least to quartic order. In fact we will find that the free coefficients in the dRGT Lagrangian are fixed by the double copy prescription to this order. We further find that the interactions of the additional spin-1 and spin-0 states are also at the scale Λ 3 strongly suggesting that this is the controlling scale of the EFT at all orders. Since massive Yang-Mills is itself an EFT with the highest possible cutoff for a coloured spin-1 particle, namely Λ = m/g, we may regard this as a natural double copy relation between two highest cutoff effective theories.
This connection is emphasized when we recognize that the leading helicity-0 interactions of a massive graviton are dominated in the decoupling limit (defined by taking m → 0 for fixed Λ 3 ) by the double copy of the leading helicity-zero interactions of the massive spin-1 gluon. Since the decoupling limit of massive Yang-Mills is a nonlinear sigma model, as encoded in the Goldstone equivalence theorem, we may reasonably expect that the interactions for the helicity-0 spin-2 states are determined by the double copy of the nonlinear sigma model 2 . It is known that double copy of a non-linear sigma model is the special Galileon [5-7, 11, 12] and that the decoupling limits of massive gravity theories are also Galileon-like theories [58][59][60]62]. However, the latter are nevertheless more complicated and include in particular non-trivial vector scalar interactions that survive even in the decoupling limit [62,63]. Even projecting onto the scalar sector, the massive gravity decoupling limit is not equivalent to a special Galileon, and so we find that the decoupling limit procedure does not commute with the double copy procedure.
The origin of this is that there are terms needed in the kinematic factors to satisfy colourkinematics duality that are singular in the decoupling limit but nevertheless cancel out of the gauge amplitudes. However when we construct the gravity amplitudes by squaring these kinematic factors, they no longer cancel and give additional non-zero contributions that are finite in the decoupling limit. To be precise, the kinematic factors which satisfy colour-kinematics duality n s + n t + n u = 0 take the form 1) where Σ(s, t, u) (triple crossing symmetric) andn i are finite as m → 0. Here Σ arises in a manner similar to the generalized gauge transformations in the massless case, a fact which is crucial to understanding why its contribution is finite. The explicit expressions for Σ andn i are given in Eqs. (E.12), (E. 13), (E.14) and (E. 15). Since in the massive case s + t + u = 4m 2 we haven s +n t +n u = −mΣ and so in the limit m → 0,n i by themselves satisfy colour-kinematics duality. The 1/m 3 behaviour in n i comes from helicity 0, 0, 0, ±1 interactions since the polarization tensor for a massive helicity-0 gluon scales as 1/m but that for helicity-1 is finite as m → 0. The term Σ cancels out of the gauge theory amplitudes by virtue of the colour relation c s + c t + c u = 0, demonstrating the natural decoupling limit scaling.
By contrast, when we square to construct the gravity amplitudes, Σ survives as a contact term. For instance the naive leading 1/m 6 term enters in the gravity amplitudes in the combination 3) and hence it contributes at the Λ 3 scale. Specifically this will show up as a non-zero spin-2, helicity 0, 0, 0, ±2 interaction. Similarly the naive 1/m 5 term is suppressed by virtue of the kinematic relation n s +n t +n u = −mΣ and we have in full as an exact statement (1.4) Since Σ does not contribute to the gauge theory amplitudes, first taking the decoupling limit of them (giving a non-linear sigma model) and performing the double copy procedure (giving a special Galileon) will lead to a different result in which the ΣΣ Λ 6 3 term is absent 3 . The kinematic factors inferred from the decoupling limitn i (m = 0) will necessarily be finite in the decoupling limit, and these do not correspond to the decoupling limit of the above kinematic factors (1.1) which are singular. Indeed in the decoupling limit, the gauge theory kinematic factors come purely from helicity-0 gluons by the Goldstone equivalence theorem.
It is worth noting that if we give up strict colour-kinematics duality in the massive case, then an acceptable choice of kinematic factors that reproduce the gauge theory amplitudes areñ i =n i /m 2 . However they no longer sum to zero. Using these in a double copy prescription will give a gravity amplitude given by the second term on the RHS of (1.4), whose decoupling limit correctly reproduces the special Galileon. However, since iñ i = 0 we have no reason to trust that the double copy prescription is meaningful in this context. Indeed, there is no clear recipe to generalize this to higher amplitudes. It is for this reason that throughout this paper we assume that the colour-kinematics duality holds in tact in the massive case in the same manner as the massless.
The paper is organised as follows: first we briefly introduce massive Yang-Mills in section 2 and dRGT massive gravity theories in section 3, then describe the double copy prescription and give the action obtained from squaring massive Yang-Mills in section 4. In particular we find that the colourkinematics duality holds for 2-2 scattering amplitudes and the resulting theory has Λ 3 = (m 2 M Pl ) 1/3 cutoff scale which is known to be the highest possible cutoff for massive spin-2 fields [58]. Having determined the gravity Lagrangian up to quartic order, we specify the decoupling limit in section 5 and clarify its inequivalence to a special Galileon. The precise quartic interactions are given in Appendix A and our conventions are give in Appendix B. Appendix C contains a brief explanation of why giving a mass to a two form potential (which arises naturally in the massless double copy story) is equivalent to a massive spin-1 Proca theory, as we find the latter formulation more useful in constructing the interacting Lagrangian. In Appendix D we complement section 5 and give the explicit decoupling limit of the gravity amplitudes, while in Appendix E we do the same for the Yang-Mills amplitudes and clarify why the double copy procedure does not commute with the decoupling limit.

Note added
In preparing this work for submission we became aware of results obtained by Laura Johnson, Callum Jones and Shruti Paranjape which also reproduce the quartic double copy interactions [66].

Massive Yang-Mills
The action of massive Yang-Mills theory comes from the low energy effective action of Yang-Mills theory with a Higgs field in which the Higgs particles are integrated out. We consider the gauge symmetry to be broken in such a way that all of the gauge bosons acquire the same mass, m. Then the leading terms in the effective Lagrangian in unitary gauge are as follows: where g is the coupling constant. This is the simplest unitary gauge Lagrangian which can describe a massive coloured spin-1 particle. Since the resulting theory is not renormalizable, it should be understood as an effective theory, and to this Lagrangian we may add an infinite number of interactions. For instance, we may further consider a quartic interaction tr(A µ A µ ) 2 . The structure of the effective Lagrangian is best understood by reintroducing Stückelberg fields (Goldstone modes) by replacing where φ a (x) are the Stückelberg fields, so that the gauge invariant form of the Lagrangian is T a ξ a (x) and ξ a (x) is the gauge transformation parameter. The unitary gauge Lagrangian is recovered by fixing the gauge φ a = 0.
The resulting effective theory has a cutoff of at most Λ = m/g which is the Goldstone mode decay constant. Additional interactions in the effective action could further lower this scale, but for now we assume that Λ is the controlling scale. Taking the decoupling limit g → 0 for fixed Λ results in a free massless spin-1 theory and an interacting non-linear sigma model This encodes straightforwardly the content of the 'Goldstone equivalence theorem' that the leading interactions for the helicity-0 modes of the massive spin-1 particle are determined by the effective theory for the Goldsones described by (2.4). From a classical perspective, the form of the Lagrangian (2.3) is clearly preferred due to its two derivative nature and it is for the reason that we will focus on the tree level amplitudes derived from this form in what follows. Were we to include additional unitary gauge interactions such as tr(A µ A µ ) 2 , etc. it is transparent in the Stückelberg formulation that these correspond to higher order operators, and they are expected to be suppressed by the scale Λ. In the decoupling limit, these extensions just correspond to the addition of further irrelevant operators to the nonlinear sigma model Lagrangian, which have been considered in the double copy context for example in [20][21][22][23].
These tree amplitudes are however most conveniently computed in unitary gauge (2.1). This is because the off-shell vertices for massive Yang-Mills are identical to their massless counterparts, and the only difference is the massless propagator is replaced by the massive one with structure whereη µν = η µν + p µ p ν /m 2 . Our goal is to follow as closely as possible the double copy paradigm for massless Yang-Mills theory [2] and with this in mind we express the tree level n-point scattering amplitudes of this theory as: where c i are colour factors i.e. products of the structure constants of the gauge group, n i are the kinematic factors, i labels distinct Feynman graphs and α i labels all internal propagators in a given graph. The only difference between this and the standard double copy is the replacement of massless propagators p 2 αi by massive p 2 αi +m 2 . The resulting kinematic factors n i are not the same as those that arise in the massless case since they absorb the information from the massive polarization structure encoded inη µν , and furthermore the on-shell external momenta now satisfy p 2 i = −m 2 . Given this it is not automatic that the colour-kinematics duality still holds. We will nevertheless show that it continues to hold up to quartic order.

Three-point Amplitude
In terms of polarization and momentum vectors the three-point on-shell vertex for massive Yang-Mills is exactly same as that of massless Yang-Mills: 4 The difference is that now the on-shell momenta satisfy p 2 i = −m 2 and there are 3 possible polarization states. Our conventions for these are given in Appendix B.

Four-point Amplitude
We express the four-point amplitude in the form given in Eq. (2.6) by defining the colour factors to be: where the kinematic factors are (2.14) where the Mandelstam variables are defined as standard: with all incoming momenta. These expressions for kinematic factors are very similar to those obtained from massless Yang-Mills theory but there are two differences: the relation between Mandelstam variables is now s + t + u = 4m 2 rather than s + t + u = 0 and the locations of the poles now are at s, t, u = m 2 . Because of that the terms coming from quartic Yang-Mills vertex now have to be multiplied by s − m 2 , t − m 2 and u − m 2 in order to recast the amplitude into the form (2.6).
In general, kinematic factors of a given scattering amplitude are not unique. They are not invariant under field redefinitions. However in massless Yang-Mills theory for any choice of kinematic factors of four-point amplitude, the colour-kinematics duality, c s + c t + c u = 0 → n s + n t + n u = 0, is satisfied [67]. In our case of massive Yang-Mills theory, it is not immediately clear whether this is still true. However, explicit calculation shows that our colour and kinematic factors (directly calculated from usual Feynman rules) in (2.12),(2.13) and (2.14) still obey n s + n t + n u ∝ p 4 · 4 = 0 and c s + c t + c u = 0. The fact that this still holds for the massive theory can be understood by noticing that the only difference between massive and massless kinematic factors is coming from the terms proportional to m 2 in (2.12), (2.13) and (2.14) (in fact we do not need to use the relation between s, t and u here). It is easy to see that these six terms add to zero, therefore the value of n s + n t + n u is the same for massless and massive theory and colour-kinematics duality for four-point amplitude still holds in the massive case.

dRGT Massive Gravity
In the dRGT theory of massive gravity, the diffeomorphism symmetry is broken by the non-dynamical reference metric, f µν , which appears in the action. It can be written in unitary gauge in terms of the variables [60] K µ This unusual square root metric structure is what is needed to build a Λ 3 effective theory as it has a straightforward decoupling limit as we shall see in section 5. The full dRGT Lagrangian for a single spin-2 field can then be constructed in unitary gauge as [57] L = M 2 where we set κ 0 = κ 1 = 0 and κ 2 = 1 and the terms in the potential are defined as The squared brackets denote the traces, and the two coefficients κ 3 , κ 4 are the free parameters of the theory together with the graviton mass m 2 . The potential terms can be written in terms of the flat space Levi-Civita tensor 5 (3.6) In this paper we will consider f µν to be Minkowski metric, η µν as we shall be largely concerned with scattering amplitudes in Minkowski spacetime. The terms in (3.2) are the unique interactions which lead to second order equations of motion for all 5 propagating degrees of freedom. However from the EFT perspective it is natural to view them as the leading terms in an EFT expansion, controlled by the scale Λ 3 . Possible higher derivative operators will arise schematically as where F denotes the sum of all diffeomorphism invariant scalar operators 6 constructed out of its arguments with dimensionless Wilson coefficients.
Just as for a massive Yang-Mills field we can write this same Lagrangian in a manifestly covariant way via the introduction of Stückelberg fields. Since (3.2) is written an a manner in which it would be manifestly covariant if K µ ν (f, g) itself transforms as a tensor, then this tells us how to introduce Stückelberg fields. Since the only part of K µ ν (f, g) that does not transform appropriately as a tensor is the reference metric f µν = η µν , it is sufficient to write this metric in an arbitrary coordinate system The four diffeomorphism scalars Φ A (x) may then be split as Φ A (x) = x A + π A (x). The π A (x) are then the Stückelberg fields we need to reintroduce manifest diffeomorphism invariance and play the 5 We use Euclidean coventions so that for flat spacetime ε 0123 = ε 0123 = 1, i.e. in the Lorentzian ε µναβ = −η µµ η νν η αα η ββ ε µ ν α β . As long as we are clear that we use one of them with all indices up and the other with all indices down together with with the generalized Kronecker delta expressed as a determinant of a matrix built out of δ's. 6 All breaking of diffeomorphism invariance can be captured by the tensor Kµν , hence all terms in the Lagrangian are diffeomorphism invariant when Kµν itself is viewed to transform as a tensor.
analogue of the φ a (x) in V (x) (2.3), so that unitary gauge is π A (x) = 0. We will make explicit use of this decomposition in section 5. For the purposes of calculating scattering amplitudes it is sufficient to work with the unitary gauge Lagrangian.

Three-point Amplitude
The three-point amplitude in dRGT massive gravity is as follows: where Γ 3 is the cubic vertex from Einstein-Hilbert term plus the cubic potential term U 3 (K). It is expressed as follows: where the coupling constant κ = 2/M Pl . The first term is already proportional to the square of Yang-Mills three-point colour-stripped amplitude if we write the polarization tensors as products of two spin-1 polarization vectors, Therefore, in order for double copy to work we need to choose κ 3 such that the second term vanishes, i.e. κ 3 = −1. We see that already at cubic level the double copy construction picks a particular one parameter (κ 4 ) subset of theories from 2-parameter family of massive gravity theories.

Degrees of Freedom
In the double copy construction the asymptotic states in the gravitational theory are identified with the tensor products of gauge theory asymptotic states, ignoring their colour indices. For example, the double copy of pure Yang-Mills theory gives the following states: i.e. we decompose the tensor product of two massless vector representation into irreducible representations of Lorentz group: h µν is the graviton, B µν is a massless antisymmetric 2-form field and φ is a massless scalar field (dilaton). In four dimensions the massless B µν is dual to a pseudo-scalar, i.e. axion. In terms of degrees of freedom we have 2 × 2 = 2 + 1 * + 1.
In the case of massive Yang-Mills, all the fields in (4.1) are massive: h µν is a massive spin-2 field, B µν is a massive 2-form field which is dual to a massive spin-1 field in four dimensions and φ is a massive scalar field. In terms of degrees of freedom we now have 3 × 3 = 5 + 3 + 1. In this paper we will consider four dimensions and write the action obtained from double copy of massive Yang-Mills in terms of massive spin-2 (h µν ), massive spin-1 (A µ ) and massive spin-0 (φ) fields. We see that there is an interesting physical difference between the field content of the double copy of massless and massive Yang-Mills theories: in the massless case the B field is a spin-0 field while in massive case it is spin-1. 7 Note that only polarization tensors for helicity ±2 can be written as ( i )µν = ( i )µ( i )ν , for helicities ±1, 0 we need to sum over the products of different helicities weighted by Clebsch-Gordan coefficients λ µν = λ λ C λ λ λ λ µ λ ν .

Double Copy Construction of Scattering Amplitudes
In order to double copy massless Yang-Mills theory the representation for the amplitude in (2.6) must satisfy the colour-kinematics duality [1], i.e. whenever three of the colour factors, c i , c j and c k are related by the Jacobi identity, c i +c j +c k = 0, the corresponding kinematic factors must obey the same relation i.e. n i + n j + n k = 0. It is conjectured [1] that it is always possible to choose a representation for the amplitude for which kinematic factors satisfy this by choosing a gauge and performing field redefinitions. In the massive case that is not guaranteed to be true but we have checked that the kinematic factors of four-point amplitude calculated directly from (2.1) satisfy the colour-kinematics duality.
In the usual double copy procedure, once the correct representation for (2.6) is chosen, the colour factors can be replaced with kinematic factors in order to obtain an amplitude of a gravitational theory [2]. We follow the same procedure and conjecture that the following expression gives an amplitude in a massive gravity theory: whereñ i are the kinematic factors of the second massive Yang-Mills. The products of Yang-Mills polarization tensors in n i andñ i , µ and˜ ν respectively, are decomposed into polarization tensors of the fields in the gravitational theory. This corresponds to decomposition of a tensor product of two vector representations of the little group (for massive particles in 4d it is SO (3)) into irreducible representations. Schematically this is done as follows: where j, k are little group indices, (()) denotes the symmetric traceless part corresponding to the graviton polarization, (h) , and the antisymmetric part denoted as [] corresponds to the spin-1 polarization in terms of the B field, (B) . However instead of working with the massive B µν field in this paper, we construct the action in terms of the vector field A µ which is dual to B µν . The dualization procedure is explained in Appendix C. We define the map between B field polarization tensor and A polarization vector to be: where p σ is the four-momentum of the external state and the factor of √ 2 is required for the correct normalization. The trace part of the tensor product, given in (4.5), is the polarization tensor corresponding to the scalar, φ. As we show in B.3 from explicit calculation in helicity basis we find it to be which up to a sign could equally have been fixed by the requirement that it is a tracefull, transverse and normalized.

Double Copy of Three-point Amplitudes
We apply (4.2) to three-point amplitudes explicitly giving the following relation: where the 3 point amplitudes have their structure constants, f abc , stripped off. By substituting (2.7) and (4.3), (4.6) and (4.7) we get the following three-point vertices in a gravitational theory: As mentioned before, M hhh matches three graviton amplitude of massive gravity if we choose κ 3 = −1 (or c 3 = 1/4 using the parametrization of the theory as in [59,68]). The M AAh and M φφh amplitudes are different from those obtained from vector and scalar kinetic terms minimally coupled to gravity (for example a minimally coupled scalar would give M φφh = −iκ 3µν p µ 1 p ν 2 . This is expected, since we know theories containing massive spin-2 field do not have diffeomorphism symmetry, and we allow couplings between our fields and the reference metric which in this case is the Minkowski metric. In this way we evade the usual equivalence principle requirements for a massless spin-2 particle. As already mentioned we see that M hhh matches the 3 point amplitude of massive gravity with κ 3 = −1.

Double Copy of Four-point Amplitudes
We start with hh → hh amplitude which is calculated using (2.12), (2.13), (2.14), (4.2) and (4.3). By comparing it with hh → hh amplitude calculated using dRGT massive gravity action, M mGr 4 , we find the following: with the free coefficients in the massive gravity action chosen to be κ 3 = −1 and κ 4 = 7 24 (c 3 = 1 4 and d 5 = − 7 192 using the parametrization of [59]). The second term on the right hand side of (4.15) corresponds to a scalar exchange with three-point vertex given in (4.11).
Having fixed the spin-2 interactions, we then construct the scattering amplitudes for all other 2-2 scattering processes (for example hφ → AA) from the double copy prescription, and make an ansatz for the action which gives these amplitudes. A couple of general features emerge. We find that all 3 and 4 point amplitudes containing odd numbers of A are zero as one would expect since A is a vector. Furthermore we find that none of the amplitudes scale with energy more that E 6 at high energies. Since all of them have κ 2 = 4/M 2 Pl in front (can be seen from (4.2)), the lowest scale appearing in the resulting theory to this order is Λ 3 = M Pl m 2 1/3 , the well-known highest possible scale for a Lorentz invariant theory of massive gravity.
As already stated, from (4.14) and (4.15) we see that the self interactions of h up to quartic order in h can be described by dRGT massive gravity action. Anticipating that the n-point scattering amplitudes are controlled by the scale Λ 3 to all orders, it is natural to write the interactions for all the fields in the dRGT form, taking particular care to choose combinations which are natural from the point of view of the decoupling limit effective theory, namely those that automatically lead to Λ 3 interactions to all orders. This process is somewhat labourious, and we quote only our final form for the action which is where g µν = η µν + κh µν is the dynamical metric, η µν is the reference metric, and the crucial contact terms which fix the form of the 2-2 scattering amplitude are given in Appendix A. The indices are raised/lowered with g. The self interactions of the scalar, φ, contain galileon interactions (the cubic term in (4.16) and the quartic one in (A.1)), φ 3 term and two additional two and four derivative contact terms to this order. The action has been intentionally written in a manner which is diffeomorphism invariant in terms of K. The reference metric η that breaks diffeomorphism invariances only enters through K, and in this sense K is a 'spurion' field for the breaking of diffeomorphisms.
Since the S-matrix is invariant under field redefinitions, the cubic φ interactions are ambiguous since we may for example use field redefinitions to trade the cubic Galileon term for a potential φ 3 and vice versa without changing the on-shell vertex. A similar story holds for the φK 2 and (∇φ) 2 K terms. However changing the off-shell structure in this way also changes the form of the quartic interactions. Anticipating that the decoupling limit is a Galileon-like theory (which is implicit in the Λ 3 scale), we have intentionally chosen to put the cubic interactions in a form for which the quartic interactions are also manifestly Galileon-like. In other words the desire to have a Galileon-like decoupling limit theory gives us guidance in writing the nonlinear off-shell structure of the theory that goes beyond what is immediately inferred from the on-shell scattering amplitudes, even though the diffeomorphism symmetry is broken by the mass term. That is the decoupling limit for the Stückelberg fields/Goldstone modes gives us an indication of the best way to structure the interacting Lagrangian and this explains many of our choices of interactions in (4.16) and Appendix A. Although we have not calculated beyond four-point level, the implicit nonlinearly realized diffeomorphism symmetry present in the Stückelberg formulation fixes a set of interactions at all orders as is familiar in effective theories with broken symmetries.

Λ 3 Decoupling Limit
Having successfully constructed the interaction Lagrangian for the double copy effective theory, at least to quartic order, it is useful to understand its decoupling limit. This will give us insight into the interactions that arise beyond 2-2 scattering, and the overall structure of the effective theory, but it will also allow us to understand better the connection between the massive Yang-Mills decoupling limit and that for the double copy massive gravity theory. We have intentionally written the interacting Lagrangian (4.16) in as covariant form as possible, so that the decoupling limit is easily derived. Following the standard recipe (see for example [57] for a review), after denoting the reference metric from which K µ ν is constructed by we further decompose so that we may identify V A as the helicity-1 and π as the helicity-0 modes of the spin-2 particle. Further for the massive spin-1 state A µ we replace it by where χ is the original Stückelberg scalar, the helicity-0 state of the spin-1. The normalizations, which are standard, are chosen so that all the additional Stückelberg fields have a finite (and non-zero) kinetic term in the decoupling limit. The metric may be denoted g µν = η µν + κh µν . Remembering that κ = 2/M Pl , the decoupling limit is defined by m → 0, κ → 0 in such a way that Λ 3 3 = m 2 M Pl is kept finite. The Lagrangian has been written in a judicious way to ensure that no term diverges in this limit.
Crucially, we have lim m→0,Λ3fixed which explains the emergence of the Galileon symmetry for π in the decoupling limit, since Π µν is invariant under π → π + c + v µ x µ , and our choice of K as the building block. Hence for all terms in the Lagrangian for which the coefficients are finite in the Λ 3 limit, it is sufficient to replace K µν by Π µν and the metric g µν by η µν . The decoupling limit Lagrangian is found to be (keeping track only of those terms which contribute to quartic order) where all indices are raised and lowered with η µν . We have separated out the spin-2 and spin-1 helicity-1 contributions which even in the case of standard massive gravity is particularly complicated [62], and they are schematically and the kinetic term coefficients K µναβ and K µναβ are tensors constructed from Π µν /Λ 3 3 and Φ µν /Λ 3 3 . Since V µ and A µ are not sourced, classically it is consistent to set them to zero. They would of course contribute in loop processes.
The tensor X µν , which is characteristic of the massive gravity decoupling limit, needs to be identically conserved to ensure that that h µν preserves spin-2 gauge invariance (linear diffeomorphisms) h µν → h µν + ∂ µ ξ ν + ∂ ν ξ µ . This is the decoupling limit remnant of full diffeomorphism invariance. Explicitly its form is The tensor (5.7) is indeed identically conserved by virtue of the double ε structure. The full decoupling limit action (5.5) is invariant under two separate Galileon symmetries π → π + v µ x µ , φ → φ + u µ x µ and thus describes a bi-Galileon theory [69] coupled to a massless spin-2 field. Indeed it may be put in a more manifest bi-Galileon form by performing a 'demixing' transformation that removes the mixed hπ and hππ terms, namely We may make use of the fact that up to total derivatives The resulting Lagrangian then takes the form 10) whereΠ ab = ∂ a ∂ bπ andπ = π − 1 √ 3 φ. The term L int bi-Galileon contains standard cubic and quartic 8 bi-Galileon interactions: where we have used the shorthand XY ZW = ε abcd ε ABCD X A a Y B b Z C c W D d and the coefficients are given by (a 0 , a 1 , a 2 , a 3 The quartic interactions of the formhΠ 3 cannot be removed with a local field redefinition, as is well known from the standard massive gravity case. This is as it should be since it is precisely these interactions that describe the nonzero helicity 0, 0, 0, ±2 amplitudes that arise from the ΣΣ contact term in the decoupling limit, as described in equation (1.4) and implicit in the full answer (D.3) and explicit in (D.4). Indeed the combinationπ is exactly the combination which identifies the diagoanlized parts of π and φ that correspond to the spin-1 helicity-0 polarization tensor squared µ 0 ν 0 9 .
As noted in the introduction, since the decoupling limit of massive Yang-Mills is a nonlinear sigma model and the double copy of the latter is the special Galileon, we might have expected the massive gravity theory to be that corresponding to a special Galileon. Interestingly however, this was never possible since the decoupling limit of dRGT massive gravity never gives rise to a special Galileon. This is easily seen by the manner in which the Galileon interactions arise from mixing with h µν . The decoupling limit of dRGT massive gravity for general κ 3 and κ 4 is (ignoring helicity-1 contributions) where Since the special Galileon in four dimensions is a pure quartic Galileon, we need that after performing the demixing there is no cubic Galileon term. This requires (2 + 3κ 3 ) = 0 which does not correspond to the value obtained from double copy. Even with this choice, we then have and so we only have a non-vanishing quartic Galileon term when there is also a non-zero hπππ interaction which cannot itself be removed with a field redefinition since it contributes to the ±2, 0, 0, 0 scattering amplitude. Furthermore higher order n-point amplitudes will receive contributions from intermediate graviton exchange which do not arise in the pure quartic Galileon theory. Hence the special Galileon does not strictly speaking arise in standard massive gravity in any form.

Discussion
In this paper we explored the possibility of constructing an interacting massive spin-2 theory, i.e. massive gravity, as a double copy of massive Yang-Mills theory. Our prescription for doing this is to demand that the kinematic factors for the massive theory, defined by normalizing by the massive scalar propagator (2.6), satisfy the same colour/kinematics duality as the massless case. This is a nontrivial requirement even at the level of 2-2 scattering, we nevertheless find that it remains intact to this order. Interestingly the ambiguity that arises in the massless case (e.g. the ability to shift n s to n s + αs etc. -the so-called generalized gauge transformations) is fixed in the massive case by the requirement that colour kinematics holds. Furthermore the manner in which it is fixed is such that 9 To see this, note that at leading order in the decoupling limit Kµν ∼ 1 (− 3/2π + 1 the kinematic factors contain a term which is singular in the decoupling limit, but nevertheless leads to finite contributions to the gravity amplitudes. One consequence of this is that the decoupling limit and double copy procedures do not commute, a result which could not have been anticipated from the decoupling limit theories alone. Hence the by now well known relations between the scattering amplitudes of nonlinear sigma models, special Galileons etc [5,7,11,12] which appear to be part of a large web of interconnected theories, are non-trivially lifted by the presence of a mass term. It is beyond the scope of this paper to consider higher n-point amplitudes which are needed to check whether the double copy procedure remains intact at all orders, however it was shown in [66] that at 5 points the double copy of massive Yang-Mills amplitude gives spurious poles and therefore that cannot be matched with an amplitude calculated from a local Lagrangian. The reason for such poles is that at 5 point the massive Yang-Mills kinematic factors, n i , calculated directly from Feynman rules do not satisfy Jacobi identities, so they need to be shifted as n i → n i + ∆ i so that the amplitude remains unchanged. As it turns out for generic theories such shifts are non local, i.e. they have poles in kinematic invariants, s ij . For, example in massive Yang-Mills case these shifts contain the following polynomial of s ij in the denominator 320m 8 + 36m 6 (9s 12  which has some complicated expressions of zeros which are the locations of unphysical poles when the shifted n i 's are squared. In [66] the conditions on the spectrum of the theory for avoiding such poles are derived and it was shown that massive Yang-Mills theory does not satisfy them. Therefore, the double copy of massive Yang-Mills amplitudes can only be matched with amplitudes calculated from the gravitational action in (4.16) only at three and four points. While this result is clearly negative in terms of the naive application of the conjecture (4.2), we do not regard this as necessarily terminal for several reasons. It is worth noting that in the present context there is more freedom that the conventional story because both sides of the double copy are effective theories. We are free to add irrelevant operators suppressed by the scale Λ or a parametrically lower scale on the Yang-Mills side, for the sole purpose of ensuring the colour/kinematics relation remains intact even when the spectral conditions of [66] are not satisfied. One reasonable conjecture is that this should be possible in such a manner to ensure that the gravitational theory remains a Λ 3 , or similarly at a parameterically lower scale, theory to all orders. Some support for this comes from the fact that if we only focus on those amplitudes that arise from helicity-0 modes of the spin-1, then the double copy procedure is known to work to all orders since it gives a special Galileon for which all interactions arise at the scale Λ 3 . Since as we have seen the decoupling limit and double copy procedures do not commute, this does not constitute a proof.
Perhaps more importantly though, the rules of the double copy paradigm for massive theories, even if they apply, are not well established and it may not be meaningful to impose the conjecture (4.2) strictly to all orders. Since nearly all cases in which the double copy has been well established are for massless theories, it is not unreasonable to suppose that something like equation (4.2) may only be true at leading order in an expansion in powers of m, i.e. at leading order in the decoupling limit. What we have been able to construct, as outlined in section 5, is a local Λ 3 effective theory whose interactions match with the double copy prescription with massive Yang-Mills up to 4-point order. The effective theory whose leading terms are given in (4.16) is consistent to all orders (from the low energy point of view) and we may use it to compute arbitrary local (analytic) n-point functions at a given order in an EFT expansion, even if these n-point functions do not precisely match those implied by the double copy conjecture (4.2). That is we can infer the leading contributions to the higher point interactions in (4.16) from the decoupling limit rather than double copy. That this is possible despite having only computed up to 4-point order is due to the role played by the nonlinearly realized symmetry, from the underlying symmetry breaking in giving mass, and how it determines the leading interactions of the low energy theory, organizing the structure of the EFT. It remains possible that the effective theory outlined in (4.16), or some close relative to it, does have a role to play in an appropriately double copied spontaneously broken gauge theory.
It is also worth remembering that the de Rham-Gabadadze-Tolley model of massive gravity is really just one example of large family of interacting effective theories which may contain any number of massive spin-2 and lower states just as massive Yang-Mills is just one example of an effective theory for a spontaneously broken gauge symmetry. It would be interesting to explore extensions on both sides to understand to what extent the double copy paradigm may be preserved either fully in the sense of a strict relation along the lines of (4.2), or at least partially in a weaker form. The present work serves as a starting point for such an analysis and for establishing the relation between the decoupling limits on each side. What is interesting to note is that the class of Galileon effective theories emerge ubiquitously in decoupling limits of these theories. Indeed the original Galileon model was first noted to arise in a diffeomorphism invariant five dimensional theory, the Dvali-Gabadadze-Porrati model [70] which is superficially quite different to ghost-free massive gravity. There the effective four dimensional graviton emerges as a resonance, i.e. may be viewed as a continuum of massive states. The intimate connection between the NLSM and the special Galileon, and the similar emergence of the bi-Galileon effective theory in our present discussion suggest that there may be an analogous double copy prescription to (4.2) which could be applied to soft massive theory in which the mass for the spin-1 field on the gauge side emerges from a resonance. In this manner the spurious pole issue identified for higher n-point functions in [66] may be resolved by satisfying constraints on the interactions of the continuum modes without needing to satisfy specific spectrum conditions since the spectrum itself is continuous.
There are many interesting future directions that these results suggest. For example, is it possible to add irrelevant operators to (2.1) or new fields such that the double copy procedure works for all npoint amplitudes, or is it only possible if the spectral conditions of [66] are satisfied? If no extra fields or higher order operators can fix the spurious poles then maybe this suggests that there is some problem of constructing such a massive gravitational theory elsewhere. For example, as an anonymous referee pointed out, for Yang-Mills theory with a single adjoint fermion the colour kinematics duality cannot be satisfied in more than 10 dimensions because gravitational theory with more than 8 gravitini and no supersymmetry cannot exist in 4 dimensions. Also, can we construct the double copy of Yang-Mills action coupled to a Higgs field, which is the UV completion of massive Yang-Mills considered here, and what is the resulting gravitational theory? 10 Does this give us any insight into UV completing massive gravity theories? Does the procedure outlined hold at loop level in any way? Are there some simple extensions of the classical double copy relations? The latter would be highly nontrivial given the known complicated nonlinear dynamics of massive gravity theories exhibited by the Vainshtein mechanism. We leave these various considerations to future work.

A Contact terms
Below are the various contact terms needed in (4.16) to reproduce the desired quartic interactions. All terms are written in a covariant form, with the understanding that they enter the action with a √ −g prefactor.

B.1 Lie Algebra Generators of the Gauge Group
We use the following conventions for the generators, T a : These are related to the usual generators (for example in [71]) as T a = √ 2t a . We define the structure constants, f abc as which again are larger by a factor of √ 2 than the structure constants in [71]. In terms of these f abc the field strength tensor F a µν is written as:

B.2 Polarizations
The four momenta in the centre of mass frame with scattering angle θ and three momenta p = 1 2 √ s − 4m 2 is defined as: We define the polarization vectors in the helicity basis as follows: where ( µ λ ) * = (−1) λ µ −λ . The polarization tensors for the spin-2 field with different helicities are constructed from the polarization vectors with appropriate Clebsch-Gordan (CG) coefficients as (we review the construction in detail in B.3), The polarization tensors satisfy the transverse, traceless and completeness relations

B.3 Construction of gravity states from mYM
As mentioned in 4.1, from the tensor product of two massive spin-1 states we get a massive spin-2, a massive spin-1 and a massive spin-0 on the gravity side. In this section we review how the gravity on-shell states are constructed from such product, i.e, |1, λ 1 > ⊗|1, λ 2 >. The polarization tensor of the particle of spin J with helicity λ is given as, where λ = λ + λ . We start from the spin-0 state which is obtained from |0, λ >= |1, λ > ⊗|1, λ >, with λ = 0 = λ + λ . This polarization state is obtained by considering the following: where C 0 λ λ are the CG coefficients given in (B.15). By substituting (B.5), we can see that (B.10) can be expressed as: Hence, the factor of 1 √ 3 in (4.7) which follows from the CG coefficient.
To give an explicit example, the helicity λ = +2 is, In this paper we use the polarization states to be a superposition of different helicities and we do not focus on specific choices, for example for the graviton polarisation we have, (B.14) Spin-2 : C 2,2 ++ = C 2,−2 −− = 1,

C Dualization of the massive B field in 4d
We follow the dualization procedure explained in [72]. The Stückelberg action of free massive 2-form field, B, is where λ is a 1-form Stückelberg field which is needed to restore the gauge symmetry which acts on the fields as follows: The first step in the dualization procedure is to rewrite the action in terms of field strengths, H = dB and G = mB − dλ. To do that we need to impose Bianchi identities, with Lagrange multipliers. We first do it for G: where A is a 1-form Lagrange multiplier imposing (C.3). By integrating the last term by parts we can find the equation of motion for G to be G = − * dA. (C.5) Substituting this back to the action and integrating by parts the last term we get Now we can replace dB by H and impose (C.2) with a scalar Lagrange multiplier, χ. This gives the following Now again we integrate last term by parts and find the equation of motion for H to be Substituting this back in the (C.7) gives the Stueckelberg action for massive spin-1 field, A, known as Proca action: where χ is now the Stückelberg scalar field. From (C.8) we can see that in unitary gauge, χ = 0, the relation between the B and A fields is dB = − * mA which in coordinate basis can be written as: This means that the relationship between the polarization vector of A, (A) , and the polarization tensor of B, (B) , will be of the form: where the overall constant can be found by requiring to be normalised (i.e. consistent with (B.6)). This relation can be inverted by multiplying both sides by the ε tensor and p, which using p 2 = −m 2 and imposing normalisation condition gives (4.6).

D Double Copy of the 4-Point Scattering Amplitude in the Decoupling Limit
We take the Λ 3 decoupling limit, of the full scattering amplitude obtained from double copy with external states arbitrary superpositions of h and φ fields defined as: (setting the vectors to zero for simplicitly) This gives the following amplitude: This amplitude simplifies considerable if we focus on scattering processes of the form +2XXX. We may easily see that X can only be a scalar mode and this amplitude then takes the form The combination β T 5 + 1 √ 2 β S is precisely the combination of polarizations that picks out the helicity-0 squared term µ Since the helicity +2 mode has polarization tensor µ + ν + we recognize that M 4 (+2XXX) is the double copy of the +1000 massive Yang-Mills amplitude and comes specifically from the ΣΣ contact term (1.4).

E Decoupling limit of massive Yang-Mills amplitude
In this section we derive the decoupling limit of the massive Yang-Mills amplitude which is expected to be the amplitude of NLSM, derive the kinematic factors and double copy it to show that we recover the 4 point amplitude of a special Galileon. We also show that taking the decoupling limit and performing the double copy do not commute. From (2.6), the 4-point amplitudes of massive Yang-Mills is expressed as: with the n's given by (2.12), (2.13) and (2.14). By plugging the polarization vectors which are arbitrary superpositions of all helicities given as: and four momenta into the n's, they can be rearranged in the following form (as mentioned in (1.1)): 3) with n s + n t + n u = 0 andn s +n t +n u = −mΣ. The explicit expressions for then's and Σ(s, t, u) are given in (E.12) and (E.13) (E.14) (E.15). The amplitude can be written as, and as mentioned in the introduction, the last term which seems at first ill defined in the decoupling limit m → 0, Λ fixed, is zero by virtue of Jacobi identity. Focusing on the non-zero term, the amplitude in the dcoupling limit is as follows: (E. 6) We see that only helicity-0 polarization states remain interacting in this decoupling limit. The kinematic factors of this amplitude are, Note that in this limit we have s + t + u = 0 and can see that the colour-kinematics duality is satisfied.
Using the kinematic factors of this amplitude we double copy it and obtain the following: limit and performing double copy do not commute.