Scattering Amplitudes For All Masses and Spins

We introduce a formalism for describing four-dimensional scattering amplitudes for particles of any mass and spin. This naturally extends the familiar spinor-helicity formalism for massless particles to one where these variables carry an extra SU(2) little group index for massive particles, with the amplitudes for spin S particles transforming as symmetric rank 2S tensors. We systematically characterise all possible three particle amplitudes compatible with Poincare symmetry. Unitarity, in the form of consistent factorization, imposes algebraic conditions that can be used to construct all possible four-particle tree amplitudes. This also gives us a convenient basis in which to expand all possible four-particle amplitudes in terms of what can be called"spinning polynomials". Many general results of quantum field theory follow the analysis of four-particle scattering, ranging from the set of all possible consistent theories for massless particles, to spin-statistics, and the Weinberg-Witten theorem. We also find a transparent understanding for why massive particles of sufficiently high spin can not be"elementary". The Higgs and Super-Higgs mechanisms are naturally discovered as an infrared unification of many disparate helicity amplitudes into a smaller number of massive amplitudes, with a simple understanding for why this can't be extended to Higgsing for gravitons. We illustrate a number of applications of the formalism at one-loop, giving few-line computations of the electron (g-2) as well as the beta function and rational terms in QCD."Off-shell"observables like correlation functions and form-factors can be thought of as scattering amplitudes with external"probe"particles of general mass and spin, so all these objects--amplitudes, form factors and correlators, can be studied from a common on-shell perspective.


Contents
1 Scattering Amplitudes in the Real World Recent years have seen an explosion of progress in our understanding of scattering amplitudes in gauge theories and gravity. Infinite classes of amplitudes, whose computation would have seemed unthinkable even ten years ago, can now be derived with pen and paper on the back of an envelope using a set of ideas broadly referred to as "on-shell methods" [1,2]. This has enabled the determination of scattering amplitudes of direct interest to collider physics experiments, while at the same time opening up novel directions of theoretical research into the foundations of quantum field theory, amongst other things revealing surprising and deep connections of this basic physics with areas of mathematics ranging from algebraic geometry to combinatorics to number theory. Almost all of the major progress in this field has been in understanding scattering amplitudes for massless particles. There are seemingly good reasons for this, both technically and conceptually. Technically, almost all treatments of the subject, especially in four dimensions, involve the introduction of special variables (such as spinor-helicity, twistor or momentumtwistor variables) to trivialise the kinematical on-shell constraints for massless particles (see [3] for a comprehensive review). And conceptually, while it is clear that the conventional fieldtheoretic description of massless particles with spin, which involves the introduction of huge gauge redundancy, leaves ample room for improvement-provided by on-shell methods that directly describe particles, eliminating any reference to quantum fields and their attendant redundancies-the advantage of "on-shell physics" seems to disappear for the case of massive particles where no gauge redundancies are needed.
As we will see, the technical issue about massless kinematics is just that-the transition to describing massive particles is a triviality-while the conceptual issue is not an obstacle but rather an invitation to understand the both the physics of "infrared deformation" of massless theories (by the Higgs mechanism and confinement), as well that of UV completion (such as with perturbative string theory), from a new on-shell perspective (see sec.6).
But before getting too far ahead of ourselves it suffices to remember that the only exactly massless particles we know of in the real world are photons and gravitons; even the spectacular success of on-shell methods applied to collider physics are for high energy gluon collisions, which are ultimately confined into massive hadrons at long distances. Even if we consider the weakly coupled scattering amplitudes for Standard Model particles above the QCD scale, almost all the particles are massive. If the amazing structures unearthed in the study of gauge and gravity scattering amplitudes are indeed an indication of a radical new way of thinking about quantum particle interactions in space-time, they must naturally extend beyond photons, gravitons and gluons to electrons, W, Z particles and top quarks as well.
Keeping this central motivation in mind, in this paper we initiate a systematic exploration of the physics of scattering amplitudes in four dimensions, for particles of general masses and spins. We proceed in sec.2 with an on-shell formalism where the amplitude is manifestly covariant under the massive SU(2) little group. This approach allows us to cleanly categorize all distinct three-couplings for a given set of helicities or masses and spins. When constructing four-point amplitudes, this formalism sharply pinpoints the tension between locality and consistent factorization, which, in turn provides a portal into the difficulty of having higherspin massive particles that is fundamental. As we will see, everything that is typically taught in an introductory courses on QFT and the Standard Model-including classic computations of the electron (g −2) and the QCD β function (sec.7)-can be transparently reproduced from an on-shell perspective directly following from the physics of Poincare invariance, locality and unitarity, without ever encountering quantum fields, Lagrangians, gauge and diff invariance, or Feynman rules.
There are a number of other motivations for developing this formalism. For instance, much of the remarkable progress in our understanding of the dynamics of supersymmetric gauge theories came from exploring their moduli spaces of vacua [4]. From this point of view the study of massless scattering amplitudes has been stuck on a desert island at the origin of moduli space; we should now be able to study how the S-matrix varies on moduli space in general supersymmetric theories, especially beginning with the Coulomb branch of N = 4 SYM in the planar limit (see [5] for early surveys).
Another motivation, alluded to above, is the physics of UV completion for gravity scattering amplitudes. It is easy to show on general grounds that any weakly coupled UV completion for gravity amplitudes must involve an infinite tower of particles with infinitely increasing spins (as of course seen in string theory) [6]. This raises the possibility that string theory might be derivable from the bottom-up, as the unique weakly-coupled UV completion of gravity. But it has become clear that consistency conditions for massless graviton scattering alone are not enough to uniquely fix amplitudes-deformations of the graviton scattering amplitudes compatible with all the standard rules have been identified ( eq.(12.6) in [6]). This is not surprising, since the most extreme tension in this physics is the coexistence of gravitons with massive higher-spin particles. Indeed (as we will review in 3.2 from an on-shell perspective) the presence of gravity makes the existence of massless higher-spin particles impossible. We should therefore expect the strongest consistency conditions on perturbative UV completion to involve the scattering of massless gravitons and massive higher-spin particles, the study of which calls for a good general formalism for treating amplitudes for general mass and spin.
Finally, an understanding of amplitudes for general mass and spin removes the distinction between "on-shell" observables like scattering amplitudes and "off-shell" observables like correlation functions [7]. After all, loosely speaking the way experimentalists actually measure correlation functions of some system is to weakly couple the system to massive detectors, and effectively measure the scattering amplitudes for the detectors thought of as massive particles with general mass and spin! More precisely, as we demonstrate in sex.8, to compute the correlation functions for (say) the stress tensor (in momentum-space), we need only imagine weakly coupling a continuum of massive spin 2 particle to the system with a universal (and arbitrarily weak) coupling; the leading scattering amplitudes for these massive particles is then literally the correlation function for the stress tensor in momentum space. This should allow us to explore both on-and off-shell physics in a uniform "on-shell" way.

The Little Group
Much of the non-trivial physics of scattering amplitudes traces back to the simple question-"what is a particle?"-and the attendant concept of Wigner's "little group" governing the kinematics of particle scattering. Let us review this standard story. Following Wigner (and Weinberg's exposition and notation) [8,9], we think of "particles" as irreducible unitary representations of the Poincare group. We diagonalize the translation operator by labelling particles with their momentum p µ ; any other labels a particle state can carry are labelled by σ. In order to systematically label all one-particle states, we start with some reference momentum k µ and the states |k, σ . Now, we can write any momentum p as a specified Lorentz-transformation L(p; k) acting on k, i.e. p µ = L ν µ (p; k)k ν . Note that L(p; k) is not unique since there are clearly Lorentz transformations that leave p invariant-these "little group" transformations will figure prominently in what follows, for now we simply emphasize that we pick some specific L(p; k) for which p = L(p; k)k. We also assume that we have a unitary representation of the Lorentz group, i.e. for every Lorentz transformation Λ there is an associated unitary operator U (Λ) acting on the Hilbert space, such that U (Λ 1 Λ 2 ) = U (Λ 1 )U (Λ 2 ). Then we simply define one-particle states |p, σ as |p, σ ≡ U (L(p; k))|k, σ . (2.1) Note that the σ index is the same on the left and the right, this is the sense in which we are defining |p, σ . Having made this definition, we can ask how |p, σ transforms under a general Lorentz transformation U (Λ)|p, σ = U (Λ)U (L(p; k))|k, σ = U (L(Λp; k))U (L −1 (Λp; k)ΛL(p; k))|k, σ .
(2.4) We conclude that a particle is labeled by its momentum and transforms under some representation of the little group. Scattering amplitudes for n particles are thus labeled by (p a , σ a ) for a = 1, · · · , n. The Poincare invariance of the S-matrix -translation and Lorentz invariance-then tells us that M(p a , σ a ) = δ D (p µ a 1 + · · · p µ an )M (p a , σ a ) M Λ (p a , σ a ) = a D σaσ a (W ) M ((Λp) a , σ a ) . (2.5) In D spacetime dimensions, the little group for massive particles is SO(D−1). For massless particles the little group is the the group of Euclidean symmetries in (D−2) dimensions, which is SO(D−2) augmented by (D−2) translations. Finite-dimensional representations require choosing all states to have vanishing eigenvalues under these translations, and hence the little group is just SO(D−2). So much for the basic kinematics of particle scattering amplitudes. It is when we come to dynamics, and in particular to the crucial question of guaranteeing that the physics of particle interactions is compatible with the most minimal notion of locality encoded in the principle of cluster decomposition, that a fateful decision is made to choose a particular description of particle scattering, introducing the idea of quantum fields. Beyond particles of spin zero (and their associated scalar fields), there is a basic kinematical awkwardness associated with introducing fields: fields are manifestly "off-shell", and transform as Lorentz tensors (or spinors), while particle states transform instead under the little group. The objects we compute directly with Feynman diagrams in quantum field theory, which are Lorentz tensors, have the wrong transformation properties to be called "amplitudes". This is why we introduce the idea of "polarisation vectors", that are meant to transform as bi-fundamentals under the Lorentz and little group, to convert "Feynman amplitudes" to the actual "scattering amplitudes". For instance in the case of spin 1 particles, we introduce µ σ (p), with the property that µ σ (Λp) = Λ µ ν ν σ (p)D σσ (W ), so that µ σ (p)M µ (p, · · · ) transforms properly. For massive particles, such polarization vectors certainly exist, though they have to satisfy constraints. For instance we must have p µ µ σ = 0 for massive spin 1, or for massive spin 1/2, we use a Dirac spinor Ψ A σ with (Γ µ p µ − m) A B Ψ B = 0. These constraints are an artifact of using fields as auxiliary objects to describe the interactions of the more fundamental particles. For massless particles with spin ≥ 1 the situation is worse, since "polarisation vectors" transforming as bi-fundamentals under the Lorentz and little groups don't exist. Say for massless particles in four dimensions, if we make some choice for the µ ± for photons of helicity ±1, we find that for Lorentz transformations (Λp) = p, (Λ ± ) µ = e ±iθ µ ± + α(Λ, p)p µ . So polarisation vectors don't genuinely transform as vectors under Lorentz transformations, only the "gauge equivalence class" { µ ± | µ ± + αp µ } is invariant under Lorentz transformations. This infinite redundancy is hard-wired into the usual field-theoretic description of scattering amplitudes for gauge bosons and gravitons, and is largely responsible for the apparent enormous complexity of amplitudes in these theories, obscuring the remarkable simplicity and hidden infinitedimensional symmetries actually found in the physics.
The modern on-shell approach to scattering amplitudes departs from the conventional approach to field theory already at this early kinematical stage, by directly working with objects that transform properly under the little group (and so at least kinematically deserve to be called "scattering amplitudes") from the get-go. Auxiliary objects such as "quantum fields" are never introduced and no polarization vectors are needed. It is maximally easy to do this in the D = 4 spacetime dimensions of our world, where the kinematics is as simple as possible. Here the little groups are SO(2) = U (1) for massless particles, and SO(3) = SU (2) for massive particles, which are the simplest and most familiar Lie groups.
In four dimensions, we label massless particles by their helicity h. Massive particles transform as some spin S representation of SU (2). The conventional way of labelling spin states familiar from introductory quantum mechanics is by picking a spin axisẑ. and giving the eigenvalue of J z in that direction. This is inconvenient for our purposes, since the introduction of the reference directionẑ breaks manifest rotational (not to speak of Lorentz) invariance. We will find it more convenient instead to label states of spin S as a symmetric tensor of SU (2) with rank 2S; this entirely elementary group theory is reviewed in appendix B. Let's illustrate the labelling of states by considering a four-particle amplitudes where particles 1, 2 are massive with spin 1/2 and 2, and particles 3, 4 are massless with helicities +3/2 and −1. This would be represented as an object where {I 1 }, {J i } are the little group indices of particle 1 and 2 respectively, and the amplitude transforms as (2.7) where the W matrices are SU (2) transformation in the spin 1/2 representation and w = e iθ is the massless little group phase factor for helicity +1/2.

Massless and Massive Spinor-Helicity Variables
Our next item of business is to find variables for the kinematics that hardwire these little group transformation laws, this will be simultaneously associated with convenient representations of the on-shell momenta. As usual we will use the σ µ αα matrices to convert between fourmomenta p µ and the 2 × 2 matrix p αα = p µ σ µ αα 1 . Note that detp αα = m 2 , so that there is an obvious difference between massless and massive particles.
For massless particles, we have detp αα = 0 and thus the matrix p αα has rank 1. Thus we can write it as the direct product of two, 2-vectors λ,λ as [10] p αα = λ αλα (2.8) For general complex momenta the λ α ,λα are independent two-dimensional complex vectors. For real momenta in Minkowski space p αα is Hermitian and so we haveλα = ±(λ α ) * , (with the sign determined by whether the energy is taken to be positive or negative). Often the introduction of these "spinor-helicity" variables is motivated by the desire to explicitly represent the (on-shell constrained) four-momentum p αα by the unconstrained λ α ,λα. But the spinor-helicity variables also have another conceptually important role to play: they are the objects that transform nicely under both the Lorentz and Little groups. Thus while amplitudes for massless particles are not functions of momenta and polarization vectors (or better yet, are only redundantly represented in this way), they are directly functions of spinor-helicity variables.
The relation to the little group is clearly suggested by the fact that it is impossible to uniquely associate a pair λ α ,λα with some p αα , since we can always rescale λ α → w −1 λ α ,λα → wλα keeping p αα invariant. The connection can be made completely explicit by attempting to give some specific prescription for picking λ α , which leads us through an exercise completely parallel to our discussion of the little group. We first choose some reference massless momentum k αα and also choose some fixed λ α . For every other null momentum, we choose a Lorentz transformation L(p; k) β α ,L(p; k)βα such that p αα = L(p; k) β αL (p; k)βαk ββ , and we then define λ 1 For our conventions of signature and spinor indices, see appendix A.
For general complex momenta w is simply a complex number and we have the action of GL(1), for real Lorentzian momenta we must have w −1 = ±(w) * so w = e iθ is a phase representing the U (1) little group. Most obviously we can perform a Lorentz transformation W for which W k = k, we simply find λ → w −1 λ. To be explicit, let represent a massless momentum in the z direction. Then a rotation around the z axis (which leaves k invariant) is under which obviously λ α → e iφ/2 λ α ,λα → e −iφ/2λα . To summarize, amplitudes for massless particles are Lorentz-invariant functions of λ α ,λα with the correct little-group helicity weights, We now turn to the case of massive particles. There is no essential difference with the massless case; we simply have the p αα has rank two instead of rank one, and so can be written as the sum of two rank one matrices as p αα = λ I αλα I (2.13) where I = 1, 2. Note that p 2 = m 2 → detλ × detλ = m 2 (2.14) We can use this to set detλ = M, detλ =M with MM = m 2 . It is sometimes useful to keep the distinction between M,M , but for our purposes in this paper we will simply take M =M = m. Of course λ I ,λ I can't uniquely be associated with a given p, we can perform an SL(2) transformation λ I → W I J λ J ,λ I → (W −1 ) J Iλ J . Note that we could extend this SL(2) to a GL(2) if we also allowed (opposite) rephrasings of the mass parameters M,M , but by making the choice M =M = m does not allow this. This is not a disadvantage for our purposes, since the object M/M transforms only under the GL(1) part of the GL(2) and can be used to uplift any SL(2) invariant into a GL(2) invariant if desired.
For real Lorentzian momenta we have W should be in the SU (2) subgroup of SL(2) and gives us the action of the little group. We can make the connection explicit just as we did for the massless case, by defining λ I α ,λα I for a reference momentum k αα and boosting to define them for all momenta. A summary of this elementary kinematics is given in appendix B.
We conclude that that amplitudes for massive particles are Lorentz-invariant functions for λ I ,λ I which are symmetric rank 2S tensors {I 1 , · · · , I 2S } for spin S particles. Note that we can obviously use IJ , IJ to raise and lower indices so that we can e.g. write p αα = λ I αλ J α IJ . Also note that clearly p ααλα I = m λ I α , p αα λ αI = −mλ Iα (2.15) If we combine (λ I α ,λα I ) into a Dirac spinor Ψ I A , this is of course the Dirac equation But there is no particular reason for doing this in our formalism: even the usual (good) reason for introducing Dirac spinors-making parity manifest in theories which have a parity symmetry-can be more easily accomplished without using Dirac spinors in our approach. We will thus not encounter any Γ matrices in our discussion. Note also that using (p αα /m) allows to freely convert between λ I α andλ Iα variables. We will sometimes find it useful, especially in the context of the systematic classification of amplitude structures, to use this freedom in order to use e.g. only λ I α to describe a given massive particle. Then we can write the symmetric tensor as where M {α 1 ···α 2S } is totally symmetric in the α indices. 2 Let us illustrate our notation for writing amplitudes by returning to the example of a four-particle amplitude with (1, 2) being massive with spin (1/2, 2), and (3, 4) massless with helicity (+3/2) and (−1). Let's give examples of "legal" expressions for these amplitudes, that is objects with the correct little group transformation properties. Two possible terms are It would clearly be notationally cumbersome to have our formulas littered with explicit SU (2) little group indices, fortunately it is also entirely un-necessary to do so. We will simply denote the massive spinor helicity variables in BOLD, and suppress the SU (2) little group indices. Since these indices are completely symmetrized, putting them back in is completely trivial and unambiguous. In this way, we re-write the above expressions as [23] 3 κ 12 4|p 1 p 2 |4 + κ 41 42 (2.18) We stress again that there is no notion of the usual "helicity weight" little group for the massive particles; we can freely have expressions (as in the above) that from the viewpoint of massless amplitudes look like they are "illegally" combining terms with different helicity weight. As we will later see this reflects a beautiful feature of this formalism, making it trivial to see how massive amplitudes decompose into the massless helicity amplitudes at very high energies.
We pause to note the relation between our discussion here and a route to massive spinorhelicity variables taken by a number of other authors [12]. This approach begins by noting that we can always represent p αα = λ αλα −(m 2 / λη [λη])η αηα , for some reference spinors η,η. 3 The states are then labelled by giving the spin in the direction picked out by the lightlike directon ηη. Of course this corresponds to a particular choice for our (λ I α ,λ Iα ), but making this choice at the very outset obscures the Lorentz and little group transformation properties of the amplitude. Practically speaking, given some formula written in terms of the λ,λ, η,η, this makes it difficult to ascertain whether or not it is kinematically a legal expression for an amplitude, and thus the program of systematically classifying and constructing on-shell amplitudes is difficult to pursue in this formalism.
Let us further illustrate our notation by presenting some classic scattering amplitudes in these variables. We will simply state the results here and derive them from first-principles later in the paper; here we are only illustrating the notation and its utility for understanding the physics. Consider for instance the result for tree-level Compton scattering (12 − 3 + 4) where particles 2, 3 are photons of helicity (−, +) while 1, 4 are charged massive particles of spin 0, 1/2, 1. The amplitudes are given by ) Note the absence of γ matrices for the spin 1/2 case-the common complaint amongst students first doing these computations-"why are we dragging around four-component objects when the electron has only two spin degrees of freedom?"-is entirely absent here. Similarly for the spin 1 case there are no polarization vectors. Indeed these expressions are the most compact representation for these amplitudes possible, directly in terms of the physical degrees of freedom of the actual particles, with no reference to fields as auxiliary objects.

The high-energy limit
It is very easy to relate the massive and massless spinor-helicity variables, and especially to take the high-energy limit of scattering amplitudes and see how massive amplitudes for particles with spin decompose into the different helicity components. To do so, we note that it is convenient to expand λ I α in a basis of two-dimensional vectors ζ ±I in the little-group space. In other words, we can expand Note, as explicitly given in the kinematics Appendix C, in a given frame we naturally have ζ ±I as the eigenstates of spin 1/2 in the direction of the spatial momentum p, and we can identify 2E, so that both η,η are proportional to m and vanish relative to λ,λ. Said in a more Lorentz-invariant way, to take the high-energy limit we take η α = mη α ,ηα = mηα; with λη = [λη] = 1 (2.22) with all dimensionless ratios of the form Note that any scattering amplitude naturally decomposes into different spins states in the spatial direction of motion, via Thus, the different helicity components in the high-energy limit are just given by As a simple exercise for taking the high-energy limit, let's consider the coupling of a massive vector to two massless scalars. This amplitude is simply:   We see that only the plus and minus helicity amplitude survives, and as η 3 scales as m, the longitudinal mode is sub-leading in m. 4 Especially in the context of the rather degenerate kinematics of three particle amplitudes, simply setting the η,η → 0 can give rise to 0/0 ambiguities, and this proper definition of the high-energy limit we have specified should be used. But for more generic situations, and for any expressions that is manifestly smooth as m → 0, we can simply set η,η → 0 to take the high-energy limit. There is an especially easy way of doing this with the "BOLD" notation we have introduced above, that shortcuts the need for any explicit expansion in terms of ζ ±I as we have indicated above. We simply unbold the characters! 5 Let us illustrate how this works for the case of Compton scattering of a charged spin one particle in eq. (2.19), and see how the massive amplitude decomposes into its helicity constituents. Expanding out the square of the numerators we find Note that as helicity amplitudes "adding" the components in this way would be illegal, but this is exactly how we can pick out the different pieces of the massive amplitude that unifies the different helicity amplitudes together into a single object, in the high-energy limit! Note also that quite nicely the (0, 0) helicity components reproduce the HE limit of the scalar Compton amplitude, reflecting the fact that the longitudinal component of the charged massive spin 1 particle is just a charged scalar at high energies.

Massless Three-and Four-Particle Amplitudes
Having dispensed with kinematics, we now move on to determining dynamics. We will follow a familiar strategy, starting by determining the structure of all possible three-particle amplitudes: 4 These results can also be obtained by converting the conventional polarization vector representation of the three particle amplitude to the massive spinor helicity basis. First, being a Lorentz vector and a symmetric tensor in SU(2), the on-shell form of the polarization vector is fixed to (see also [13]) Contracting with the momenta then converts the polarization vector to pure chiral indices, αβ = αα pα β m . Taking the high energy limit, one straight forwardly obtains the three helicity sectors: in the chiral representation. 5 This is analogous to the replacement of k → k in the massive spinor helicity formalism of [17]. When many species N s,m of particle of the identical mass and spin/helicity, we will label them with an index "a".We will always think of these as real particles, and assume that the "free propagation" does not change the a index, i.e. that free propagation has an SO(N s,m ) symmetry. This choice is hardwiring the most basic physics of unitarity. Note that it is trivial to have (non-unitary) Lagrangian theories that violate this rule, for instance we can have grassmann scalar fields ψ a with free action J ab ∂ µ ψ a ∂ µ ψ b with antisymmetric J ab . Here the free propagation is proportional to J −1 ab which vanishes for a = b, and the free theory has an Sp(N ) rather than SO(N ) symmetry.
Moving beyond three particles, the central constraint on higher-point tree amplitudes is unitarity, in the form of consistent factorization. For massless or massive internal particles goes on shell, spin s goes on-shell, we must have We will impose this consistency condition at 4 points, which must factorize onto a product of three-particle amplitudes.
As is by now well-known, these conditions are incredibly restrictive for massless particles. The kinematics of three-particle momentum conservation forces either λ 1 , λ 2 , λ 3 to be all proportional, orλ 1 ,λ 2 ,λ 3 to all be proportional. Thus the three-particle amplitudes must either be of the form [12] a [23] b [31] c or 12 a 23 b 31 c in these two cases respectively, and the powers are fixed by the helicities of the three particles. The amplitudes are given by Note that only by symmetries we could use either of the two expression regardless of the sign of h 1 + h 2 + h 3 , but we also demand that the amplitudes have a smooth limit in Minkowski signature where the brackets also go to zero. We see that, up to the overall couplings g,g, the three-particle amplitudes are entirely fixed by Poincare symmetry. We now move on to determining four-particle amplitudes from consistent factorization. The obvious strategy for doing this is to simply compute the residue in e.g. the s-channel by gluing together the three particle amplitudes on the two sides of the channel; then multiply this residue by 1/s. Adding over the channels should then give us an object that factors correctly in all the channels. This trivially works for φ 3 theory where the coupling is simply a constant g, and the residue in each channel is simply g 2 . Then an object with the correct poles in all channels is g 2 (1/s + 1/t + 1/u). Of course in addition to this we may have contact terms with no poles at all, whose form is not fixed by the three-particle amplitudes. But we will only be concerning ourselves with the parts of the four particle amplitudes that are forced to exist by consistent factorization given the three-particle amplitudes.
Let's repeat this exercise for the slightly more interesting case of Yukawa theory, where the three-particle amplitude for fermions 1,2 of helicity −1/2 to a scalar 3 is simply y 12 . Let us compute the s-channel where here and in what follows we will suppress the trivial coupling constant dependence. This can be simplified using that ]. The residue in the u channel is the same swapping 2, 3. So finally the consistently factorizing amplitude is (3.5)

Self-interactions
Let's now try a different example: consider a theory of a single self-interacting particle of spin s. The three particle amplitude for (1 −s 2 −s 3 +s ) is 12 3s 13 s 23 s . Note a remarkable feature of this expression, which we did not encounter in either the φ 3 or Yukawa theory cases: already the 3 particle amplitude appears to have poles! Thus in a sense, these amplitudes are not as "local" as we might have expected. Now of course this peculiarity is un-noticed in the usual Minkowski space, since the three-particle amplitude vanishes in the Lorentzian limit. It is not a coincidence that this subtle sort of "non-locality" appears for precisely the same theories that, in a conventional Lagrangian description, must introduce gauge redundancies for consistency. But returning to our problem of determining four-particle amplitudes by imposing consistent factorization, this feature introduces an important obstruction. The strategy of computing the residue in the s-channel, multiplying by 1/s, then summing over channels, is no longer guaranteed to work; as we will see because of the poles in the three-particle amplitudes, the residue in the s channel will itself have poles in the the other channels, making it non-trivial to be able to find an object that consistently factorizes in all channels. Indeed, while we can define massless three-particle amplitudes for any helicities, it will be impossible to find consistent four-point amplitudes for all but the familiar interacting theories of massless spin 0, 1/2, 1, 3/2 and 2 particles. This exercise has been carried out in systematically in [18], here we highlight some aspects of this story before moving on to carrying out the similar analysis with massive particles.
Let us return to the theory of self-interacting massless particles of spin s; we will consider the four-particle amplitude (1 −s 2 +s 3 −s 4 +s ). The residue in the s-channel, reached when For s ≥ 1, we encountered the challenge alluded to above: the residue in one channel itself has a pole in another channel. Let us start with s = 1. Given the structure of the residues, any consistent amplitude must have the form Note that as s → 0, we have t = −u, e.g. the residue in s is A/t + C/u = (A − C)/t. In this way, we find that matching the residues in s, t, u demands that (A − C) = 1, (B − A) = −1, (B − C) = 1, which is impossible since the sum of the three terms would have to vanish. We conclude that it is impossible to a single self-interacting massless spin 1 particle! But suppose we have many of these particles labelled by the index a; thus the self-interaction of a 1 , a 2 , a 3 is further proportional to a coupling constant f a 1 a 2 a 3 . Note that for s = 1 the three particle amplitude (1 −1 2 −1 3 +1 ) = 12 3 13 23 is anti-symmetric in exchanging 1 ↔ 2, implying f a 1 a 2 a 3 taking on the same property. Extending to all helicity configurations one can conclude that f a 1 a 2 a 3 must be totally anti-symmetric. Next consider the four particle amplitude with labels a 1 , a 2 , a 3 , a 4 , the residues in the s, t, u channels have additional factors of f a 1 a 2 e f ea 3 a 4 and similarly in the t, u channels. Now the ansatz for the four-particle amplitude has the form and matching the residues in s, t, u tells us that and now, we can solve for A a 1 a 2 a 3 a 4 , B a 1 a 2 a 3 a 4 , C a 1 a 2 a 3 a 4 if and only if the f a 1 a 2 a 3 satisfies the Jacobi identity f a 1 a 2 e f ea 3 a 4 + f a 2 a 3 e f ea 1 a 4 + f a 1 a 3 e f ea 4 a 2 = 0 (3.11) Let's now move on to a single particle with s = 2. Naively, since the residue in the s−channel is proportional to 1/u 2 , we might think that it is impossible for the four-particle amplitude to have crucial properties of having only single poles! However, this 1/u 2 is the residue just as s → 0, and so it could also be represented as − 1 tu . Thus there is a unique possibility for the four-particle amplitude for a single massless spin two particle: which evidently has all the correct residues in all three channels! We can further investigate the possibility on several massless spin two particles, with a coupling constant g a 1 a 2 a 3 ; the same analysis as for spin one then gives us quadratic constraints on the g a 1 a 2 a 3 that are solved only by g's that, up to change of basis, are only non-vanishing for a 1 = a 2 = a 3 , i.e. which are mutually non-interacting.
We have thus seen that the only consistently interacting massless spin one particles must have a Yang-Mills structure, and the only consistent massless spin 2 particles does not nontrivially allow more than one such particle, and gives us the standard gravity amplitude. Of course we have done more than simply show the amplitudes are consistent, we have computed them! For spin s > 2, the residue in the s-channel is at least 1/u 3 , and so there is no way to have a consistent four particle amplitude with only simple poles in s, t, u. We thus conclude that there are no consistent theories of self-interacting massless particles of spin higher than two.

Interactions with other particles
Let's move on to determine what sorts of self-consistent interactions other particles can have with massless spin 1, 2 particles. Let's start with the coupling of a spin s particles to spin one particle, for which the three particle amplitude is 12 2s+1 23 1−2s 13 −1 . Let us now consider the residues for the (1 −s 2 + 3 − 4 +s ) amplitude; we get residues in the s and u channels from gluing these three-particle amplitudes together. These residues are trivially computed to be We see there is a qualitative difference between s ≤ 1 and s ≥ 3/2. For s = 0, 1/2, 1, while the residues in one channel have poles in the other, we can write down a consistently factorizing four-particle amplitude: ( 13 [24]) 2s [2|(p 1 − p 4 )|3 2−2s su (3.14) But for s ≥ 3/2, the residues have (increasing powers of) the spurious pole in [2|(p 4 − p 1 )|3 , and so no consistent four particle amplitude is possible. Thus we recover the correct Comptonscattering expressions for particles of spin 0, 1/2, 1 scattering off photons, while also seeing that it is impossible to have a consistent theory of massless charged particles with spin ≥ 3/2.
When there are several species of spin s particles i coupling with several spin one particles a, we attach an extra coupling T a ij to the vertex. Consider (1 − i 2 + a 3 − b 4 + j ) scattering; writing the residues R in any channel as R = ( 13 [24] where (r s , r u ) satisfies s = 0 and u = 0 kinematics respectively. Note that if ( vanishes, we can get a consistent amplitude as with our Compton scattering example, with poles only in these s and u channels, but this is not possible if [T a , T b ] = 0. This means that the 1/u in r s and the 1/s in r u must secretly be 1/t instead, i.e. must also include a pole in the t channel. Of course fortunately we can have a residue in the t channel, using the cubic self-interaction for gluons. Quite nicely the same kinematical factor appears in R t , and we find (writing this residue in an s, u symmetric way): and using the fact that when t = 0, s = −u, we find that the following amplitude indeed consistently factorizes in all channels: This agrees with the result in [16]. Also, clearly once again no consistent amplitudes are possible for spin s ≥ 3/2. Thus we have discovered the familiar structure of Yang-Mills theories for particles of spin 0, 1/2, 1.
The same sort of analysis extends to gravity, since the details are virtually identical we will leave them as enjoyable exercises for the reader. We can consider the coupling of two particles of spin s to a graviton, with strength g. The residues in the s, u channels are no longer equal, and the only way to make a consistent four particle amplitude is to also have a pole in the t channel, using the graviton self-interaction κ = 1 M P l . Thus once again the poles for the amplitude is forced to come in the combination 1/stu. This implies that the coupling constant appearing in the spin-s exchange channel must be identified with that of the graviton exchange. That is, consistency between the three factorization channel forces the universality of couplings to gravity, g = κ, with the following form for Compton scattering: Now we see that for s ≥ 2 one again develops spurious pole, and one reaches the conclusion that for spin greater than 2, the particle cannot consistently couple to gravity. In other words, even if higher spin particles are non self-interacting and free, the moment one turns on gravity it ceases to be consistent in flat space. Thus we find that the only possible consistent theories that can couple to gravity can only have spins (0, 1/2, 1, 3/2). 6 We can also discover the need for supersymmetry when massless particles of spin 3/2 are present. Consider for simplicity the case with a single spin 3/2 particle ψ. Now let's imagine we also have a massless scalar φ. Both of these particles have a universal coupling to gravity, so there is inevitably an amplitude for ψ 1 ψ 1 φ 2 φ 2 scattering mediated by gravity. We can again compute the residue in the s-channel, and find that it has a pole in the t channel. But since there is no (ψ, φ, graviton) coupling (amplitudes must be grassmann even), we can't have any t-channel poles, and so this theory is inconsistent. The only way to have a consistent amplitude is if we also introduce a massless fermion χ, now we can have a (ψ, φ, χ) interaction with the same gravitational strength 1/M P l , which provides the needed pole in the t-channel. The full amplitude is then given as: Thus we see that we must have a bose-fermi degenerate spectrum, with the couplings of the "gravitino" ψ to particles and their superpartners of universal gravitational strength.
We have given a lightning tour of some of the arguments leading to the determination of all consistent theories of massless particles via the "four-particle scattering" test. It is remarkable to see the architecture of fundamental physics emerge from these concrete algebraic consistency conditions in such a simple way. A more complete and systematic treatment can be found in [18].
Before moving on to considering massive amplitudes, let us briefly comment the (in)consistency of theories with three-particle amplitudes for helicities satisfying h 1 + h 2 + h 3 = 0. Apart from the case of all scalars h 1 = h 2 = h 3 = 0, we have "phase" singularities in the couplings, for instance we have a coupling of the form 13 / 12 or [12]/[13] for a spin zero particle 1 to particles 2, 3 of helicity ±1/2. This peculiar interaction is unfamiliar, and does not arise from Lagrangian couplings. But, as expected, it is also impossible to find a correctly factorizing four-particle amplitudes with these couplings [18], so consistency forces the couplings to vanish.

General Three Particle Amplitudes
In this section we will categorize the most general three-point amplitude with arbitrary masses. As discussed in section 2, the amplitude will be labeled by the spin-S representation of the SU(2) little group for massive legs and helicities for the massless legs. For amplitudes involving massive legs, it will be convenient to expand in terms of λ I α , since any dependence onλ Iα can be converted using eq.(2.15). For example for a general one massive two massless amplitude, with leg 3 being a massive spin-S state, we have: where (h 1 , h 2 ) are the helicity. We will be interested in the most general form of the stripped M , which is now a tensor in the SL(2, C) Lorentz indices. The problem thus reduces to finding two linear independent 2-component spinors that span this space, which we will denote as (v α , u α ). The convenient choice of (v α , u α ) will depend on the number of massive legs in a given set up and we will analyze each case separately. We note that a similar classification of three-point interactions using a different basis can be found in [19,20].

Two-massless one-massive
Let's first begin with the two massless and one massive interaction: Since both legs 1, 2 are massless, their spinors can serve as a natural basis: The helicity weight (h 1 , h 2 ) then completely fixes the degree-2S polynomial in λ 1 , λ 2 up to an overall coupling constant: where with appropriate factors of m such that it has the correct mass-dimension. Note that we can trade [12] for 12 using [12] = m 2 21 . When the massive leg is a fermion, i.e. S ∈ 1 2 Z, we must then require precisely one of the massless legs to be a fermion as well.
The fact that the structure of this three-point amplitude is unique implies no go theorems for certain interactions. For example, for identical helicities the factor [12] S+2h 1 will attain an extra factor of (−1) 1+2h 1 under 1, 2 exchange for odd spins. This will result in the wrong spin-statistics, thus a particle of odd spin S cannot decay to identical particles with the same helicity. Now suppose the particles have opposite helicity, namely h 1 = −h 2 = h. If we take into account that the exponents of λ 1 and λ 2 must both be positive, we conclude that the amplitude vanishes if |h| > S/2. For massive spin one states, this is Yang's theoremthat a massive spin one particle cannot decay to a pair of photons. We also learn that a massive spin three particle cannot decay to a pair of gravitons. Note that we have invoked spin-statistics without giving its on-shell origin. As we will see in the coming subsection 4.3, when considering the three-point amplitude of identical massive spin-S states to gravity, spin-statistics is immediately forced upon us.

One-massless two-massive
For two massive legs, the three-point amplitude is now labeled by (h, The analysis depends on whether or not the masses are identical. For equal mass, the kinematics becomes degenerate and one expects some form of superficial non-locality. The reason is that the equal mass kinematics occurs precisely for minimal coupling, where its massless limit contain inverse power of spinor brackets as discussed in the previous section. As we will see, for this case we need to introduce a new variable x that encodes this non-locality.

Unequal mass
For unequal mass, one of the basis spinor can be λ of the massless leg, while the remaining can be chosen to beλ contracted with one of the massive momentum. For example one can choose: Unlike the one massive case, here the amplitude is not unique. The helicity constraint only fixes the polynomial degree in u and v to differ by 2h. For S 1 = S 2 there are then a total of C = S 1 +S 2 −|S 1 −S 2 |+1 different tensor structures, and the general three-point amplitude is given by: where i labels the different structure and g i is the coupling constant for the different tensor structures. Note that the number of possible tensor structures is determined by the lowest spin. For example for one S 1 = 1 S 2 = 2, we have three tensor structures. For a minus helicity photon these are given by: where the parenthesis indicates the grouping of the symmetrized SU(2) little group index. One can also compare this with a Feynman diagram vertex F 3,µν νρ 2 ∂ ρ µ 1 , where 1 , 2 are the polarization vectors for the massive particles. Again, substituting the on-shell form of the massless polarization vectors where |μ], |µ are reference spinors, and massive ones in eq.(2.29), one finds: Indeed the three-point amplitude for the vertex can be expanded on the basis in eq.(4.7), as it should.

Equal mass: the x-factor
If the masses are identical, then u and v are no longer independent, since: Thus (u α , v α ) are parallel to each other and pick out just one direction in the SL(2,C) space. There is however a crucial piece of additional data in the constant of proportionality between u and v, which we will call "x": Note that x carries +1 little group weight of the massless leg. Furthermore, x cannot be expressed in a manifestly local way. Indeed contracting both sides of the above equation with a reference spinor ζ yields: so while x is independent of ζ, any concrete expression for it has an apparent, spurious pole in ζ. In the next section, as we glue the three-point amplitudes to get the four-point, it will be convenient to choose ζ to be the spinor of the external legs on the other side. The denominator then yields a pole in other channels! This yields non-trivial constraint for the four-point amplitude to have consistent factorisation in all channels. Now the only objects we have carrying SL(2,C) indices are λ 3 , as well as the the anti-symmetric tensor ε αβ . 7 We can then express the three-point amplitude as: where the superscript on λ, ε, pλ/m indicates its power. For later purpose we present it in two equivalent representations.

Minimal Coupling for Photons, Gluons, Gravitons
We have seen that while there is a unique structure for massless three-particle amplitudes once the helicities are specified, for couplings of e.g. two equal mass particles of spin S to a massless particle there are (2S+1) independent structures, each term with n factors of ε with n = 0, · · · , 2S. Let us take the massless particle to be a graviton. Note that ε is antisymmetric with respect to the exchange 1 ↔ 2. Furthermore while the definition of x in eq.(4.10) implies that it picks up a minus sign under the 1 ↔ 2, this is irrelevant for gravitational couplings which are proportional to x 2 . Thus we see that one gravitation two identical spin S amplitude will have a factor of (−) 2S+1 under the exchange of the spin-S states. This is nothing but the spin-statistic theorem! Now one of the (2S+1) structures is special, and corresponds to what we usually think of as "minimal coupling" to photons, gluons and gravitons. The defining characteristic of "minimal coupling" is physically very clear. For massless particles, the mass dimension of the couplings is given by 1−|h 1 +h 2 +h 3 |, and so the leading low-energy interactions with photons, gluons and gravitons-those with dimensionless gauge couplings e, g or gravitational coupling 1/M P l , involve massless particles of opposite helicity. The definition of "minimal coupling" for massive particles is then simply the interaction whose leading high-energy limit is dominated by precisely this helicity configuration. As we will see the remaining (2S + 1) − 1 = 2S interactions represent the various multipole-moment couplings (such as the magnetic dipole moment in the coupling to photons.) In our undotted SL(2,C) basis, the amplitude with a positive helicity state can be viewed as an expansion in λ. The leading piece in this expansion, namely that where the SL(2,C) indices are completely carried by the Levi-Cevita tensors, precisely corresponds to minimal coupling! It is instructive to see why this is the case. Using the simplest example, a photon coupled to two fermions, we find: Taking the high energy limit, we see that the leading term indeed correspond two possible pairs of opposite helicity fermion, (4.14) In general the the minimal coupling between photon and two spin-S states is simply: where we've also included the negative helicity photon in its simplest dotted representation.
The proper amplitude (with little group indices) is then given as: For gravitons, we simply introduce an extra power of m M pl x. The fact that in this formalism, minimal coupling is as simple as λφ 3 heralds its potential for simplification. It is also instructive to see how such simple representation emerges from the usual vertices in Feynman rules. Here we present examples for scalar, spinor and vector at three points: Scalars : where we've used the identity xmλ 3 = p 1 |3]. Similarly for spin-1 2 and 1, we have: Fermons : Vectors : The fact that minimal coupling is literally the "minimal" interaction in the undotted SL(2,C) representation indicates the λ expansion should directly correspond to the presence of couplings through higher-dimensional operators. These precisely are the magnetic and electric moments. Let us begin with the magnetic dipole moment. Since this corresponds to a coupling of the particle with F µν , it can only occur for particles with spin. Thus we can extract the electric dipole moment by separating the minimal coupling into a piece that is universal, and pieces that only exists for spinning particles.
Recall that the field strength in momentum space becomes F µν → λ α λ β εαβ +λαλβε αβ . This implies that couplings through the field strength will be transparent in the undotted frame for negative helicity photon, and dotted frame for the positive photon. With this in mind we convert the minimal coupling for spin-1 2 and negative helicity photon into the dotted frame: Here the piece m x ε αβ is the same as that for scalars, sans the ε αβ factor which is necessary to carry the SL(2,C) indices, and thus a universal term. The extra piece λ α 3 λ β 3 then represents the magnetic moment coupling, with the amplitude given by 13 32 m . (4.21) Thus we immediately see that g = 2 for the magnetic dipole moment. 8 Thus for minus helicity photon, the general spin-1 2 amplitude has the simple expansion: where we've manifestly separated the minimal coupling and the (g − 2) part of the magnetic dipole moment. It is straight forward to see that in the undotted frame, is simply λ 3 λ 3 . For the plus helicity, one has: One can trivially extend this to higher spin. For example for spin-1, the minimal coupling now contains both the magnetic dipole moment and electric quadrupole moment. The 8 As a comparison, for the positive helicity and insisting on the undotted frame, we can make the separation after contracting λ I s. More precisely: minimal coupling yields: We again see that the first term is the universal piece, the terms quadratic in λ is the dipole moment where as the terms quartic in λ is the electric quadrupole moment. Thus the general three point amplitude for the charged vector and a photon is: where (g − 2) and (g + 1) is the anomalous magnetic dipole and electric quadrupole moment respectively.

Three massive
For all massive legs, we no longer have massless spinors to span the SL(2, C) space. This implies that the space has to be spanned by tensors instead. The fundamental building blocks are now The general form of the three-point amplitude is: where i = 0, 1 represents the number of εs and σ i labels all distinct ways the SU(2) indices can be distributed on Os and should be summed over. It will be interesting to see whether the higher spin interactions from string theory, see [21] for recent results, span the space of all interaction allowed.

Four Particle Amplitudes For Massive Particles
Now that we have determined the structure of all possible three-particle interactions, we would like to proceed to investigating the consistency of four-particle amplitudes. Just as we did for all massless particles, we ask: given a spectrum of particles, and a set of three-particle interactions, is it possible to find a four-particle amplitude that consistently factorizes in all possible channels? We stress that this is a completely sharply defined and straightforward algebraic problem. To be maximally pedantic, suppose we have a set of particles with masses (zero or non-zero) given by m i . Then the most general ansatz for the four-particle amplitude has the form N and we simply wish to determine whether there is a consistent numerator N that allows this function to factorize correctly in the s, t, u channels 9 As we've shown before, it is convenient to expand the amplitude on the λ I α basis, in which case the contraction of little group indices now translates to the contraction of undotted SL(2,C) indices: To make contact with the usual Feynman rules, the numerator of the vector propagator is G µν ≡ η µν − pµpν m 2 , which in SL(2,C) undotted representation is: as expected. This is not surprising, as we've discussed in the introduction, the transverse traceless-ness, which determines the numerator of the propagator, simply translates to symmetrization of the SL(2,C) indices. In practice, we don't need to work with this slavishly systematic ansatz for the amplitude with the giant denominator consisting of all possible simple poles. Instead, following the same steps as in the all massless case, given the spectrum and the three-particle amplitudes, we will first simply compute the residues R(i) s , R(i) t , R(i) u in the s, t, u channels from the exchange of the i'th particle. If these residues are local, we are trivially done, since the object Of course the amplitude cannot be uniquely determined in this way, since we can always simply have contact terms that are simply polynomials with no poles at all (corresponding to piece in N that cancels all the poles). To avoid clutter, we will suppress the possible contact terms in what follows.
manifestly matches the poles in all the channels. This is the case for the massive gφ 3 theory where these residues are all simply R s = R t = R u = g 2 . But as we already saw in the massless case, there are more interesting cases where the residues in one channel themselves have poles in another channel. With massive particles this will occur whenever we have minimal coupling and the "x" factor. In this case an ansatz separately summing the channels cannot work, and we must use building blocks that have simple poles in more than one channel. For massless particles, the requirement of four-particle consistency was so strong as to simply make certain theories (of high-enough spin charged or gravitating massless particles) impossible. It also enforced universality of the couplings to gravitons and the usual Yang-Mills structure for coupling to photons and gluons. We will see the analogue of these statements for massive amplitudes. Once again, consistent factorization will demand the standard couplings to photons, gluons and gravitons, will also see that any self-interactions have to be invariant under the (global part) of the gauge symmetry. But with these restriction met, it is possible to find consistently factorizing four-particle amplitudes for any masses and spins. This is of course expected, since almost all interesting objects in the real world are massive particles of high spin! But of course as we will also see, the impossibility of consistent amplitudes for massless particles of high spin shows up in a singularity of the massive high spin amplitudes in the high-energy (or m → 0) limit, giving a very concrete sense in which particles of high spin cannot be "elementary".

Manifest local gluing
We first begin with the construction of amplitudes without any x-factor non-localities. Let's begin with Yukawa amplitude, i.e. one massless scalar two massive fermion amplitude. The three-point amplitude is simply where m f is the mass of the fermion. The gluing in the s-and u-channel yields: where by c.c. we are exchanging λ ↔λ and g ↔ g . As one can see, since the three-point amplitude was local, the resulting four-point amplitude can be written in a manifest local way with two separate channels.
A "slightly" more complicated example would be the process γ − + t → gra + + t, via a massive spin-3 2 exchange: Here, t 1,4 are the massive top quarks with their mass denoted by m t . The three-point amplitude on both sides are: There are two tensor structures for V L , reflecting the two distinct way the SL(2,C) indices can distribute. The resulting four-point amplitude is then, where m T is the mass of the spin-3/2 particle. In the above examples, the residues are manifestly local as it is inherited from the threepoint amplitude. The only place potential non-locality can occur is when factors of x appear for the three-point amplitude, for example the minimal coupling. Thus in the next section we will focus on minimal coupling for massless spin-1 and 2 particles.

Minimal Coupling
In this subsection we will consider the gluing of minimally coupled higher spin particles. We will first begin with charged particles, which entails the three-point coupling of two massive spin-S state and a positive or negative helicity photon. The three point amplitude is given in eq.(4.15), which after dressing with external spinors, the complete amplitude is:

Compton Scattering For S ≤ 1
Let us begin with scalar. Here one simply has: Here the subscripts on x serve to distinguish between different three point vertices. Now since we see that the residue is given by: Again the s-channel residue is non-local and must be interpreted as a pole from the other channel! We now have a choice, it can either be interpreted as a massless particle in the t-channel, or an u-channel massive particle since −t = u − m 2 when s = m 2 . For there to be a t-channel massless pole, the vectors must be gluons instead of photons, and we leave this possibility to the later part of this subsection. For the case where one has a u-channel massive pole, the amplitude is simply: As the amplitude is symmetric under 1 ↔ 4 exchange, it is guaranteed to be consistent with the u-channel factorisation. It is straight forward to see that at H.E. one obtains the usual two adjoint-scalar two gluon, and two charged scalar two photon amplitude.
Let us now consider Compton scattering for general spin. The s-channel gluing yields, Recall that x 12 x 34 m 2 = − 3|p 1 |2] 2 /t, if we rewrite t as u − m 2 and put back the s-channel propagator, this has the property that it is symmetric under 1 ↔ 4 (it is the scalar amplitude after all). This means that if P 2S m 2S matches to the u-channel residue then we are done! Finally using the identity: 10 1|P one derives the following ansatz for the four-point amplitude of minimally coupled general spin-S amplitude, 10 This identity can be derived as follows: |P I [PI | is the internal momentum that satisfies the s-channel on-shell constraint, The solution is given by: In appendix D we reproduce this result using Feynman diagrams for fermions. By studying the H. E. limit, one can easily verify that this is correct. At H.E. for S = 1 one obtains three terms, two of which are contributions where legs 1 and 4 are opposite helicity gluons, and a final one which is when they are both scalars, which are the Goldstone bosons that were eaten in the Higgs mechanism! Note that this is telling us that the Higgs mechanism provides a way to "unify" the independent massless amplitudes in the IR. We will discuss this phenomenon in more detail in section 6. Now in the above discussion the result from the s-channel gluing can be matched to the u-channel if we have a single species of spin-S. If there are multiple species, then similar to the massless discussion in section 3, we should assign a matrix T a ij to each vertex, and due to [T a , T b ] = 0, the matching to the u-channel will be off by a piece that is proportional to f abc T c ij . This mismatch is a sign that the t-channel pole from the s-channel factorisation should be assigned into a physical massless pole, i.e. revealing the presence of an non-abelian vector. For this to hold we should show that the s-channel residue admits this interpretation. Indeed taking a scalar for example, 3|p 1 |2] 2 /(s−m 2 )t can be matched to the t-channel residue since (5.22) The last equality utilizes the fact that when t = 0, s−m 2 = −(u−m 2 ). Thus the final amplitude is given by:

Compton scattering for S > 1
The ansatz for general minimal coupling in eq.(5.20) appears to contain non-physical poles for S > 1. Of course this cannot be the final story since there's an abundance of charged higher spin-states in nature, and although we know that they are not fundamental, it has no bearing on the existence of S-matrix for low energy scattering. In deriving eq.(5.20), we started from the s-channel residue and analytically continued P I to a form that is manifestly 2 ↔ 3 and + ↔ − symmetric, and thus can be directly matched to u-channel residues. This is not entirely necessary, since the full amplitude can contain terms that only contain s and not u-channel pole. Thus the very fact that eq.(5.20) gives us non-physical poles for S > 1 is precisely telling us that such terms must be present. To see this subtlety in detail, let's consider minimal coupling for spin-3/2, for which the gluing from s-channel yields: Now expanding (A+B) 3 , only the A 3 term will contribute to both s-and u-propagators, while terms with B will contribute solely to s-channel propagators. Putting everything together, one finds the following local form for the amplitude: We now see that in the final local form, all terms contain 1/m factors and becomes singular in the H.E. limit. In other words, the obstruction of taking m → 0 reflects the absence of a consistent massless high energy amplitude. For example the leading term in 1/m that will contribute to M (1 + 3 2 , γ +1 2 , γ −1 3 , 4 + 3 2 ) at high energies is given by: As we will elaborate below, this is the concrete sense in which charged particles with spin S ≥ 3/2 cannot be "elementary", the same conclusion holds for any particles at all of spin S ≥ 5/2 that can consistently couple to gravity.

Graviton Compton Scattering
Let us again begin with scalars, with the massive scalars are on legs 1, 4, a positive and negative helicity graviton on legs 2, 3 respectively. The s-channel residue is given as: where M pl is the Plank mass. As with the massless discussion we now have double pole in t, which can be identified as the massive pole 1/(u − m 2 ) and a massless 1/t pole. Thus the four-point amplitude is simply (5.30) It is instructive to verify that the massless pole is correct. Let us take the residue at t = 0, in the kinematics where ij = 0. The residue of eq.(5.30) is (5.31) Since ij = 0, the massless three-point amplitude should be MHV, and one has where P is the massless internal momenta. Finally using the identity where in the last line, we've applied Schouten on the denominator, keeping in mind that 23 = 0. Thus we see that eq.(5.30) yields correct factorization in all channels. For massive higher-spin particles, we again use the mixed representation. The s-channel residue yields: For S > 2, we see that the formula ceases to be local. Similar to our photon coupling analysis, this indicates that the residue of s-channel must be separated into pieces that will combine with other channels and pieces that don't.

Massive higher spins cannot be elementary
We have seen that Compton scattering amplitudes for particles of high enough spin do not have a healthy high-energy limit, growing as powers of (p/m). Of course so long as the gauge/gravitational couplings are small, these amplitudes do not become O(1) till energies parametrically above the particle mass m, so in that sense no inconsistency is encountered in the effective theory of a single massive higher spin particle till a cutoff parametrically above its mass. Nonetheless, the sickness of the m → 0 limit does show that a single massive higher spin particle cannot be "elementary", and that any consistent theory for such particles must also include new particle states with a mass comparable to m. As an example, suppose we have some strongly-interacting QCD-like gauge theory; can such a theory have a spectrum consisting of bound states of high spin, with a parametrically large gap up to higher excited states? Our analysis suggests that this is impossible. We can imagine weakly gauging a global symmetry of the theory, or coupling the system to gravity. The total cross-section for e.g. γγ → X should be bounded by σ < C × e 4 /s for some constant C characterizing the current four-point amplitude. But if we have a charged higher spin particle, just the cross-section for its production would grow as e 4 /s × (s/m 2 ) n , and if there is a parametrically large gap up to other particle states this will exceed the bound when (s/m 2 ) n > C. Of course this is a somewhat qualitative argument, but we believe it captures the essence of why higher-spin massive particles must be composite. A sharpening of the argument may be able to give a more quantitative bound for the scale beneath which new particles must appear. We can ask if the presence of new states in the propagator can tame this high-energy behaviour by cancelling the 1/m 6 singularity in eq.(5.28). In other words consider the case where one has a new spin S state with the similar mass as the S = 3 2 , then one can include the contribution: If S = S, then in the degenerate mass limit, it is easy to see that the three point amplitude cannot involve the pure x dependent pieces and thus the residue must be local. This then tells us that the contribution of such terms in the high energy limit must take the form ns sm α + nu um α for some α, and n s , n u is some local function in kinematic invariants. This has a distinct high energy behaviour than eq.(5.28) which behaves as 1/su, and thus cannot cancel. 11 For S = S, if the masses are not identical then the residue is again local and we have the same issue. If the masses are the same, then one simply obtains the exact same form as eq.(5.28) with identical signs, and the H.E. behaviour is again untamed. Thus even with finite number of states with comparable mass, the sick H.E. limit still rules out isolated charged higher spin state as a fundamental particle. The above analysis does provide a loop hole: one can have an infinite tower of ever increasing higher spin states. While their presence in the propagator only produces terms with single poles in the H.E. limit, an infinite sum of n s /s terms can produce poles in u if the degree of polynomial for n s unbounded. That is, if the exchanged state has unbounded spin. This is precisely what happens for string theories which contain massive higher spin states.

All Possible Four Particle Amplitudes
Having discussed the four-particle amplitudes associated with the most familiar and important three-particle interactions, let us finally turn to computing all possible four-particle amplitudes. As we have seen when there are no "x" factors involved, we have local residues and the construction of four-particle amplitudes is trivial. We will therefore concentrate on discussing the cases where consistent factorization is non-trivial, which involve having at least one minimal coupling with an "x" factor, but now allowing for the most general set of other couplings. We will see (once again) that consistency demands that the minimal couplings have the standard Yang-Mills/gravitational forms, and that the other interactions have to be (globally) Yang-Mills invariant. But it is then possible to find consistently factorizing four-point amplitudes for any choice of three-particle interactions satisfying these conditions.

All Massive amplitude
This is the simplest, since we only need to consider the massless exchange. Consider the exchange of a massless-photon, for external scalars we have: where the two terms correspond to the two different helicities. Using m |P ] and P = p 1 + p 2 , we find: where one uses the fact that P |p i |P ] = 0 for any external momenta p i . This is not the complete answer, as one expects (p 1 ·p 3 )−(p 2 ·p 3 ) s from minimal coupling. The difference is s/s and thus have no factorization poles. The correct answer can be inferred from symmetry arguments under 1 ↔ 2 exchange. Thus the correct completion is For the exchange of a general massless spin S state, we simply get a factor of ((p 1 − p 2 ) · p 3 ) S for the numerator. Now we let the external particles carry spin. For simplicity we will consider the case where all four particles are of the same spin. Then the residue for the most general coupling is given by: where we've included the contribution where the photon helicity is flipped. Finally, using the identity: introduces |P ] P | that can again be used to absorb the x-factors leaving behind where we've used eq.(5.40). Thus we see that the massless gluing of any three point vertex can be converted into a local form. For more general external spins, the analysis is the same albeit more complicated.

Three-massive one-massless
If we have three-massive legs, the dangerous x-factors can occur in two types of diagrams for the s-channel residue: The first is manifestly local. For the second, let's consider the all massive vertex being φφ φ vertex, and the photon only couples to φ and φ with coupling e, e . Then gluing leads to: where legs 1, 4 are φ, φ respectively. We see that only when the charge is conserved, i.e. e + e = 0 does the 2ξ pole cancels and the amplitude becomes local. If the scalars were all charged with charges e, e , e , the same analysis would tell us that e + e + e = 0. Next suppose the photon was instead a gluon, with the scalars carry indices i, i , i and the threepoint amplitude given by c ii i . We have already seen that consistency demands the couplings to the gluons T a ij , T a i j , T a i j be generators in some representation of the Yang-Mills group. Then we discover that we must have T a ij c ji i + T a i j c ij i + T a i j c ii j = 0, in other words the cubic interaction must be invariant under the (global) Yang-Mills symmetry. Finally, for graviton, gluing to a φ 3 vertex leads to: where we've let all three scalars couple to gravity. Again after rearranging the terms, one finds that the auxiliary spinor drops out only if g 1 = g 2 = g 3 , and one arrives at: .
Thus we see that coupling to photons, the consistency of the four-point amplitude requires charge to be conserved, for a gluon it requires the particles to be in the adjoint representation, and finally for a graviton, it leads to the equivalence principle. Note that this discussion does not refer to any gauge redundancy and the independence there of. On the other hand, the astute reader will recognize that the factor [2|p 1 |ξ 2ξ can be identified with 2 · p 1 from Feynman rules, where λ ξ is the reference spinor for the polarization vector 2 . Indeed from the photon and graviton soft-theorem [23], it is precisely this factor whose gauge invariance (Ward identity) demands the conservation of charges and equivalence principle. Here, there's no gauge redundancy, the auxiliary spinor λ ξ is simply a projection of eq.(4.10), and the independence thereof is the requirement that factorization is consistent to all solutions of x defined through eq.(4.10). Again the same applies if we consider external spinning particles. For example for massive spin-1, diagram (a) yields, where again the residue is local. For diagram (b) the only non-locality originates from the minimal coupling piece, and hence one recovers the same condition as before.

One-massive three-massless
So far we have found that all potential non-localities can be converted into local expressions, and hence the residue of one-channel does not encode information with respect to other channels. For three massless particles things are more interesting. The potential s-channel factorization diagrams are: For our purpose, only minimal coupling is relevant for the two massive one massless vertex in (a). We will consider a massive scalar coupled to abelian and non-abelian vectors. First for the abelian case we only need to consider diagram (a). Taking all vectors to be plus helicity, one finds the s-channel residue given by where we've added the massless t-channel image, and the extra 1/m 2 is to guarantee that both massless channels factorises correctly. One can check the s-channel massive residue, which was given in eq.(5.52), matches when taken into account that 34 = m 2 [43] . Note that the amplitude vanishes as m → 0 as it should. Now let's move on to the case where there are external spins. For example, one can consider a massive spin-1 particles couple to three massless vectors. If the vector is abelian, Yang's theorem tells us that there is no vertex to consider, and thus there are no factorizable four-point amplitude to consider. We instead begin with a massive vector and three gluons. We will start with colour stripped all plus-helicity gluons, whose residue for the massless s-channel is given as: where the last equality sign is understood to hold on 34 = 0 kinematics. We see that unavoidably there is an 1/ 24 pole in the s-channel massless residue, which is spurious unless it can be interpreted as a t-channel pole 1/(t − m 2 ). Thus the massless residue for the amplitude tells us that there must be a two massive vector, one gluon matrix element that must be present to explain the apparent spurious singularity. The contribution of this matrix element for the s-channel is given by: This suggests that we begin with the following piece which factorises correctly on the s and t-channel massive pole: Note that the above is symmetric in (2 ↔ 4) and contains 34 , 23 poles as well. Taking 34 → 0, only the second term in eq.(5.59) contributes to its residue: This is nothing but the spurious residue appearing in eq.(5.57)! Putting the information built from the s-and t-channel massive, and s-channel massless residue together, leads to: The matching to the massless t-channel is straight forward given that the above is symmetric in (2↔3). Note that unlike the uniqueness of the one massive two massless amplitude, a priori the coupling between the two massive and one massless vector does not have to match that of minimal coupling. It is the consistency between the massless and massive factorisation that fixes this choice. A quick recap: beginning with the massless residue, for which the three-point coupling involving the massive spin-1 is unique, the anti-symmetric property with respect to the massless legs tells us that the massive state must be in adjoint rep of the color group. Then the presence of an 1/ 24 singularity becomes spurious unless it arises from the massive propagators evaluated on degenerate kinematics. Thus the massless residue in one channel encodes the massive residue in the other.
For the other helicity components, the derivation is simpler as one can construct the full amplitude from the residue of the massive channel, and we simply list the results: In the first line, we've listed the amplitude in the dotted frame for simplicity. One can check that the leading contribution for the H.E. limit of this amplitude yields the amplitude generated by the tr(F 3 ) extension of Yang-Mills theory. As a final example, let's consider a possible singlet massive spin-2 particle that interacts with gluons via a higher-dimensional operator RF 2 . For the one massive three positivehelicity gluon amplitude, we expect that the final result is cyclic invariant in (2, 3, 4). The massless s-channel residue can now be written as, The above result contains other additional poles, which under cyclic rotation (2, 3, 4), will generate terms that will modify the original 1 34 residue. Thus before summing over its cyclic image, we should augment eq.(5.64) with terms that kill the extra poles in 24 and 23 .
Putting everything together, we find: We give further examples of massive amplitudes involving one massive higher spin and non-identical spin massless particles in appendix F.

The spinning polynomial basis
The fact our on-shell formalism provides a convenient basis to classify distinct three-point couplings lends itself to another important application: construction of a basis polynomial to expand the four-point amplitude. A well known example for such a polynomial is the Gegenbauer polynomial, or its four dimensional representation the Legendre polynomial, as a basis for the four-point scalar amplitude. The Gegenbauer polynomials arises from the exchange of a spin-S particle for a four scalar amplitude. Note that we have one polynomial for a given S because the three-point coupling between two scalars and a spin-S particle is fixed.
As we've seen in the previous discussion, the three-point amplitude for one massive, two massless particles is also unique. This implies that we can similarly construct "spinning" Gegenbauer polynomials for massless scattering amplitude, where each polynomial correspond to a different spin exchange. To see how this works let's consider the residue for a spin-S exchange in the s-channel for M (1 −h 2 +h 3 −h 4 +h ). We can write down the unique three-point amplitudes on both sides: [34] S m 2S−1 Such coupling only exists for S ≥ 2h. Now when we glue the two tensor structures together the indices on λ 1 , λ 2 must be fully contracted with those on λ 3 , λ 4 . This can be done in many ways, each with its own pre-factor counting the number of equivalent contractions. The gluing procedure is thus a sum over all possible contractions with suitable combinatoric factors: . The spinning Gegenbauer polynomial is then given as: As a few example (with x = cos θ): The universal prefactor (x − 1) 2 can be identified with 13 2 [24] 2 which takes care of the overall helicity weights of this amplitude. Taking = 0 and we indeed recover the Legendre polynomials P 0 S (x) = P S (x). For completely general helicities h 1 , h 2 , h 3 , h 4 of external massless particles, we have: This reduces to equal spin polynomial if we take all |h i | to be equal. Three-point couplings with more than one massive leg are no longer unique. This means that for a given spin-exchange, one instead has a symmetric matrix where the rows and the columns label the independent three-point vertices on both sides of the factorization channel. We illustrate this for the two massive spin-1 and two massless spin-1 amplitude. Now the three-point coupling involved in the factorization involves a massive spin-1 spin-S and massless spin-1 amplitude. The number of such coupling is determined by the lowest spin massive particle, which in this case is 1 and there are three independent coupling. To give an explicit example, consider S=2 The building blocks of tensor structures will be {λ 1 , P 2λ1 } = {v, u}. If the massless particle has − helicity, we have three tensor structures listed in eq.(4.7). Now imagine gluing the two three-point amplitude: The residue will be a polynomial of (u L , v L , u R , v R ) with By gluing them we contract the internal indices in all possible ways, then sum them up with appropriate combinatoric factors. We can distribute indices carried by exchanged particle into a bunch of u's and v's: where S is the spin of exchanged particle. For a contraction with (u L ) k 1 and (u R ) k 2 on exchanged leg, suppose u L and u R are contracted together k 3 times. Then we have which means a factor of The first two factors come from choosing which u L s and u R s are to be contracted together.
Since we can always redefine coupling constants for interactions, the k 3 -independent factors shall not concern us here. Summing this factor over k 3 one gets (2N )!, the total number of permutations on 2N indices. Assigning a coupling constant g i for each three-point vertex, the residue of the four-point amplitude can then be expanded as g i M ij g j where each element in M ij is a polynomial given by the contraction of the corresponding three-point amplitudes. Since we have two external spin-1 particles, M ij is a 3 × 3 symmetric matrix irrespective of the exchanged spin. For the case where one exchanges a spin-2, the matrix elements are given by: where we've contracted each entry with the external λ I 1 , λ I 4 s. For convenience, we will also give the representation in terms of scattering angle. We can parameterize the kinematics as One can explicitly check that i p i = 0, p 2 i = m 2 i . In this parametrization, the matrix elements then take the form, where we've stripped the external spinor dependent terms:

(Super)Higgs Mechanism as IR Unification
Our exploration of consistent four-particle amplitudes has given us an almost complete understanding of the broad architecture of particle physics. Theories of massless particles are incredibly constrained, allowing only helicities (0,1/2,1,3/2,2), and limited to the (super)gravity coupled to (super)Yang-Mills theories. Massless higher spins are made impossible by the mere presence of gravity. We have also seen that the amplitudes for massive particles of sufficiently high spin have sick high-energy limits-as expected, since there is no consistent theory of massless high-spin particles they can match to at high-energies-so such particles cannot be "elementary". The final case to consider is then that of massive particles of low spin S ≤ 2. Here of course there is in principle a consistent high-energy theory to match to, but as we will see in this section, doing so puts non-trivial restrictions on the particle content and interactions of the theory. This investigation will lead to the on-shell discovery of the Higgs and Super-Higgs mechanisms.
Note that we will not simply be rephrasing well-known "bottom-up" facts, such as the high-energy growth of scattering amplitudes for longitudinal components of massive spin one particles, and the attendant need for the Higgs particle to tame this growth, in an on-shell language. It is of course perfectly possible to do this, and the on-shell methods do simplify the explicit computations, but the advantage is purely technical and does not add anything conceptually new to this standard textbook discussion.
We will instead take a different, "top-down" point of view, where as described above we insist that massive amplitudes manifestly match to consistent massless amplitudes in the high-energy limit. As we will see this gives us a satisfying understanding of the Higgs mechanism that is at least psychologically quite opposite to the usual picture of gauge symmetry "breaking". Indeed in textbook language, the gauge symmetry is "broken" or "hidden", and becomes more manifest only at high energies. By contrast in the on-shell picture, the massive "Higgsed" amplitudes do not "break" or "hide" the (non-existent in this formalism!) gauge redundancies. Instead, they unify the different helicity components of massive amplitudes, the Higgs mechanism can be thought of as an infrared unification of massless amplitudes, and this unification is more disguised at high energies! We will see this beginning already at the level of three-particle amplitudes. Here, the nonlocality associated with the poles in massless three-particle amplitudes gets IR-deformed to 1/m poles. Such 1/m poles non-trivially disappear in the high-energy limit while the massive amplitudes unify different helicity components together. Matching the high-energy limit enforces all the usual consistency conditions associated with the Higgs mechanism. Moving on to four-particle amplitudes, we will obtain them both by gluing the three-particle amplitudes as usual, but also in a novel way, starting with the massless helicity amplitudes, simply adding them so they fit into massive multiplets, then shifting the poles and "BOLD"ing the spinorhelicity variables to make massive amplitudes! This will highlight the Higgs mechanism as an "IR unification" in an even more vivid way.
Rather than present a completely systematic analysis of all possible "Higgsings", in this section we will content ourselves with illustrating this physics in three standard examples: the Abelian Higgs model, the Super-Higgs mechanism in a simple model with N = 1 SUSY, and the general structure of the non-Abelian Higgs mechanism for a model with enough scalars so that all the spin one particles are massive. As alluded to above we will also discuss why gravity cannot be Higgsed in this way.

Abelian Higgs
Let us start with the simplest example -a theory with a massless photon and a charged scalar; we'll call the scalar's two real degrees of freedom "H" and "E".
The three-particle amplitudes are We now want to see how to introduce masses as an "infrared deformation". The first step is a trivial kinematical one. We declare that (+, −, E) are to become the 3 components of a massive spin 1 particle, leaving H as an additional scalar. Now, the two massive vector (with m 2 γ ) and one massive scalar (with m 2 H ) amplitude can only be, The coefficient is fixed by the requirement that this 3 particle amplitude matches the massless amplitude in the high-energy limit. It is illuminating to see how this happens explicitly. Recall that to take the HE limit we put where we scale each of η,η as ∼ m. We are looking for pieces that survive in the m γ , m H → 0 limit. The leading piece in the numerator are those with zero η,η's which is given as: This term is more interesting. To compute it, note that in the UV we have our usual restrictions on 3 particle kinematics -either λ 1 ∝ λ 2 ∝ λ 3 orλ 1 ∝λ 2 ∝λ 3 . This 3-particle amplitude vanishes in the first case. On the other hand, in the second case, we have by momentum conservation that Here we see an interesting counterpart to the purely massless 3pt amplitudes -which are not manifestly local due to the presence of poles. Healthy theories of massless particle (which we should reproduce in the UV) do not have such non-local poles at 4pts and higher. When we perform this "IR deformation", we have removed the non-local poles but are left with seeming factors of 1 mγ in the amplitude. But as we have seen the 3pt amplitude isby design -chosen to match the correct massless helicity amplitudes and thus be smooth as m γ → 0, and this will be inherited at higher points.
Indeed let us compute the 4-particle amplitude with all massive spin 1 particles consistent with factoring into the three-point amplitude in eq.(6.2). Since we have no "x" factors to worry about, we can proceed in the most naive possible way, simply gluing the 3-pt amplitudes in the s, t and u channels, and we find: Since there are no three-point massive spin-1 amplitude, there is no poles involving m γ . Note that all possible contact terms here can be eliminated since they give growing amplitudes for some of the helicity components in the UV, which we are assuming not to have. Now again, despite appearances this amplitude is guaranteed (by construction!) to be smooth in the high-energy (or m γ , m H → 0) limit. Let us first show this directly for some of the helicity components. For instance, the all-longitudinal amplitude is Where with p = λλ + ηη we definep = λλ − ηη. Just to take a first look at the HE limit, which naively goes as g 2 mγ 2 , we drop the η's and find at O( 1 (6.11) and so as expected there is no ( s,t,u m 2 γ ) singularity as m γ → 0. In order to find the leading high-energy limit, let us define q ≡ ηη. Note that p · q = m 2 γ 2 so q = O(m 2 γ ), and we will work to first order in q. Using 2p 1 · p 2 = s − 2m 2 γ , and alsop = p − 2q, we find in the HE limit So summing over channels gives Hence the all-longitudinal amplitude is fixed to be 3 . This tells us we must have a quartic coupling in the UV, and by the U (1) invariance it must be λ( (6.14) Let's see how some of the other component amplitudes work. Consider (1 0 2 − 3 + 4 0 ), which should match (1 E 2 − e + 4 E ) in the high-energy limit. This is Thus we find the correct amplitude for minimally charged scalars in the UV. All other helicity amplitude components vanish as m γ → 0. We have thus verified that the 4pt massive amplitudes are an "infrared deformation" of the massive ones, reproducing and unifying the different helicities in the HE limit.

Higgsing as UV Unification → IR Deformation
Given that we see the massive amplitudes reproduce the massless ones at high energy, we are motivated to consider directly assembling the high-energy massless amplitudes in a way that one can readily "IR deform" the amplitude by simply putting in the mass for the propagator and "BOLDing" the spinor brackets. We are then guaranteed to have a result that gives the correct high-energy behaviour, and what remains is simply to add in higher order corrections in mass that ensure the massive residue is matched. Let's first consider all the different component amplitudes -Compton scattering for H, E, and the quartic interaction for E. We will first merely group these amplitudes together, ready to be "BOLD"ed + unified into a massive amplitude. The massive amplitude in the IR will 12 Using this representation forη2, one can also show that the O(m −1 γ ) term in the amplitude vanishes as well, withλ I 2 →η2, while all other massive spinors are set to their massless limit.
be the four massive vector amplitude, and thus we will need a total of eight spinors to carry the SU(2) Little group indices, these are the objects that will be BOLDed. Thus the name of the game is to write the massless amplitudes in a form which contains eight spinors, two for each legs, and every thing else can only be expressed as momenta. Note that because of this the E 4 quartic must be written in an interesting way. Naively it is just 3λ, but to put it in a form where by BOLDing we can recognize it as a component of massive spin 1, we have to write it in the following way: and {t} {u} represents its t, u image. This is the only way to represent the "constant" without introducing double poles. Similarly for the two photons two E amplitudes we write Collecting all the component amplitudes together, we are ready to IR deform: declaring the particles have mass m γ by BOLDing the spinors, and deforming s → s − M 2 h etc., giving an IR deformed object: (6.20) The above result by construction gives the correct answer in the High-energy limit, with mismatch at higher order in m 2 γ , M 2 h . Thus we have the identity But now in this form, the challenge is to check the factorization channels, which will fix the O(m 2 γ , M 2 h ) terms. For example in the limit where m 2 γ = M 2 h ≡ m 2 , the remaining term is simply We have thus seen the Higgs mechanism very explicitly as an IR deformation. Note that while it is pleasing to see everything work explicitly, the correct HE limit was guaranteed once we ensured the 3 particle amplitudes reproduced and unified the helicity amplitudes in the high-energy limit. Again: all the non-trivial physics was in the "unified packaging" of all the massless helicity amplitudes into the massive amplitudes -everything was guaranteed to work after that point. Now, we wish to see whether the (ψ, χ) amplitudes can be unified into those of a single massive spin 3 2 multiplet. The logic completely parallels to the Abelian Higgs mechanism we discussed above. Indeed, again we simply have the following massive amplitude for massive spin-3 2 , spin-3 2 and scalar: yields the correct massless amplitude in the HE limit. After this point everything is guaranteed to work just as with the Abelian Higgs mechanism, and we omit the details. (We have described spontaneous SUSY breaking with the chiral superfield X = φ + θχ + θ 2 F φ and W = µ 2 X)

Non-Abelian Higgs
Let us now look at the most general case. In the UV we have gluons and scalars in some representation R: 32 13 Now, we want to take the ± component of index a, together with some linear combination of the scalars (u a J φ J ), and make the part of a massive vector of mass m a . Here, we are assuming that all the vectors are massive, in particular this means that the number of scalars N φ is larger than or equal to the number of massless vectors. Then, what we are doing is considering a big SO(N φ ) matrix U IJ , such that U aJ φ J will become the longitudinal component of the massive vector. The remaining scalars are "Higgses" U iJ φ J . We can always diagonalise so these have mass In particular in the high energy limit we must have, for example: Being able to unify these into massive amplitudes will allow us some interesting interpretations of the U matrix. First, the only possibility for the first figure in (6.27) is 13 We can again compute the HE limit of the component amplitudes. The details of this limit is given in appendix E, and we simply summarise the result: From the above we see that in order for the massless amplitudes to be unified into a single massive amplitude, the matrix U a I must satisfy Let's define τ a I = m a U a I , then τ a I τ b I = m 2 a δ ab . (6.34) So, we can re-write the eq.(6.33) as where we have suppressed the contraction of indices I, J. The solution to the constraint for τ a I is simply that τ a I = T a IJ V J (6.36) for some constant vector V J (the "vev"). Indeed this is precisely what we get in the usual Higgs mechanism. The combination T a IJ V J φ I is "eaten", and diagonalising ( 13 This can be verified by noting that αα = One can check that after substituting for τ , eq.(6.35) becomes (note we are always writing with real states so T a IJ = −T a JI ). Now, if we assume that the "coupling tensor" f abc is the structure constant for the Lie group associated with T a , then we can repeatedly use T a T b = f abd T d + T b T a , and we find, Using the fact that V T T a T b V is diagonalised, we find: Once eq.(6.33) is satisfied, the rest of the story is again the same as our previous examples. Note in particular that we must have Higgses! Even if we have N scalar = N gluon precisely, the interactions are not the correct ones for the full UV theory due to the standard polynomial growth of the longitudinal piece scattering, which is not present for the UV theory. But with the "uneaten Higgses" included, is simply chosen to match the high energy limit, and we manifestly match to a healthy UV theory.

Obstruction for Spin 2
We now consider massive spin-2 particles, which in the HE limit should yield a graviton, a massless vector and scalar. We would like to see if the massless interactions can be consistently unified into an IR massive amplitude. The three-point massive spin-2 amplitude can be easily written down as: where m is the mass of the massive graviton. Let us look at the HE limit. We can directly import what was done for non-abelian Higgs, and one finds:  , 0). Thus the scalar coupling at high energy is three times what it should be. This is unacceptable since gravitational coupling is universal, and the coupling strength M pl has already been set by the self-interaction. Note that similar difficulties arise for the HE limit that yields the one graviton two minimally coupled vector, where one obtains −2 12 4 /M pl 13 2 . Again the factor of 2 is inconsistent with graviton self coupling. Thus we see that there is a fundamental obstruction in organising the massless degrees of freedom into a massive spin-2 particle, in a way such that the massive interactions have HE limit that morphs into a consistent UV theory.

Loop Amplitudes
In this section we briefly touch on constructing loop amplitudes by an on-shell gluing of the tree amplitudes we have found in previous sections. We will follow the philosophy of "generalized unitarity" [2,24], where the integrand for loop amplitudes is determined by a knowledge of its (generalized) cuts, putting internal propagators on-shell. As is well-known, at one-loop this gives a systematic way of determining the integrand from gluing together on-shell tree amplitudes. 14 While we are not adding anything new to this conceptual framework, the technical advantages offered by our formalism for massive particles with spin are significant in many cases, including the dispensation of complicated gamma matrix algebra, the clear separation of electric and magnetic moments for charged particles, the extraction of UV divergent properties without the contamination from IR divergences (by virtue of using massive external and internal states), and finally directly obtaining the (internal) mass depending pieces in the small mass expansion relevant for obtaining rational terms for massless one-loop amplitudes. In all of these processes, as they do not have tree counterparts, bubbles on external legs do not contribute. It is pleasing to continue seeing directly the way in which Poincare symmetry and Unitarity fully determines the physics, not just at tree-level but incorporating the leading quantum loop corrections as well.
7.1 g−2 for spin-1 2 and 1 As seen in previous discussions the simplicity of minimal coupling allows us to straight forwardly separate the magnetic moment pieces. The same simplicity translate to a straightfor- 14 There is an obvious subtlety in this on-shell approach to loop amplitudes, regarding "wavefunction renormalization". In the unitarity approach where one glues tree amplitude on both sides of the cut, there will be diagrams which correspond to a bubble insertion on the external leg, and hence give rise to an 1/0 from the on-shell propagator. In the Feynman diagram approach, these are wave function diagrams that are to be amputated, replaced by counter terms. This procedure breaks gauge invariance in the intermediate steps. For massless internal states, these can be side stepped since there will be UV-IR cancellation for these diagrams. For massive internal particles this is no-longer the case, and we refer the reader to [25] for unitarity based treatments of this issue. This subtlety will not affect any of the examples we discuss in this section: for (g-2) and rational terms, the 1-loop corrections are leading, while for the beta function the external massive particles are merely probes. ward computation for the loop level magnetic moment.
Let's consider the e + , e − → γ at one loop. The diagram we want to build is: where we've glued the three-point vertices according to the two possible helicity configurations in the internal photon lines. Notice that here, we are using the three point amplitude in the SL(2,C) undotted basis. This is motivated by eq.(4.23), which yields a clear separation of (g−2) factors in this basis. One can also understand this from the fact that anomalous moments should arise only if the particle carries spin. By expanding the integrand in eq.(7.1), one notices that the λ independent terms will be present for charged scalars as well, and thus the piece of the integrand that can contain the magnetic moment is: This gives us the following integrand: This gives the (g−2) = α 2π by comparing with eq.(4.23). Just to give us a little bit more challenge, let's now consider the W + , W − → γ at one loop involving only photon coupling. The integrand is again built from: Leaving behind the electric coupling, we now have two structures for the numerator of the integrand: Here f 1 (q) is the same as the electron moment, and leads to: (7.6) For the second tensor structure, one has: where we've defined O αβ i,j ≡ p iα α pα β j .

The beta function
Let's now turn to the extraction of beta function. For massless amplitudes, these can be obtained by extracting the coefficient for the bubble integrals in the scalar integral basis [4,24]. However, extra care needs to be taken for the subtraction of infrared divergence. Here we will instead consider two massive scalar probes of a photon propagator, and consider the correction to the propagator due to an internal massive scalar, fermion and vector (denoted by X):

X
The UV divergence of this amplitude contains the contribution of a scalar to the beta function, without the IR-contamination. The loop amplitude will be constructed by gluing the 2→2 amplitude involving the scalar probe particle exchanging a photon with X. This will allow us to obtain the beta function for different spins. From the massive vector, we will also be able to extract the contribution for a massless vector by simply subtracting a scalar. Assuming that the mass of X is identical with that of the scalar probe, the relevant tree amplitudes can be easily constructed by generalizing the examples in subsection 5.4.1: where we've again summed over the two possible photon helicity configuration and P = p 3 +p 4 . The second equality for each amplitude gives the manifest local form, which can be checked against the H.E. limit where one should find a finite result as m → 0. Note that each term contains a piece which is identical to the scalar contribution.
We can now glue the tree amplitudes into the one-loop integrand. The beta function can be readily read off by picking out the divergent piece which is proportional to the tree amplitude. For further simplification, we can take the s → 0 limit, and we will be looking for the term that is proportional to 2(p 1 ·p 3 ) s . Let us use the scalar correction as an example. The one-loop amplitude is now The one-loop integrand is then simply: where · · · represent terms terms that are purely functions of s, or finite. For fermions, there are now two pieces that are relevant, the square of the scalar piece, and the square of the p i P piece. All other contributions cannot generate the p 1 · p 3 tensor structure. We find: The relevant part of the one-loop integrand is then: Finally, similar analysis for vectors yields: which leads to 1 s s + · · · . (7.14) Thus we've found that the beta function for a scalar is 1 6 a Dirac fermion 4 3 and a massless vector being − 7 2 + 1 6 = − 11 3 , where we've subtracted the scalar "eaten" by the massive vector.

Rational terms
Another application of massive amplitudes is to derive rational terms for massless amplitudes, that are not constructible via four-dimensional cuts. These terms appear due to the fact that the integrals are regulated and one can encounter / ∼ O(1) effects. These terms can be obtained by considering the states in the internal loops to be massive [28], where the mass m 2 is identified with the extra −2 dimension piece of 2 , denoted as µ 2 . 15 For QCD, one considers the contribution of a massive adjoint scalar state that is minimally coupled to the external gluons. These "µ" terms are computed using the tree-level amplitudes in D-dimensions [15,26] and consider the extra dimension momenta as four-dimensional mass.
Here we will directly use the four-dimensional massive amplitudes to obtain the integral coefficients for I 4 [µ 2k ], the four-point scalar box integral with µ 2k as its numerator. For the box-integral coefficient one considers the quadruple cut, where the two solutions for the cut loop momentum are: The box-coefficient is then obtained by gluing the four tree-amplitudes substituted with the cut loop momenta.
First consider the four-point all-plus amplitude, where the cut is given by: The above rational terms are in agreement with [26].

Form Factors and Correlation Functions
The ability to discuss scattering amplitudes for general mass and spin largely removes the distinction between amplitudes and "off-shell" objects such as correlation functions and formfactors. Consider correlation functions for the stress tensor for some theory. The computations are precisely the same as what we would carry out if we were computing the scattering amplitude for a massive spin two particle, (arbitrarily) weakly coupled to the theory. The scattering amplitude for these massive particles gives us the correlation function in momentum-space, corresponds closely to the experiments that are actually done to measure correlation functions. Strictly speaking we are coupling a continuum of particles of different masses, and we are getting the correlator in momentum space for the external legs p a in the timelike Lorentzian region where p 2 a > 0. But we can then define the correlators for null and spacelike momenta by analytic continuation. At least in perturbation theory-which is what we will largely concern ourselves with here in this subsection-there is no ambiguity for what this means in practice.
It is important to imagine that the massive particle O corresponding to the operator is simply an external probe and does not participate in the dynamics. In other words, we should not have any "internal propagators" associated with cuts that put O on-shell. In practice, this means that we should be able to make the coupling of O to our system proportional to a parameter that we can make as small as we wish. To take an example, consider a 3-point coupling of O to a pair of massless particles for the system of interest; making this proportional to means that the leading amplitudes will never involve internal O particles: In general, the leading amplitude involving N O's will be proportional to N and will never involve internal O particles.

Observables in Gauge Theories and Gravity
Before moving on to illustrating how this interpretation is useful in concrete calculations, let us pause to interpret some standard and elementary facts about observables in gauge theories and gravity from this on-shell perspective.
In particular, let us understand the reason for the absence of charged local operators in gauge theory, or any local operators whatsoever in gravity. Consider a charged operator Φ. We know that consistency enforces universal coupling of Φ to photons/gluons, with strength set by the gauge coupling g, and so we can't arbitrarily weakly couple Φ to the system. Thus we can't speak of charged local operator. Similarly with gravity, the coupling of any particle to gravity is universal given by √ G N , so in the presence of gravity we can't meaningfully talk about any local operators at all. In a conventional Lagrangian description of the physics, this is associated with the impossibility of making local charged operators gauge invariant. Of course we can always fix a gauge and compute correlators for operators in that gauge, but then these are not quite local. If we start with correlators of local operators in the limit as g 2 → 0 or G N → 0, the weak gauging attaches Wilson lines to the operators in some way. Of course this also has an obvious on-shell meaning, again corresponding closely to physical experiments that measure these Wilson-line dressed correlators.
Consider again a charged scalar Φ of charge +1 in an abelian gauge theory, and let's consider the correlator Φ * (x)Φ(y) first in the limit where we turn off the gauge coupling. We may have U (1) invariant self-interactions for Φ of the form e.g. (Φ * Φ) 2 , and we can also turn on the gauge-interactions. But we also couple Φ to some heavy external probe particles X (q) , Y (q+1) and A (Q) , B (Q+1) via the couplings X (q) Y (q+1) * Φ, A (−Q) B (−Q−1) * Φ * . Let's now look at the (XY * B * A) scattering amplitude. Since this breaks the global particle number symmetries acting separately on X, Y, A, B as , → 0, this amplitude is proportional to the product ; some of the diagrams contributing to the amplitude are shown below: As , → 0, stripping off this product from the amplitude yields the correlator where Φ * (x)Φ(y) is dressed with Wilson lines in the p X , p Y , p A , p B directions: 3) The fact that inequivalent "dressings" of the local operator with Wilson lines are possible simply reflects the many different ways we can couple Φ to external probes; since the probes themselves are charged and emit long-range gauge fields, the amplitudes (and hence the extracted correlator) does depend on the choices that are made. Thus, while correlation functions for local charged operators don't exist, dressed version of these correlators exist, for both gauge theory and gravity, to all orders in g and √ G N . There is a deeper difficulty with gravity, which makes even these quasi-local "Wilson-line dressed" correlators ambiguous at a non-perturbatively tiny level, of O(exp(−M 2 P l /s)). As we saw in our example above, in order to be able to identify the piece of the amplitude for the heavy probes that is unambiguously associated with the coupling to the operator Φ, it was important that the coupling to the probe broke some global symmetry of the problem. But we expect that gravity breaks all global symmetries, and in particular, we can't say that e.g. the XY * A * B amplitude is arbitrarily small; there is some (perhaps virtual black-hole mediated) rate for this process of O(exp(−M 2 P l /s)) that pollutes any attempt to associate this amplitude with "the" (Wilson-line dressed) correlator of interest, making it impossible to pick out a piece proportional to as , → 0. Summarizing more informally, in both gauge theories and gravity we don't have meaningful correlators of local charged operators, for the (relatively trivial) reason that we can't ignore the long-range gauge and gravitational fields. This can already be seen perturbatively in g 2 , G N , but to all order in these couplings, there are dressed versions of local operators that take care of the long-range fields at infinity, smoothly deforming the local correlators we have when g 2 , G N = 0. But in gravity, due to exponentially small effects of O(exp(−Area/G N )), associated with black-hole physics, even these dressed versions of local operators don't make precise sense. This is a concrete sense in which any notion of spacetime becomes ambiguous in quantum gravity, highlighting that e.g. the breakdown of locality in the context of the black-hole information paradox is an effect of O(exp(−S BH )), and is otherwise invisible to every order in G N .

Weinberg-Witten
The interpretation of correlators in terms of massive amplitudes allows us to re-interpret some familiar facts about massive amplitudes we have already encountered to other well-known facts about QFT's. Consider the Weinberg-Witten theorem [30], which in this way of thinking is essentially identical to Yang's theorem. Recall the discussion of consistent couplings of a massive spin S particle to massless particles. Note that since conserved currents and stress tensors measure the charge and the momentum on single particle states respectively, we will be interested in the interaction of the massive state with two opposite helicity massless-particles h 1 = −h 2 . 16 Our analysis showed that S + h 2 − h 1 and S + h 1 − h 2 must always be greater or equal to 0, this tells us that for S = 1, |h 1 | = |h 2 | ≤ 1 2 , i.e. massless particles with spin > 1 2 cannot couple to a Lorentz covariant conserved current. Similarly for S = 2, |h 1 | = |h 2 | ≤ 1, and massless particles with spin > 1 cannot couple to a conserved stress-tensor. This is precisely the Weinberg-Witten theorem.

Form Factors Example: Stress Tensor/Gluons
From Weinberg-Witten theorem we know that the stress tensor can only couple to massless particles of spin ≤ 1, thus we will consider form factors of a stress tensor and three gluons. Identifying the stress tensor as a massive spin-2 state, we will map this to a four-point amplitude involving one massive and three massless states: Let us consider the t-channel massless residue. Since the gluon is "charged" under the stress tensor, for the one massive two massless coupling, one should consider opposite helicity gluons. The t-channel residue can then be written as: where again, the equality holds for 23 = 0. This leads us to the following simple expression for the form factor: It is straight forward to check that the above result matches all three factorisation channels, as expected from its cyclic invariant form, up to the over all factor of (λ 2 ) 4 that takes care of the excess helicity weight and the stress tensor's SL(2,C) indices. We can straight forwardly extend to two stress tensors coupled to two gluons: 16 Recall that all momenta are out going, so for p1 and p2 to represent the same particle, h1 = −h2.
There is an elephant in the room that we have not yet addressed. So far we have been considering conserved operators as massive spinning states. But conserved operators are a tiny subset of an infinite number tensor operators, for which all must have well defined form factors (and in the next section momentum space correlation functions). Furthermore, we should be able to see there must be a kinematic distinction between conserved operators and non-conserved operators, such that higher-spin conserved currents for an interacting theory can be ruled out,à la Coleman-Mandula theorem [31].
As an exercise let's consider a theory with two scalars (φ,φ) and the operators O 1µ = φ ← → ∂ µφ and O 2µ = φ∂ µφ . The first is a conserved current while the second is not. Let us now consider the three-point form factor for Converting the above result into pure undotted SL(2,C) indices by contracting with (p 1 + p 2 ) one finds: Not surprisingly the form factor for O 2 can be further decomposed into a combination of S = 2, 1 and 0 states. Thus we see that a general operator simply corresponds to a linear combination of lower spin states. In position space this is a statement that a general current, for example, can whereÔ µ is the conserved piece. Note that while there is a conserved piece in a general operator, the projection introduces non-locality and are thus distinct from a genuine conserved operator. This non-locality is present in all the lower spin components in the projection. Let us look at this distinction more closely in the context of general form factors. For an interacting theory, the form factor will in general have poles whose residue reveals the existence of a non-trivial S-matrix: Let us consider the particles to be massless, and take the momenta of the operator to be soft. Then just like the usual Weinberg's soft theorems for S-matrix, the form factor will be dominated by diagrams where one has the operator attached to the external leg where q is the soft momenta of the operator, n(p i , q) is the numerator function. If the operator is a tensor, then n(p i , q) should carry the corresponding Lorentz indices. Conserved tensor is reflected in that the form factor must vanish when contract with q µ . If we have a conserved current, then we can have n(p i , q) µ = e i p µ i , where e i is the charge of each external state. The requirement of conservation then simply corresponds to the requirement of charge conservation. Similarly for conserved stress tensor we have n(p i , q) µν = κp µ i p ν i , and the conservation condition is simply stem from momentum conservation if the coupling κ is universal. Note that for higher spins, S > 2, there are no local solutions for n(p i , q) µ 1 ···µ S such that the conserved quantity is respected. This is the Coleman-Mandula theorem! The assumptions that went into this argument is the existent of a non-trivial S-matrix, the analyticity of the form factor which can be interpreted as a massive S-matrix, and Lorentz invariance. The fact that the argument is closely related to Weinberg's soft theorems for gauge bosons is not a surprise in view of our usual intuition that if a conserved tensor exists in an interacting theory, then we can always weakly gauge it and have non-trivial S-matrix involving the gauge boson.
Note that while one can always project out a conserved piece for non-conserved tensors, the corresponding form factor will include non-local pieces. Indeed in this case we can have, for example, n(p i , q) µ = q µñ (p i ,q) . This non-locality is again reflected in the singularity of the m 2 → 0 limit. This of course is an artifact of our projection, since there will be lower spin contributions coming along that will contain the same singularity and conspire to cancel, producing a smooth m 2 → 0 limit.

Current and Stress-Tensor Correlators
Let's consider the two and three-point correlation functions for stress-tensors in a conformal theory. In momentum space, the tree-level correlator are computed by gluing tree-level amplitudes with one massive leg and two massless legs. For conformal theories, the available tensor structures are constrained by conformal symmetry. In momentum space, this constraint is simply a reflection of the uniqueness of the three-point amplitude, which is fixed by the spin of the massive state and the helicities of the massless legs.
For example the two point function receives contribution from: where we've listed the contributions from different internal helicity configuration and I 2 [X] is defined as: where k is the momenta of the stress tensor. The operator S 1 2 α 1 α 2 is a shorthand notation for 1α 1β 2α 2β . Note that it is understood that the expression must be symmetrized over {α i } and {β i } separately, as well as over exchanging α i ↔ β i , which takes into account the conjugate helicity configurations. For the scalar and and equal helicity fermion contributions, their tensor structure are identical to that of equal helicity gauge field.
For the three-point function one has: and I 3 [X] is defined as: where k 1 , k 2 are the momenta carried by the α i and β i indexed stress-tensor respectively. Again symmetrisation interns of {α i }, {β i } and {γ i } are implied and the equal helicity fermion on the one of the vertices as well as internal scalars do not produce new tensor structures.

Outlook
Relativistic quantum mechanics governs the laws of nature at low enough energies so that physics can be described in flat space, with a finite number of interacting particles. "Quantum field theory" is the standard textbook approach to this physics, where, as useful theoretical constructs, "local quantum fields" are introduced, along with the attendant baggage of field redefinition and gauge redundancies, in order to allow a description of the physics in a way compatible with relativistic locality and unitarity. But the on-shell approach to scattering amplitudes suggests that this may not be the only way-that we might instead be able to describe relativistic quantum mechanics without local quantum fields, directly in terms of the physical particles. 17 In this paper we have taken the first steps to extending the ideas of this on-shell approach to cover particles of all masses and spins in four dimensions. The purely kinematical part of our discussion has been fundamentally trivial-but trivializing the kinematics allows to understand the structure of the physics as following seamlessly from the foundational principles of Poincare Invariance, Locality and Unitarity in a satisfying way.
We have seen many aspects of this understanding throughout this paper. The structure of three particle amplitudes, for any mass and spin, is fixed by Poincare invariance. For massless particles, there is a peculiarity for high enough spin-the three particle amplitudes are superficially "non-local" in the sense of having poles; while this doesn't show up in (3,1) signature Minkowski space where these amplitudes vanish, it does mean that consistent factorization at four points is non-trivial, and indeed, all but the usual massless theories we know and love, of interacting spin (0, 1/2, 1, 3/2, 2), are ruled out by these considerations. We learn that we can only have a single massless spin two particle, with universal couplings, that the massless spin one particles must have the structure of Yang-Mills theories, and spin 3/2 requires supersymmetry. Furthermore the mere existence of a consistent amplitude coupling to gravitons rules out all higher spin massless particles.
Similarly there is still a superficial "non-locality" associated with the coupling of a single massive particle to massless particles with spin-the "x− factor"-which again makes factorization non-trivial. Unlike the case for massless particles, we can (non-trivially) find consistently factorizing four-particle amplitudes for any choice of three-particle couplings, (with the usual restrictions on consistent couplings to massless spin one and spin 2 particles). But for massive particles of high enough spin, these consistently factorizing amplitudes are badly behaved at high energies-growing with powers of (p i · p j /m 2 ), so that the massless limit cannot be taken smoothly. This tells us that even massive particles of high enough spin cannot be separated by a parametrically large gap from other particles-massive particles with high spin cannot be "elementary". Finally, three particle amplitudes involving all massive particles are local, but naturally have powers of 1/m. Thus, theories of massive particles can only smoothly interpolate to massless amplitudes at high energies for special choices of spectra and couplings; conversely, starting from massless helicity amplitudes at high energies, we can "unify" subsets of these amplitudes into massive ones in some cases. This can be done for spin 1 and spin 3/2 particles, representing the on-shell avatars of the Higgs and super-Higgs mechanism, but we can see that gravity can't be "Higgsed" in this way. 17 It is amusing that the on-shell program is often contrasted with the standard approach using Feynman diagrams, since Feynman's primary physical motivation for introducing his diagrams to begin with was to get rid of quantum fields-and he was famously disappointed to learn, via Dyson's proof, that his diagrams were so closely related to field theory after all! In the context of this summary it is perhaps also worth briefly describing the on-shell understanding of the most famous general consequences of relativistic quantum mechanics: the existence of antiparticles and the spin-statistics connection.
The existence of antiparticles is essentially hardwired into the on-shell formalism, since by fiat we are considering analytic functions of Lorentz-invariant kinematical variables, with consistent factorization on all possible channels. To be a little more explicit on these ancient points, we can ask how causality is encoded in the S-matrix in any theory, with or without Lorentz invariance. At tree-level, causality tells us that the amplitude can only have simple poles as a function of energy variables. If the particles have a dispersion relation of the form E = ω( p), the poles can be either be of the form 1/(E + ω( p)), or also 1/(E − ω( p)) if the interaction Hamiltonian allows particle production. But in a Lorentz invariant theory, neither (E +ω( p)) nor (E −ω( p)) are individually invariant, so Lorentz invariance and causality forces us to have poles of the form 1 (E 2 −ω( p) 2 ) = 1 p 2 −m 2 . This is how we see that causality demands this familiar pole structure at tree-level, which as a byproduct also forces the existence of non-zero amplitudes for the production of degenerate particles and antiparticles.
The on-shell understanding of the connection between spin and statistics is slightly more interesting, and makes use of the universality of coupling to gravity. Indeed we saw vividly that the structure of the four-particle amplitude for gravi-compton scattering off particles of general mass and spin is completely fixed, and in particular, forces the correct spinstatistics connection. This deeply relies on the non-triviality of how residues in different channels are consistent with each other, forcing the "s" and "u" channels-related by particle interchange-to have fixed relative signs. It is not surprising that an on-shell understanding of a classic fact related to locality and unitarity should be related to coupling to gravity-after all it is precisely the ability to "weakly gauge" gravity that gives a physical probe (via the existence of an energy momentum tensor) of the locality of quantum field theory. We also described how other famous general results in field theory, such as the Weinberg-Witten and Coleman-Mandula theorems, are interpreted in directly on-shell terms.
Moving beyond tree-scattering, we also took some first steps for computing amplitudes at one-loop, where the on-shell picture is especially powerful, as seen in the speed and transparency of the computation for electron (g − 2) and the QCD beta function. While not discussed in this paper, chiral anomalies, together with the possibility of cancelling them via the Green-Schwartz mechanism, also have a beautiful on-shell understanding, arising from the necessity to interpret poles in one-loop amplitudes fixed by generalized unitarity [32].
But of course, much more importantly than providing a conceptually transparent and technically straightforward understanding of standard results, we hope that the formalism introduced in this paper removes the trivial barriers to exploring the new frontier of massive scattering amplitudes, which is filled with fascinating physical questions. We close by listing just a small number of these.
We have focused almost entirely on the computation of tree-level three-and four-particle amplitudes, so one completely obvious question is the extension of e.g. BCFW recursion relations to any number of external particles, especially for Higgsed Yang-Mills theories.
Of course for massless particles the BCFW shift must be performed for massless particles of appropriate helicity in order to ensure the absence of poles at infinity, so the obvious challenge is that the massive amplitudes unify both the "good" and "bad" helicity combinations into a single object.
Another clear goal is the systematic computation of all the massive amplitudes in the Standard Model, starting at tree level but moving to multi-loop level. It is worth mentioning at least one exciting motivation for this undertaking. Future Higgs factories-like the CEPC or TLEP-can also run on the Z-pole, producing between 10 9 − 10 1 1 Z particles. Making full use of this data will require a computation of Z-couplings at three to four loop accuracy. And unlike QCD calculations of backgrounds at the LHC, for which the perturbative computations must ultimately be convolved with non-perturbative information such PDF's and hadron fragmentation functions to connect with experiment, these precision electroweak calculations are unaffected by hadronic uncertainties at the needed level of precision, so any theoretical predictions can be unambiguously connected to exquisitely precise experimental measurements! It is also clearly of interest to investigate massive amplitudes in supersymmetric theories, this should of course be especially interesting in the context of the N = 4 SYM on the Coulomb branch. Now even our first look at the on-shell avatar of the Higgs and Super-Higgs mechanisms, showed that the Higgsed amplitudes are more unified than their massless counterparts. Thus we should expect that all the natural objects encountered for massless amplitudes-such as tree amplitudes, leading singularities and on-shell diagrams, which are separated into different "k" sectors-are somehow unified into more interesting objects. Amongst other things the extension of BCFW to the Higgsed theories might be most natural in the massive N = 4 on-shell diagram formulation. And of course it would be fascinating to see if the Grassmannian/Amplituhedron structures underlying the theory and the origin of the moduli space is somehow extended/deformed away from the origin.
All of the physics we have discussed in this paper has revolved around the consistency of long-distance physics: the on-shell focus on factorization and cuts at tree and loop level is meant to ensure that infrared singularities needed by locality and unitarity are correctly accounted for, and this fixes the structure of the amplitudes. For theories with growing amplitudes in the ultraviolet, needing a UV completion, it is very natural to ask the same questions: can the physics of UV completion also be determined from the consistency conditions of locality and unitarity? If the UV completion has a weak coupling, the question becomes perfectly sharply posed, and in the context of unitarizing the Fermi interaction or W W scattering, searching for a tree-level UV completion correctly led to the prediction of massive W particles and Higgses as the completion of the weak interactions. Turning to the even more famous problem of UV completion for gravity scattering amplitudes, we encounter a well-known novelty. As will be discussed at greater length in [6], any weakly coupled UV completion for gravity amplitudes, (or for that matter, also Yang-Mills or φ 3 theory, any theory with non-trivial three-particle amplitudes), must involve an infinite tower of particles with infinitely increasing spins, as of course familiar from string theory. It is a tantalizing prospect to try and "derive string theory" in this way, as giving the only possible consistent tree scattering amplitudes for gravitons coupled to the infinite tower of massive higher spin particles necessary for UV completion. But consideration of amplitudes involving massive higher spin particles is necessary for any possible uniqueness, since as shown in [6], deformations of the string scattering amplitudes with only gravitons as external particles, compatible with all the standard rules, have been identified. This is not at all surprising. Since we know the presence of gravity makes massless higher spin particles impossible, the coexistence of gravity unified with an infinite tower of massive higher spin particles must involve the strongest consistency conditions imaginable. Again, the massive amplitude formalism we have discussed in this paper trivializes kinematical issues so that important physics points can be studied with an unobstructed view, and with this in hand we will return to string theory and the challenge of UV completion in [6].

B SU(2) Irreps as Symmetric Tensors
In this appendix we review, mostly to set notation, the elementary treatment of representations of SU (2) as symmetric tensors, and briefly discuss some of its standard applictions, such as a transparent determination of spherical harmonics. The standard treatment of representations of SU (2) is the one encountered by most undergraduates in beginning quantum mechanics courses. Since we can mutually diagonalize J 2 and J z , eigenstates of these operators are labeled by |s, jẑ , where theẑ reminds us that we have chosen to diagonalize the operator J z , and we have J 2 |s, jẑ = s(s + 1)|s, jẑ , J z |s, jẑ = m|s, jẑ . The irrep is (2s + 1) dimensional with jẑ taking all the values −s ≤ jẑ ≤ +s. The spin information in a general state |ψ is then entirely contained in specifying s, jẑ|ψ . But for our purposes it is more convenient to describe an irrep of SU (2) as a completely symmetric SU (2) tensor with 2j indices: where i is the SU (2) index. The inner product χ|ψ between two states is given by χ|ψ = ε ı 1 j 1 · · · ε i 2s j 2s (χ ı 1 ···i 2s ) * ψ j 1 ···j 2s (B.2) Saying that ψ is an SU (2) tensor is just the statement that the rotation generators J act as ( Jψ) i 1 ···i 2s = ( 1 2 σ) j 1 i 1 ψ j 1 ···i 2s + · · · + ( 1 2 σ) j 2s i 2s ψ i 1 ···j 2s (B.3) Note that the dimensionality of ths space is precisely 2 × 3 × · · · × (2j + 1)/(1 × 2 × · · · × 2j) = (2j + 1) as desired. Using that σ j i · σ l k = 2δ j k δ l i − δ j i δ l k , we trivially see that ( J 2 ψ) i 1 ···i 2s = s(s + 1)ψ i 1 ···i 2s . If we choose to diagonalize σ z with eigenstates (σ z ) j i ζẑ ,± j = ±ζẑ ,± i , then the spin s tensor that is an eigenstate of J z with eigenvalue jẑ is ψ s,jẑ = (ζẑ ,+ ) s+jẑ (ζẑ ,− ) j−jẑ (B.4) where here and in what follows, since the tensor indices on ψ are always symmetrized there is no need to write them explicitly when no confusion can arise. We can also express the same fact in a different way, telling us how to extract s, jẑ|ψ from the tensor ψ i 1 ,··· ,i 2s : The tensor representation makes it trivial to give explicit expressions for finite rotations, and expand the eigenstate ψ s,jn for a general directionn pointing in the usual (θ, φ) direction, as a linear combination of ψ s,jẑ 's. We only need to know the relation for spin 1/2: The tensor formalism also makes it trivial to construct spherical harmonics, which naturally arise in building irreps of SU (2) which are polynomials in a 3-vector x. Of course we are used to converting x to SU (2) indices by dotting with the σ matrices, but this gives us an object σ j i · x with an upstairs and downstairs index, while for the purposes of building irreps we would like to work with symmetric tensors and all downstairs indices. So it is natural to look instead at x ij = ik x k j ; explicitly we have We would like to make symmetric rank 2s tensors from a product of s x ij 's. But we don't need to do the symmetrizations explicitly; again because of the symmetrization all the information is contained in ζ i 1 ζ j 1 · · · ζ is ζ js x i 1 j 1 · · · x isjs = (ζζx) s (B.10) Putting ζ i = (α + , α − ) and so ζ i = (−α − , α + ), expanding the above gives us the generating function for spherical harmonics. Letting x be the unit vector with (x + iy) = sin(θ)e iφ and z = cos(θ), we have (ζζx) s = α 2 + sin(θ)e iφ − 2α + α − cos(θ) − α 2 − sin(θ)e −iφ s ≡ jẑ α s+jẑ

F Examples for 1 Massive 3 Massless Amplitudes
For three-point amplitudes, since the all massless and one massive two massless amplitudes are unique, this tells us that the massless residue for the 1 massive 3 massless amplitude is unique. If the residue is non-local, then consistent factorization in the other channel may forces the theory to have a particular one massless two massive interaction. Here we present some examples. We consider the four-point amplitude of arbitrary higher spin-S, two massless scalars and a graviton: We can now look at the massless residue for s-channel,