Amplitudes, Hopf algebras and the colour-kinematics duality

It was recently proposed that the kinematic algebra featuring in the colour-kinematics duality for scattering amplitudes in heavy-mass effective field theory (HEFT) and Yang-Mills theory is a quasi-shuffle Hopf algebra. The associated fusion product determines the structure of the Bern-Carrasco-Johansson (BCJ) numerators, which are manifestly gauge invariant and with poles corresponding to heavy-particle exchange. In this work we explore the deep connections between the quasi-shuffle algebra and general physical properties of the scattering amplitudes. First, after proving the double-copy form for gravitational HEFT amplitudes, we show that the coproducts of the kinematic algebra are in correspondence with factorisations of BCJ numerators on massive poles. We then study an extension of the standard quasi-shuffle Hopf algebra to a non-abelian version describing BCJ numerators with all possible gluon orderings. This is achieved by tensoring the original algebra with a particular Hopf algebra of orderings. In this extended version, a specific choice of the coproduct in the algebra of orderings leads to an antipode in the resulting Hopf algebra that has the interpretation of reversing the gluons' order within each BCJ numerator.

In this paper we mainly focus on the algebraic aspects of the colour-kinematics duality. The first critical milestone is to reveal the kinematic algebra and construct all the duality-satisfying numerators, known as Bern-Carrasco-Johansson (BCJ) numerators, from an algebraic product. In [57], and building on [54][55][56], four of the present authors and Johansson proposed a kinematic algebra that underlies the colourkinematics duality in a heavy-mass effective theory (HEFT) and, using a decoupling limit, in Yang-Mills (YM) theory. Remarkably, it was found that the BCJ numerators for tree-level amplitudes of two heavy particles and any number of gluons can be constructed from a quasi-shuffle Hopf algebra, which is well studied in the context of combinatorial Hopf algebras of shuffles and quasi-shuffles [58][59][60][61][62].
The HEFT BCJ numerators obtained in our approach enjoy several important properties: discovered quasi-shuffle Hopf algebra [57] back to the physical properties of scattering amplitudes; and second, we introduce new mathematical structures and extend further the quasi-shuffle Hopf algebra to incorporate certain general properties of scattering amplitudes, which are otherwise difficult to describe using the standard quasi-shuffle algebra. These two directions go hand-in-hand.
Specifically, for better describing the ordering of particles in the scattering amplitudes we are led to a non-abelian extension of the standard quasi-shuffle Hopf algebra, and important structures of the algebra such as the coproduct and the antipode act non-trivially on the BCJ numerators. As we will show, the action of the coproduct is interpreted as a factorisation limit of the BCJ numerators on the massive poles, whereas the antipode reverses the ordering of the particles entering the numerators.
There are several reasons to pursue these goals. One stems from the importance of exploring the kinematic algebra from a mathematical viewpoint. In addition, one may recall that the gravitational amplitudes obtained after a double copy describe black hole scattering and gravitational wave emission. The resulting expressions in the approach described in this paper are extremely compact, and have led to one of the most efficient ways of studying black hole scattering [63].
The rest of the paper is organised as follows. In Section 2, we review the novel colour-kinematics duality and general properties of tree-level amplitudes in HEFT, including the decoupling limit leading to YM amplitudes. We also prove the double copy formula for the gravitational amplitudes involving heavy particles using KLT relations. In Section 3, we introduce the concept of the pre-numerators in HEFT and their construction from abstract algebraic generators. These generators obey a quasi-shuffle Hopf algebra, and the pre-numerators are obtained using a map from the generators to functions of momenta and polarisation vectors. The BCJ numerators are directly related to the pre-numerators in a simple manner through the so-called nested commutators. Section 4 aims to study the kinematic Hopf algebra for the BCJ numerators in HEFT and to understand the physical interpretation of various mathematical operations in the Hopf algebra. In particular, the coproduct is related to the factorisation of the pre-numerators. In Section 5, we introduce the non-abelian version of the quasi-shuffle Hopf algebra. This is realised by extending the algebraic generators to include the ordering of external particles. We show that it is this version of the kinematic algebra that directly describes the BCJ numerators in an algebraic fashion in the HEFT. Section 6 is concerned with the factorisation behaviour of BCJ numerators and tree-level amplitudes in HEFT and their connections with coproducts, using the non-abelian version of the quasi-shuffle Hopf algebra. Finally, we conclude in Section 7 and comment on several directions for further research. Three appendices complete the paper. In Appendix A we provide a recursive definition of the quasi-shuffle product, while in Appendix B we discuss a connection between shuffle and quasi-shuffle algebras. Finally, in Appendix C we give an alternative definition of the pre-numerator from a different mapping rule which ensures that the pre-numerators themselves are symmetric under the action of the antipode.
2 Colour-kinematics duality and double copy in HEFT

General setup
We will be interested in amplitudes involving two massive scalars with mass m and n−2 gluons/gravitons, in the heavy-mass limit [64][65][66][67][68]. More precisely, we will define the HEFT amplitude to be the piece of the gluon or graviton amplitude which is of order m or m 2 , respectively 1 . For Yang-Mills-scalar amplitudes the HEFT amplitudes are simply obtained by taking the leading term in the heavy-mass limit. For instance, the colour-ordered two-scalar/two-gluon amplitude is given by 2 A(1, 2, 3, 4) = 2p 3 ·F 1 ·F 2 ·p 3 s 12 s 13 − m 2 , (2.1) where particles 1, 2 are gluons, 3, 4 are adjoint massive scalars. F µν i :=p µ i ε ν i − p ν i ε µ i denotes a linearised field strength, and s ij :=(p i + p j ) 2 . To take the heavy-mass limit we write the momentum of the first and second scalar (in a convention where all particles are outgoing) as p µ 3 = −mv µ and p µ 4 = mv µ + q µ , respectively, where v µ is the velocity of the heavy particle and q µ = p µ 1 + p µ 2 is the sum of the gluon momenta. Then we can take the leading term in the large-m expansion, to yield In the above we have replaced the two scalar labels 3, 4 with a single label for the velocity v. For n-point HEFT amplitudes we will also make the same replacement, 1 Strictly speaking, gravity amplitudes contain additional contact terms of order m 3 and higher [63] in the heavy-mass limit. These are given by products of lower-point HEFT amplitudes and delta functions, and vanish on generic kinematic configurations. In the following we will ignore such contributions, which is equivalent to dropping all Feynman iǫ's. 2 We have omitted the overall coupling dependence. In the same manner as above, we can take the heavy-mass limit of a gravitational amplitude M(1, 2, . . . , n−1, n) involving two massive scalars and extract the term which scales like m 2 giving the gravitational HEFT amplitude M(12 . . . n−2, v) Note that gluons (gravitons) in YM (gravity) amplitudes A (M) are labelled as 1, . . . , n−2 as in the figure above. We also quote the useful relation which follows from momentum conservation and on-shell conditions, where we have defined p 12...m := m k=1 p k . From now on we will drop explicit factors of m and m 2 since HEFT amplitudes always scale homogeneously in the mass.
A novel colour-kinematic duality and double copy for HEFT amplitudes were recently proposed in [56,57]. The colour-ordered YM tree amplitudes with two heavy particles and n−2 gluons are found to take the following form where ρ denotes all ordered nested commutators of the labels {1, . . . , n−2}. Each term in the sum is in one-to-one correspondence with a cubic graph with gluon labels following the colour ordering of the amplitude, and d Γ is the product of inverse scalar propagators of massless particles associated with that cubic graph. N (Γ, v) represents the HEFT BCJ numerator for the cubic graph Γ, whose properties and explicit construction will be studied in detail in this paper. Note that the number of terms in the sum in (2.6) is given by the Catalan number C n−3 (for n-gluon amplitudes in YM, this number becomes C n−2 ).
The BCJ numerators are manifestly gauge invariant and contain only physical heavy-mass propagators, which are linear. Thus, by construction, gluon HEFT amplitudes contain only single massive poles, while graviton HEFT amplitudes contain only double massive poles. The latter propery is entirely non-obvious upon simply expanding graviton amplitudes in the full theory in the large-mass limit. Moreover, the antisymmetric property for a single graph: -6 -and the Jacobi identity for a triplet of graphs: hold automatically. We also note the following two important relations for colourordered gluon amplitudes:

Proof of the HEFT double copy
The particular representation of HEFT amplitudes given in (2.6) was proved in [57] by understanding the factorisation behaviour of the BCJ numerators. The main goal of this section is to prove the double-copy formula (2.8). To do so, we start from the KLT relation for the full theory [69] M(1, 2, . . . , n−1, n) = − α,β∈S n−3 S(1α|1β)A(1, α, n−1, n)A(1, β, n, n−1) , (2.15) where S(1α|1β) is the field theory KLT kernel which is an (n−3)! × (n−3)! matrix, and the sum is over the (n−3)! permutations of (2, . . . , n−2). In the heavy-mass limit for the particles n−1 and n, we see that swapping n−1 and n amounts to sending v→ − v, and the HEFT amplitudes are odd under this transformation, (2.16) Using this relation we obtain the KLT double copy relation for the HEFT amplitudes We then write the colour-ordered HEFT amplitudes with two massive scalars and n−2 gluons as [1,39,[69][70][71]] , v) are KLT numerators in the Del Duca-Dixon-Maltoni (DDM) basis [72]. This basis is associated with fully left-nested commutators where the first index is fixed to be 1, which we denote as where β is a permutation of the remaining gluon legs 2, . . . , n−2. The matrix m appearing in (2.18) is the propagator matrix [70], or inverse of the KLT matrix, satisfying 3 Now, the last line of (2.21) can be recast in the form of (2.8). This can be seen from the definition of the propagator matrix [70] in the bi-adjoint scalar theory [39,71,[73][74][75], 22) and then using the colour-kinematics duality to replace the factors of c in those formulae by factors of N , thus transforming (2.21) into (2.8). Here c and c are nothing but the trace of the colour group generators for the external scalars, that is with a similar formula for c.
Alternatively, we use the BCJ form (2.6) to rewrite the gauge theory amplitude appearing in (2.21), giving

Comparison to Yang-Mills theory
It is interesting to compare (2.18) to the corresponding formula for an n-point gluon amplitude in Yang-Mills theory Here the expansion is over the (n−2)! fully left-nested DDM numerators N ([1, β], n) with particles 1 and n fixed, which are widely studied in the literature [76][77][78][79][80][81][82]. Each term in the sum corresponds to a half-ladder, or multi-peripheral graph, and in this case the matrix m(1α|1β) is singular. When we take the decoupling limit (2.14), mapping our HEFT amplitudes (with two scalars and n−2 gluons) into (n−1)-point YM amplitudes, the YM amplitudes thus obtained are in the form (2.31) (with n replaced by n−1).
We also note the alternative KLT representation of YM amplitudes with three legs being fixed [69,83], where the matrix m(12α|12β) is now invertible, which implies that the N are gauge invariant and unique, but may contain poles [84]. The N represent a basis of numerators which are equivalent to the N up to the kernel of the non-invertible matrix m(1α|1β) with α, β ∈ S n−2 . However, we have used generalised gauge transformations to explicitly set linearly dependent numerators to zero as per [83].
As an example, we review the four-point case.

Pre-numerators and quasi-shuffle products
The BCJ numerators introduced in the previous section can be generated from an object known as the pre-numerator [54,55,57], denoted in the same way as the BCJ numerators but without the commutator structure: N (123 . . . n−2, v) . (3.1) We can construct the BCJ numerators by permuting the labels of the pre-numerator according to the commutator structure, for example 1 2 3 : and similarly for more complicated BCJ numerators. Written in this way the BCJ numerators manifestly satisfy the Jacobi identities.
The BCJ numerators are unique, however the pre-numerators used to build them are not uniquely defined: there is still freedom in choosing their exact form, which we can use to make certain properties of the pre-numerators manifest. Remarkably, it was shown in [57] that for tree amplitudes in HEFT there exists a closed-form expression for crossing symmetric pre-numerators in terms of gauge-invariant quantities for arbitrary n. Moreover, the pre-numerators in the canonical gluon ordering N (12 . . . n−2, v) can be constructed from a quasi-shuffle algebra using a two-step procedure which we now outline, following [57]. 4 First, we build an intermediate object which we call an "algebraic" pre-numerator [54,55,57,63], obtained by fusing abstract generators T (i) , one for each gluon leg, using a quasi-shuffle product denoted by ⋆. Then we define a linear map (denoted by angle brackets • ) from these abstract generators to kinematic quantities The linear map relates each single-index generator, e.g. T (i) , to a vector current with a heavy source. Multi-index generators are mapped to multi-rank tensor currents of the same heavy source. The fusion product of these currents can be understood as an algebraic fusion rule from lower-rank to higher-rank currents once we perform the map (3.4). In this sense, the algebra generators are identified as the operators that generate the tensor currents [54,55].
In the next two sections we review the quasi-shuffle product, before describing the map • to tensor currents and finally pre-numerators.
To begin, we introduce some standard nomenclature for generators: we will refer to generators with a single subset T (τ 1 ) as "letters", those with multiple subsets T (τ 1 ),...,(τr) as "words", and the length of a word is the number of subsets, or letters, it contains.
Next we introduce the quasi-shuffle product ⋆ on the space of generators A, first in some examples and then in full generality. The simplest case to consider is the product of two letters, say, T (1) and T (2) , which gives the four-point algebraic pre-numerator: (12) . (3.6) The first two terms above correspond to the "shuffle" part of the quasi-shuffle product, while the last term is a letter made from "stuffing" (1) and (2) together (hence the quasi-shuffle nature of the algebra). The "stuffing" terms can be understood as giving rise to "contact term" corrections to the lower-point rules such that the numerators constructed in this way lead to the correct amplitudes. For instance, at four points, naively attaching two three-point numerators together by shuffling would lead to T (1), (2) and T (2), (1) , and the required correction to obtain the correct numerator (or equivalently amplitude) is precisely T (12) . 5 The product of a letter with a word is also fairly straightforward and follows the same splitting into shuffle and stuffing terms. As an example, consider (14) . (3.7) Note that this product preserves the ordering of the letters in the word, T (2),(3), (4) . This is true in general: the quasi-shuffle of two words T (τ 1 ),(τ 2 )...(τr) and T (ρ 1 ),(ρ 2 )...(ρr ) preserves the ordering of the τ i and the ρ j . The general formula for the product of two arbitrary words is given by [57] T (τ 1 ),...,(τr) ⋆ T (ρ 1 ),...,(ρs) = where the τ i or ρ i are now any subsets of {1, . . . , n−2}. The notation σ| {τ } means that we restrict the partition σ onto the subset τ = τ 1 ∪ τ 2 ∪ · · · ∪ τ r , for example There is also a recursive definition of the quasishuffle product [58,61] which we give for completeness in Appendix A. The quasi-shuffle product defined in this way is associative and commutative, hence we can perform the products in any order we choose.

Mapping to the pre-numerator
After constructing the algebraic pre-numerator N (12 . . . n−2) using (3.8), we then use the angle-bracket map to obtain the pre-numerator (from which we finally obtain BCJ numerators as e.g. in (3.2)). The key step is clearly the • map, which we now discuss.
The angle-bracket map for the special case of a single label generator is simply while for a generic generator, it is given by [57] T where the length of τ 1 ∪· · ·∪τ r is n−2. Θ(τ i ) is the subset of labels in τ 1 ∪· · ·∪τ i−1 which are smaller than the first label in τ i for the canonical ordering 1, 2, . . . , n−2, and τ 1 [1] is the first label in τ 1 . As an example, if T (13), (25), (46) which can also be seen by locating all the labels to the south-west of the labels 2 and 4 in the following "musical diagram" [57] (τ 1 ) 1 3 Here we have also defined V µν τ = v µ p ν τ , and F τ i is the product of linearised field strengths F j for all j ∈ τ i where τ j is ordered with respect to the canonical ordering 1, 2, . . . , n−2. For example, F 123 = F 1 ·F 2 ·F 3 . Finally, if any of the sets Θ(τ i ) happens to be empty, then we set that generator to zero.
Note that here we always consider the canonical gluon ordering 1, 2, . . . , n−2 when we take the map • . This is what makes the algebraic pre-numerator N (12 . . . n−2) symmetric in the gluon labels, while the kinematic pre-numerator N (12 . . . n−2) is not. In Section 5 we will consider an extension to the algebra which includes arbitrary gluon orderings, however for now we will use the canonical ordering.
For the first few lower-point cases, the pre-numerators are given explicitly by (3.14) In the above expressions, we have made explicit the terms that are mapped to zero according to the rules (3.10). From these examples it seems as though there is a large redundancy in the pre-numerator, since many terms are set to zero by the map. Despite this redundancy, we shall see in the next section that these terms are essential for linking the coproduct to the factorisation behaviour of the pre-numerator. 6 4 Kinematic Hopf algebra for the pre-numerator The quasi-shuffle product described in the last section gives us an algorithmic way to produce n-point pre-numerators and hence all BCJ numerators in the HEFT. The quasi-shuffle algebra can be extended to a bialgebra with the introduction of two new operations: a coproduct: ∆, and a counit: ǫ, which must satisfy specific compatibility conditions we will detail later in this section. Additionally, there exists another operation, the antipode: S, which further extends this bialgebra to a Hopf algebra, and is also subject to various compatibility conditions. In this section we will describe each one of these new operations in turn, starting with the coproduct and its relation to (multiple) factorisation of the pre-numerators.

The coproduct and factorisation
The coproduct is a linear map from the space A of generators T ω , introduced in Section 3.1, to the tensor product space A⊗A. For the quasi-shuffle product, the coproduct is well known [58,61] and can be derived from two key relations. First, we define the coproduct of the simplest object, a letter where Á is the identity element of ⋆, the "empty" word. Then we extend the definition to products of letters using a defining property of the coproduct: compatibility with the quasi-shuffle product, where the extension of the product to tensors is defined naturally as (A⊗B)⋆(C ⊗D) = (A ⋆ C) ⊗ (B ⋆ D). We can represent the above equation (and similar equations later on) diagrammatically as follows: where the red dot • denotes the coproduct and the black dot • the quasi-shuffle product.
Note that such diagrams should be read from top to bottom.
Using (4.1) and (4.2) we can deduce the coproduct of the algebraic pre-numerator as Additionally, we can consider the coproduct of an arbitrary word (which will in general not be a product of other words) Equation (4.4) is usually the fastest way to compute the coproduct, but first it is worth using the explicit definition (4.5) in an example. Consider the coproduct of the four-point algebraic pre-numerator, We can also introduce ∆ ′ , the reduced coproduct, which simply removes all trivial terms involving the identity Á. For the four-point pre-numerator this leaves us with In this example there are two important points to emphasise: 1. The coproduct has the general property of splitting up words (and algebraic prenumerators) which, we will see, is reminiscent of factorisation on the massive propagators v·p τ .
2. The non-trivial terms in the coproduct comes precisely from the terms T (1), (2) and T (2),(1) which did not contribute to the pre-numerator upon taking the map • in (3.12).
This is true in general: we can derive an expression for the coproduct of the n-point algebraic pre-numerator in terms of tensor products of lower-point ones, where here we allow σ L and σ R to be empty and define N (∅) := Á. Thus if we demand σ L , σ R to be non-empty in the above equation we get ∆ ′ ( N (123 . . . n−2)) instead. The proof of the above formula comes directly from the relation (4.4). Indeed, assuming that the (n−1)-point pre-numerator takes the form above, then by induction for the n-point pre-numerator we have (4.9) To make the connection between the coproduct and factorisation precise we need to extend the map • to include tensor structures. To do this let us first consider the factorisation property of the four-point pre-numerator The factorisation of the above pre-numerator as v·p 1 → 0 is characterised by the corresponding residue: More generally, the pre-numerator has the following factorisation behaviour, proven in [57], obtained from the residues at the poles v·p 1τ =0: where τ and ω are subsets of {2, 3, . . . , n−2} such that τ ∪ ω = {2, 3, . . . , n − 2}, and τ , ω are their respective lengths. Again, Θ(ω) is the set of labels in {1, 2, . . . , n−2} which are less that ω [1]. Note that in the HEFT due to momentum conservation we have v·p 1...n−2 = 0; for example at five points we could write v·p 12 = −v·p 3 . Thus, in the above we have included leg 1 in v·p 1τ to fix such ambiguities. Now we come to a crucial point: each one of these massive poles can be matched to a term in the coproduct if we define a replacement rule C for tensor products as where Θ is defined in the same way as before, and if Θ(ω) = ∅ then we set p ∅ = 0. Note that the algebraic generators have not yet been evaluated, and should be treated as commuting objects. Finally, to map to physical quantities we must apply the angle bracket • to the generators, as in (4.14) To make the connection between factorisation and the coproduct, we must take the residue at the relevant poles before we perform the angle bracket map to kinematics. Explicitly, we have the following relation for any subset τ ⊂ {2, . . . , n − 2}: We also note that the pre-numerator has poles which are not colour ordered e.g. when v·p 13 = 0, and therefore the residues at these poles must cancel when we combine these pre-numerators into colour-ordered amplitudes.
In summary, the coproduct decomposes a pre-numerator into a sum of terms, each of which corresponds to a factorisation channel on a single massive pole. It is useful to illustrate the connection between coproducts and factorisation in some examples.
n=4. In this case, we calculated the coproduct of the algebraic pre-numerator in (4.6). Mapping this result to physical quantities using (4.13) we obtain (4. 16) Taking the residue at the pole v·p 1 = 0, and mapping to physical quantities we get where the right-hand side is nothing but the factorisation behaviour of the four-point pre-numerator in (4.10).
n=5. At five points the algebraic pre-numerator is given in (3.14), and its coproduct is Proceeding as before, we act on the above equation with the replacement rules C as given in (4.13), to find that only N (1) ⊗ N (23), N (13) ⊗ N (2), and N (12) ⊗ N (3) are non-vanishing. Thus we get Now if we take the residue at the pole, say, v · p 13 = 0, and apply the map • we obtain which is the expected factorisation of the pre-numerator in the limit v·p 13 → 0.
We now move on to discuss iterated coproducts and their connection to generalised factorisation.

Generalised factorisation and iterated coproducts
We now examine the factorisation of the pre-numerators when we take multiple massive propagators on shell -we call this generalised factorisation. Interestingly, the result will depend on the order in which we take the limits, in contrast to amplitudes, where the order of limits is irrelevant. As an example consider the five-point pre-numerator There are three massive poles we can consider here which we write as v·p 1 → 0, v·p 12 → 0 and v·p 13 → 0 (recall that v·p 123 = 0 here so that v·p 2 = −v·p 13 ). If we take v·p 1 → 0 we find, using (4.12) Then taking v·p 2 → 0, which is now equivalent to v·p 12 → 0 we obtain Note that the labels appearing in Res (v·p 2 , v·p 1 ) are ordered. Alternatively, we could first take the limit v·p 12 → 0 then v·p 1 → 0 which first gives and then Comparing the above with (4.23) we see that although the pre-numerator has been split into the same pre-numerators (N (1, v), N (2, v) N (3, v)), the momentum products differ. All of the different choices of multiple factorisations of N (123, v) can be summarised in the following diagram: This process continues for higher-point pre-numerators, and in general the order in which the limits are taken is important. By applying the coproduct to a pre-numerator several times we aim to build an object which can recreate this factorisation structure by taking multiple residues, as we now discuss.
If we apply k coproducts to a generator we land in the (k + 1)-tensor product space of A, which we write as A ⊗(k+1) . A k-iterated coproduct is denoted as ∆ (k) and defined recursively as (4.26) For example ∆ (2) T (1),(2),(3) is given by where in the second equality we have used ∆(Á) = Á ⊗ Á. As before, we can introduce a reduced iterated coproduct ∆ ′(k) , which simply removes the trivial terms involving the identity. For the example above this gives Note that the choice of (∆ ⊗ Á) instead of (Á ⊗ ∆) in (4.26) is arbitrary, due to the co-associativity of the coproduct: This property can also be illustrated diagrammatically as = . (4.30) Using the general form of a single coproduct of the algebraic pre-numerators (4.8), and the recursive definition for the iterated coproduct ∆ (k) in (4.26), we can deduce the action of ∆ (k) on a generic algebraic pre-numerator: where again we allow any of the τ i to be empty, and we can obtain the expression for ∆ ′(k) by demanding that the τ i be non-empty.
We can now naturally extend the replacement rule C defined in (4.13) to include an arbitrary number of tensor products, where Θ(ω i ) is the set of elements in ω 1 ∪ · · · ∪ ω i−1 which are less than the first element of ω i in the canonical ordering (1, 2 . . . , n−2). Now we can explicitly obtain k factorisations of a pre-numerator from the k coproduct by taking the residues on the appropriate poles and applying the map • , where once again both of these residues are ordered and taken right to left.
As a nontrivial example, consider the five-point case. After applying the first coproduct one obtains  from which we can identify which pieces contribute to factorisation by using the replacement rule C defined in (4.13). Doing so, we see that the terms which contribute are Extracting the residue corresponding to v·p 1 → 0 and then v·p 2 → 0, we get on which we can act with the map • to obtain the factorisation (4.37) The other two factorisations can be obtained similarly: It is easy to check that the three residues are not linearly independent: which is a consequence of the global residue theorem of multi-dimensional integrals [85].
As a final comment, we note that for an n-point amplitude we can have non-trivial iterated coproducts ∆ (k) with k = 2, . . . , n−3. In the context of multiple polylogarithms, ∆ (n−3) corresponds to what is usually known as the symbol [86]. Note however that in the case of iterated integrals, the relevant algebra is a shuffle algebra [87]. 7

The counit
To obtain the quasi-shuffle bialgebra we introduce the counit ǫ, defined by its action on the generators T ω and the identity Á: The counit can be used to "undo" the action of the coproduct as follows: where by ⋆ we mean ⋆(T ω 1 ⊗ T ω 2 ) = T ω 1 ⋆ T ω 2 . Again, the content of this equation can be shown using diagrams: (4.42) One can also check directly that the counit is compatible with the quasi-shuffle product, Unlike the coproduct, the counit does not have a particularly useful interpretation in terms of a property of the pre-numerator, since it simply maps the algebraic prenumerator to zero, ǫ( N (12 . . . n−2, v)) = 0 , which is easily seen from the definition (4.40).

The antipode
The final operation we need to introduce to form a Hopf algebra is the antipode S, which is a linear map from the space of generators A to itself. Its defining property is which can also be described diagrammatically as In simple terms, we can take a generator and first split it up via the coproduct, then act on one of the tensor product factors with S, and then recombine with the product. The result of this operation is that all generators except for the identity are mapped to zero as per (4.40). 8 The explicit action of the antipode on a generator is defined recursively,  Using this, one can check the following required compatibility properties: To find the action of the antipode on the algebraic pre-numerator we first use (4.48) to find the antipode of a single letter, then we use the first relation of (4.49) to find S N (12 . . . n−2) = S(T (1) ⋆ T (2) ⋆ T (3) · · · T (n−2) ) = S(T (n−2) ) ⋆ S(T (n−3) ) ⋆ S(T (n−4) ) ⋆ · · · ⋆ S(T (1) ) where in the second-to-last line we have use the fact that the quasi-shuffle product is commutative. Thus the antipode only changes the pre-numerator by at most an overall sign. This seems to suggest that, like the counit, there is no useful interpretation of the antipode. However, we shall see in the next section that we can extend the quasi-shuffle algebra to a non-abelian version in which case the antipode becomes non-trivial.

The non-abelian quasi-shuffle Hopf algebra
In Section 3 we laid out properties of the quasi-shuffle product used to build prenumerators and, in turn, BCJ numerators. As an explicit example consider the fourpoint BCJ numerator where in the last equality we have used (p 1 + p 2 )·v = 0.
Note that in this example the pre-numerator N (12, v) is not symmetric in the labels 1 and 2, whereas the algebraic pre-numerator N (12) = T (1) ⋆ T (2) is symmetric since the quasi-shuffle product is commutative. The ordering of labels (1, 2) is only imposed on the pre-numerator once we take the angle bracket map • given in (3.10). This raises the interesting question of whether we can incorporate arbitrary orderings of legs, beyond the canonical ordering 1, 2, . . . , n−2, into the algebra itself while preserving the Hopf algebra structure.
As we now demonstrate it is possible to extend the quasi-shuffle product to become non-abelian by endowing our generators with additional dependence on the overall ordering of the massless particles denoted by α ∈ S n−2 . First steps in this direction were taken in the appendix of the prequel paper [57], but here we will present the complete construction, the non-abelian kinematic Hopf algebra, and find, in particular, an amplitude interpretation for its coproduct and the antipode. The important result will be the ability to compute BCJ numerators directly at the level of the non-abelian quasi-shuffle algebra.

Extended generators
The first important step is to introduce extended generators with ordering α ∈ S n−2 of the form 9 where the τ i are subsets of {1, 2, . . . , n−2}. In practice the actual labels appearing in the subscript τ = τ 1 ∪· · ·∪τ r will be a subset of α, for example T (12435) (12), (34) . These new generators correspond to a tensor product extension of the underlying vector space of the quasi-shuffle algebra. As implied by (5.3), we extend the generators involving indices {1, 2, . . . , n−2} by tensoring with the free algebra over the elements of {1, 2, . . . , n−2}. This algebra is generated by taking products and linear combinations of the n−2 elements r (i) for i ∈ {1, 2, . . . , n−2}.
As a free algebra, this product simply consists of concatenating indices r (α) · r (β) := r (αβ) , (5.4) and in general builds objects with indices consisting of words taken from the alphabet {1, 2, . . . , n−2}. It is clear from its definition that this product is non-abelian. For now this is the only structure we will impose on the orderings, however we shall add slightly more in the next section. The fusion product of the extended generators is defined similarly and is induced through their definition (5.3) in terms of a tensor structure and through the previously defined products (3.6) and (5.4). It is given by This new product corresponds to the tensor extension of the original product via ⋆ → · ⊗ ⋆, where for convenience we use the same symbol for both.
The actual fusion products we will perform in order to build a BCJ numerator will involve orderings α and β of disjoint sets, hence the product αβ will also be an ordering of their union.
This extension of the quasi-shuffle algebra is non-abelian since the product on the superscripts is non-abelian: [T  We can now build algebraic pre-numerators with any ordering σ of labels {1, 2, . . . , n−2} by taking successive products of extended generators and further we can define algebraic BCJ numerators by simply expanding in terms of algebraic pre-numerators. For example, at five points one has These algebraic pre-numerators can then be mapped to physical pre-numerators using a map which we again denote by • . If we recall the original definition (3.10) of • , we can extend it to arbitrary orderings as follows, where now Θ (α) (τ i ) is the subset of τ 1 ∪ · · · ∪ τ i−1 which are less than the first leg in τ i with respect to the ordering α, for example if T (254163) (13), (25), (46) then Θ (254163) (46) = {2, 5}. If any of the sets Θ (α) (τ i ) happens to be empty then we map that generator to zero. The expression F τ j denotes an ordered product of F i for i ∈ τ j where the τ j are also ordered with respect to the ordering α.

The Hopf algebra of orderings
The extension of the quasi-shuffle algebra described above can also be turned into a Hopf algebra by defining a new coproduct, counit and antipode. This can be achieved by simply tensoring the original Hopf algebra of quasi-shuffles with another Hopf algebra of orderings which acts on the r (α) , as we now discuss.
The coproduct: To begin building such a Hopf algebra of orderings, we define a coproduct to the concatenation product (5.4). As in the quasi-shuffle case, the coproduct is a map from the space of orderings O to the tensor product space O ⊗ O. It turns out that there are many different choices for it. As before, we can define the coproduct by its action on the simplest object, r (i) , and then extend to all other elements using linearity and compatibility with the product: ∆(r (α) ·r (β) ) = ∆(r (α) )·∆(r (β) ). We will also abuse notation here and use the same symbol for the coproduct, as in the quasi-shuffle Hopf algebra.
To explain the possible choices of coproduct we start with the following simple ansatz where a and b are arbitrary constants at least one of which is non-zero. In order to satisfy co-associativity, and compatibility with a non-trivial counit and antipode (see (4.42) and (4.47)), we require the coproduct to be symmetric between the left and right parts of the tensor product. If we consider the case were b = 0 and a = 1 the coproduct will not preserve the ordering of legs in the pre-numerator, for example: 10 ∆ a=1,b=0 (r (12) ) = Á ⊗ r (12) + r (1) ⊗ r (2) + r (2) ⊗ r (1) + r (12) ⊗ Á . (5.11) The whole point of introducing the extended algebra was to make the ordering of particles manifest, so instead we will choose a = 0 and b = 1, which gives for a general ordering: ∆(r (α) ) := ∆ a=0,b=1 (r (α) ) = r (α) ⊗ r (α) .

(5.12)
It is straightforward to show that this definition also satisfies co-associativity as per diagram (4.30).
The counit: The counit can also be simply defined, to yield a bialgebra of the r (α) , where again we are abusing notation and using Á to label the identity element of the product · in (5.4) and the quasi-shuffle product (3.8). This definition immediately satisfies the defining properties of the counit, the diagrams (4.42) and (4.44).
The antipode: Finally, to construct a Hopf algebra for the orderings r (α) we need to define an antipode S, which by definition must satisfy (see diagram (4.47)) · (S ⊗ Á)∆(r (α) ) = ·(Á ⊗ S)∆(r (α) ) = ǫ(r α ) . (5.14) Using our choices for the coproduct and the counit, this becomes It is clear from the above relation that S(r (α) ) should correspond to the inverse of r (α) , however in the free algebra of the r (α) no such inverses exist. Additionally, the antipode must be an antihomomorphism from the space of r (α) to itself, in particular it must satisfy S(r (ij) ) = S(r (j) )·S(r (i) ) .

(5.16)
To solve both of the above relations we must include inverses and instead consider a group algebra of orderings. The antipode will then be identified with the map which sends a group element to its inverse -this is a very general construction applicable to any group known as a group Hopf algebra.
We began with a free algebra of orderings r (α) and in order to introduce the antipode we require inverse elements. The most general construction we could employ is a group algebra of the free group with n−2 generators. This group algebra is constructed in a manner very similar to the free algebra, and consists of linear combinations of products of generators g in the group and inverse elements g −1 , while assuming only the group axioms. However, to define the group algebra which corresponds to orderings we require that the antipode of a particular ordering r (α) is also an ordering of the same indices contained in α. As such we define the inverse of an ordering α to be the reversed ordering α: r (α) ·r (β) = r (αβ) , r (α) ·r (α) = Á , (5.17) for example r (123) = r (321) , and as before we can take linear combinations of orderings. Note that the above conditions imply r (i) ·r (i) = Á. Instead of a free group this is the group freely generated by n−2 self-inverse elements, where "freely" here means assuming no other group relations beyond the axioms. This group is known as the universal Coxeter group with n−2 generators. Now that we have inverses in place, it is clear that we should define the antipode of an ordering as S(r (α) ) = r (α) . (5.18) This definition also satisfies the other compatibility conditions required by the diagrams (4.50).
To summarise, we have defined a coproduct, counit and antipode for the algebra of orderings with product given by (5.4). The resulting Hopf algebra has the interesting property of requiring an underlying group algebra structure of the universal Coxeter Group such that the antipode acting on an element is identified with the inverse element given by the reversed ordering. This is known as a group Hopf algebra and is a general construction that can be made for any group.

Combining the Hopf algebras
Equipped with a Hopf algebra of orderings, we can now describe the Hopf algebra structure of the non-abelian quasi-shuffle algebra by tensoring the ordering algebra O with the quasi-shuffle algebra A. As advertised at the beginning of this section, and much like the fusion product (5.5) for the extended generators, the coproduct, counit and antipode of the extended algebra are simply given by a product structure.
The coproduct on the extended generators is where T : O ⊗ A → A ⊗ O is a transposition operator which swaps the entries in the tensor product. It is then an immediate consequence that the coproduct (5.19) is compatible with the product defined in (5.5) and therefore satisfies the diagram (4.3). We also again define the reduced coproduct ∆ ′ which removes all trivial generators without subscripts, i.e. T (α) .
Like the coproduct, the counit is extended such that it acts on both the generators T ω and the ordering r (α) . That is, except for when the element T ω = Á, in which case we have Again it is trivial to check that this counit is compatible with the product (5.5) as required by the diagram (4.44).

(5.23)
It is interesting to note that the antipode acting on an algebraic BCJ numerator simply mirrors the associated cubic graph up to an overall minus sign. For example, the antipode acting on a left-nested BCJ numerator gives  We can also construct an algebraic amplitude by replacing the kinematic BCJ numerator with the algebraic one, The HEFT amplitude (2.6) satisfies the reflection identity (2.12), which is inherited from the well-known reflection identity of gluon-scalar amplitudes before taking the heavy-mass limit. We expect the algebraic amplitude to satisfy its own reflection identity, generated by the action of the antipode, which we now describe.
First, we define an operatorS which consists of the antipode S and a reversal of the labels in the massless propagators, hence giving the propagators dΓ of the mirrored graphΓ. Then, under the action ofS, the algebraic amplitude picks up an overall sign where we have first used the second line of (5.24), and then the fact that the HEFT amplitude is a sum over all cubic graphs, and hence every BCJ numerator appears together with its mirror. The action ofS simply multiplies by a factor of −1.
The action ofS on algebraic HEFT amplitudes can also be written as which reproduces the reflection identity (2.12) for the algebraic HEFT amplitude and relates it to the action of the antipode.
In summary, in this section we have constructed a non-abelian quasi-shuffle Hopf algebra which allows us to treat all possible colour orderings on the same footing, unlike its abelian counterpart that only applied to the canonical ordering (1, 2, . . . , n−2). As a bonus we were able to give physical interpretations to the various operations of the Hopf algebra at the level of the numerators, rather than the pre-numerator. We will discuss in the next section in detail how the co-product of the new algebra is related to the factorisation of BCJ numerators.

Factorisation behaviour of BCJ numerators and amplitudes
In this section we discuss the factorisation properties of the BCJ numerators and HEFT amplitudes with gluons and gravitons. Just like the pre-numerators, the BCJ numerators and the HEFT amplitudes are formal polynomials in the T ω , and we can use the coproduct to characterise the factorisation behaviour. To do this we must define an extension of the replacement rule C for the coproduct defined earlier in (4.13). This is given by where we chose to restrict the ordering in the superscript to include only those indices appearing in the subscript. For example, we have  Note that this relation is only valid at the level of the kinematic pre-numerator and not at the level of the algebraic pre-numerator. 11  N ([[1, 3], 2]). Singlecut factorisations are in one-to-one correspondence with terms in the single coproduct ∆ ( N (Γ)). Using (6.1) one finds

Five-point example
(2) p 2 ·p 1 The single coproduct of the other two cubic graphs are related to the above by swapping the indices and the Jacobi relation 3], 2])) .
Following the factorisation behaviour for BCJ numerators, the factorisation behaviour of amplitudes can also be characterised by the coproduct. For example, consider the amplitude A(123, v), whose algebraic form is The double cut can be calculated similarly; we have only one independent double cut due to the missing pole at v·p 2 = 0. This agrees with the fact that the factorisation of an amplitude does not depend on the ordering in which we take the cuts, in contrast to the pre-numerator discussed in Section 4.2.

Six-point example
We now consider the six-point case, and mainly focus on the left-nested commutator [1,2,3,4]. Other commutators in the DDM basis are directly obtained by relabelling the indices of the gluons while the algebraic numerators beyond the DDM basis are then derived from the Jacobi relations.
The BCJ numerator for the left-nested commutator [1,2,3,4] has the following coproduct structure, , v)N ( [23], v)p 1 ·p 2 N ( [13], v)N ( [24], v)p 1 ·p 2 N ( [12], v)N ( [34], v)p 12  The factorisation behaviour of the amplitude can also be obtained from the coproduct by using the factorisation behaviour of the BCJ numerators. Considering the pole at construction of BCJ numerators by considering commutators of pre-numerators with their labels swapped. We exactly emulated this construction of BCJ numerators in a purely algebraic manner by defining a non-abelian quasi-shuffle algebra. Then we constructed the non-abelian algebra in a simple way by tensoring the original quasishuffle algebra with an algebra of orderings of the gluon labels. The latter is also a Hopf algebra and is given by the group Hopf algebra of the universal Coxeter group. This non-abelian algebra has the appealing property that the action of the antipode is related to the reversal of the gluon ordering in a numerator, and in turn plays a role in the reflection identity (2.11) of the HEFT amplitude. Finally, after defining a direct algebraic construction of HEFT BCJ numerators, we have related their factorisation properties to the coproduct of the non-abelian quasi-shuffle Hopf algebra.
The results of this paper leave us with several interesting avenues of research, some of which we now outline. First, an important question is to assess the universality of the Hopf algebra. The first indication of this appears in [89], where it was found that very similar Hopf algebra structures govern the colour-kinematics duality in Yang-Mills-scalar theory with a bi-adjoint φ 3 interaction. The construction was also applied both to scattering amplitudes and form factors with an arbitrary number of gluons and massive scalars. The results of that paper point to the universality of the Hopf algebra, beyond the leading HEFT expansion. The connections between physical properties of amplitudes and mathematical structures of the Hopf algebra revealed in this paper should have an interesting application in those more general contexts.
One can also consider amplitudes with gravitons and massive particles such as fermions, vectors and even higher-spin particles, which enter the computation of the deflection angle, waveform and metric in the gravitational scattering of Kerr black holes. We expect the colour-kinematics duality and the double copy to play a significant role here, and it would be fascinating to identify a kinematic Hopf algebra in this context.
Another important direction concerns loop amplitudes. The colour-kinematics duality and double copy in Yang-Mills and gravity appear to hold also at loop level (see for instance [3]), and it would be very interesting to identify algebraic structures also in loop amplitudes, especially the loop integrands.
Finally, we are still missing an explicit realisation of the algebra in terms of differential operators, similarly to the work of [35] for the self-dual sector of Yang-Mills and gravity. This would avoid the appearance of abstract generators and the angle-bracket map. We leave these and other fascinating questions for future investigation.

A Recursive definition of the quasi-shuffle product
In this appendix we detail, for completeness, a recursive definition for the usual quasishuffle product [58,61]. It turns out that explicitly writing the generators T ω leads to clumsy looking formulas, so instead we will write relations involving only the words ω = (τ 1 ), . . . , (τ r ). The extension to the full generators is then immediate.

B Quasi-shuffles from shuffles
One important question is whether or not the quasi-shuffle algebra used to find prenumerators is unique. For example, can one redefine the generators T (τ 1 )... , and change the quasi-shuffle algebra into another algebra with a different product? The answer is that one can indeed define new generators such that we obtain a shuffle algebra instead, although at the expense of making the mapping • (3.10) much more complicated. To do this it will be necessary to introduce some extra machinery. Namely, we will need to define the action of analytic functions on abstract letters and words which will form maps between algebras. In this appendix we will briefly introduce shuffle products before explicitly relating them to quasi-shuffles. Here we apply the methods of [58,61] to our context, as well as adopting their notation. For example, once again we drop the T ω notation and initially work with just the words ω = (τ 1 ), . . . , (τ r ).
As an example consider the shuffle products which should be compared to their quasi-shuffle cousins (3.6) and (3.7). Here we can see explicitly that terms like (12) etc. are not included in the shuffle product.

B.2 Compositions of words
To define the action of functions on words we first need to introduce the notion of a composition. Given a word of length n we define the set of compositions as follows Therefore, a given composition is a list of positive integers which add up to give the length of the word. These compositions can then act on words to obtain shorter words. Given I = (i 1 , i 2 , . . . , i k ) and ω = (τ 1 )(τ 2 ) . . . (τ n ) we define For example, the composition I = (2, 3, 1) acting on ω = (τ 1 )(τ 2 )(τ 3 )(τ 4 )(τ 5 )(τ 6 ) is

B.3 Functions of words
To map the quasi-shuffle algebra to a different algebra we will need to define the action of analytic functions on words. Much like defining the action of a function on a matrix; the action of a function on words is defined using its Taylor series. Given a function of a single variable f (x) which is analytic and zero at x = 0 we write its Taylor series as Then we define the corresponding linear function Ψ f on a word ω using compositions [58,61] where l(ω) is the length of the word ω i.e. the number of letters it contains. This definition satisfies intuitive properties when taking compositions and inverses of functions Of particular importance for us are the exponential and logarithm maps which take us between the shuffle and quasi-shuffle algebra In addition we will need a function which introduces a sign in our version of the quasishuffle algebra [58,61] 13 With these functions in hand, we can map shuffles to quasi-shuffles and vice versa as follows, for any words ω 1 and ω 2 we have (B.11) 12 Note that due to linearity we have exp(ω 1 + ω 2 ) = exp(ω 1 ) + exp(ω 2 ) and log(ω 1 + ω 2 ) = log(ω 1 ) + log(ω 2 ). 13 In [58,61] this function is denoted by T , however here we instead use χ for obvious reasons.
The functions χ exp χ and χ log χ are clearly mutual inverses, and thus both are algebra isomorphisms. The equations above can be naturally extended to arbitrary chains of shuffles and quasi-shuffles. In particular, for the pre-numerator we can write (reintroducing the generator notation T ω here) χ exp χ(T (1) ¡···¡T (n) ) = χ exp χ(T (1) ) ⋆ · · · ⋆ χ exp χ(T (n) ) = T (1) ⋆ · · · ⋆ T (n) . (B.12) The above relation suggests that we can rebuild the pre-numerator using only shuffles, but to do so we will need to define some new generators as follows We will call these generators T the shuffle generators. It is worth going through a few examples of these new generators to see explicitly how they relate to the original ones: for any i = 1, . . . , n , (B.14) Note that the shuffle generators are only labelled by brackets containing a single leg like T (1),(2),(3) etc.
To show the equivalence of these generators to the original quasi-shuffle generators we need to define a shuffle product ¡ for the generators T . This product is defined, as expected, by the shuffling of the brackets associated to each generator T T (i 1 ),(i 2 ),...,(ir) ¡ T (j 1 ),(j 2 ),...,(js) := (B.20) The point to note is that this map is more complicated than the original one, in that it involves a sum of many terms with no clear simplification. This is not too surprising: we have simplified the algebra to a shuffle algebra, and it is then reasonable to expect that the mapping is more complex. Despite this, one might hope that the angle bracket map • above could be simplified in some way, or that a totally different angle map could give rise to the same pre-numerators. Alternatively, there could be another choice of pre-numerator altogether, better suited to the shuffle product. Unfortunately, these options seem to be as difficult as finding the original map • for the quasi-shuffle generators T , and in this work we have been unable to find a simple equivalent shuffle version.
Indeed, the algebraic pre-numerator constructed from shuffles (for example T (1) ¡ T (2) ) contains explicitly fewer terms than the equivalent quasi-shuffle pre-numerator (for example T (1) ⋆ T (2) ) hence the map • is likely to be more involved.
C Pre-numerators with manifest antipodal symmetry As mentioned in Section 3, there is some freedom in constructing pre-numerators, which can be used to make certain properties manifest. For instance, using the angle-bracket map defined in the main text creates a pre-numerator which, by construction, makes the crossing symmetry of the gluons manifest. As another example of this freedom, in this appendix we will define a pre-numerator which only picks up a sign under the action of the antipode map -we will refer to this property as "antipodal symmetry". Hence, these alternative pre-numerators have the same properties under the action of the antipode as the amplitudes, which implies that they obey their own version of the reflection identity (2.12). We now describe these alternative pre-numerators and their properties in detail.
We start from the algebraic pre-numerator N (α) = T where V µν τ and Θ (α) (τ i ) are defined in Sections 3.2 and 5.1, respectively, and V µν τ = p µ τ v ν and Θ (α) (τ i ) is the subset of τ 1 ∪ · · · ∪ τ i−1 which are larger than the last index in τ i with respect to the ordering α. Note that the τ j are also ordered with respect to α. For example, we write T (31), (25), (46) and then since the last index in the superscript is one we use the Θ functions which for this example gives: Θ (254631) (46) = {31}, Θ (254631) (31) = ∅ etc. As before, if any of the sets Θ (α) (τ i ), Θ (α) (τ i ) happen to be empty then we map that generator to zero. Finally, the three-point case is special and as usual we define it separately as  which indeed agrees precisely with the BCJ numerator N ([ [1,2], 3], v). Note that for this choice of the map • S , any pre-numerator without leg 1 at the beginning or end is set to zero. This explicitly shows why crossing symmetry is no longer manifest at the level of the pre-numerators.