Hedgehog Bases for A_n Cluster Polylogarithms and An Application to Six-Point Amplitudes

Multi-loop scattering amplitudes in N=4 Yang-Mills theory possess cluster algebra structure. In order to develop a computational framework which exploits this connection, we show how to construct bases of Goncharov polylogarithm functions, at any weight, whose symbol alphabet consists of cluster coordinates on the $A_n$ cluster algebra. Using such a basis we present a new expression for the 2-loop 6-particle NMHV amplitude which makes some of its cluster structure manifest.


Introduction
In a series of recent papers following [1] it has been realized that (all known) multiloop n-particle scattering amplitudes of planar N = 4 super-Yang-Mills (SYM) theory possess special properties that are intimately connected to mathematical structures known as cluster algebras. The most basic aspect of this connection is that amplitudes are linear combinations of generalized polylogarithm functions whose symbol arguments are cluster coordinates on the Gr(4, n) Grassmannian cluster algebra 1 . This connection between scattering amplitudes and cluster algebras is undoubtedly related to a similar cluster structure that has been observed at the level of integrands in [2], though the precise connection has yet to be made. Nevertheless, the observed cluster structure of integrated amplitudes has already helped to facilitate the computation of new expressions for various quantities associated to amplitudes (see for example [3][4][5][6][7]). In parallel, work by Dixon, Drummond, and collaborators has resulted in spectacular progress in determining 6-particle amplitudes via a bootstrap approach (see [8][9][10][11][12], or the review [13]) utilizing input from the OPE of null Wilson loops (see for example [14][15][16][17][18][19]).
Typically, results in SYM theory take the form of colossal linear combinations of generalized polylogarithm functions. These special functions satisfy a huge number of functional identities: shuffle identities, stuffle identities, the Abel identity, the trilogarithm identity of [1], and many others. These make generalized polylogarithms notoriously difficult to work with. Moreover, with so many identities, there are a multitude of possible ways to write the same formula. In general there is no "best" way to write a given expression, nor is it even clear how one ought to define "best" -perhaps the shortest expression, or one where certain physical or mathematical properties are manifest.
Large progress towards finding canonical bases for generalized polylogarithms has been made by Brown in [20] (see also [21] for some applications) and employed by Dixon et. al. in their 6-particle bootstrap program. In this paper we demonstrate a natural way to "clusterize" Brown's basis of polylogarithm functions. Namely, we show how to generate, at any weight, a basis of generalized polylogarithm functions whose symbols are manifestly expressible in terms of cluster coordinates on the A n cluster algebra. We call these "hedgehog" bases because they are naturally associated to certain spiny structures in the A n exchange graph. Hedgehog bases provide an almost canonical way to write expressions for 6-particle MHV and NMHV amplitudes, presumably at any loop order. Compared to using other bases that have been considered in the literature, hedgehog bases have the theoretically-pleasing advantage of making some of the cluster structure of such amplitudes manifest, as well as the practical benefit of allowing notably shorter expressions. The latter feature echoes a common theme in the amplitudes program: identifying underlying mathematical structure and improving computational efficiency go hand in hand.
Section 2 briefly reviews the necessary mathematical technology of polylogarithms, cluster algebras, and scattering amplitudes. Section 3 introduces the idea of a "hedgehog" for a cluster algebra, and sketches the rigorous proof (with details relegated to an appendix) that they can be fashioned into a basis for polylogarithms. Section 4 presents, as an application of this technology, a construction of a hedgehog-basis representation for the the 2-loop 6-particle NMHV amplitude 2 .

Review
Polylogarithms and cluster algebras are each subjects unto themselves. Thus this section is not an all-encompassing review, but rather a brief reminder of some of the mathematical technology needed for the rest of the paper, together with citations where the curious reader may find additional details. We also review the relevant aspects of the connection between Grassmannian cluster algebras and scattering amplitudes in SYM theory.

Generalized Polylogarithms
Polylogarithms are a broad class of special functions that generalize the logarithm. More details on the material in this section may be found in the recent review [22].
Recall that the ordinary logarithm can be written as log z = z 0 dt t . Generalizing this to an iterated integral of the type first studied systematically by Chen [23] gives the weight-k Goncharov polylogarithm [24]: with the special case In general, a 1 , . . . , a k are valued in the complex numbers with z ∈ C \ {a 1 , . . . , a k }, and one should specify a contour of integration. We will see that for scattering amplitudes in a certain domain these variables are all real-valued, and there is a natural ordering which allows one to take the "naive" contour straight along the real axis. A large class of L-loop amplitudes in SYM theory, including at least all MHV and NMHV amplitudes, are expected to be expressible as linear combinations of weight-2L polylogarithms.
(2.4) The name comes from riffle shuffling a deck of cards; shuffling two stacks of cards together interweaves them while leaving each stack in the same order.
Each polylogarithm has an associated object called its symbol (see for example [25,26], and the review [27]). The symbol is a useful tool for converting the functional identities of polylogarithms into linear algebra, obviating many thorny problems. The symbol of a Goncharov polylogarithm admits a nice graphical interpretation as a sum over plane trivalent trees [28], and is given explicitly by the recursive formula S(G(a k , . . . , a 1 ; a k+1 )) = k i=1 S(G(a k , . . . , a i , . . . , a 1 ; a k+1 )) ⊗ (a i −a i+1 ) − S(G(a k , . . . , a i , . . . , a 1 ; a k+1 )) ⊗ (a i −a i−1 ).

(2.5)
Here a i denotes that the argument is omitted, and it is also understood that any term with 0 as a symbol entry (which can happen if some adjacent a's are equal) should simply be omitted. Symbols behave as if there were implicit "d log's" in front of each term: just as d log 1 = 0 and d log φ 1 φ 2 = d log φ 1 + d log φ 2 , symbols obey (· · · ⊗ 1 ⊗ · · · ) = 0 and The collection of φ i which appear in the symbol of a given function is called its symbol alphabet.

Cluster Algebras
Cluster algebras are a relatively new area of mathematics, introduced in 2002 by Fomin and Zelevinsky in [29,30]. This section quickly reviews some salient facts about cluster algebras; the reader may consult [31,32] for additional mathematical background and [1] for the amplitude perspective. A cluster algebra starts with a seed -a quiver where each vertex is labeled with a cluster variable (also called a cluster coordinate) 3 . See figure 1A for an example of a seed.
The initial seed for the A 3 algebra.
The result of mutating on the vertex x 2 .

Figure 1
A cluster algebra is generated from an initial seed through an iterative process. An operation called mutation on a vertex generates a new seed and new cluster variables, according to the formula given in eq. (B.2) (or see [33]). For example, mutating on the middle vertex on figure 1A gives figure 1B with the new cluster variables Mutation is an involution, so applying the same mutation twice does nothing. The cluster algebra is the algebra generated by the set of all cluster variables which arise from repeatedly mutating the initial seed. Under certain conditions on the initial quiver, all possible repeated mutations will yield only finitely many seeds. Such cluster algebras are said to be of finite type.
A natural domain for a cluster algebra is the positive domain, where all cluster variables take positive real values. This property is preserved under mutation: if all variables in a given seed are positive-valued, then all possible cluster variables on the same algebra are also positive-valued.
The structure of the cluster algebra as a whole can be displayed as an exchange graph, where each vertex represents a seed, and undirected edges are drawn between seeds linked by a single mutation. (See figure 2B.) Because applying a mutation will invert a single cluster variable x i → 1 x i , a directed edge of the exchange graph can be associated to a unique cluster variable x i . The same edge with the opposite direction corresponds to 1/x i .
Later in this paper we will focus on the A n family of cluster algebras, which start with the initial seed S 4 : Figure 2: (A) The five seeds for the A 2 algebra, with each node labeled by its associated cluster variable. (B) The exchange graph, showing how to move from one seed to another by mutation.
for n ≥ 1. For the special case of A n algebras, which are all of finite type, there is a convenient alternative to representing clusters with quivers. This construction is reviewed in appendix A. The A 2 algebra, for example, has exactly 5 distinct seeds and 10 cluster variables 4 given by and their reciprocals. These variables obey the recursive formula This paper makes extensive use of the A 3 cluster algebra, which starts with the initial seed in figure 1A. It has 14 seeds and 30 cluster variables. We take the opportunity to enumerate in eq. (2.9) 15 of these cluster variables (the other 15 are their reciprocals) in four ways: (1) in terms of the names v i , x ± i and e i that these variables have been given in previous work (see in particular [1,3]), (2) as rational functions of the variables x 1 , x 2 , x 3 in the initial seed, (3) in terms of the {u, v, w, y u , y v , y w } variables used extensively by Dixon et. al. in their study of 6-particle scattering amplitudes in SYM theory, (4) and in terms of Plücker coordinates on Gr(4, 6) (the connection to Plücker coordinates is explained in the following subsection) 5 1236 3456 1256 3456 1234 3456 1236 1456 1456 2345 1456 2346 1256 1345 1236 2456 1234 1356 1246 2345

Scattering Amplitudes and Grassmannian Cluster Algebras
The connection between scattering amplitudes in SYM theory and cluster algebras was first made in [1] and further explored in [3,4,34,35]. The basic fact that allows for such a connection is that the kinematic domain for n-particle scattering in SYM theory, called Conf n (P 3 ), has, according to [33], the structure of a cluster Poisson variety associated to the Gr(4, n) Grassmannian cluster algebra. This fact is special to SYM theory in four dimensions because it relies on the dual conformal symmetry of the theory, discovered in [36][37][38][39][40][41].
An ordered scattering amplitude of n massless particles is a function of n null vectors in Minkowski space that sum up to zero due to energy-momentum conservation. Using the momentum twistor variables of Hodges [42], the space of such configurations can be realized as n ordered points in P 3 , or concretely as a 4 × n matrix [Z 1 Z 2 · · · Z n ] where each column Z i is a four-component homogeneous coordinate on P 3 . In this presentation, dual conformal symmetry, which must leave all amplitudes invariant, acts as left-multiplication by SL(4, C). Passing to the quotient space we get a birational isomorphism (which means a bijection for generic points) Thus, scattering amplitudes can be (essentially) regarded as complex-valued functions on Grassmannians, making it natural to use the SL(4, C) invariant Plücker coordinates ijk = det[Z i Z j Z k Z ], which are well-defined complex-valued functions on Gr(4, n). However, since the Z's are homogeneous coordinates, it is necessary to use ratios of Plücker coordinates (or, more generally, ratios of homogeneous polynomials of Plücker coordinates), with the same Z's appearing in the numerator and denominator, such as 5713 5624 4512 3567 , (2.11) to get well-defined coordinates on Gr(4, n)/(C * ) n−1 . Scattering amplitudes in SYM theory are naturally written as functions of such cross-ratios. This is where cluster algebras enter: the Plücker coordinates of any Grassmannian form a cluster algebra [43], and the quotient Conf n (P 3 ) has the structure of a cluster Poisson variety [33], with cluster coordinates given by certain very special cross-ratios of the abovementioned type 6 . The physics interest in such cluster algebras stems from the fact that all known multi-loop amplitudes that have been explicitly computed to date in SYM theory (including [1, 9-12, 26, 44-47]) are generalized polylogarithms whose symbol alphabets are subsets of cluster coordinates on this Gr(4, n) cluster algebra 7 . The A n family of cluster algebras reviewed in section 2.2 corresponds to the Grassmannian Gr(2, n + 3), which overlaps with the sequence of algebras relevant to scattering amplitudes in the case A 3 ∼ = Gr(2, 6) ∼ = Gr(4, 6) relevant to 6-particle amplitudes.

The Cluster Bootstrap
The main problem we address in this paper is simple to state: given a cluster algebra A, with a set of cluster coordinates X A , we would like to write down a basis for weight-k polylogarithm functions whose symbols may be written in the alphabet X A . We call such functions "cluster polylogarithm functions" or simply cluster functions 8 on A.
To be explicit, let us note that A 2 cluster functions, for example, are those which can be written in the symbol alphabet consisting of the five x i shown in eq. (2.7). Thanks to eq. (2.6), we can equivalently consider the A 2 symbol alphabet to be the set since each of the five x i may be (uniquely) expressed as products of powers of elements of this set. For A 3 only 9 of the 30 cluster variables are multiplicatively independent, and it is evident from eq. (2.9) that the A 3 symbol alphabet may be taken as the set Closely related symbol alphabets have appeared elsewhere, notably in Brown's work on polylogarithm functions on the moduli space M 0,m of m marked points on the Riemann sphere [20]. For example, for the case m = 6, Brown's polylogarithms are based on the symbol alphabet in cubical coordinates 9 or A number of results in two-dimensional kinematics including [48][49][50][51][52] provide partial evidence in support this assertion, though the full Gr(4, n) structure necessarily collapses in two-dimensional kinematics. This has been studied in [34]. 8 Functions of this type were called "cluster A-functions" in [3] to distinguish them from a smaller set of functions with more special properties called "cluster X -functions", but we do not explore these additional properties here. 9 These cubical coordinates were called x i in [20], but we use c i in eq. (2.14) to distinguish them from our x i cluster coordinates. in simplicial coordinates. Neither alphabet is multiplicatively equivalent to eq. (2.13), but their relation will be uncovered in the following section. In fact, one way to express the central result of our paper is to say that we demonstrate how to construct explicit changes of variables between those of [20] on M 0,n+3 and the A n cluster X -coordinates, for any n, which render the corresponding symbol alphabets multiplicatively equivalent.
Let us conclude our review by briefly recalling that for finite symbol alphabets this problem admits a conceptually straightforward, if computationally intensive, brute force solution. If the symbol alphabet for A has s multiplicatively independent letters {φ 1 , . . . , φ s }, then the symbol of any weight-k cluster function may be expressed as a unique vector (with rational components) in the s k dimensional vector space V k spanned by basis elements φ i 1 ⊗ · · · ⊗ φ i k . Going the other way around, any vector in V k which satisfies a set of linear integrability conditions (see for example [25]) corresponds to (the symbol of) some cluster function. Therefore, the problem of finding a basis for the (symbols of) weight-k cluster functions on A is the same as that finding a basis for the nullspace of a certain linear operator on V k .
The efficiency of this approach can be considerably enhanced by recycling lowerweight information at higher weight, and by exploiting the Hopf algebra structure of polylogarithms (discovered in [28], and nicely reviewed for a physics audience in [53]) 10 . Collectively these "bootstrap" techniques have been implemented systematically by Dixon and collaborators for the 6-particle case (associated to the Gr(4, 6) ∼ = A 3 cluster algebra) to great effect in [10][11][12]. A slightly modified "weight-skipping" bootstrap based on a symbol alphabet of Gr(4, 7) cluster coordinates allowed for the calculation of the symbol of the 3-loop 7-particle MHV amplitude in [7].
Finally, we note a fact we will use later: the classical polylogarithm functions Li k (and products thereof) are known to span the space of all polylogarithm functions of weight k ≤ 3, so it is trivial to write down a (vastly overcomplete) set of irreducible cluster functions at weights k = 1, 2, 3: (2.16) The problem we address in this paper is that of finding bases for all weights, not just overcomplete sets of cluster functions.

Hedgehog Bases
We tackle the problem of constructing bases of cluster functions in three steps. (1) First we discuss the set of Goncharov polylogarithms whose symbols may be written in the alphabet of cluster coordinates. (2) Next, we review the form a generating set should have, based on work of Brown [20] and Drummond [54]. (3) Lastly, we define "hedgehogs" and prove that they provide bases for the space of A n cluster functions.

Good Arguments for Goncharov Polylogarithms
To construct suitable collections of functions there is no need to reinvent the wheel. We may attempt to solve this problem by using a nice set of polylogarithm functions we already have at our disposal: the Goncharov polylogarithms defined in eq. (2.1). Then it remains only to decide what kinds of variables we should allow as the arguments a 1 , . . . , a k ; z.
Let us write G k [Q] to denote the set of weight-k Goncharov polylogarithms whose arguments are drawn from some set Q: It is evident from eq. (2.5) that functions in G k [Q] have symbol entries of the form q i as well as q i −q j , for q i , q j ∈ Q. We may try to follow the path of least resistance by considering what happens when Q is chosen simply to be some subset of X A . Actually, although this doesn't matter at the level of symbols, for later convenience it will be better to consider subsets of −X A since this will help to naturally provide Goncharov polylogarithms that are manifestly free of branch cuts in the positive domain.
(Henceforth we shall use x i ∈ X A to denote cluster coordinates and q i = −x i to denote negative cluster coordinates.) Unfortunately, for two generic q i , q j ∈ −X A , there is nothing particularly nice about the quantity q i −q j ; it may not even have definite sign in the positive domain, in which case it should never appear in the symbol of a cluster function. One approach to construct bases of cluster functions would use special linear combinations of Goncharov polylogarithms for which all "bad" letters cancel out at the level of symbols. Several examples of such functions have been studied in the literature. For the particular case of A = A 3 , Dixon et. al. have constructed Goncharov polylogarithm representations for bases of "hexagon functions" through weight at least 8. These are cluster functions satisfying an additional important physical constraint (the first-entry condition), which we do not address here. The construction of these bases, and several impressive applications to 6-particle scattering amplitudes in SYM theory, are discussed in [9][10][11][12][13]. Also, the "cluster X -functions" studied in [3,6] for more general algebras can be expressed as suitable linear combinations of Goncharov polylogarithms with all "bad" symbol entries cancelling out. These functions also play a prominent role in SYM theory: in particular, it appears from the result of [4] that all 2-loop MHV ampli-tudes can be expressed in terms of classical polylogarithms and the single non-classical cluster function K 2,2 defined in [6].
In the present paper, we would like to explore a different approach to cluster functions. We explore the possibility of constructing Goncharov polylogarithms at any weight which are manifestly free of any "bad" letters, rather than having to rely on solving a (potentially computationally-challenging) linear algebra problem to ensure their cancellation. In light of the factorization property reviewed in eq. (2.6), it is evident that this will be the case if we can choose the set Q so that q i −q j factors into a product of powers of cluster coordinates for all q i , q j ∈ Q. To be precise, let us define the multiplicative span of X A to be the set (If A is an infinite algebra, then only finitely many of the n i may be nonzero.) We say that a set Q splits over Then it is evident that G k [Q] is a set of cluster functions on A whenever Q ⊆ −X A splits over X A . In fact, for any such Q we can get additional cluster functions "for free" by considering the enlarged set G k [{0, 1} ∪ Q]. The inclusion of 0 is trivial, and 1 is allowed because of the property that q − 1 = 1 + x ∈ M A for all q ∈ −X A . A proof of this property, which played an important role in [1,4,6], is presented in appendix B.
We can conclude that is a set of cluster functions on A.
Of course, additional functions of weight k may be constructed by taking products of functions of lower weight. It may be helpful to visualize sets of cluster coordinates satisfying the required property with the assistance of what we call a factorization graph. For a given algebra A, the factorization graph contains one vertex for each cluster coordinate x ∈ X A and two vertices x i , x j are connected if x i − x j ∈ M A . The factorization graphs for the A 2 and A 3 cluster algebras are shown in figures 3 and 4.
In mathematics, a complete subgraph (that is, a collection of vertices such that each pair is connected by an edge) is known as a clique (or an n-clique, if it has n vertices). It is evident from figures 3 and 4 that A 2 has 10 2-cliques and no higher cliques, while A 3 has 60 2-cliques, 12 3-cliques, and no higher cliques. Also note that the A 3 factorization graph is composed of 6 intersecting copies of the A 2 factorization graph. Therefore we can rephrase the conclusion boxed above by saying that Figure 3: The factorization graph for A 2 . Each vertex is one of the 10 cluster coordinates on the A 2 cluster algebra (see eq. (2.7)), and two vertices x i , x j are connected by an edge if x i −x j factors into a product of cluster coordinates. Each of the 10 pairs of connected vertices, for example {1/x 2 , x 5 }, is a 2-clique.
Cliques give cluster functions.
Since ordering will play a crucial role in what follows, this is the perfect opportunity for us to note the convenient fact that if Q ⊆ −X A splits over X A , then there is a natural ordering on Q. Recalling that cluster coordinates are positive-valued everywhere in the interior of the positive domain, possibly taking value 0 or +∞ only on the boundary of that domain, it is evident that for every pair q i = q j ∈ Q, the difference q i − q j ∈ M A takes uniform sign inside the positive domain. Therefore, for each pair either q i < q j or vice versa, so the natural ordering on Q is simply the true numerical order q 1 < q 2 < · · · < q n of these coordinates in the positive domain. It will be convenient to choose the ordering on the set {0, 1} ∪ Q to be 0, 1, q 1 , . . . , q n , even though this is not the true numerical ordering of these quantities (since the q's are negative in the positive domain).

Bases of Cluster Functions
So far, we have seen that elements of the set G k [{0, 1} ∪ Q] are cluster functions on A, i.e. have symbols which can be written in the symbol alphabet X A of cluster coordinates on A, whenever −Q is a clique for the factorization graph of A. We now want a basis for A • (A n ), the space of cluster functions on A n . Let's first consider a simple case. Suppose where "span" denotes the vector space of Q-linear combinations of the indicated functions. We use the notation G to carefully distinguish G k [S; z], which is a set of weight-k functions, from G[S; z], which is a vector space of functions of any weight.
Although G[S; z] is a vector space, because of eq. (2.3) it is more useful to consider it as a shuffle algebra. When dealing with such functions, it is more natural not to look for a vector space basis, but rather to find a minimal generating set for the algebra, such that each element of G[S; z] has a unique expression as a linear combination of products of elements of the minimal generating set.
For this, we use Radford's Theorem (see [55]), which provides a minimal generating set for any free shuffle algebra in terms of Lyndon words. A Lyndon word of length k on an ordered set S is a sequence of k elements of S which is strictly smaller than all of its cyclic permutations with respect to the lexicographic order of S k . (Several explicit examples will be presented in section 4.1.) Let Lyndon k (S) denote Lyndon words of length k on S. It is a consequence of Radford's Theorem that G[S; z] has a minimal generating set k∈N G k [Lyndon k (S); z]. (3.4) How can we use (3.4) to generate cluster functions? The answer to this question 11 is provided by Brown's extensive study of polylogarithms on the moduli spaces M 0,n+3 12 in [20] (see also [21] for some applications). The results of [20] were presented in various useful coordinate systems on M 0,n+3 . One key result was that (essentially) the space of Goncharov polylogarithms on M 0,n+3 is the tensor product of n spaces of polylogarithms with fixed last arguments in a certain ordered set of variables S. The analysis in the previous subsection has revealed that choosing S to be a clique Q along with {0, 1} makes manifest the A n cluster structure of these functions. And, by eq. (3.4), we have a generating set for each of those n spaces of polylogarithms. Combining these observations we arrive at: For A n , each n-clique gives a generating set for all cluster functions.
If −Q ⊆ X An is an n-clique of the factorization graph of the A n cluster algebra with an ordering Q = {q 1 < q 2 < · · · < q n }, then We call the basis generated by eq. (3.5) a Hedgehog Basis for reasons that will become clear in the next section. A very nice feature of this basis is that, thanks to the natural ordering q 1 < q 2 < · · · < q k < 0 on the set Q discussed above, it is manifest from eq. (2.1) that each G function in eq. (3.5) is free of branch cuts everywhere in the interior of the positive domain, with possible branch cuts only on its boundary -with one important exception that we should note. The exception is that at weight 1, instead of G(0; q i ) we should use the function G(0; −q i ) = log(−q i ). (3.6) The feature of being free of branch cuts in the positive domain is a necessary feature for these functions to be useful in describing scattering amplitudes, but the analytic constraints on amplitudes are far stronger still: they must be singularity-free everywhere inside the larger Euclidean domain, with branch points allowed only on boundaries corresponding to multi-particle production thresholds. It is an outstanding problem of great importance to find an explicit basis for the subspace of cluster functions spanned by functions satisfying these tighter analytic constraints.

The Hedgehog Theorem for A n
We have now reduced the problem of finding a basis for cluster functions on A n to that of finding cliques Q of size n. In this section we show that there are precisely two such cliques for each A n−1 subalgebra of A n . This correspondence can be visualized, at the level of the exchange graph, by collections of cluster variables that we call hedgehogs.
Let us start by defining hedgehogs. Suppose A is a cluster algebra of rank r and B is a subalgebra of rank r − 1. The exchange graph of A is an r-regular graph (each vertex has valence r), and the exchange graph for B is an embedded (r − 1)-regular subgraph. Therefore, each vertex of B is incident to r −1 edges leading to other vertices of B and to one edge leading to a vertex of A \ B. In other words, each vertex of B has an edge which goes "out of" B and "into" A \ B. Recall from section 2.2 that a directed edge of the exchange graph can be associated with a cluster coordinate x ∈ X A . Let the hedgehog X (A, B) ⊆ X A be the set of cluster coordinates associated to the edges going out of B into A \ B.
Example hedgehogs for X (A 2 , A 1 ) and X (A 3 , A 2 ) are shown in figures 6A and 5 respectively. As can be seen from the pictures, the edge variables in the set X (A, B) radiate outwards -just like the spines of a hedgehog. We might also consider the set of cluster coordinates associated to inward directed edges, which just gives the "anti-hedgehog" We are now in a position to state the main result of this paper: Figure 5: The A 3 algebra has six distinct hedgehogs (and six anti-hedgehogs). This figure shows the exchange graph for A 3 , with one of its six pentagonal A 2 subalgebras highlighted. The "spines" of this X (A 3 , A 2 ) hedgehog are the red edges connecting this A 2 to the rest of A 3 . Specifically, this X (A 3 , A 2 ) is the set of 3 X A 3 cluster coordinates associated to these 5 outward directed red edges.
The Hedgehog Theorem for A n : Hedgehogs are n-Cliques Let X (A n , A n−1 ) be any hedgehog (or anti-hedgehog). Then Q = −X (A n , A n−1 ) is an n-clique of the factorization graph for A n . In particular, eq. (3.5) generates a basis for the set of all cluster functions on A n .
The details of the proof of this theorem are presented in appendix D, using the machinery of triangulations reviewed in appendix A. Here we will be content to use the notation of the latter appendix to provide explicit formulas for all A n−1 hedgehogs of A n , and to check that they are cliques.
Let us note that the symbol alphabet of the cluster functions generated by eq. (3.5), which consists of letters of the form q i , 1 − q i , or q i − q i , has exactly the same form as that of the polylogarithm functions studied by Brown [20] in what he calls simplicial coordinates, t i . We are therefore able to conclude that the two sets of functions can be related to each other by the identification t i = −q i = x i between simplicial coordinates t i on M 0,n+3 and the cluster coordinates x i of any hedgehog X (A n , A n−1 ) or antihedgehog X −1 (A n , A n−1 ).
As reviewed in appendix A, there are precisely 2 n+3 4 cluster variables on A n (counting x and 1/x separately); half of these can be enumerated explicitly as crossratios of n+3 points in P 1 , while the other half are their reciprocals 1/r(i, j, k, ) = r(j, k, , i).
The Gr(2, n) Plücker coordinates ij used here may be related, in the case n = 3, to the Gr(4, n) coordinates used in section 2 by The A n cluster algebra has n + 3 subalgebras of type A n−1 , so there are n + 3 hedgehogs. In appendix D we show that these hedgehogs are given by sets of the form where k + 1 and k + 2 are taken mod n. It is easy to verify that these are n-cliques by taking two variables r(k, k + 1, k + 2, i), r(k, k + 1, k + 2, j) in this hedgehog and looking at their difference, where the second equality is from a Plücker relation. To summarize, for the A n cluster algebras, hedgehogs are cliques of size n. There are n + 3 hedgehogs, and n + 3 anti-hedgehogs, related by the dihedral symmetry of the n + 3-gon. This provides, via eq. (3.5) and the Hedgehog Theorem, 2(n + 3) distinct, but equivalent, bases for cluster functions on A n .

Comments on Other Algebras
Our problem was to write down a basis of cluster functions on a cluster algebra A, and for A = A n we have found that eq. (3.5) gives such a basis whenever −Q is a hedgehog (or anti-hedgehog) in A n . The algebras of most relevance to SYM theory, however, are the Gr(4, n) algebras (see [1]). Happily the one overlapping case A 3 = Gr(2, 6) = Gr(4, 6) underlies the structure of 6-particle scattering amplitudes. We present an application of our results to this case in the following section.
For more general algebras A, the definition of hedgehog given above still makes sense, but it doesn't appear to be useful. In particular, it is straightforward to check, for example, that there is no A 3 ⊂ D 4 , nor A 5 ⊂ E 6 , such that X (D 4 , A 3 ) or X (E 6 , A 5 ) are cliques. For such hedgehogs, a set of functions of the type shown in eqs. (3.1) or (3.5) are still perfectly fine sets of polylogarithm functions, but they are not cluster functions: their symbols contain non-cluster coordinates as entries.
One could, of course, look at smaller hedgehogs, associated to A n ⊂ A subalgebras, which are known to be cliques due to the Hedgehog Theorem. For example, D 4 has 12 distinct A 3 subalgebras, each of which has 6 A 2 subalgebras, so in all there exist 72 Q(A 3 , A 2 ) hedgehogs sitting inside D 4 . However it is easy to check that no individual hedgehog furnishes enough functions to provide a basis for all cluster functions on D 4 . We know this because we can compare with the dimension of the spanning sets for weight ≤ 3 described in eq. (2.16). The same comment holds for E 6 , which has seven A 5 subalgebras, each of which in turn has eight A 4 subalgebras, for a total of 56 X (A 5 , A 4 ) hedgehogs. On the other hand, at least for the D 4 case we have checked that the union of all cluster functions over these various hedgehogs provides a vastly overcomplete set of cluster functions at weight ≤ 3, but we do not have a collection of hedgehogs which exactly spans to provide a basis. It may be that, just as eq. (3.5) gives a basis for cluster functions on A n ∼ = Gr(2, n) by gluing together sets of the form (3.4) in a certain pattern, some different pattern of gluing might work for other algebras including the cases Gr(4, n) of relevance to scattering amplitudes.

An Application to the 2-loop 6-particle NMHV Amplitude
All evidence available to date (including [1, 9-12, 26, 44, 45]) supports the hypothesis that all 6-particle scattering amplitudes in SYM theory can be expressed in terms of cluster functions on the A 3 cluster algebra. As an application of the Hedgehog Theorem, we discuss in this section how to express the 2-loop 6-particle NMHV amplitude in a hedgehog basis. This amplitude was originally computed in [9] and written (see eq. (2.27) of that paper) as [12345](V +Ṽ ) + cyclic, where [12345] is an R-invariant and X ≡ 8(V +Ṽ ) is a weight-4 polylogarithm function. The exercise of rewriting X in a hedgehog basis has some practical benefit in that it produces a formula which is notably shorter than results previously available in the literature. But from our perspective a greater benefit of working with a hedgehog basis is that it makes some of the cluster structure of the amplitude manifest.
To highlight this point, let us note that in the presentation of [9], the amplitude X is written as a linear combination of various generalized polylogarithm functions whose symbols may be written in the 10-letter alphabet (4.1) The relation between these variables and ours may be read off from eq. (2.9). The tenth letter 1 − y u y v y w is not "clustery" -that is, it cannot be expressed as a product of A 3 cluster coordinates, so it should never appear in the symbol of anything we would call a cluster function. Indeed the full amplitude (like all 6-particle amplitudes) has the property that when all of the individual contributing polylogarithm functions are added up, this tenth letter cancels out of the symbol of the full amplitude. This is suggestive: if all these terms cancel out in the end, it seems desirable to express the amplitude in such a way that they never arise in the first place. This is exactly what an A 3 hedgehog basis does.
An additional 45 functions of weight 2 may be obtained by taking products of pairs of the weight-1 functions shown in eq. (4.2), so the total space of weight-2 functions on A 3 has dimension 55.
It is a simple exercise to continue enumerating Lyndon words in this manner to higher weight. We find a total of 285 functions of weight 3 and 1351 functions of weight 4, which is as far as we need to go for the purpose of expressing the 2-loop amplitude X. Symbols of functions in this hedgehog basis can be expressed in the 9-letter "q" alphabet where −Q = {−q 1 , −q 2 , −q 3 } is any 3-clique of the A 3 factorization graph.

Hedgehogs for A 3
According to the Hedgehog Theorem, cliques for A 3 are given precisely by hedgehogs (or anti-hedgehogs), which are in one-to-one correspondence with A 2 ⊂ A 3 subalgebras. The hedgehogs for A 3 are triples of cluster coordinates associated with triangulations of a hexagon. In terms of the variables defined in eq. (2.9), the six hedgehogs are: (4.6) A simple calculation using the second line quickly reveals that the difference of each pair lies in the multiplicative span of the symbol letters shown in eq. (2.13), and also that they are listed in increasing numerical order in the positive domain. These properties are less apparent from the third line. Each of the first 9 terms of the "y" alphabet can be written as a product of elements of the "q" alphabet so, by means of the symbol rule (2.6), the NMHV amplitude X can be written in the "q" alphabet. Each element of the hedgehog basis can be expressed in the same alphabet and, because the symbol map is linear, the symbol of the amplitude can be written as a linear combination of the symbols of the basis vectors. To find the coefficients of this linear combination, it is convenient to work in the ambient 9 4 dimensional space of length 4 symbols in the "q" alphabet. The symbols of the hedgehog basis vectors, together with the amplitude, constitute 1352 linear combinations in this larger space with one linear relation. Calculating the null space of the 1352 × 9 4 matrix with the linear algebra library SparseSuite gives the appropriate linear combination. To summarize, the result of this calculation is a particular linear combination of 376 elements of the weight-4 hedgehog basis whose symbol matches that of the amplitude X exactly. To find a representation for the full amplitude we turn in the next section to the problem of fixing terms of the form (transcendental numerical coefficient) × (functions of weight less than four).

Fixing Beyond-the-Symbol Terms
If the symbols of two functions are equal, then the functions are equal, modulo "beyondthe-symbol" terms of lower weight. So this 376-term expression is the highest-weight part of the NMHV amplitude. A priori, we might expect up to 65 possible terms of lower weight. These include 55 weight-2 functions times ζ(2), 9 weight-1 functions times ζ(3), and one overall additive constant proportional to ζ(4). The coefficients of these 65 terms can be fixed by numerically evaluating the amplitude and our 376-term highest-weight expression at 65 random points in the positive domain and performing a row reduction. All the coefficients turn out to be rational numbers with small denominators. Our final result 13 is a 416-term expression for the 2-loop, 6-particle NMHV amplitude X. The validity of our ansatz, and solution, for the lower-weight terms has been stringently tested by comparing our result to the known expression at high precision for additional random kinematic points 14 .

Outlook
Hedgehog bases give a natural way to express 6-particle amplitudes, since they make manifest that these amplitudes have symbols which can be expressed in terms of A 3 cluster coordinates. In practice, this may translate into more "compact" representations of amplitudes than might be otherwise achieved. It should be stressed again that this the hedgehog basis is a true basis for cluster functions, with no functional or linear relations between its elements.
However, hedgehog bases are clearly not the ultimate solution for representing scattering amplitudes. The most important reason is that amplitudes satisfy a stringent analytic constraint on the possible locations of their branch points, which translates into a condition that allows only certain letters to appear in the first entry of their symbols. For example, 6-particle amplitudes may only have the letters {u, v, w} in the first entry of their symbols, whereas all nine letters of the A 3 symbol alphabet appear as first entries in the hedgehog basis. It would be extremely interesting, as well as of great practical utility, to see if there is a natural way to construct bases of cluster functions manifesting this additional property. It would also be very interesting, both mathematically and physically, to find an appropriate extension of the Hedgehog Theorem to algebras other than A n .

Acknowledgments
We have benefitted from enormously valuable discussions with J. Drummond, correspondence with L. Dixon, and collaboration on closely related topics with J. Golden and A. Goncharov. This work was supported by the US Department of Energy under contract DE-SC0010010 (MS) and the DE-FG02-11ER41742 Early Career Award (AV), by the Sloan Research Foundation (AV), and by Brown University UTRA Awards (DP and AS). MS and AV are also grateful to the CERN Theory Group for support during the completion of this work.

A Triangulations and A n Cluster Algebras
Here we review from [30] the fact that in the special case case of A n cluster algebras there is a convenient alternative to representing clusters with quivers: each cluster can instead be associated with a triangulation of an (n + 3)-sided polygon. Beginning with a labeled (n + 3)-gon, a triangulation is obtained by repeatedly adding non-crossing internal chords ik between nonadjacent vertices i, k until no further chords can be added. There are always n chords in a triangulation. For example, the five chords in a particular triangulation of an octagon are shown here: i k Cluster coordinates are associated with these chords. Specifically, the chord ik shared between two triangles ijk and ikl is associated with the cluster variable 15 r(j, k, , i) = 1 r(i, j, k, ) where ij denotes the Plücker coordinate of two points z i , z j in P 1 .
In this representation, mutations are associated with chord-flips. To perform a chord-flip on r(i, j, k, ), remove the chord ik and add the chord jl. It is easy to see that the resulting variable r(j, k, , i) is indeed equal to 1/r(i, j, k, ); adjacent chords (those which lie on the same triangle) take the place of adjacent nodes in a quiver.
The added convenience of using triangulations over quivers comes from the fact that every triangulation is associated with a single cluster whose variables can be found explicitly via the formula above. Using this explicit formula, one can determine many useful facts about A n : it's order is the Catalan number C n+1 , there are 2 n+3 4 cluster variables, and those variables are r(i, j, k, ) for cyclically ordered i, j, k, .
Triangulations also make it easy to enumerate and analyze subalgebras. Consider the case n = 5. Clusters of A 5 correspond to triangulations of a labeled octagon. Selecting the vertices 1, 3, 4, 7, and 8, we can form a pentagon within our octagon: Some triangulations of the octagon contain all of the edges of the pentagon as chords (13,34,47,48,81). The subtriangulation obtained by discarding everything outside the pentagon is associated with a cluster of A 2 . By flipping only the chords lying strictly within the pentagon, we can obtain other A 2 clusters, until we have an entire A 2 subalgebra: By flipping the chords lying strictly outside the pentagon, or choosing a different pentagon to begin with, we can obtain different A 2 subalgebras. Note that the different subalgebras sharing the same pentagonal boundary must all have the same set of cluster variables; therefore, if we consider two subalgebras "equivalent" when they have the same variables; there are exactly 8 5 = 56 nonequivalent A 2 subalgebras of A 5 . This generalizes nicely: the clusters of A m subalgebras of A n correspond to (m+3)vertex subtriangulations of (n + 3)-gon triangulations. Up to equivalence, there are n+3 m+3 of these.

B A Theorem on 1 + X Coordinates
In this appendix we prove that for all cluster coordinates x, the quantity 1 + x can be expressed as a product of cluster coordinates on the same algebra.
To be precise: Suppose C is a cluster algebra of A, D, or E type whose quivers are connected with more than one node. Suppose X C is its set of cluster coordinates.
for some n j ∈ {−1, 0, 1}. The proof of this statement is straightforward. Pick some quiver of C that contains x i . By connectedness, there exists some x k connected to it. One of the properties of an A,D or E-type cluster algebra is that |B ij | ≤ 1 for all i, j. In particular, B ik ∈ {−1, 0, 1}. Recall the mutation rule for cluster coordinates: Otherwise, B ik = −1, in which case Since x i and x k are connected, B ik = 0, so this is exhaustive. Thus 1 + x i factors as a product of cluster coordinates. Connectedness of a quiver is preserved by mutation, so if the initial quiver is connected, all quivers of an algebra are connected. Note also that the algebra A 1 as well as its derived algebras such as A 1 × A 2 have disconnected quivers, hence expressions of the form 1 + x do not necessarily factor in this case.
Let us note here that a sort of converse statement, which has been stated and used for example in [1,4,5], remains a conjecture: If a and b are two elements of the multiplicative span M A for some cluster algebra A, and if b = a + 1, then precisely one element of the set {a, −1 − a, −1 − 1/a} is a cluster coordinate.

C A Cluster Parameterization of 6-Particle Kinematics
We include here a parameterization of the positive domain of 6-particle scattering kinematics, in terms of momentum twistors, that we have found useful: It is easily checked that this lies in the positive domain (that is, all minors ijkl > 0 when i < j < k < l) whenever x 1 , x 2 , x 3 > 0, and that when plugged into the last column of eq. (2.9), it precisely reproduces the second column.

D Hedgehogs are Cliques for A n Cluster Algebras
Here we provide the details of the proof of the Hedgehog Theorem presented in section 3.3. First consider the case n = 2. The general case can be reduced to the n = 2 case, so it is worth doing in detail. As reviewed in section 2.2, the mutation relations for A ∼ = A 2 give ten cluster coordinates {x 1 , 1 x 1 , . . . , x 5 , 1 x 5 } related by For B ∼ = A 1 let us choose the subalgebra with coordinates {x i , 1 x i }. So the relevant hedgehog is X (A, B) = {x i+1 , 1 x i−1 }. Pictorially, the hedgehog consists of the red and blue edges in the exchange graph shown in figure 6A. Figure 6: (A) The exchange graph for A 2 , with the two vertices on the bottom row constituting an A 1 subalgebra. The hedgehog X (A 2 , A 1 ) contains the two cluster variables 1/x i−1 , x i+1 associated to the edges emanating away from the subalgebra. (B) The same exchange graph, but with each vertex showing the associated pentagon triangulation.
This is a clique because Recasting this in terms of polygon triangulation is very illuminating. Recall that A 2 can be described in terms of pentagon triangulations. The red dashed lines in figure 6B are the chords that change as the red edge is traversed, and similarly for the blue. What we have shown then is that the difference between the cluster coordinates for the red and blue edges can be written in terms of products of cluster coordinates. Now for the general case consider a (n + 3)-gon. Choose three adjacent vertices k, k + 1, k + 2 and draw the chord k(k + 2). This chord separates a triangle from an (n + 2)-gon. The variable associated with this chord depends only on the triangles containing it, and not on the rest of the triangulation. The triangle on one side of the chord will always have the vertices k, k + 1, k + 2. The other triangle will have vertices k, k + 2, i, where i is any of the n remaining vertices: k k + 1 k + 2 k k + 1 k + 2 i k k + 1 k + 2 i Therefore, there are n cluster variables that can be associated with k(k + 2). These n variables are given by {r(i, k, k + 1, k + 2) : i / ∈ {k, k + 1, k + 2}}. (D.3) Consider the subalgebra B ∼ = A n−1 associated with the (n + 2)-gon that excludes vertex (k + 1). Any triangulation containing k(k + 2) will contain a triangulation of this polygon, and hence will be associated with a cluster in B. However, flipping k(k + 2) will yield a triangulation that is not in B; therefore, the set of cluster variables associated with k(k + 2) is a hedgehog of A n ! Because there are n + 3 choices for k, all hedgehogs can be so obtained, and all will be of cardinality n. We can also obtain the anti-hedgehogs, which are associated with the result of any chord-flip of k(k + 2): {r(k, k + 1, k + 2, i) : i / ∈ {k, k + 1, k + 2}}. (D.4) Take x i , x j to be two arbitrary elements of the hedgehog, with (i, j, k) cyclically ordered. Then the mutation x i → 1/x i corresponds to flipping k(k + 2) to (k + 1)i and x j → 1/x j corresponds to flipping k(k + 2) to (k + 1)j. These are indicated in figure 7 in (C) and (A) respectively. Any other sub-triangulation of the gray region of (A) preserves x j , so in particular one can choose the sub-triangulation with the pentagon {k, k + 1, k + 2, i, j}, shown in (B). Similarly, one can go from (C) to (D) and x i will still be accessible by the red chord flip. But now notice that this is exactly the situation from the A 2 case! There is an embedded pentagon with exactly the same triangulations that appeared above. Therefore x i − x j factors as a product of cluster coordinates, i.e x i − x j ∈ M An . We can also show this algebraically, making use of a Plücker relation, as displayed in eq. (3.11).