The symbol and alphabet of two-loop NMHV amplitudes from Q¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{Q} $$\end{document} equations

We study the symbol and the alphabet for two-loop NMHV amplitudes in planar N\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathcal{N} $$\end{document} = 4 super-Yang-Mills from the Q¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{Q} $$\end{document} equations, which provide a first-principle method for computing multi-loop amplitudes. Starting from one-loop N2MHV ratio functions, we explain in detail how to use Q¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{Q} $$\end{document} equations to obtain the total differential of two-loop n-point NMHV amplitudes, whose symbol contains letters that are algebraic functions of kinematics for n ≥ 8. We present explicit formula with nice patterns for the part of the symbol involving algebraic letters for all multiplicities, and we find 17 − 2m multiplicative-independent letters for a given square root of Gram determinant, with 0 ≤ m ≤ 4 depending on the number of particles involved in the square root. We also observe that these algebraic letters can be found as poles of one-loop four-mass leading singularities with MHV or NMHV trees. As a byproduct of our algebraic results, we find a large class of components of two-loop NMHV, which can be written as differences of two double-pentagon integrals, particularly simple and free of square roots. As an example, we present the complete symbol for n = 9 whose alphabet contains 59 × 9 rational letters, in addition to the 11 × 9 independent algebraic ones. We also give all-loop NMHV last-entry conditions for all multiplicities.


Introduction
Scattering amplitudes are central objects in fundamental physics: not only do they play a crucial role in bridging theory to high energy experiments such as Large Hadron Collider, but they also provide new insights into Quantum Field Theory (QFT) itself. Tremendous progress has been made in unravelling hidden mathematical structures of perturbative scattering amplitudes, especially in planar N = 4 supersymmetric Yang-Mills theory (SYM) at the (all-loop) integrand level (cf. [1][2][3]). Moreover, it has become an extremely fruitful playground for new methods of evaluating multi-loop Feynman integrals, which is a subject of enormous interests by itself (cf. [4][5][6] and references there in). Along these fascinating directions, we have discovered numerous new structures of the theory (and in many JHEP03(2021)278 mine MHV and NMHV amplitudes given lower-loop ones. No serious attempts have yet been made to include both sets of equations, which would allow us to compute amplitudes with k ≥ 2 and in turn MHV and NMHV amplitudes at higher loops, thus significantly push the limit of this method. Within the limitations ofQ equations, the first application of the method has produced the complete symbol of two-loop MHV for all n, two-loop NMHV heptagon, and three-loop MHV hexagon [24]; for external kinematics in two dimensions,all two-loop NMHV and three-loop MHV amplitudes have been computed usinḡ Q equations [26]. Moreover, the equations can provide all-loop constraints on scattering amplitudes which have proved to be very useful. As we will show shortly, without doing any loop-specific computations,Q equations provide the so-called last entry conditions for the symbol [27][28][29] of not only MHV, but also NMHV amplitudes to all loops and all multiplicities, 1 and for n = 6, 7 they have been exploited in the hexagon and heptagon bootstrap.
A crucial assumption for the hexagon and heptagon bootstrap is that the collection of letters entering the symbol, or the alphabet, consists of only 9 and 42 variables known as cluster coordinates [30], and the main challenge starting at n = 8 is the lack of control for the symbol alphabet. The cluster algebra of G + (4, n) for n ≥ 8 becomes infinite type and it is unclear which letters can appear for L-loop N k MHV amplitudes (see progress on analysis based on Landau equations [31,32]). In addition, a new feature for n ≥ 8 is the appearance of algebraic letters which can no longer be written as rational functions of momentum twistors [33]. It is of great interests even at two loops to understand the symbol alphabet and in particular what algebraic letters appear for n ≥ 8.
In our recent paper [34], we have computed the symbol of two-loop NMHV octagon usingQ equations, as the first example of multi-loop amplitudes with algebraic letters. We have determined the alphabet of two-loop NMHV octagon, which consists of 180 rational letters and 18 algebraic ones that are independent under multiplicative relations. The n = 8 alphabet of algebraic letters [34] has been explained and conjectured to hold to higher loops using mathematical construction based on tropical Grassmannian [35,36] (see related ideas on positive configuration space [37]). More recently, these 18 algebraic letters have also been obtained from studying n = 8, k = 2 Yangian invariants or leading singularities [38,39] (see earlier works on "cluster adjacency" properties of rational letters based on poles of Yangian invariants [40,41]). It is tempting to ask if we can extend all these results on algebraic letters and the symbol to higher multiplicities.
In this paper, we systematically derive such all-multiplicity results usingQ equations, both at two-loop and all-loop orders, for NMHV amplitudes. Our results can be divided into three parts. First, sinceQ equations alone determine the total differential of MHV and NMHV amplitudes, the last entries of their symbols directly follow. The famous MHV last-entry conditions [42] follow from a simple residue computation on NMHV Yangian invariant [24], without specifying loop orders. We will show that by the same residue computation on all possible N 2 MHV Yangian invariants, we obtain the complete last-entry conditions of n-point NMHV amplitudes, which we expect to be valid for all loops. 1 Recall that MHV and NMHV amplitudes are expected to contain only generalized polylogarithms of weight 2L at L loops [2].

JHEP03(2021)278
As the main results of the paper, we present theQ calculation for the "new" part of twoloop n-point NMHV that contains algebraic symbol letters. By showing how to compute the action of collinear integrals on four-mass boxes of one-loop N 2 MHV, we determine the algebraic part of the symbol. All algebraic letters are grouped according to what "fourmass" square root ∆ a,b,c,d they contain: we find exactly 17−2m multiplicative-independent algebraic letters for each ∆, where 0 ≤ m ≤ 4 is the number of corners of the four-mass boxes that has only two particles. The most generic case is when all four corners contain more than two particles, i.e. m = 0, we have 17 independent letters which at least depend on 12 particles, a−1, a, a+1, · · · , d−1, d, d+1; other cases with m > 0 are degenerate ones where some of the labels coincide. It nicely generalizes the 9 independent algebraic letters for n = 8 [34] with m = 4. Moreover, we find that the symbol for this algebraic part can be written in a compact form: all new algebraic letters only appear in the third entry, and the first two-entry are the symbol for corresponding four-mass box; these weight-3 functions are also interlocked with specific (rational) last entries.
Two interesting observations can be made about our results on the algebraic alphabet and words. First, we will show that all the algebraic letters can be interpreted as "letters" or poles of one-loop four-mass leading singularities with four trees that are either MHV or NMHV, which generalizes results for n = 8 in [38,39]. Moreover, we find a large class of components of two-loop n-point NMHV amplitudes, which are absent of algebraic letters. They are the simplest NMHV components, which are coefficients of χ i χ j χ k χ l for any nonadjacent i, j, k, l, and we show that they are completely free of square roots! Any such component can be written as the difference of two (cyclically related) double-pentagon integrals [43], connecting our result to these important two-loop integrals.
Finally, as the computation is straightforward but tedious for the remaining part which is independent of any algebraic letters, we content ourselves by presenting explicit symbol for n = 9. We find precisely 59 × 9 rational letters in addition to the 11 × 9 independent algebraic letters. Almost all rational letters are predicted by Landau equations except for one cyclic class, and we expect similar results for the rational alphabet extends to all multiplicities. We also show that all the algebraic letters are consistent with the rational letters in the sense of Landau analysis.
The paper is organized as follows. In section 2, after a quick review ofQ equations, we move to list the complete collection of last-entry conditions for n-point NMHV amplitudes to all loop orders. In section 3, we show how to compute the action of collinear integrals on four-mass boxes of the one-loop N 2 MHV amplitudes, which allows us to compute the algebraic part of the two-loop NMHV amplitudes. In section 4, we present the 17 − 2m independent algebraic letters after finding all the multiplicative relations they satisfied; we make a observation connecting them to one-loop four-mass leading singularities; we also give the nice patterns of how these algebraic letters appear in the symbol, and the implication for a large class of components. In section 5, we present the full symbol of two-loop n = 9 amplitudes including the complete alphabet.

A lightning review ofQ equations
The infrared divergences of scattering amplitudes in planar N = 4 SYM exponentiate [44], which can be captured by the famous BDS ansatz [25]. For the n-point, N k MHV amplitude A n,k , we define an infrared-finite object, the so-called BDS-normalized amplitude R n,k = A n,k /A BDS n . R n,k is dual conformal invariant and enjoys a chiral half of the dual superconformal symmetries, but it is not invariant under the action of the other half [24]: where Z i and χ i are the bosonic and fermionic part of the super momentum-twistor with dual super coordinates (x|θ). The remaining unbroken SL(4|4) dual superconformal generators include the "good" chiral half Q a Thus super momentum-twistors make dual superconformal symmetry manifest, while the usual superconformal symmetry acts via level-one generators [45]. The most basic SL(4) invariants that one can build from bosonic momentum-twistors are the Plücker coordinates of Gr(4, n): ijkl Using supersymmetric momentum-twistors, one can build dual superconformal invariants or even Yangian invariants, which are most generally written in terms of contour integrals inside positive Grassmannian [2,46,47]. For example, the most basic Yangian invariant, which were originally called the R invariant [21,48], reads It is antisymmetric in the five particle indices and satisfy the so-called six-term identity They are the most general Yangian invariants for k = 1, and for general k, such Yangian invariants are leading singularities of loop amplitudes, including BCFW terms appearing in tree amplitudes. The NMHV tree amplitude, which we use shortly, is simply given by a sum of them: R tree n,1 = i<j [1 i i+1 j j+1] (one can replace label 1 by any other label here, which gives the same result).
One of the main results of [24] is the following anomaly equation for theQ generators: it has been argued based on a Wilson-loop analysis thatQ of the amplitude is given in terms of an integral of higher-point one with fermion insertion (which increases k) in the collinear limit, and by taking into accountQ of the BDS ansatz we have:

JHEP03(2021)278
where Γ cusp is the cusp anomalous dimension [49] and "cyclic" denotes the n−1 images of the foregoing term under the rotation 1 → 2 → · · · → n → 1. Eq. (2.5) is conjectured to hold non-perturbatively [24]. On the r.h.s., we have shown the term with particle n+1 added in collinear limit with n, and its (super-) momentum-twistor Z n+1 is parameterized by , τ : with C = n−1 n 2 3 n 1 2 3 and C = n−2 n−1 n 1 n−2 n−1 2 1 . The integral measure (d 2|3 Z n+1 ) A a consists of the bosonic part (d 2 Z n+1 ) a := ε abcd Z b n+1 dZ c n+1 dZ d n+1 and the fermionic part (d 3 χ n+1 ) A ; using (2.6) the bosonic measure can be written as with (n) a := (n−1 n 1) a . Thus computing theQ anomaly is straightforward: after performing the 3-fold fermionic integration over χ n+1 , taking the collinear parametrization (2.6) for both Z n+1 and the remaining χ n+1 , the notation Res =0 means to extract the coefficient of d / under the collinear limit of → 0, and finally we integrate over "momentum fraction" τ from 0 to ∞. As shown in [24], precisely the difference of the two terms in the bracket ensures that the r.h.s. of (2.5) is finite: not only do possible log divergences cancel, the combination is also free of endpoint divergences for the τ -integral, which serve as important consistency checks of the calculation. In practice, we can make enormous progress by expanding (2.5) perturbatively: it relates the anomaly of L-loop amplitude, R (L) n,k , to lower-loop ones such as R n,k . The next question is if we can use the anomaly to determine the full amplitude, and this amounts to solve (2.5) which can be viewed as a collection of first-order differential equations. From a "modern" perspective (cf. [50]), the right thing to do is to replace the Grassmann variables by differentials of momentum twistors (which are also anti-commuting) [51] χ A i → dZ A i for i = 1, 2, · · · , n, and we have identified the index of dZ with the R-symmetry index; by taking the trace of the operator n i=1 dZ A i ∂/∂Z a i we have that dR (1) n,k is given by the r.h.s. of (2.5) with the same replacement χ i → dZ i . 2 The remaining task in solving the differential equations is just to determine the "kernel" ofQ (or d if we make the replacement). As shown in [24], for N k MHV amplitude with k ≥ 2, the kernel ofQ does involve non-trivial dual conformal functions, thusQ-equation can not determine the result uniquely without supplement with the parity-conjugate, Q (1) equations (with bothQ and Q (1) equations, the kernel must be linear combination of Yangian invariants which can be determined in turn by collinear limits etc.). However, as we will see that for NMHV (and MHV) amplitudes, (2.5) is very powerful as the kernel is essentially trivial, and it alone allows us to compute the differential of R Nevertheless, one can prove that theQ A a can never annihilate a function of form [i j k l m]F (Z) where F (Z) is a conformal invariant function of bosonic momentum-twistors [24]. Thus,Q equations (2.5), with the supplement of dual conformal invariance, can determine the differential of NMHV amplitudes on their own.

Last-entry conditions for all-loop NMHV amplitudes
It is expected that MHV or NMHV BDS-normalized amplitudes admit a schematic form where Y n,k denote the loop-independent Yangian invariants (recall Y n,k=0 = 1 and all Y n,k=1 are given by R invariants of the form (2.3)), and I (2L) 's are linear combinations of generalized polylogarithms of weight 2L. They can be defined by 2L-fold iterated integrals [52] G(a 1 , . . . , a 2L ; with the starting point G(z) := 1. It is straightforward to see that the differential of a generalized polylogarithm I where I (2L−1) are some generalized polylogarithm of weight 2L−1. Then, one can introduce a symbol map for generalized polylogarithms by recursively defining with S(log a) := a. We call s β 's generated in this way the symbol letters, each tensor product consisted of letters the word, and the collection of all letters the symbol alphabet [27,53]. Now, it is natural to write the differential of NMHV or MHV L-loop amplitudes as To derive (2.13) fromQ equations (2.5), we use the fact that the r.h.s. of (2.5) consists of terms of the form Y n+1,k+1 F 3 For k > 0, we do not assume the transcendental functions F to be generalized polylogarithms, but we still schematically put a "weight" 2L−2 for such functions at L−1 loops. In turn, just fromQ equations we cannot prove NMHV amplitudes must be generalized polylogarithms without inspecting structures of N 2 MHV amplitudes on the r.h.s., though we expect this to be true.

JHEP03(2021)278
is that the fermionic integral d 3 χ n+1 and residue Res =0 part can be performed on Yangian invariants Y n+1,k+1 independent of transcendental functions F 's. One may worry about the log L−1 divergences arising from the collinear limit of but the divergences are always canceled after integrating over τ , as shown in [24]. For MHV (k = 0), the effect of Res =0 d d 3 χ n+1 yields terms of the form with some rational function f i,j (τ ) for each term. The last step is trivial for MHV: we can simply replace χ A i inQ A a with dZ A i then take the trace to obtain the external derivative d := i,a dZ a i ∂/∂Z a i , which reproduces the well-known MHV final entries d log i−1 i i+1 j after collecting all cyclic terms. The one-dimensional integrals for F gives weight-(2L−1) functions in (2.13).
For NMHV (k = 1), the effect of Res =0 d d 3 χ n+1 on N 2 MHV Yangian invariants, Y n+1,2 on the r.h.s. of (2.5) gives a list of possible Y n,1 in (2.3) times final entries as where I, J can generally be intersections of momentum twistors of the form e.g. (ij)∩(klm) (see [1,43]). To obtain the differential of R n,1 in this case, the naive replacement above has an ambiguity due to the existence of non-trivial kernel ofQ, which always take the form (2.8). Nevertheless, since the kernel ofQ can not contain non-trivial functions of dual conformal invariants (DCI) in this case, the replacement χ i → dZ i has no ambiguity once we convert the arguments ofQ log to DCI by adding "0" of the form (2.8). It is a straightforward but tedious algorithm to arrive at such a manifestly DCI form, which gives the final answer for dR n, 1 .
To obtain all possible last entries for NMHV amplitudes, one needs to consider the action of d 2|3 Z n+1 on all possible N 2 MHV Yangian invariants. Unlike the unique type of NMHV Yangian invariant (2.3), there are 14 distinct N 2 MHV Yangian invariants up to cyclic rotations [2]. Apart from the algebraic leading singularities of four-mass boxes, which we will discuss below, the other 13 of them are all rational, for which the effect of Res =0 d d 3 χ n+1 is given in appendix A. Now we list all the NMHV last-entry conditions by applying the operation all these 14 types of invariants.
We have obtained three types of last entries (dressed with NMHV Yangian invariants). First, we obtain last entries of the form

JHEP03(2021)278
The second type of last entries we obtain are such last entries. Finally, we have the third type of last entries of them. By considering cyclic rotations, we see that altogether there are 42 n 6 last entries. However, sinceQ has a non-trivial kernel, we need to turn the arguments ofQ log into dual conformal invariants (DCI): this has already been done in the second and third cases, but we need to do it for the first case. This is realized by expanding the last entries on the basis which has the minimal number of the basis vectors that are not DCI. After then, the coefficients of these basis vectors automatically vanish due to the dual conformal invariance of amplitudes. To see this, we temporarily introduce equivalence relations Y n, last-entry conditions for allloop NMHV BDS-normalized amplitudes. This reduces to the last-entry conditions given in [34] for n = 8. Figure 1. The four mass box and four mass leading singularity.

A quick review of box expansions in N = 4 SYM
According toQ equations (2.5), we need the one-loop N 2 MHV amplitudes as the input in the computation of 2-loop NMHV amplitudes. Such data are available from the familiar box expansion [54,55]. Let us quickly review this result and setup some notations. Because the scattering amplitudes in N = 4 are free of UV divergence, one-loop amplitudes can be expanded in a basis of box integrals I a,b,c,d involving four inverse loop momentum propagators x 2 0a , x 2 0b , x 2 0c and x 2 0d : with the leading singularities L a,b,c,d as the coefficients. The most generic terms are the socalled "four-mass" box, i.e., all four mass corners {{a, . . , a−1}} involve two or more particles, (see figure 1). For such terms, the box integrals I a,b,c,d is free of any divergence and can be evaluated to weight-2 polylogarithms: The subscript a, b, c, d will be restored to indicate the specific box when necessary, otherwise suppressed. The coefficient L a,b,c,d for each four-mass box is the sum of products of 4 tree

JHEP03(2021)278
amplitudes and the N 2 MHV "four-mass" Yangian invariant [2] where the first sum is over all sets of four tree amplitudes satisfying 4 i=1 k i = k−2, and the second sum is over the two solutions of the Schubert problem The other terms in the box expansion (3.1) can be obtained from the general four-mass boxes by taking one or more mass corners massless (say, b → a+1). The coefficients L a,b,c,d vary smoothly in this limit. However, the box integrals become divergent and must be regulated. There are several regularization schemes, say dimensional regularization [55] and Higgs regularization [56]. Here we follow a dual conformal invariant regularization scheme introduced in [57]. In this regularization, the infrared finite and regulator-independent BDS-subtracted S-matrix R n,k at 1-loop reads where I fin a,b,c,d denote the finite part of DCI-regulated box integrals; L MHV a,b,c,d = 0, 1 are 1-loop MHV box coefficients. The reader who is interested in the other coefficients L a,b,c,d and DCI-regulated box integrals is urged to see [57] for a detailed discussion and a complete list.
For our purpose, we only need k = 2 thus the 4 tree amplitudes in (3.5) are all MHV with A k=0 = 1, and we are left with the last line, which we denote by f ± a,b,c,d for the two solutions. An important point we want to emphasize here is that all boxes other than four-mass ones are totally free of the square root ∆, since the corresponding u and/or v vanish. The d 2|3 Z n+1 integration for these boxes can thus be easily performed without any obstacle. However, the d 2|3 Z n+1 integration for four-mass boxes are non-trivial due to the existence of the square root ∆. In the rest of this section, we will work out on the prescription for four-mass boxes and obtain algebraic part of the two-loop answer.

The prescription for four-mass boxes
Now we consider how d 2|3 Z n+1 acts on N 2 MHV four-mass boxes with coefficients, and it is easy to see that only two kinds of f ± a,b,c,d survive: f ± 1,b,c,n I 1,b,c,n with corners

JHEP03(2021)278
Since ∆ 1,b,c,n become rational in terms of momentum-twistors under the collinear limit Z n+1 → Z n , it is straightforward to see that no square root remains, and we have and lim Z n+1 →Zn Note that the divergence of log and τ -integration will be cancelled in the final answer. However, for the second kind of boxes, after taking the collinear limit Z n+1 → Z n , the square root ∆ remains and we do not have a rational functions of τ . To perform the τ -integration, we need rationalizing these τ integrands first. In other words, we need to find a variable substitution t(τ ) such that ∆ 2 in terms of t is a perfect square. Since ∆ 2 is a quadratic polynomial in τ (after factoring out a perfect-squared denominator), this is just a classical problem to find a rational parameterization of a quadratic curve. For the rational curve defined by If there is a rational point (x * , y * ) on this curve, 5 then we can insert y = y * + t(x − x * ) in eq. (3.12) to work out the rational parameterization x(t) and hence y(t). For a more comprehensive treatment of rationalizing roots in Feynman integrals, we refer the reader to [58,59]. For our problem, there are two kinds of obvious rational points, one with u(τ * ) = 0 and the other with v(τ * ) = 0. In what follows, we will denote these two points as τ u and τ v . Depending on the values of τ u and τ v , the second kind of 4-mass boxes can be decomposed into 4 classes further: (i) a > 2 and c < n − 1: (ii) a > 2 and c = n−1: (iii) a = 2 and c < n−1: Case (i) is the generic case, which first appears for one-loop 10-point N 2 MHV. Case (ii) and (iii) are two special cases of Case (i), both of which first appear for 9 points, and Case (iv) is the most special case which first appears for 8 points. Let us consider Case (i) first:

JHEP03(2021)278
Case (i): 2 < a < a+1 < b < b+1 < c < n−1. The d 2|3 Z n+1 integration for such boxes will introduce two square roots (3.14) Similarly, we have z 1 ,z 1 and z n ,z n as in eq. (3.4). In terms of these new variables, there are two rational parameterizations based on the rational points τ u and τ v , respectively. By using the first parameteriza- where (i ↔ j) denote the exchange of particle labels i and j of the foregoing terms, and an + Z a n a−1 rather than I = Z n−1 n 1 a−1 a + Z n 1 a−1 a n + Z 1 a−1 a n−1 n , similarly for the R invariant . This kind of R invariants can be expressed as the R invariants without any intersection by using of the six-term identity (2.4) and The box integral under this collinear limit in terms of t becomes Note that although the τ integrand becomes rational in terms of t, the integration region for t is either from which these two square roots enter the final result, as in [34]. For future use, here we also give the result under the second parameterization (3.16): and x a , y a etc. are the same as in eq. (3.18). The box integral under this parameterization has the same form as eq. (3.20) but now with

JHEP03(2021)278
Case (ii): 2 < a < a+1 < b < b+1 < c = n−1. Unlike the previous case, the d 2|3 Z n+1 integration for these boxes only give the square root ∆ 1 in eq. (3.13), since now In this case, the first rational parameterization (3.15) is still available while the second one (3.16) becomes singular. Thus, the d 2|3 Z n+1 integration for the sum of f ± a,b,n−1,n+1 gives almost the same result as in (3.17) with c = n−1 except that the terms withQ log nc have to be modified since now nc = 0. A straightforward calculation shows that . (3.26) Note that, the 1-D integral over t for this term now is divergent since the poles of (3.26) meet with one endpoint of both integration regions. The divergence of this t integration can be easily removed by subtracting which is cancelled in the final result.
Case (iii): 2 = a < a+1 < b < b+1 < c < n−1. This case is very similar to the Case (ii). The d 2|3 Z n+1 integration for these boxes only give the square root ∆ n in eq. (3.13) with Now, the second rational parameterization (3.16) is available while the first one (3.15) is not. Again, the d 2|3 Z n+1 integration for the sum of f ± 2,b,c,n+1 gives almost the same result as in (3.22) with a = 2 except that the terms withQ log n a−1 have to be modified since now n a−1 = 0. A straightforward calculation shows that Again, the 1-D integral over t for this term now is divergent since the poles of (3.29) meet with one endpoint of both integration regions. The divergence of this t integration can be

JHEP03(2021)278
easily removed by subtracting which is cancelled in the final result.
Case (vi): 2 = a < a+1 < b < b+1 < c = n−1. We remark that in fact the last case does not introduce any square root since and we do not need it for the algebraic part. However, for completeness, we present the result for this case in appendix C.
Having rationalized the τ integrand, one can easily perform the τ integration for the above four-mass boxes and obtain generalized polylogarithms of weight 3. This is done by using, say PolylogTools [60] or HyperInt [61], or the algorithm provided in the appendix A of [24] if one only needs the symbol.

Algebraic letters and their multiplicative relations
The full computation including contributions from lower-mass boxes becomes tedious for large n, and for now we will be interested in a part of the answer depending on algebraic letters, which turns out to be quite neat. Before spelling out the general result for this part, we present all the algebraic letters appeared in the result. We write them in the form where a is a rational function of Plücker coordinates, and ∆ is a square root for one of the four-mass boxes. The nice thing about this representation is that the multiplicative relations of algebraic letters do not involve rational ones, as shown in appendix B. In terms of notations we introduced in Case(i), we find the following algebraic letters from the integral of d 2|3 Z n+1 ± f ± a,b,c,n+1 I a,b,c,n+1 : which are letters consisting of the symbol of the 4-mass box integral (3.2), and new symbol letters of the form (with * denotes possible superscripts) where x * a,b,c,n are simply x a , x b , x c defined in (3.18) as well as x a−1 , x b−1 , x c−1 differing by exchanges of particle labels. Here we restore the subscript to indicate the specific boxes.

JHEP03(2021)278
Since Case (ii) and Case (iii) can be viewed as degenerations of Case (i), the new algebraic letters produced by them have the same form as in (4.2).
All new algebraic letters are generated by cyclic rotations of eq. (4.2). Collect all new algebraic letters, we find they can be filled into two classes: , It is straightforward to show that there are 50 − 2m algebraic letters involve the same square root if the corresponding four-mass box has 0 ≤ m ≤ 4 corners that contain only two particles. Note that m also signifies the number of momentum twistors involved in such a square root. For the most generic case with m = 0, it involves at least 12 momentum twistors, a−1, a, a+1, · · · , d−1, d, d+1. For degenerate cases with m > 0, some particles become coincide, and the most degenerate case with m=4 we have only 8 momentum twistors. These X 's and X 's, together with z/z and (1 − z)/(1 −z) give a set of algebraic letters whose logarithms are invariant up to a sign under the cyclic rotation i → i+1 and the reflection i → n−i+1. Algebraic letters involving different ∆'s are manifestly multiplicatively independent, while algebraic letters involving the same ∆ are not. A rather remarkable observation we have is that there are precisely 33 multiplicative relations among them. In the most general form, these relations read: (4.7) Note that eqs. (4.6) and (4.7) can also be written in terms of X 's by using eq. (4.5).
These relations leave us 17 − 2m multiplicatively independent algebraic letters for where ∆ involves m corners that contain only two particles. Let us complete this subsection by commenting on the consistency of these algebraic letters with rational letters. These algebraic letters can be rewritten in terms of (a ± √ a 2 − 4b)/2. The discriminants a 2 − 4b are always proportional to ∆ 2 in eq. (3.3), which are the square-root branch points from the Landau analysis [62], while the branch points b = 0 correspond to zero locus of some rational letters since log(a− √ a 2 − 4b) = log b+O(b). It is straightforward to show that:  −1 and a, etc. . . The consistent alphabets for two-loop NMHV amplitudes thus require the appearance of factors on right-hand side of eq. (4.8). As we will see in the next section, these rational letters indeed appear the alphabet for the 2-loop 9-point NMHV amplitudes.

Algebraic words of the symbol and a large class of simple components
Once the algebraic letters are expressed in terms of (a + ∆)/(a − ∆), we can separate the words involving algebraic letters from the symbol unambiguously, and we call such JHEP03(2021)278 words algebraic words. As noted in [34] for n = 8, these algebraic words, although not integrable, follow a very simple pattern where the first two entries consist of the symbol of the four-mass box integral I a,b,c,d , while the third entry would be an arbitrary algebraic letter in the alphabet. Our calculation shows that not only this is true for all n, but there is a much stronger result, which we present now. In fact, the final entries and the accompanied R invariants are also completely fixed after knowing the first three entries of the algebraic words, as indicated in eqs. (3.17) and (3.22). In summary, when the third entry is non-degenerate X * a,b,c,d 's, the algebraic words become extremely simple: and likewise for X 's. We see that the x * a,b,c,d variables, which we have used to define X and X 's, exactly appear as the last entries for the corresponding third entries, and they also determine the accompanying R invariants.
When the third entry is z/z or (1 − z)/(1 −z), one can directly show from eq. (3.17) or eq. (3.22) that, for general a, b, c, d which are non-adjacent, the algebraic words take the form and

JHEP03(2021)278
as well as their cyclic images under the rotation a → b → c → d → a. Again, the final entries have to be modified as in eqs. (3.26) and (3.29) when c = d−2 or a = d+2. More precisely, the final entries in the first lines of (4.11) and (4.12) are modified by respectively. This concludes our result for the algebraic words of two-loop n-point NMHV.
Finally, let us remark on an obvious but interesting corollary from the pattern of algebraic words. Let's consider the χ i χ j χ k χ l components with non-adjacent i, j, k, l of the two-loop NMHV amplitudes (recall the MHV tree amplitudes are stripped off), or Wilson loops. Given that in the algebraic words, all R invariants always contain two pairs of adjacent particles, i.e. [a, a+1, b, b+1, c], no such components can be extracted, thus any such component is simply free of square roots! Note that when n is large enough, we have O(n 4 ) such component, which are the majority of all NMHV components.
Qualitatively we do expect these to be the simplest components of NMHV amplitudes, since they not only vanish at tree and one-loop level (which means they are finite at two loops), but each of them can be written as a combination of only two integrals! These facts are clear in the representation of NMHV amplitudes (up to two loops) in [43], but even more invariantly follow from the super-Wilson-loop picture [63]. As noted in [43], it is straightforward to show that the component χ i χ j χ k χ l is simply given by the difference of double-pentagon integral I dp (i, j, k, l) and its cyclic rotation I dp (l, i, j, k): and we record the definition of non-adjacent double pentagon integral I dp (i, j, k, l): where 1 and 2 denote the two bi-twistors for the loop momenta.
Remarkably, in terms of these integrals, what we find, rather indirectly through twoloop NMHV amplitudes, is that for any non-adjacent i, j, k, l, this difference is free of algebraic letters! We have also obtained the complete symbol of the differences for n = 8, 9, which depends on relatively small number of rational letters, and we expect the simplicity continues to all n (note the difference depends on at most 12 twistors).
Of course, these integrals themselves are important ingredients of two-loop amplitudes and it would be fantastic to study them individually. The symbol of such integrals are JHEP03(2021)278 currently unknown, and individual integral does contain algebraic letters involving ∆'s, as shown in [64] by evaluating it at a specific kinematic point. We leave the comprehensive study of both the differences and the integrals themselves to the future.

Comments on algebraic letters from leading singularities
Following [38,39], we make a simple observation that all our algebraic symbol letters for two-loop n-point NMHV amplitudes, (4.3), are "letters", or simply singularities, of one-loop leading singularities for the four-mass boxes. As reviewed in section 3, these are leading singularities (LS) gluing together four tree amplitudes, each with at least 4 legs. If the number of legs for them are n 1 , n 2 , n 3 , n 4 respectively, we have n = 4 i=1 n i − 8. For details of "letters" of leading singularities, or Yangian invariants, please refer to [39]. For these algebraic functions, we do not need to compute all the "letters", and it suffices to list the poles of such leading singularities.
The simplest examples are 8-point N 2 MHV leading singularities (with n i = 4-point MHV amplitudes for i = 1, 2, 3, 4), which was the primary example in [38,39]. Quite nicely, we find that there are 9 independent letters associated with such a leading singularity, all containing the square root e.g. ∆ 2,4,6,8 , and similarly 9 letters with square root ∆ 1,3,5,7 . We denote such leading singularities as L k=2 2,4,6,8 and L k=2 1,3,5,7 . By just using these two leading singularities, we obtain exactly the 18-dim space of algebraic symbol letters for our n = 8 case. 6 Encouraged by this success, now we move to general 1-loop four-mass leading singularity, which was given in (3.5). The independent algebraic letters/poles of such a leading singularity are given by the 9 independent ones of L k=2 a,b,c,d and those from the four tree amplitudes at the corners.
To be concrete, we focus on a particularly simple sub-class of these leading singularities, where each corner has either MHV or NMHV degree (k i = 0, 1 for i = 1, 2, 3, 4). N 2 MHV leading singularities correspond to all k i = 0, and now we allow some (or all) of the corners to have k i = 1. We start with the case where one corner, say, the first one, has k 1 = 1, and without loss of generality we consider (a, b, c, d) = (1, 4, 6, 8) for n = 9 (i.e. 3 external legs only at the first corner). In this case we have A n 1 =5,k 1 =1 = [α, 1, 2, 3, β] (and the other three A = 1). Now in addition to the 9 independent algebraic letters of L k=2 (1, 4, 6, 8), we have 5 letters/poles from [α, 1, 2, 3, β]: αβ12 , αβ23 , αβ13 , α123 , β123 . (4.14) Note that α = (91)∩(87γ) and β = (34)∩(56δ), thus α123 , β123 and αβ13 are in fact rational functions of Plücker coordinates, so the only two algebraic letters are αβ12 and αβ23 . It is straightforward to check that they are multiplicatively independent with the 9 letters above, thus we conclude that there are 9 + 2 = 11 independent algebraic letters for this leading singularity. By taking the ratio of two solutions ±, they span precisely the same space as the 11 algebraic symbol letters for n = 9 which are associated with ∆ 1,4,6,8 . More generally, it turns out that the complete algebraic alphabet of two-loop n-point NMHV, (4.3), can be obtained from one-loop four-mass leading singularities with k i = 0, 1.

JHEP03(2021)278
The correspondence works for each four-mass configuration (a, b, c, d) individually: for the generic case, the 17 independent symbol letters of two-loop NMHV amplitude with square root of ∆ a,b,c,d can be obtained from a single leading singularities in (3.5) with 4 NMHV tree amplitudes, i.e. k i = 1 for i = 1, 2, 3, 4. One can check that in addition to the 9 algebraic letters for L k=2 a,b,c,d , each tree amplitudes at least contain two new algebraic letters similar to those in the n = 9 example. Altogether this means we can generate the 17 independent algebraic letters for the generic case, which we first encounter at n = 12. Note that as we have seen before [39], the correspondence between letters from leading singularities and symbol letters does no preserve k: for NMHV (two-loop) amplitudes, we need leading singularities with up to k = 6.

The complete symbol and alphabet: n = 9 example
Now that we have the algebraic part of the symbol, we can finish the calculation by including the rational part which also receive contribution from all lower-mass boxes. This part has been automatized, which produces the complete symbol of two-loop n-point NMHV amplitude. However, the length of the symbol (especially the rational data) grows rapidly when n increases, and we content ourselves by presenting the result for n = 9 as an example.

JHEP03(2021)278
A few comments are in order. First of all, by combining these 59 × 9 rational letters with 11 × 9 independent algebraic letters, we have the complete alphabet of 630 letters for n = 9. We see that the alphabet is consistent as we have mentioned in the last section: for each algebraic letter of the form a ± √ a 2 − 4b, b is indeed a rational letter. We expect this to hold for all multiplicities. Furthermore, we have found some discrepancies with the predictions from Landau analysis [62]: not only some rational letters predicted there do not appear in our alphabet, but more importantly exactly the last class, i.e. cyclic rotations of 1 (56) ∩ (3) (78) ∩ (3) 9 , are absent in the Landau analysis of [62].

Consistency checks
Even before obtaining the final result, theQ calculation is very rigid: it cannot be carried through till the end unless various tests have been passed. For example, in the collinear integral, all the log divergence must be accompanied by vanishing τ -integrals; also it is highly non-trivial that we are able to convert the arguments ofQ log into DCI combinations. All that being said, to make sure our result is correct, we have performed various consistency checks including easy ones such as cyclicity, dual conformal invariance and the condition of physical first entries. Let's present details for the more non-trivial checks, such as integrability, collinear limits and absence of spurious poles.
In these checks, it's usually difficult to determine whether a symbol with algebraic letters vanishes or not before imposing the multiplicative relations of algebraic letters. We leave this technical problem in the appendix B.
Integrability. It is a non-trivial but crucial check that our symbol is integrable. We expand the symbol of the total differential on a basis of n−1 4 R invariants, and we check that each coefficient can be integrated to a function. These coefficients have the form where symbols in the coefficients of d log comes from polylogarithms, so it's integrable if and only if (see [66,67]) In order to calculate d log l i , we choose a positive parameterization of Gr + (4, 9)/T [68] which makes all arguments of square roots positive x 0,0 and the unlabeled face variables are fixed to be 1.
We use this positive parametrization to check the integrability of the coefficients of all linear-independent R invariants, and we find that they are indeed integrable.

Collinear limits.
We check that the NMHV 9-point amplitude reduces to NMHV and MHV 8-point amplitude upon taking the k-preserving and k-decreasing collinear limits respectively. We consider the limit 9||8 by sending for fixed τ then taking the limit η → 0 before → 0. Under the k preserving limit, R invariants behave as [abc89] → 0 and [abcd9] → [abcd8], while under the k decreasing limit, the R invariants behave as [1a789] → 1 with the others vanishing. After taking such limits and keeping leading terms of η and , it is highly non-trivial that the limits do not depend on the parameters η, and τ , i.e. it has smooth limits, and then we find that these two limits are exactly the known symbols of NMHV 8-point amplitude [34] and MHV 8-point amplitude [42]. each of which is of the form 1abc and belongs to exactly one R invariant in our basis. The cancellation of the pole 1abc means that the coefficient of the corresponding R invariant vanishes as 1abc → 0. To see this, we send Z 1 → αZ a + βZ b + γZ c + δZ 9 for fixed α, β, γ, and verified numerically that the coefficient of [1abc9] vanishes under the limit of δ → 0.

Discussions
In this paper, following the n = 8 result [34], we have systematically studied NMHV amplitudes to all multiplicities based on the recursive method ofQ equations [24]. In addition to the first all-loop results for last-entry conditions of n-point NMHV amplitudes (2.16)-(2.18), we have focused on the computation of two-loop NMHV amplitudes. The main

JHEP03(2021)278
results we have presented are the symbol and alphabet of the non-trivial, algebraic words, derived using relevant four-mass boxes for one-loop N 2 MHV amplitudes. For a generic square root which involves four corners with at least 3 particles, we find 50 algebraic letters (4.3) satisfying exactly 33 multiplicative relations, (4.5)-(4.7), thus resulting in 17 independent algebraic letters (for degenerate cases the number reduces to 17 − 2m with 1 ≤ m ≤ 4 corners containing 2 particles). The symbol has a nice pattern where the R-invariant and last-entry are directly correlated with the algebraic letters on the third entry (while the first two entries being the symbol of four-mass boxes). Moreover, we have computed for the first time the complete symbol for n = 9, and obtained the full alphabet with 59 × 9 rational letters, in addition to 11 × 9 algebraic ones. Our results have passed various consistency checks, and interestingly the rational letters for n = 9 raise tensions with Landau analysis though the majority of them are consistent with it. One of the motivations here is to extend the n = 8 alphabet [34] to higher n, namely the algebraic letters for all n and the full alphabet for at least n = 9. It is straightforward but tedious to compute the full alphabet for higher n, which would provide a new family of data points besides n-point MHV alphabet [42]. It is then highly desirable to "explain" such alphabets from certain mathematical structures [35][36][37]. We have provided a simple explanation by listing the letters/poles of one-loop leading singularities with MHV/NMHV corners, and it would be interesting to pursue that direction further. For example, even when restricted to quadratic ones, we find many "new" algebraic letters/poles (most of which with new ∆'s) of higher-loop leading singularities already for n = 9, 10, and it would be interesting to see which of them appear as symbol letters. Moreover, the remarkable simplicity of the algebraic words suggests a deeper structures, and it is worth studying properties such as cluster adjacency/extended Steinmann [69][70][71], now for the part involving algebraic letters. It is also highly desirable to "complete" such algebraic words into integrable ones, which would allow us to write down weight-4 functions for the algebraic part.
Regarding computation of loop amplitudes, the most pressing question is to compute the long-sought-after symbol of three-loop n = 8 MHV, from our two-loop n = 9 NMHV results. As is familiar fromQ computations before, the computation of MHV amplitudes from NMHV ones can be completely automatized, though again we need to rationalize all the square roots as we have done for two-loop NMHV in this paper. This is a tedious but straightforward exercise, and we expect to report the result in the near future [72], which would add a data point of the alphabet as well as give the "lost symbol" for the octagon. Moreover, since the method for rationalizing square roots works for all multiplicities, it is conceivable that one can compute the algebraic words of higher-point three-loop MHV from those of two-loop NMHV as well.
We have focused on the symbol so far, but it should be possible to obtain polylogarithm functions from our symbol, at least for two-loop NMHV octagons (see recent works on heptagons [73]). Moreover, a fascinating question is if we can "bootstrap" octagons, based on the alphabet, first and last entries, as well as constraints from collinear limits etc. similar to the hexagon and heptagon bootstrap. A potential issue is how to implement (extended) Steinmann relations in some way, which at least naively do not apply to n = 8 (or any JHEP03(2021)278 multiple of 4) due to the lack of BDS-like normalization [74]. If one could resolve that issue, it may be possible to bootstrap to three loops and higher, which would be a strong test on some conjectural alphabet of octagons. A particularly simple example is given by our special class of components which are free of algebraic letters: for example the component χ 1 χ 3 χ 5 χ 7 of the octagon has a simple symbol with only 68 (out of 180) rational letters, and it would be interesting to uplift it to a weight-4 function (or even directly bootstrap). Given the simple relation of such components to double-pentagon integrals, such results may also shed light into these unknown Feynman integrals.
It would be interesting to push the limit of our method based on anomaly equations even further. Higher-point three-loop NMHV and four-loop MHV amplitudes can be reached if we have the corresponding two-loop N 2 MHV amplitudes. The simplest one is the two-loop N 2 MHV octagon, which should be completely fixed byQ equations and parity; the point is not only to re-derive three-loop NMHV heptagon and four-loop MHV hexagon from first-principle computations, but also illustrate the structures of theQ-method further. Of course, to go to even higher n, k and loops, we would need the more general method involving solving bothQ and Q (1) equations, which are related by parity. We leave the study of the anomaly equations and their applications to higher-loop amplitudes to the future. Finally, it is tempting to ask the following: can we formulate a question based on these anomaly equations, to which the non-perturbative S-matrix of planar N = 4 SYM is the (unique) answer?

A The effects of d d 3 χ n+1 on all rational N 2 MHV Yangian invariants
Here we require 1 ≤ i 1 < i 2 < · · · < i 10 ≤ n+1 and define X := n ∧ B where Z B = Z n−1 − Cτ Z 1 . The effects of the operation d d 3 χ n+1 on NMHV Yangian invariants are known from [24]  There are 14 classes of N 2 MHV Yangian invariants which can be found in [2]. One of them is the four-mass box Yangian invariants which we have elaborated in the main text. In

1.
The effect of The effect of d d 3 χ n+1 on this Yangian invariant is the same as in (A.2).
The effect of d d 3 χ n+1 on this Yangian invariant is the same as , for which we can apply eq. (A.1).

JHEP03(2021)278
For generic i 1 < i 2 < i 3 < i 4 < i 5 where i 1 > 1 and i 5 < n−1, the operation The effect of d d 3 χ n+1 on this Yangian invariant is the same as for which we can apply eq. (A.1).
The effect of

JHEP03(2021)278
For i 1 = 1 and i 6 < n−1, eq. (A.22) reduces to The effect of d d 3 χ n+1 on this Yangian invariant is the same as , for which we can apply eq. (A.1).
The effect of d d 3 χ n+1 on this Yangian invariant is the same as , for which we can apply eq. (A.1).

[i
One of the R-invariants have a smooth limit under Z n+1 → Z n , and the other follows the replacement rule (A.1).

B Simple facts of field extension
When the symbol involves algebraic letters, there is an important technical question: how to find a basis of (numerical) algebraic letters

JHEP03(2021)278
such that all algebraic letters are product of powers of letters in the basis and some rational numbers? It's difficult to find it directly because rational numbers are indefinite, so we first normalize algebraic letters to fix this uncertainty by introducing the norm of a number in a field extension [75]. Suppose we have the square roots √ c 1 , √ c 2 , . . . , √ c n in letters, where {c i } 1≤i≤n is multiplicative independent. Consider the field K = Q( √ c 1 , √ c 2 , . . . , √ c n ). As a field extension of Q, K is a 2 n -dimensional Q-vector field, each element a ∈ K defines a linear operator L a : K → K by L a (b) := ab, and we define the norm N (a) to be det(L a ). It's clear from the definition that (1). N (ab) = N (a)N (b), (2).
. N (a) = a 2 n if a ∈ k.
The main lemma used here is that 1 and −1 are only possible rational numbers with unit norm in K.
Since our irrational letters always have the form with unit norm, if a product α l nα α ∈ K is rational, it can only be 1 or −1 according to the lemma. Therefore, such a multiplicative relation is equivalent to a linear relation of log(|l α |) up to a overall sign, α n α log(|l α |) = 0, which is very easy to handle for computers, e.g. by the PSLQ algorithm [76].

C Details on case (iv)
For case (iv), rational parameterizations (3.15) and (3.16) are not available, but one can easily find the following parameterization τ = r(t + s) t(t + 1) (C.1)