A Three-Point Form Factor Through Five Loops

We bootstrap the three-point form factor of the chiral part of the stress-tensor supermultiplet in planar $\mathcal{N}=4$ super-Yang-Mills theory, obtaining new results at three, four, and five loops. Our construction employs known conditions on the first, second, and final entries of the symbol, combined with new multiple-final-entry conditions, ``extended-Steinmann-like'' conditions, and near-collinear data from the recently-developed form factor operator product expansion. Our results are expected to give the maximally transcendental parts of the $gg\to Hg$ and $H\to ggg$ amplitudes in the heavy-top limit of QCD. At two loops, the extended-Steinmann-like space of functions we describe contains all transcendental functions required for four-point amplitudes with one massive and three massless external legs, and all massless internal lines, including processes such as $gg\to Hg$ and $\gamma^*\to q\bar{q}g$. We expect the extended-Steinmann-like space to contain these amplitudes at higher loops as well, although not to arbitrarily high loop order. We present evidence that the planar $\mathcal{N}=4$ three-point form factor can be placed in an even smaller space of functions, with no independent $\zeta$ values at weights two and three.


Introduction
The most important production mechanism for the Higgs boson at the Large Hadron Collider is the gluon fusion process, mediated by a top quark loop. Higher-order QCD corrections to this process are very large, necessitating its understanding to at least next-to-next-to-nextto-leading order (N 3 LO) in the strong coupling α s [1,2]. In particular, matrix elements for the Higgs boson plus n additional gluons are required. In the limit where the mass of the top quark is taken to be infinite, these amplitudes can be thought of as the n-point form factors of the gauge-invariant local composite operator tr(F 2 ) [3,4]: F tr(F 2 ) (p 1 , . . . , p n ; q) = d d x e −iq·x 1, . . . , n| tr(F 2 )(x)|0 , (1.1) all loop orders by the integrability-based pentagon operator product expansion (POPE) [50][51][52][53][54][55][56][57][58][59]. Recently, an analogous description of the near-collinear limit of form factors has been developed, termed the form factor operator product expansion (FFOPE) [60][61][62]. It provides physical constraints on the near-collinear behavior of the form factors of the chiral part of the stress-tensor supermultiplet at any loop order.
In this paper, we study the three-point maximum-helicity-violating (MHV) form factor of the chiral part of the stress-tensor supermultiplet. This form factor depends on the three ratios u = s 12 /q 2 , v = s 23 /q 2 and w = s 31 /q 2 . However, since q 2 = s 123 = s 12 + s 23 + s 31 , these variables satisfy the constraint u + v + w = 1, and the form factor actually depends on only two variables. These are exactly the same variables on which the corresponding Higgs boson amplitudes in QCD depend. The two-loop contribution to this form factor has been computed via unitarity methods; moreover, it was shown to be uniquely determined by a bootstrap approach involving conditions on the first, second, and final entries of the symbol [9]. The first two entry conditions restrict the single and double discontinuities of the form factor, while the final-entry conditions constrain its differential structure. Here we exploit the recent progress on the FFOPE to bootstrap this form factor all the way to five loops.
In addition to leveraging the FFOPE, we observe and exploit two new forms of mathematical structure in the infrared-finite part of the form factor, which are not obeyed by the remainder function. First, we observe new multiple-final-entry conditions that restrict linear combinations of the last two or three entries of the symbol of the form factor. Second, we discover certain "extended-Steinmann-like" (ES-like) conditions [63,64] at all depths in the symbol. In the three-point form factor alphabet {u, v, w, 1 − u, 1 − v, 1 − w}, these ES-like conditions imply that 1 − u never appears next to 1 − v or 1 − w in the symbol, nor 1 − v next to 1 − w. While this condition does not transparently follow from the standard Steinmann relations, it is inspired by studying the Steinmann and cluster algebra constraints for heptagon functions [45,64], since the pentabox ladder integrals [65,66] contribute to both the seven-particle amplitude and the three-point form factor.
To carry out our bootstrap, we define a space of polylogarithmic functions M that contains the infrared-finite part of the form factor. This function space draws from the symbol alphabet {u, v, w, 1 − u, 1 − v, 1 − w} and obeys the ES-like conditions described in the last paragraph. We find that the dimension of M is exactly 3 w at weight w (at symbol level). Although this growth rate is considerably slower than without the ES-like conditions, it is still considerably faster than the hexagon function space H associated with the sixpoint amplitude, which grows approximately as ∼ 1.8 w [63]. On the other hand, the simple dependence of the dimension on the weight suggests that a direct construction of this function space should be possible.
In the absence of a direct construction of M, it would be helpful if there were a smaller space of functions that contained the finite part of the form factor. And indeed, when we normalize the form factor in a slightly different way, we learn by taking its derivatives (or rather, its iterated {n − 1, 1} coproducts) that the space M is still larger than necessary for the planar N = 4 form factor. We also begin to see aspects of cosmic Galois theory, or the coaction principle [67][68][69][70], which is also seen in the space of hexagon functions [63]. Notably, in this smaller space C, the constants ζ 2 and ζ 3 no longer need to be treated as independent functions; they are locked to the other, symbol-level functions.
The rest of this paper is structured as follows. We review some basic properties of the three-point MHV form factor of the chiral part of the stress tensor in section 2. Moreover, we analyze its two-loop remainder function, which exhibits many of the properties that we will generalize to higher loop orders in subsequent sections. In section 3, we construct the function space M in which the form factor lives. We then bootstrap this form factor at three-, four-, and five-loop orders, and analyze the minimal space C ⊂ M that appears in the coproduct of these functions in section 4. In section 5, we study the behavior of the remainder function through five loops, plotting its dependence on various combinations of parameters, and considering several kinematic limits. In section 6, we present evidence that the space M may also govern amplitudes with the same kinematics in arbitrary massless theories. Our conclusions and outlook are contained in section 7.
There are two appendices. Appendix A describes the pentabox ladders and their relation to both M and heptagon functions. Appendix B collects some explicit results for the nearcollinear limits of the form factor remainder function needed to make contact with the FFOPE.
Ancillary files: We include three ancillary files. The first one, cEandRsymbols.txt, gives the symbols of the finite part of the form factor through five loops, and of the remainder function through four loops. The second and third files, T2terms.txt and T4terms.txt, provide the T 2 and T 4 terms, respectively, in the near-collinear limit through five loops.

BPS Form Factors and Polylogarithms
In this paper, we study the half-BPS operator corresponding to the chiral part of the stresstensor supermultiplet in planar N = 4 sYM theory. This supermultiplet includes the scalar operator tr(φ 2 ), which is part of the so-called 20 ′ multiplet, as well as the chiral part of the on-shell Lagrangian, which includes the self-dual operator tr(F 2 SD ). We refer to refs. [15,18,71] for more background on these form factors, and their formulation in N = 4 harmonic superspace.
Similar to scattering amplitudes, a helicity degree can be assigned to form factors, corresponding to the helicity of the n external massless states. Form factors with helicity degree n−k −2 are related to those with helicity degree k by parity, which acts on the external states as well as on the operator. Thus, non-trivial N k MHV form factors first occur for n = 2k + 2.
For n = 2 external states, the form factor only depends on the single scale s 12 = (p 1 +p 2 ) 2 , where p i is the momentum associated with the i th external state. Dimensional analysis dictates that this scale dependence has to factor out, leaving us with just a number at each perturbative order. The first case described by a non-trivial function thus involves n = 3 external states. The only non-trivial form factor for this number of external states is MHV, namely F n,k=0 ≡ F MHV (2.1) We rescale these invariants by the invariant mass of the operator's momentum, in order to define three dimensionless ratios, These variables obey the constraint due to momentum conservation; as in the case n = 2, the overall scale dependence factors out.
The form factor F MHV n obeys an exponentiation relation similar to that of amplitudes in planar N = 4 sYM theory [9]. We thus define a finite remainder function by where F BDS n is an appropriately exponentiated one-loop form factor [72]. (We will provide more details in the three-point case shortly.) Because the collinear behavior of amplitudes iterates in planar N = 4 sYM theory [73], the remainder function has smooth collinear limits, Moreover, it is only non-zero starting at three points and at two loops, when expanded in the 't Hooft coupling for gauge group SU (N ), g 2 = g 2 YM N 16π 2 : Thus, the three-point form factor remainder function has vanishing collinear limits in all three channels,

The two-loop remainder function
The two-loop contribution to F MHV 3 was computed explicitly in ref. [9]. There, it was found that the remainder function could be expressed entirely in terms of classical polylogarithms as where u i+3 = u i , {u 1 , u 2 , u 3 } = {u, v, w}, and we have made use of the function (2.10) While this form of R 3 makes its dihedral symmetry manifest, we recall that it is really just a function of two variables due to the constraint (2.4).
Classical polylogarithms are particular examples of generalized polylogarithms, or iterated integrals over logarithmic integration kernels [74][75][76][77][78][79]. These functions can be defined iteratively by an integration base point and their total differential, where the sum is over all logarithmic branch points appearing in the polylogarithmic function F . They are commonly expressed in the notation For instance, classical polylogarithms become Another important subclass of generalized polylogarithms are the harmonic polylogarithms (HPLs) [77] H a (z) with a i ∈ {0, 1, −1}. They are related by where p is the number of letters 1 in a. Transcendental constants such as multiple zeta values (MZVs) and alternating Euler-Zagier sums also naturally appear in this space, as special values of the functions (2.12). Generalized polylogarithms can be assigned a transcendental weight, corresponding to the number of logarithmic integrations that appear in their definition; the weight of a product of polylogarithms is given by the sum of weights. In the G a 1 ,...,an (z) notation, this simply corresponds to the number of indices n. Similar to amplitudes in planar N = 4 sYM theory, the form factor remainder function (2.9) is observed to have a uniform transcendental weight equal to twice the loop order.
To understand the analytic structure of the polylogarithmic function (2.9), it proves useful to study its symbol [80]. The symbol of a generic polylogarithm F can be defined recursively in terms of its total differential (2.11) to be where S maps a rational function back to itself, allowing this recursion to terminate. The symbol can also be defined as the maximal iteration of the motivic coaction that acts on generalized polylogarithms [81][82][83][84][85]. This definition allows one to retain information about all constant letters φ other than iπ, which can be useful in practice; see for instance ref. [86]. The algebraic functions φ that appear in the tensor product (2.15) are referred to as symbol letters, and the set of all multiplicatively independent letters appearing in the symbol of F is referred to as its symbol alphabet. The symbol alphabet identifies the location of a function's logarithmic branch points; correspondingly, the symbol can be thought of as extracting all of a function's non-zero sequences of discontinuities.
The symbol of R 3 is found to take the remarkably simple form [9]: 16) where the expression in the brackets is summed over all three cyclic permutations (u i → u i+1 ) of the external states. From this result, we can read off the symbol alphabet of R 3 to be In addition, we see that the first entry of the symbol is always drawn from the smaller set consistent with the branch cuts for massless processes always starting at either s ij = 0 or There is a natural action of the dihedral group D 3 ≡ S 3 , which is generated by the two transformations: The MHV form factor and remainder function should both be invariant under S 3 , i.e. under all permutations of u, v, w.
After eliminating w in favor of u and v in the symbol alphabet (2.17) using the constraint (2.4), the symbol alphabet can be rewritten as (2.20) It follows that not all functions that draw on this symbol alphabet can be written as (products of) single-variable functions. Instead, the alphabet (2.20) can be seen to give rise to the space of 2dHPLs [87].

Adjacent-letter restrictions on the form factor
Further interesting properties arise when we study the finite part of the form factor itself, instead of the remainder function. This procedure is similar to defining a BDS-like normalized amplitude, which exposed the constraints from the Steinmann relations in six-and seven-point amplitudes [42,45]. The one-loop form factor is [9,14] M (1) where we have omitted an overall factor of e −ǫγ E /(4π) −ǫ , with γ E the Euler-Mascheroni constant, and where is the finite, dual conformally invariant part. 3 For our "BDS-like" ansatz here, we (initially) use the minimal infrared-divergent part of M (2.23) When we convert from BDS normalization (2.5) to BDS-like normalization,
(2.26) 3 We chose the ζ2 part of E (1) so that E (1) has no constant under the logarithms in the soft limit u, v → 0.
In any event, we will shift to another normalization later. 4 Henceforth, we drop the n = 3 subscript.
What consequences does this have for the symbol of E (2) ? Expanding eq. (2.25) to order g 4 , we have Notice that in several terms in the symbol (2.16), the letter (1− u) appears adjacent to (1− v) or (1−w). However, these appearances are all attributable to the term −2 i Li 2 (1 − 1/u i ) 2 in eq. (2.9). For example, the symbol of J 4 is which has the letters (1−u i ) only in the fourth entry. Using Li 2 (1−1/u) = −Li 2 (1−u)− 1 2 ln 2 u, it is easy to see that the terms with (1 − u i ) adjacent to (1 − u j ) for i = j are all cancelled by adding 1 2 [E (1) ] 2 to the remainder function. Thus, the symbol of E (2) obeys the novel ES-like restrictions plus the five other conditions generated by the S 3 dihedral symmetry (2.19). 5 Using compatibility graphs, the same restriction can be seen to hold to all loop orders for planar ladder-box integrals [90], including the explicit three-loop result of ref. [91]. As we will discuss further in section 6, there is considerable evidence at two loops that these restrictions are broadly applicable to all processes with the same kinematics, i.e. four-point scattering amplitudes with one massive external leg and three massless ones, and all massless internal lines, planar and non-planar. Correspondingly, we will adopt eq. (2.29) as part of the definition of the form factor function space M in the next section. The condition (2.29) does not appear to arise from any standard (extended) Steinmann relation [92,93]; in fact, the letters 1 − u i are not associated with physical thresholds. On the other hand, we understand these conditions in the pentabox ladder integrals [65,66]. As will be discussed further in appendix A, these integrals belong to both M and the space of heptagon functions relevant for seven-point amplitudes. As heptagon functions, they inherit adjacency restrictions from (extended) Steinmann relations (or cluster adjacency conditions) governing this space [45,64], and these restrictions imply eq. (2.29).

Kinematic regions
Finally, let us discuss the various kinematic regions that can be accessed. All real values of (u, v) correspond to physical scattering or decay processes, as depicted in Fig. 1. In the Euclidean region I, where all four of the Mandelstam invariants are negative and 0 < u, v, w < 1, the form factor is manifestly real. For infrared-finite expressions, this region is equivalent to the pseudo-Euclidean region with all four invariants positive, which describes the decay of the (time-like) operator insertion into three massless particles -for instance, a Higgs boson decaying into three positive-helicity gluons, H → g + g + g + . Scattering region IIa, where w < 0 < u, v, describes instead a scattering process involving a space-like operator (similar to deep-inelastic scattering), such as a (space-like) Higgs boson and a gluon scattering into two gluons, Hg − → g + g + . Scattering regions IIb and IIc are obtained by cyclically permuting (u, v, w) from region IIa. Finally, scattering region IIIa with u, v < 0 < w, as well as its images under cyclic permutations, IIIb and IIIc, describe time-like Higgs production, say g − g − → Hg + .
The dashed lines in Fig. 1 correspond to potential spurious poles, because they coincide with the vanishing loci of the letters 1 − u, 1 − v, and 1 − w. On the line u = 1, the letters (2.20) of S 3 become {v, 1 − v, 1 + v}, and so the functions in M all collapse to HPLs H a (v) with a i ∈ {0, 1, −1}, making it straightforward to plot the form factor or remainder function there. On the dotted line with u = v, the letters become {u, 1 − u, 1 − 2u}, which maps to the same function space with argument 2u − 1. In section 5.1, we will plot the remainder function on the dashed line u = 1 and the dotted line u = v through five loops.

The Form Factor Function Space
The polylogarithms that contribute to the one-and two-loop form factor have a number of notable features, which can be abstracted away from the specific weight-two and weight-four functions E (1) and E (2) . In particular, the polylogarithms that appear in these functions can be individually chosen to obey certain branch cut conditions and restrictions on their adjacent symbol letters. In combination with the symbol alphabet (2.17), we generalize these properties to all transcendental weights to define the three-point form factor space of functions M. Specifically, we define M to be the space of polylogarithms that satisfies the following criteria: (i) Symbol Alphabet: their symbol only involves letters that appear in S 3 , as given in eq. (2.17) or equivalently eq. (2.20), (ii) Branch Cut Condition: they develop logarithmic branch cuts only at physical thresholds, corresponding to their symbol's first entries belonging to {u, v, w}, (iii) Extended-Steinmann-Like: the letter 1 − u never appears adjacent to 1 − v or 1 − w in the symbol, nor 1 − v next to 1 − w.
We conjecture that this space of functions contains the perturbative three-point MHV form factor to all loop orders.
Since M contains only polylogarithms, it is graded by transcendental weight. Namely, we can decompose where M w is the space of polylogarithms of weight w that satisfy the above constraints. It is analogous to the hexagon and heptagon function spaces relevant to six-and seven-particle scattering in planar N = 4 sYM theory [41-46, 63, 94-97], and it can be constructed iteratively in the weight. We also assume that the L-loop form factor has weight 2L, i.e. E (L) ∈ M 2L . We first describe how this iterative construction can be carried out using the coproduct formalism, and then we discuss how the dimension of M w grows with the weight w.

Construction of the space M
The method of building polylogarithmic spaces of functions from a fixed symbol alphabet has been described in a number of places in the literature (see for instance refs. [41,98] or appendix D of ref. [96]), so we here describe it only briefly. The basic idea is to build the space of coproducts that correspond to the desired polylogarithms, rather than the functions themselves. This simplifies the description of the function space at each weight, at the cost of making certain properties of the functions non-manifest. Information about integration constants also has to be retained separately.
To begin our iterative construction of M, we first determine which logarithms exist in M 1 . From condition (i), we have a candidate six-dimensional space of logarithms. However, as described by condition (ii), form factors are only expected to have discontinuities at thresholds for physical particle production, i.e. where (multiple) internal propagators in Feynman integrals can go on shell. Since we are in a massless theory, this can only happen when one of the Mandelstam invariants vanishes. From equation (2.3), it is clear that setting any single Mandelstam invariant to zero causes u, v, or w to vanish (or all of them to become infinite); in no case does it cause any of them to approach 1. Thus, the only functions appearing in M at weight one are ln u, ln v, and ln w.
To proceed to higher weights, we construct the space of coproducts corresponding to the functions in M w , rather than the functions themselves. More specifically, at each weight we start from an ansatz for the coproduct component involving functions of one lower weight in the first entry: , and the c ij ∈ Q are undetermined coefficients. The coproduct of each function in M w will be given by some value of the coefficients c ij in this ansatz, but not all values of these coefficients correspond to valid functions in M w . Thus, we need to constrain this ansatz further.
By assumption, each of the functions F (i) w−1 satisfy constraints (i)-(iii). Thus, condition (i) is automatically satisfied by our ansatz for M w since each of the φ j are also drawn from S 3 . To impose condition (iii), we replace each function F where we have introduced notation in which we denote the 'φ k coproduct entry' of a function F with a superscript, as F φ k . The ES-like conditions can then be imposed on the doublecoproduct ansatz (3.3) by requiring that plus all dihedral permutations, where F φ i ,φ j denotes the linear combination of functions appearing in the first entry of the tensor product (3.2) with second and third entries ln φ i and ln φ j . In addition to this constraint, we must also require that F is a genuine function. We can ensure this by solving the integrability conditions on adjacent pairs of symbol letters. Taking into account the conditions (3.4) that we have already imposed, it is sufficient to require plus all dihedral permutations. Together, equations (3.4) and (3.5) generate 12 conditions on our ansatz. Finally, we must impose the branch cut condition (ii) on our ansatz. This can be done by requiring plus the two cyclically related conditions These conditions prevent the logarithms ln(1−u), ln(1−v), ln(1−w) from appearing at higher weights when multiplied by zeta values. As discussed above, such functions have branch cuts in unphysical locations, so they must be forbidden. In principle, eq. (3.6) could be imposed at other points on the line u = 1. However, it is simplest to impose the condition here, because functions in M in the vicinity of u = 1 with v, w → 0 are simple polynomials in ln v and ln w, with zeta-valued coefficients. We require the vanishing of these polynomials for any This constraint removes functions even at symbol level, unlike what happens for the hexagon or heptagon functions. We can also fix the constant of integration for an arbitrary function F ∈ M at the same point (1, v → 0, w → 0); for example, we can require that the constant in the polynomial in ln v and ln w at this point is zero for each function.

Growth of the space M
Solving the conditions outlined in the last section at symbol level through weight eight, we find exactly 3 w independent functions at each weight w. The generating function for this dimensionality, d(w) = 3 w , is simply Such a simple dimensionality begs for an all-orders construction, but so far we have just let the computer solve the conditions. If we had not imposed the ES-like relations, we would expect a faster asymptotic growth rate of 4 w . This growth rate stems from the fact that four of the letters in eq. (2.20) contain u. Hence G 0, a (u), G 1, a (u), G 1−v, a (u), and G −v, a (u) all provide possible functions at weight w, for each allowed function G a (u) at weight (w − 1). At weight eight, we would expect about 10 times as many functions without the ES-like relations, making bootstrapping much more difficult.
When v → 0, the symbol alphabet collapses to {u, 1 − u, v}. Thus, on the v → 0 line, the limiting behavior of the function space M just involves logarithms in v, and HPLs H a (1 − u) with a i ∈ {0, 1} [77]. The constants associated with these functions are MZVs, which (motivically) have the generating function [99][100][101] We define M to include constant functions for all such MZVs. Thus the dimensionality of the full function space M is generated by In practice, we have constructed the full function space through weight eight.
Many of the functions in M are quite simple. Any polynomial in ln u, ln v, ln w is in the space. Furthermore, if a function F is in the space, then so is ln , where the last a k must be 1. On the other hand, we cannot multiply two of these functions for different u i together. For example, the product H 0, Let us define the "simple" functions in M to be all of the functions H a (1 − u i ) multiplied by arbitrary polynomials in all of the ln u j . The generating function for this space of simple functions is where the factor of 1/(1 − t) 3 accounts for the polynomials in all the ln u j . The remaining functions depend irreducibly on two variables, i.e. they are true 2dHPLs [87]. We do not yet have a closed-form construction of such functions in M. However, we can count them.
We remove the ones that are simply powers of logs times lower weight 2dHPL functions by multiplying by (1 − t) 3 . Thus the generating function for the 2d functions is at symbol level. Non-classical polylogarithms can first appear at weight four. Applying the Lie cobracket test in ref. [80], we find that there are 9 non-classical functions, which are contained within the set of 12 weight-four 2dHPLs in eq. (3.11). For example, the function r (2) i appearing in the operator form factor studied in ref. [102] contains non-classical polylogarithms and appears in the space M.

Bootstrapping the Three-Point Form Factor
Having constructed M up to weight eight, we now want to identify the planar N = 4 sYM form factor within this space. We first assume that the L-loop contribution to the form factor has uniform transcendental weight 2L, as is true for scattering amplitudes in this theory. A number of additional constraints were used to bootstrap the two-loop remainder function in ref. [9]. However, as discussed in section 2, the remainder function does not belong to M because it does not satisfy the constraint (2.29). Only the form factor E, defined via the relation (2.25), satisfies this constraint.
In fact, there is an even more optimal normalization for bootstrapping the three-point form factor. Its definition just differs from that of E by an exponentiated product of logs: where The shift from E to E corresponds to using a different BDS-like normalization of the form factor. Because Hence, although our bootstrap was initially carried out in terms of the form factor function E, we will also present the bootstrap constraints on E. In particular, we will show that E can be bootstrapped through four-loop order using very little information from the FFOPE. Before considering the constraints on E or E, let us first summarize the constraints that can be imposed on the remainder function: The translation of some of these constraints on R into constraints on E or E is not entirely simple. Consider for instance the vanishing of the L th discontinuity of R (L) in the u channel. From eq. (2.22), we have that Thus, if we take the coefficient of g 2L in eq. (2.25) and compute its L th discontinuity, we can immediately impose the condition that the second discontinuity of E (1) vanishes and similarly that the L th discontinuity of R (L) vanishes for all L. Doing so, we see that only the pure E (1) term in eq. (2.25) can contribute at symbol level. Using the Leibniz rule, we furthermore have that Thus, the vanishing discontinuity condition on the remainder function corresponds to the requirement that at the level of the symbol. Similarly, we can work out the final-entry condition on E. Using the fact that R u +R 1−u = 0 and that one derives the following conditions at higher loops: These relations hold at full function level. In practice it requires some work to utilize them, because the multiplication of the logarithms in X by the lower-loop form factors is not wellsupported by the {n−1, 1} coproduct organization of the function space in the bulk. However, one can make use of the behavior of each function in the collinear and near-collinear limits, as well as on the dotted and dashed lines in Fig. 1, to pin down where the functions on the right-hand sides of these equations sit within the function space.
In the E normalization, the final-entry condition can be phrased much more simply.
to all loop orders L. This normalization also turns out to obey simple multiple-final-entry conditions, which we will present in section 4.2.

Near-collinear limit via integrability
An important source of data for the perturbative form factor bootstrap stems from the nearcollinear limit. Using the recently developed form factor operator product expansion, or FFOPE, the form factor can be determined in an expansion around the collinear limit for any value of the planar coupling constant [60][61][62]. Let us briefly review this construction. The FFOPE is based on the dual description of the form factor in terms of a periodic Wilson loop [14,34,35,37,38]. Similar to scattering amplitudes, this dual Wilson loop is defined via dual points x i , where x i+1 − x i = p i . However, since i p i = q, the dual Wilson loop for form factors is periodic instead of closed: x i+n − x i = q. Note that dual conformal symmetry acts on both the dual momenta and the periodicity constraint. Using dual momentum space, the expectation value of the (suitably regularized) Wilson loop can be written in terms of dual conformal cross ratios [60]. For the three-point form factor, we have (4.14) We can also express the cross ratios in terms of T = e −τ and S = e σ , in terms of which they are given by [60] . (4.15) Note that, in principle, algebraic symbol letters could appear in the form factor at higher weight; however, the fact that the OPE expansion parameter is T 2 rather than T [60] provides strong motivation to consider only the six letters in eq. (2.17). 7 The FFOPE immediately applies to a finite ratio of Wilson loops, W n , which is defined such that the ultraviolet divergences occurring at the cusps vanish; see ref. [60] for details. It can be translated to the remainder function R n via where W and Γ cusp is the cusp anomalous dimension (2.26). The FFOPE is the large-τ expansion of the ratio W n , and is given for n = 3 by [60] 7 In higher-multiplicity form factors, it is also reasonable to expect elliptic polylogarithms and integrals over higher-dimensional manifolds to occur; see for instance refs. [104][105][106][107][108][109][110][111][112][113][114][115][116][117][118].
where the sum is over a basis of eigenstates ψ of the Gubser-Klebanov-Polyakov (GKP) flux tube. All ingredients in this expansion are known at finite coupling: the energies E ψ and momenta p ψ of states were found in ref. [119], the pentagon transitions P in refs. [50][51][52][53][54][55][56][57][58][59] and the form factor transitions F in refs. [60][61][62]. At any loop order in the weak-coupling expansion, only flux-tube states ψ with a finite number of effective excitations contribute. The integral over the momenta of these excitations can then be done via residues to obtain a series expansion in e σ to any desired order.
We have used the above procedure to obtain a series expansion of the terms of order T 2 up to O(S 300 ) through five loops, but this can be extended easily to higher loop orders and higher orders in S. 8 We have also generated selected results for the terms of order T 4 up to five-loop order. After a translation to the remainder function (4.16), they agree perfectly with the expansion of our result for the remainder function. In appendix B, we give the results for the expansion of R in closed form in S at two and partially at three loops; the higher-loop results are provided in the ancillary files T2terms.txt and T4terms.txt.

Multiple-final-entry conditions
One of our initial assumptions is that the remainder function obeys the same final-entry condition as the six-point amplitude, R u i + R 1−u i = 0 [120]. 9 As described above, the L-loop form factor E (L) does not obey the same relation, since the one-loop form factor E (1) does not respect the same final-entry condition. However, the violation of this final-entry condition is predictable, as detailed in eqs. (4.8)-(4.11). From eq. (4.7), the sum over all three cyclic images of E (1), u + E (1), 1−u vanishes, and therefore E (L) does obey one homogeneous finalentry condition, Thus, E (L) seems to have five final entries. However, we know that only three of them can be truly new, due to the inhomogeneous relations (4.8)-(4.11) and the fact that we know the lower-loop form factors.
Similarly, we can ask how many independent k th final entries there are, or {2L−k, 1, . . . , 1} coproducts of the L-loop form factor E (L) . This set of dimensions is given in Table 1. In Table 2 we present the same dimensions for E (L) . The number of independent functions appearing in the coproduct components of E (L) is considerably smaller, principally because E (1) obeys the same final-entry condition as R. 8 In practice, information from the perturbative form factor bootstrap was initially used to determine the higher-order behavior of the form factor transition, which in turn provided more subleading logarithms of T in the T 2 terms in the near-collinear limit, which were in turn needed to fix the form factor at higher loop orders. 9 It might be possible to derive this relation using similar methods as were used to derive theQ equation for closed polygonal Wilson loops.  Since the multiple-final-entry analysis is simpler for E, we focus on this normalization. For k = 1, the independent final entries can be taken to be (4.20) The rest are determined by the relation as well as its dihedral images. For k = 2, the independent next-to-final entries can be taken to be The final-entry pairs ending in one of the u i are related to those ending in 1 − u i by taking further coproducts of eq. (4.21). So we only need to specify those ending in a 1 − u i , and of course E 1−u j ,1−u i = 0 for j = i. That leaves only the E u j ,1−u i . They are all given by dihedral images of the relation, which was first found empirically at three loops. Similarly, we find that, starting at three loops, there are 12 linearly independent triple final entries for E, which can be taken to be where the ellipses denote the cyclic images of the first four. The other triple final entries of the form E x,u,1−u and E x,1−u,1−u are related to these by as well as the (v ↔ w) flip of these relations. These relations, and eq. (4.23), hold for all the form factors through five loops. Finally, Table 2 shows that there are 24 independent quadruple final entries, starting at four loops. There is a corresponding set of quadruple final entry relations that holds through five loops. We have not yet fully characterized these relations, or used them as constraints in our bootstrap.

Imposing the constraints
We originally bootstrapped the form factor E (L) through five loops, and did not make much use of multiple-final-entry relations, relying more on data from the FFOPE [60][61][62]. 10 However, the implementation of the constraints for E (L) is generally simpler. In Table 3 we present the number of parameters remaining as we impose the various constraints described above on our ansatz for E (L) , working in the space of functions M. The final-entry condition in that table refers to eq. (4.21), while the next-to-final-entry condition refers to eq. (4.23). The vanishing of the L th discontinuity of R (L) in u is implemented at symbol level. At three loops, only one constant has to be fixed using the OPE data, and it can be done with just the T 2 ln 2 T S 2 ln 2 S coefficient, while we actually used significantly more OPE data when we  initially constructed E (3) . At four loops, Table 3 shows that we need to use the OPE data down to T 2 ln 1 T in order to fix all of the 32 parameters remaining after imposing the other conditions.
We do not present the numbers of parameters for five loops in Table 3 because redoing the whole construction for E (5) would be too time-consuming. In bootstrapping E (5) , we used the OPE data all the way down to T 2 ln 1 T , as well as the triple-final-entry conditions, in order to find a unique solution. The solution was validated on the T 2 ln 0 T data and selected T 4 ln k T data [62]. The fact that the numbers of independent coproducts at weights two, three, and four in Table 2 drop significantly when shifting from E (5) to E (5) provides additional circumstantial evidence that the five-loop solution is correct. We provide the symbols of the form factors E (L) through five loops and the remainder functions R (L) through four loops in the ancillary file cEandRsymbols.txt.
In ref. [9] it was observed that the symbol of the remainder function of the two-loop six-gluon amplitude, R (2) , can be written as the sum of two terms. The first term is precisely the symbol of the three-point form factor R (2) . The second term has a final entry which is the product of the three parity-odd hexagon letters, y u y v y w . This term actually vanishes on the surface u + v + w = 1, because y u y v y w → −1 there. Indeed, it is possible to show that the two functions coincide on this surface, R (2) 11 However, this agreement turns out to be a coincidence that is only true at two-loop order. That is, comparing our three-loop remainder symbol with the corresponding one for the three-loop six-gluon amplitude [40], they do not agree on the surface u + v + w = 1. Table 2 that there is a smaller space C ⊂ M which contains all of the coproducts of E. Because M grows rather rapidly with the weight, understanding C will be key to pushing this form factor bootstrap beyond five loops. At weight two, the missing function in C is the constant ζ 2 . It gets coupled to the other weight-two functions as follows:

It is clear from
but we do not have illuminating representations of the two more complicated three-orbits. We have not yet fully characterized C at higher weights, but we can construct a version of it based on the 21 weight-three functions listed in Table 2. We construct the weight-four functions by requiring that their {3, 1} coproducts lie in this 21-dimensional space, as well as imposing the ES-like, integrability, and branch-cut conditions. We find 51 symbol-level functions, plus ζ 4 . (It may be possible to discard three of these functions; the 47 functions in the coproducts of E (5) can be supplemented with 2 more functions from the pentaladder space P 4 in appendix A to obtain a space that appears to be complete. But it is more conservative to keep the three functions for now.) Continuing iteratively through weight eight, we find the dimensions listed in Table 4, along with those for M. The symbol-level dimensions are consistent with the generating function which also grows asymptotically like 3 w , although we emphasize again that the true C is likely to be considerably smaller.
We have redone the bootstrap construction in the function space C through four loops, in order to see how much it lessens the requirements for OPE data. Because we do not have a usable weight-eight basis for C yet, at four loops we worked at the level of the {7, 1} coproducts, and at symbol level. Due to this restriction, we impose dihedral symmetry and the final-entry condition simultaneously, and we report the numbers of surviving parameters at the level of the symbol. We see from Table 5 that through four loops there are considerably fewer parameters in early stages than in Table 3, although similar amounts of OPE information is  next-to-final entry 3 12 57 collinear limit 0 1 14 required at the last stage. The smaller size of the space C should bode well for bootstrapping the three-point form factor beyond five loops. It will also be important to explore the implications of cosmic Galois theory, or the coaction principle, for the function space C. This space shares with the space of hexagon functions H [63] the property of not including ζ 2 or ζ 3 as independent constant functions. At weight four, ζ 4 is an independent constant function in both C and H. At weight five, neither ζ 5 nor ζ 2 ζ 3 are independent constant functions in H. For C, we do not know yet whether ζ 5 or ζ 2 ζ 3 are likewise locked to other functions. Determining this may require either the construction of the six-loop form factor E (6) , or else a deeper understanding of the function space C.
In addition to probing the coaction principle in bulk kinematics, we can investigate what constants appear at special points in this space. For example, in the case of H, there is a natural base point (u, v, w) = (1, 1, 1) where all of the weight-three functions vanish, so ζ 3 does not appear. This fact leads to strong coaction-principle constraints on the MZVs that can appear in higher-weight functions in H when evaluated at (1, 1, 1) [63]. Unfortunately, we have not yet found a point analogous to (1,1,1) in the (u, v) plane for C, where we can look for missing zeta values. There also may be a higher-loop constant normalization factor needed to maintain the coaction principle, analogous to the constant ρ in the case of hexagon  functions [63], but this issue is hard to assess without the analog of the point (1, 1, 1).

Numerical Results and Simplifications in Kinematic Limits
Let us now analyze the numerical and analytical behavior of the form factor remainder function found in the previous section. We first focus on the (u, u, 1 − 2u) and (1, v, −v) lines, and then consider various points on these lines where the form factor evaluates to MZVs or alternating sums.

Values on particular lines
Perhaps the simplest place to plot the remainder function is on one of the three equivalent lines bisecting the Euclidean triangle 0 < u, v, w < 1 in Fig. 1, which run from one corner to the mid-point of the opposite side. For example, one of these line segments is given by setting v = u. The remainder function is then forced to vanish as u → 0 (a corner) and also as u → 1/2, since w → 0 at that point (the midpoint of the opposite side). By symmetry, the maximum of the absolute value of the L-loop remainder function for 0 < u < 1/2 occurs at the dihedrally symmetric point u = 1/3, since u = v = w = 1/3 there. The numerical values at this point, and the ratios to the previous loop order, are given in Table 6. Most quantities in planar N = 4 sYM theory exhibit sign alternation from one loop order to the next, at least at high loop orders. The sign alternation is associated with a finite radius of convergence of the perturbative series, and a pole on the negative g 2 axis that is typically (e.g. for the cusp anomalous dimension [88]) at g 2 = −1/16. This behavior has been explored to seven loops in the six-point amplitude [43], and to four loops in the seven-point amplitude [47], mainly in the Euclidean region. Interestingly, Table 6 shows that for the three-point form factor remainder R there is no sign alternation from two to three loops in the Euclidean region. We will see this feature in other regions as well. From three to four loops the sign alternation starts up (also in other regions). Indeed, the successive loop-order ratio R (4) /R (3) = −19.934697 . . . is not that far from the asymptotic cusp value of −16. The ratio R (5) /R (4) given in Table 6 is extremely close to this asymptotic cusp value. (So close that it might be an accident.) In order to take into account the large dynamical range and lack of sign alternation, in Fig. 2 we plot the remainder function divided by its maximal value, given in Table 6. The L = 4 curve is hidden directly under the L = 5 curve, because they are almost exactly proportional in this region.
We can also study the behavior of the remainder function in scattering regions, where it becomes complex. To move from region I to region IIa, we let ln w → ln |w| − iπ in the vicinity of the line w = 0. We then use the coproduct formalism to integrate up from that line on the IIa branch. Similarly, to move from region I to region IIIa, we let ln u → ln |u| − iπ and ln v → ln |v| − iπ near the point u = v = 0, and then integrate up from that point.
In Fig. 3a, we plot the real part of the remainder function on the extension of the line (u, u, 1−2u) into the space-like scattering region IIa, for u > 1/2, divided by its value at u = 1. We see that the proportionality of the four-and five-loop results extends a bit into region IIa, but does not hold so strongly at larger values of u. In Fig. 3b, we plot the imaginary part of the remainder function on the same IIa line, also normalized by the value of its real part at u = 1.
The analogous plots for the extension of the line (u, u, 1−2u) into the time-like scattering region IIIa, with u < 0, are given in Fig. 4a and Fig. 4b. In this case, we normalize by the value of the real part at u = −1. In this region, the proportionality of the four-and five-loop remainder function holds to much larger values of |u| than in region II.
Finally, we plot the remainder function in scattering region IIa again, but now on the  line u = 1, in Fig. 5a and Fig. 5b. In this case the value at v = ∞ is real, and we normalize by that value.

Values at particular points
There are some distinguished points in the (u, v) plane where the form factor remainder function takes on interesting analytic values. Of course, R vanishes on the entire boundary of the Euclidean triangle, which includes the points (u, v, w) = (1, 0, 0), (0, 1, 0) and (0, 0, 1).
For the function space M as a whole, we expect only MZVs and logarithms of the two small variables at these points. The values at four and five loops can be written in terms of an alternating-sum f -basis using HyperlogProcedures [121], but the results are fairly lengthy. In Table 7 we present the numerical values of R (L) and of the ratio of successive loop orders R (L) /R (L−1) at this point.
When we take the limit v → ∞ on the u = 1 line, the remainder function becomes real, We provide the numerical values, and the successive loop order ratios, in Table 8. The approach to the asymptotic cusp ratio of −16 is slower out at infinity than it is closer to the origin, which is similar to what is observed for six-point amplitudes [43]. A second limit in which the remainder function becomes real and evaluates to MZVs is the limit where u → ±∞ on the line (u, u, 1 − 2u). The value is independent of the sign of u;  Table 9: The value of the L-loop remainder function at (u, v, w) = (u, u, 1 − 2u) with u → ∞, as well as the ratio to the previous loop order.
that is, it is the same whether we approach this point from within region IIa or region IIIa. Through five loops, the limit is + 1680 ζ 3 ζ 7 + 240 ζ 2 ζ 5,3 + 192 ζ 7,3 . (5.10) The numerical values and the ratios to the previous loop order are shown in Table 9. The approach to the asymptotic cusp ratio is even slower here. The general conclusion of our numerical investigation is a rough consistency with the expected large-order asymptotics. The two-loop remainder function is a bit smaller, and has the opposite sign, than one might have expected from the higher-loop results, and the more strictly sign-alternating behavior of six-and seven-point amplitude remainder functions [41,43,47,95]. Also, the form factor remainder function tends to a constant at infinity, which has interesting consequences for the high-energy or Regge limit, as we will mention in section 7.

Multi-Loop One-Mass Four-Point Integrals
As mentioned in section 2.2, the ES-like adjacency restriction (2.29), which is critical to defining the function space M, was first motivated by its appearance in the form factor (as opposed to the remainder function) computed in ref. [9]. We have also inspected several other three-point form factors, or amplitudes involving a Higgs boson and three gluons, that are available in the literature [6,9,102,122]. All obey the same ES-like restriction. The constraint (2.29) has no effect until weight three, because 1 − u cannot appear until the second entry, but it can be seen in the weight-three and -four functions in these references.
All of the quantities in question are linear combinations of a class of two-loop master integrals associated with four-point processes with one massive and three massless external legs, and all massless internal lines, for both planar and non-planar topologies. The master integrals for these topologies were first computed through weight four in refs. [87,123]. More recently they have been expanded further in the dimensional regularization parameter ǫ, through weights five [124] and six [125]. Examining these two-loop master integrals through weight six [125], we find that they all lie within M. 12 There are 89 different master integrals. At weight four, 69 symbols and 79 functions of the 89 are linearly independent, while there are 3 4 = 81 symbols and (from Table 4 Based on the fact that all the two-loop master integrals are in M, as well as the planar N = 4 sYM form factors through five loops, we expect that this space contains all master integrals for the same topologies, one massive and three massless external legs, and all massless external lines, planar and non-planar, to higher loop orders. However, for reasons discussed in the "note added", we don't expect this statement to be true to arbitrary loop order. If our expectation is true, then any four-point amplitude with these kinematics, for example subleading-in-N corrections in N = 4 sYM, as well as QCD corrections to amplitudes for gg → Hg (in the heavy-top limit), qg → Zq, and e + e − → qqg, can be expressed to three loops, possibly further, in terms of the transcendental functions in M. All different weights up to 2L can be expected to appear at L loops in QCD, and the rational-function prefactors in front of these functions, especially the lower-weight ones, can be expected to become rather intricate. Nevertheless, this would be a remarkable property for a large class of high-order amplitudes in massless theories to exhibit.

Conclusions and outlook
In this paper, we have bootstrapped the three-point form factor of the chiral part of the stresstensor supermultiplet in planar N = 4 sYM theory through five loops, extending the state of the art by three loop orders. To carry out this bootstrap, we have utilized new boundary data coming from the OPE in the near-collinear limit of these form factors [60][61][62]. The OPE also provides important cross-checks on our results. On the other hand, the information about subleading logarithms in T at order T 2 coming from our results also assisted the determination of the higher-order behavior of the form factor transition. The integrability-based bootstrap and the perturbative form factor bootstrap thus complement each other quite nicely.
These new form factor results provide a novel glimpse into the mathematical structure of gauge theory at high loop orders, complementing our rapidly-developing understanding of the mathematical structure of multi-loop amplitudes. In particular, we have observed novel extended-Steinmann-like constraints (2.29) that are obeyed by the finite part of the form factor E (L) . Similar to the extended Steinmann relations for amplitudes [63], we do not have a full proof of these relations based on physical principles. It would thus be interesting to elucidate the origin of these restrictions, perhaps using the types of methods employed in refs. [126][127][128][129][130][131].
We have also observed that the functions E (L) satisfy multiple-final-entry conditions. Namely, the double coproduct entries of E (L) satisfy relation (4.23), and its dihedral images, through at least five loops. It would be interesting to see if these relations could be derived using theQ equation [120,132]. Similarly, the triple coproduct entries of E (L) satisfy relations (4.25)-(4.29), and their dihedral images, through the same loop order. It is worth noting that the six-point amplitude also satisfies next-to-final-entry conditions [96]; however, when the extended Steinmann relations are used to build up the space of hexagon functions, these conditions are automatically satisfied [43]. As seen in Table 3 and Table 5, the multiple-finalentry constraints that we observe are not redundant with the other general assumptions we currently build into our ansatz.
In addition to exhibiting interesting mathematical structure, these form factors should be closely related to quantities of phenomenological interest. In particular, we expect the remainder function R (L) to match the maximally transcendental parts of the gg → Hg and H → ggg amplitude remainder functions in the heavy-top limit of QCD. Additionally, for the reasons described in section 6, we believe that the lower-transcendentality functions appearing in these QCD amplitudes at three loops will be contained in the form factor space M, although they will have more complicated rational prefactors. These form factors, and the function space M, are thus important for future studies of the Higgs boson, among other LHC processes.
An interesting avenue for future work is to study the high-energy or Regge behavior of four-point scattering involving a colorless operator. In the case of four-gluon scattering, where the remainder function vanishes, the BDS formula [72] is consistent with Regge behavior being controlled by the cusp and collinear anomalous dimensions [133,134]. In the present case, the remainder function does not vanish in the scattering region II (space-like operator), but as we saw in section 5.2, it goes to a real constant as v → ∞ for u = 1, or for u = v. This constancy follows from the final-entry condition, R u i + R 1−u i = 0. For u ≪ v, this region is the high-energy limit s 23 ≫ s 12 , q 2 . The logarithmic growth with energy again is captured by the infrared-divergent terms, as well as E (1) , which have been removed from the remainder function. However, there will also be an impact factor, which does not depend on s 23 , but does depend on the dimensionless ratio u = s 12 /q 2 . It can be extracted from the remainder function to high loop orders. Similar remarks apply to scattering region III (the time-like operator).
We have defined the form factor function space M to have the analytic properties expected of a generic Feynman integral that draws from the S 3 symbol alphabet, and to obey the extended-Steinmann-like relations (2.29). In order to carry out our bootstrap, we have iteratively constructed M through weight eight. While the space M involve fewer symbol letters than the hexagon function space H, we find that its dimension grows faster with the transcendental weight w, namely as 3 w -compared to ∼ 1.8 w in the hexagon case. However, the clean growth of the function space with the weight and the simpler letters make M an ideal testing ground for constructing the function space in a closed form.
In fact, we have found that the function E is contained in the smaller subspace C ⊂ M. We expect this space to be crucial for going beyond five-loop order [135], but we do not yet have a bottom-up, or first-principles, construction of C.
It would also be interesting to bootstrap form factors with a larger number of points n. In particular, a non-trivial NMHV form factor is first encountered at n = 4. Higher-point form factors for tr(F 2 SD ) factorize onto n gluon scattering amplitudes for n ≥ 4. Since these amplitudes do not obey a maximal-transcendentality principle, we also do not expect such a principle to hold between N = 4 sYM form factors and QCD Higgs amplitudes, beyond n = 3.
Finally, it would be interesting to bootstrap form factors of different operators. Possible operators include those studied in refs. [33,102,[136][137][138][139][140][141][142][143][144][145][146]. In particular, in the case of the two-loop three-point form factor for tr(F 3 ) it has been shown that the next-to-leading order contribution to the Higgs amplitude remainder in the heavy-top limit of QCD shares its maximally transcendental part with its counterpart in N = 4 sYM theory [122,[147][148][149]. These form factors are liable to have just as interesting a structure at higher loops as the form factor we have studied in this work.
Note Added: The initial version of this paper included the conjecture that the transcendental functions appearing in four-point one-mass amplitudes would be contained in the M space to all loop orders. After it appeared, we were reminded by Erik Panzer that non-polylogarithmic periods are known to appear in massless φ 4 theory by eight loops [69,150,151]. Because these period integrals can be converted to propagator integrals by injecting external momentum, and since propagator integrals are contained within four-point one-mass integrals, the polylogarithmic conjecture we originally put forward must fail at higher loop orders (in generic massless quantum field theories).

A Pentaladder integrals
The pentaladder integrals, or pentabox ladder integrals, belong to both the form factor space M and the heptagon function space. As such, they provide a way of understanding the ESlike adjacency restrictions for some of the functions in M, in terms of adjacency restrictions which follow from extended Steinmann relations for heptagon functions, as we will discuss in this appendix.
The class of pentaladder integrals depicted in Fig. 6 was first considered in ref. [65]. With the inclusion of a numerator associated with the pentagon loop (which is depicted graphically by a dashed line), they are rendered infrared finite and dual conformally invariant (DCI). Starting from the one-loop pentagon integral, where x a is the dual point that satisfies the null-separation conditions x 2 a1 = x 2 a2 = x 2 a3 = x 2 a4 = 0, the L-loop pentaladder integral can be constructed iteratively by including L−1 further integrations of the form These integrals depend on the pair of cross-ratios which describe a two-dimensional subspace of seven-particle kinematics. An important aspect of this class of integrals is that adjacent loop orders are related to each other by a second-order differential equation [65]. Specifically, defining the function Using this relation, these integrals have been computed to high loop order and resummed in the coupling [66]. As a result, explicit polylogarithmic expressions for these integrals are available through eight loops. These integrals also turn out to be relevant to the space of functions entering the threepoint form factors in this paper. To see how, consider the limit in which the dual point x 6 is sent to infinity. In this limit, the seven-point cross ratios (A.3) simplify to ratios that only depend on three external momenta: x 3 x 4 x 6 x 1 x 2 Figure 6: The seven-point pentabox ladder integral, labeled by dual coordinates, in which the box ladder involves L − 1 loops. The dashed line represents a numerator factor that renders the integral DCI and infrared finite. This class of integrals maps into the space of functions relevant for the three-point form factor in the limit x 6 → ∞.
Thus, these variables map directly to the variables u and v defined in eq. (2.3), while all kinematic dependence on the box end of the ladders drops out.
Since the variables U and V each remain finite in the limit (A.6), the functional form of Ψ (L) does not change, and we can directly check whether they have the right properties to contribute to the form factor F MHV 3 . As it turns out, they only draw from the five-letter subset of the six-letter symbol alphabet (2.20). Also, they obey the same extended Steinmann constraints as the form factor space M, insofar as the two letters 1−u and 1−v never appear sequentially in their symbols. Since only the letters u and v can appear in the first entry in seven-particle kinematics, these functions also satisfy the first-entry condition relevant for form factors. Thus, we see that the functions Ψ (L) (u, v) have all of the right properties to appear in the space of functions M constructed in section 3. The differential equation (A.5) implies the single coproduct relations, and the additional double coproduct ones, where we have dropped the (L) superscript on Ψ (L) for clarity. Using these relations, the flip symmetry Ψ (L) (u, v) = Ψ (L) (v, u), and the vanishing of Ψ in the collinear limit w = 1 − u − v → 0, it is easy to locate the pentabox ladder integrals in M; we have done so through five loops.
The extended-Steinmann-like constraints we have observed for the form factor space M do not seem to have a direct physical explanation, based on causality, within that space alone. On the other hand, when Ψ (L) ∈ M is interpreted as a pentaladder integral in heptagon kinematics, the constraints follow from the extended Steinmann relations in planar N = 4 sYM theory [45,63], which have a causal interpretation. This connection can be seen most easily by building the full space of polylogarithms with the alphabet (A.7)-interpreted as heptagon letters via the definitions (A.3)-that satisfy the first-entry and extended Steinmann conditions. 13 This construction generates the P subspace of the form factor space M, which has dimension at symbol level. Notice that only two functions appear at weight one, consistent with the above comment that 1−u−v does not satisfy the first-entry condition in heptagon kinematics. Moreover, at weight two, only the functions survive after applying the heptagon Steinmann conditions. As has been previously observed in heptagon kinematics, imposing the extended Steinmann relations at each higher weight also implies cluster adjacency [64,153,154], even though the latter condition seems to imply more constraints.
In terms of the labelling of the heptagon cross ratios u i , and the heptagon g letters described in ref. [47], the five letters (A.7) become u = u 2 u 6 = g 1,2 g 1, 6 , v = u 4 u 7 = g 1,4 g 1,7 , To restate the previous connection in terms of these letters: the letters g 3,i and g 3,i+2 are never found adjacent to each other in the extended Steinmann heptagon function space. It would be very interesting to find a constructive description of the space P, perhaps as a prelude to finding one for M or C. We leave this to future work. B Near-collinear limit at orders T 2 and T 4 In this appendix, we describe further the structure of the near-collinear expansion of the remainder function discussed in section 4.1, providing explicit results at two and three loops. 13 Since the extended Steinmann relations are most directly formulated in terms of non-DCI variables, their implications in the DCI heptagon alphabet (A.7) are not immediately clear; however, they can be imposed easily on an ansatz of DCI symbol letters. While a DCI formulation of cluster adjacency does exist, it only guarantees that a cluster-adjacent form of the symbol exists in terms of X -coordinates, not that every representation of the symbol in terms of X -coordinates will obey cluster adjacency [152]. From the FFOPE side, the T 2 terms are completely determined, as are essentially all of the T 4 terms. As mentioned in section 4.1, the FFOPE results are generated as a high-order series expansion around S = 0. One can construct an ansatz for the closed-form dependence on S in terms of HPLs up to a certain weight with suitable rational-function prefactors. With enough terms in the series expansion, one can fix all of the coefficients in the ansatz.
Alternatively, it is straightforward to obtain the complete dependence on S by using the coproduct representation to compute the v (or T ) derivative of each function in M in terms of lower-weight functions. The integration in T is trivial to do, order-by-order in T , given the expansion (B.1). The HPLs are generated by the T 0 term in the expansion, where the derivative in u has to be integrated up (at least for the terms with no ln T ), but this is also straightforward. After constructing the near-collinear limits of all the functions, the results for E or E, and then for R, can be obtained, in principle to any power of T 2 .
Here we provide some terms at order T 2 and T 4 . It is convenient to take the argument of the harmonic polylogarithms to be x = −S 2 . At order T 2 , we can separate out the rational prefactors by writing   The remaining three-loop T 4 coefficients, and the four-and five-loop ones, can be found in the ancillary file T4terms.txt.