Virtual QCD corrections to gluon-initiated diphoton plus jet production at hadron colliders

We present an analytic computation of the gluon-initiated contribution to diphoton plus jet production at hadron colliders up to two loops in QCD. We reconstruct the analytic form of the finite remainders from numerical evaluations over finite fields including all colour contributions. Compact expressions are found using the pentagon function basis. We provide a fast and stable implementation for the colour- and helicity-summed interference between the one-loop and two-loop finite remainders in C++ as part of the NJet library.


Introduction
Precise theoretical predictions are in high demand for the current Large Hadron Collider (LHC) experiments which are aiming to look for tiny deviations from the Standard Model (SM). Due to the relatively large size of the strong coupling constant, next-to-next-toleading order (NNLO) corrections in quantum chromo-dynamics (QCD) are desirable for a wide variety of final state processes. In particular, a class of 2 → 3 scattering processes with many kinematic scales have presented a considerable challenge to the theoretical community and there has been a good deal of activity leading to new methods able of overcoming their algebraic and analytic complexity [1][2][3][4][5][6][7][8][9][10][11].
The production of a pair of high energy photons is an important experimental signature at hadron colliders and can be used for example to study the Higgs boson through its decay to photons. The SM backgrounds are dominated by QCD corrections and a precise description of the kinematics of these observables requires the theoretical predictions to include perturbative information from the production in association with additional jets. NNLO corrections to the process pp → γγ + j, which is initiated at LO by quark-antiquark and quark-gluon processes, have been considered a high priority for current and future experiments for several years [12][13][14], and were computed most recently [15]. The Bornlevel amplitude for gluon-initiated diphoton plus jet production contains a closed quark loop coupling to both photons. Consequently, this type of process starts to contribute to the cross section only from NNLO onwards. Owing to the large gluon luminosity, it yields a dominant contribution to the NNLO corrections and dominates their scale uncertainty [15]. To improve upon this uncertainty requires the NLO corrections to the closed quark-loop contributions, which amount to the two-loop virtual amplitudes for gg → γγg that we derive in this article. Curiously, the gluon channel has the opposite structure to the conventional expansion in the number of colour charges, N c . The dominant, leading colour, contributions to the quark-initiated process contain only planar diagrams, while in the gluon-initiated case the leading-colour limit contains both planar and non-planar graphs at two loops. Graphs with the highest complexity are thus unavoidable.
The last few years have seen rapid progress in our ability to compute two-loop 2 → 3 scattering processes in QCD which had been intractable for a long time. The analytic computation of the scattering amplitudes in a form suitable for phenomenological applications requires a number of major technical bottlenecks to be overcome. A basis of special functions must be identified that can be evaluated efficiently over the full phase space. For massless five-point scattering, such a basis has been identified [16][17][18][19][20] and became recently available as a fast and stable implementation in C++ valid in the physical scattering region [21]. Secondly, the amplitude must be reduced from tensor Feynman integrals onto a basis of master integrals that can subsequently be expanded in terms of special functions. Currently, the only viable approach to this task is through the solution of enormous systems of integrationby-parts (IBP) identities [22][23][24] of which many public implementations now exist [25][26][27][28][29][30]. There has been success in simplifying this problem using syzygy relations [7,[31][32][33][34], module intersection [35,36], intersection theory [37][38][39][40], η expansion [41][42][43][44][45], direct solution of IBPs through recursive relations [46], multivariate partial fractioning [36], and by-passing complicated algebraic steps through finite field arithmetic [47][48][49][50][51]. The latter method can be applied more broadly [48,50], in particular to a complete reduction of the amplitudes into a representation using special functions. In this article, we approach the problem through a direct analytic reconstruction of the amplitudes at the level of the pentagon functions performing all intermediate steps numerically over finite fields. This technique has been applied successfully to leading-colour (planar) five-parton amplitudes first numerically [52][53][54][55] and then analytically [56][57][58][59][60]. Leading-colour three-photon production has also been completed and cross checked by two independent groups both at the level of the amplitudes [61,62] and of differential cross sections [63,64]. Very recently, NNLO QCD predictions for a number of three-jet observables and differential three-to-two jet ratios have been computed at leading colour as well [65]. The process gg → gγγ contains the most complicated non-planar topologies with up to rank five tensor numerators even at leading colour.
Diphoton production has been known at NNLO for some time [66,67] and the two-loop scattering amplitudes were among the first complete 2 → 2 process to be calculated [68,69]. The first results for the amplitudes for pp → γγj appeared in the last few months both for the amplitudes [70,71] and NNLO differential cross section [15] in the leading-colour approximation. Very recently, the full-colour two-loop QCD corrections for the quarkinitiated channels to pp → γγj were presented [72].
We obtain sufficiently compact analytic expressions for the complete set of helicity amplitudes for which the ultraviolet (UV) and infrared (IR) poles have been subtracted, and implement them into an efficient and stable C++ code as part of the NJet library [73]. These expressions take the form of rational coefficients multiplied by pentagon functions. The code provides colour-and helicity-summed expressions for the two-loop amplitudes interfered with the one-loop amplitudes, which can be used directly in phenomenological applications.
Our paper is organised as follows. We first introduce the notation and describe the colour decomposition of the amplitudes. We then describe the methodology used to perform the integration-by-parts reduction and reconstruction of the finite remainders over finite fields. In particular we describe a method for performing a univariate partial fractioning of the rational coefficients of the special functions on the fly. This approach can be used inside the finite field workflow, reducing significantly the number of sample points required to complete the analytic reconstruction and yielding compact analytic expressions. In particular, we show explicitly some remarkably simple analytic forms we obtained for the all-plus helicity amplitude, which highlight its conformal properties. Finally, we present the implementation in the NJet library [73] and the performance of the code using a realistic set of phase-space points before concluding with a few remarks on future applications of the results and methods. We also include an appendix with some details of the momentum twistor formalism used to provide a rational parametrisation of the kinematics.

Kinematics and amplitude conventions
We consider the production of a pair of photons in association with a gluon from gluon fusion, up to two-loop order in QCD. All particles are massless, p 2 i = 0, and we take all momenta as outgoing, so that The square of tr 5 can be expressed in terms of the scalar invariants through the Gram determinant of the external momenta, which is a degree-4 polynomial in the s ij . The pseudo-scalar invariant tr 5 therefore introduces an algebraic dependence on the kinematics, since tr 5 = ± √ ∆. We emphasise that the sign of tr 5 changes under parity conjugation, which acts by flipping the sign of the spatial momentum components, and under odd-signature permutations of the external momenta. We work in the s 12 physical scattering region, which is delimited by the requirements that all s-channel invariants are positive and all t-channel invariants are negative, together with the negativity of the Gram determinant, ∆ < 0, which follows from the real-valuedness of the momenta [18]. The scattering of gluons and photons is a one-loop process at leading order. We decompose the scattering amplitude as where n = i(4π/µ 2 R ) e − γ E with µ R being the renormalisation scale. In Eq. (2.8), g s and g e are the strong and electromagnetic coupling constants, α s = g 2 s /(4π), N q and Q q are the number of quarks of type q and their electric charge in units of the electron charge, and a i is the adjoint SU(N c ) colour index of the i th gluon. The one-loop amplitude can be obtained from permutations of pure gluon scattering [74,75].
We further expand the loop amplitudes in powers of N c and n f (the number of light flavour fermions), 3 (1 g , 2 g , 3 g , 4 γ , 5 γ ) . (2.9) Surprisingly, the subleading-colour two-loop amplitudes contain only planar integrals, while the leading colour contains all of the four independent families shown in Figure 1. This pattern is the opposite to that of the quark-initiated channels computed in Refs. [70][71][72], for which the leading-colour contributions involve only the planar integrals and are therefore simpler to compute. Providing a prediction for the gluon-initiated channel necessarily requires handling the most complicated integral families. A simple analysis of the colour factors of each of the three-gluon vertex diagrams shown in Figure 2 illustrates how this pattern arises. Photons couple to any of the fermion propagators, and the colour factors remain the same. It can then be seen that non-planar contributions can come from the  Figure 2: The colour factor of each diagram in the gg → gγγ follows from the representative three-gluon, two-loop diagrams with a closed fermion loop shown here.
diagrams (a)-(c) only. Diagrams (d)-(e), which contribute to the subleading colour, remain planar (allowing for permutations of the external momenta). In our setup, we reduce directly to the finite remainder where the UV and IR poles have been subtracted analytically. The poles take a particularly simple form since there is no tree-level process and the one-loop amplitudes are finite in . The one-and two-loop finite remainders are given in terms of the bare amplitudes by [76][77][78][79][80], where β 0 = 11N c /3 − 2n f /3 and with n Γ ( ) = e γ E /Γ(1 − ) and γ g = β 0 /2 in the 't Hooft-Veltman scheme. The logarithms arising from the -expansion of I (1) can be analytically continued to the s 12 channel by adding a small positive imaginary part to each s ij . The β 0 term in the definition of the two-loop finite remainder accounts for the strong coupling renormalisation. The finite remainders inherit from the amplitudes the decomposition in powers of N c and n f given by Eq. (2.9), 3 (1 g , 2 g , 3 g , 4 γ , 5 γ ) . (2.12) Our final results are presented in the 't Hooft-Veltman scheme, although we make the distinction between the dimension d of the loop integration and the dimension d s = g µ µ arising from the numerator algebra. Amplitudes with d s = 2 have a much simpler algebraic structure and contain information that can then be used to reduce the complexity of the more difficult d s − 2 component (see e.g. Section 4.1). The one-loop finite remainder has only the d s = 2 term, and we expand the two-loop finite remainder around d s = 2 as (2.13) The 't Hooft-Veltman scheme is obtained by setting d s = d = 4 − 2 .

Computational setup and amplitude reduction
We take a diagrammatic approach to the calculation of the amplitude along the lines of previous work [81,82]. Here we briefly summarise the steps and refer the reader to Ref. [82] for details. All Feynman diagrams are generated using QGRAF [83] and subsequently processed using a combination of in-house Mathematica and FORM [84,85] scripts. In total, including contributions from ghost diagrams, we find 50 diagrams at one loop and 1527 at two loops. Aided by the Spinney [86] package to perform the 't Hooft algebra, the numerators are written for each independent helicity configuration. From the loop denominator structure we assign an integral topology to each diagram. At this point, the diagram numerators are linear combinations of monomials in loop-momenta dependent scalar and spinor products with coefficients depending only on external momenta. These coefficients are loaded into a dataflow graph using FiniteFlow [50]. This enables numerical sampling over finite fields, thus sidestepping analytically complicated intermediate expressions in further steps. We rewrite loop-momenta dependent monomials into inverse propagator denominators and a choice of irreducible scalar products (ISPs). The required mapping of the coefficients is performed numerically within the dataflow framework. After summing all diagrams and dropping scaleless integrals, we arrive at an expression ready for integration-by-parts (IBP) reduction. The reduction to master integrals has been obtained using an improved version of the Laporta algorithm [24]. For most integral families we generated identities containing no higher power of propagators with respect to those appearing in the amplitude, following ideas proposed in [7,31,33]. These identities have been found using the Baikov representation of loop integrals, for which identities (i) without higher powers of propagators and (ii) without dimension-shifted integrals can be found by solving polynomial equations called syzygy equations. Closed form solutions to both these constraints are separately known. Indeed, the solution of (i) is almost trivial and the solution for (ii) has been found in Ref. [34]. The two syzygy solutions need to be combined for generating identities that satisfy both constraints. For this purpose we used a custom syzygy solver that implements the algorithm in Ref. [32] using FiniteFlow [50]. More details on this method can be found in Refs. [7,31,33,34].
For each integral family, we generated integral identities only for one permutation of the external legs. Numerical solutions for all the permutations contributing to an amplitude have been found by solving the systems of equations several times, with different numerical inputs for the invariants. Mappings between master integrals with different permutations of external legs are applied afterwards to obtain a result in terms of a minimal set of them.
As an additional improvement, for each phase-space point evaluated on a finite field, we reconstruct the full dependence on the dimensional regulator of the amplitude reduced to master integrals before substituting their expressions in terms of special functions and computing the Laurent expansion in . With this setup, fewer numerical solutions of the integration-by-parts identities are needed in order to reconstruct analytic results for the amplitude. This is due to the fact that the expansion of the integrals into pentagon functions, before performing the Laurent expansion in of the final coefficients, complicates the dependence on of the result in this intermediate stage.
To make use of the finite field arithmetic we must have a rational parametrisation of the external kinematics. We parameterise the kinematics using momentum twistors [8,87] where, We stress that the pseudo-scalar invariant tr 5 , and hence the square root of the Gram determinant ∆, is a rational function of the x i variables. Moreover, since x 1 is the only dimensionful variable, we can set it to 1 and recover the dependence on it after the reconstruction by dimensional analysis. Further details on the momentum twistor parameterisation are presented in Appendix A. In the following sections, we will consider all coefficients of the special functions to be rational functions of the variables x i .

Analytic reconstruction over finite fields
In this section, we present three general strategies to optimise the reconstruction over finite fields of the rational coefficients in the finite remainders. At this stage, each component F (x) of the two-loop finite remainder is expressed as where r i are rational functions of the variables x which parameterise the momentum twistors, and mon i (f ) are linearly independent monomials of the pentagon functions. The entire chain of operations is implemented over finite fields in the framework FiniteFlow.
We therefore have a numerical algorithm which evaluates the rational coefficients r i (x) modulo some prime number. The final step consists in reconstructing the analytic expression of the rational coefficients from a sufficient number of numerical evaluations. We employ FiniteFlow's multi-variate functional reconstruction algorithms, supplemented with three strategies to reduce the number of required sample points: we determine the linear relations among the rational coefficients and an ansatz, use univariate slices to identify the factors belonging to another ansatz, and perform a univariate partial fraction decomposition on the fly. In the following subsections we discuss thoroughly each of these procedures, and their application to two-loop diphoton finite remainders.

Linear relations among the rational coefficients
The representation of the finite remainders in terms of rational coefficients and special function monomials given by Eq. (4.1) is in a sense not optimal. The special function monomials in fact do not all appear independently. They are present only in a number of independent combinations that is typically much smaller than the total number of monomials. As a result, the rational coefficients r i in the finite remainders are not linearly independent. Expressing the finite remainders in terms of a set of linearly independent rational coefficients not only leads to more compact expressions, but may also simplify their reconstruction. We can determine the linear relations among the rational coefficients {r i (x)} of the special function monomials by solving a linear fit problem, Since the coefficients of the linear relations a i are rational numbers, they require substantially fewer sample points to be reconstructed with respect to the rational coefficients themselves. We can then use these relations to express the rational coefficients in terms of a set of linearly independent ones, which remain to be reconstructed. Choosing the latter to be the simplest -i.e. the ones with the lowest polynomial degrees -may reduce the number of sample points required for the reconstruction. This strategy can be further refined by supplying an ansatz for the rational coefficients. We then fit the linear relations among the rational coefficients of the finite remainders and the coefficients of the ansatz, which we denote by In the best case scenario, all the rational coefficients r i can be expressed in terms of the ansatz coefficients e j and no further reconstruction needs to be performed. Even when the ansatz does not entirely cover the rational coefficients, it may still lower the degrees of the linearly independent coefficients which have to be reconstructed. The ansatz can be constructed from the tree-level amplitude and the rational coefficients of the one-loop amplitudes up to order 2 from the analysis of the leading singularities [88][89][90][91] or from other related amplitudes. In the diphoton case, we can use the two-loop five-gluon amplitudes. At one loop, the 3g2γ amplitudes can be expressed in terms of permutations of the five-gluon ones [74,75]. While this is no longer true at two loops, we find there is an important overlap between the rational coefficients of the 3g2γ amplitudes and those of the five-gluon ones. We use as ansatz in the linear relations the rational coefficients of the leading-colour two-loop five-gluon amplitudes (all two-loop five-parton amplitudes are available analytically at leading colour [56-58, 60, 92-94]; we made use of independent results, which are being prepared for publication).

Matching factors on univariate slices
The pole structure of the pentagon functions is determined by the letters of the pentagon alphabet [16]. The pentagon functions (or their discontinuities) may in fact have logarithmic singularities in the phase-space points where one of the letters vanishes. For this reason, it is natural to expect that the poles of the rational coefficients should be similarly linked to the pentagon alphabet. Indeed, we observe that the denominators of the rational coefficients in front of the pentagon functions factorise into a product of letters of the pentagon alphabet.
In other words, each rational coefficient r(x) has the form where e k are integers, n(x) is a polynomial in the variables x, and { k } is an ansatz of factors from the pentagon alphabet. The exponents e k in Eq. (4.4) may in general be negative, corresponding to factors in the numerator. We use the following ansatz for the factors, (4.5) The exponents e k in the ansatz (4.4) can be determined by reconstructing r(x) on a univariate slice modulo some prime number [58]. The univariate slice is defined by parameterising the variables by a single parameter t, for constant a i and b i . The latter are chosen randomly in the finite field to avoid artificial simplifications. The dependence on t is chosen to be linear so that the degrees of the numerator and denominator of r(t) := r (x(t)) correspond to the total degrees of r in x. Matching the reconstructed r(t) with the ansatz (4.4) evaluated on the same slice allows to determine the exponents e k straightforwardly. With a univariate reconstruction on just one prime field we can thus infer a lot of information about the analytic form of the rational coefficients: the denominators are entirely fixed, and typically some factors of the numerators are determined as well. What remains to be reconstructed therefore requires fewer sample points.

Univariate partial fraction decomposition over finite fields
Partial fraction decomposition is a standard and powerful tool for the simplification of rational functions. The decomposition in partial fractions is however not unique in the multivariate case. Its application to the multivariate rational functions in scattering amplitudes is therefore not straightforward. The necessity to simplify the rational coefficients of two-loop five-particle scattering amplitudes has recently spurred several approaches to handle the multivariate case efficiently [36,59,95], based upon Leinartas' algorithm [96,97]. These algorithms rely on algebraic geometry techniques, such as multivariate polynomial division and Gröbner bases, and require the arbitrary choice of a monomial ordering.
Our main goal in this work is actually to simplify the reconstruction of the rational coefficients over finite fields. In other words, we want to reconstruct the rational coefficients on the fly, directly in a form which is decomposed in partial fractions. The simplification of the ensuing analytic expressions comes as a welcome by-product. We observe that a univariate partial fraction decomposition is sufficient for this purpose. The advantage is that it can be straightforwardly implemented over finite fields, avoiding all algebraic geometry complications. The only arbitrary choice that remains to be done is which variable to partial fraction with respect to. The latter can be chosen by observing the impact of the partial fraction decomposition with respect to each variable separately on the lower order amplitudes. With the parameterisation of the kinematics in terms of momentum twistors, Eq. (A.3), we find it most convenient to partial fraction with respect to x 4 .
We now discuss our algorithm to reconstruct the univariate partial fraction decomposition of a multivariate rational function r from its numerical evaluations over finite fields. The algorithm requires as input an ansatz for the factors which may appear in the denominator of r. Only those factors which depend on the variable with respect to which the partial fraction decomposition is being performed are strictly necessary. Guessing other factors may further simplify the reconstruction. In the application to massless two-loop five-particle scattering amplitudes, the factor ansatz can be inferred from the letters of the pentagon alphabet [16]. We use the factors in Eq. (4.5).
Let r be a rational function of the variables x = {x i } n i=1 . In this work the x i 's are the momentum twistor variables defined by Eq. (3.1), so n = 5, but we outline the algorithm in general. The goal is to decompose r in partial fractions with respect to one of the variables, say x k . To simplify the notation, we denote the latter by y = x k , and the remaining variables byx = {x i } n i=1 \x k . We may not know the analytic expression of r, but we must be able to evaluate it numerically modulo some prime number through some algorithm. Let { i (x, y)} m i=1 be an ansatz for the factors which may appear in the denominator of r. Without loss of generality, we assume that the i 's are irreducible polynomials over Q. In other words, we assume that r has the form where e i ∈ Z, and N (x, y) is a function which depends polynomially on y and rationally onx. The ansatz { i (x, y)} m i=1 may catch some of the factors in the numerator of r (x, y), corresponding to negative values of the exponents e i in Eq. (4.7). This lowers the total degrees of N (x, y) and eventually simplifies its reconstruction, but is not necessary for the partial fraction decomposition with respect to y. Similarly, the ansatz may cover all the factors in the denominator of r, so that N (x, y) is a polynomial inx and y. What is necessary for the partial fraction algorithm to work is that the ansatz contains all the factors in the denominator of r which depend on y. We denote this subset by where deg y [h] is the degree in y of the polynomial h. The first step consists of fixing the exponents e i in the ansatz (4.7). We do this through the procedure discussed in Section 4.2. In the second step we determine the degree in y of the numerator N (x, y) in the ansatz (4.7). We recall that N (x, y) is by construction polynomial in y. We compute its degree in y by reconstructing it on another univariate slice, this time where only y varies, with a i chosen randomly in the finite field. Clearly, the degree in t of N (t) := N (x =ā, y = t) gives the degree in y of N (x, y). We introduce the short-hand notation for the degrees of N (x, y) and of the denominator factors i (x, y) in y.
Using the information about the factors in the denominator of r and the degree in y of its numerator, we construct the following ansatz for the partial fraction decomposition of r with respect to y: where U ijk (x), R (x) and V h (x) are unknown rational functions ofx. The right-most term in Eq. (4.11) is required only if d N > d Λy , i.e. only if the numerator of r has a higher degree in y than the denominator. The last step of the algorithm consists of reconstructing the analytic dependence onx of the unknown coefficients in the ansatz (4.11) from the numerical evaluations of r (x, y). To solve this linear fit problem, we use the algorithm implemented in the FiniteFlow framework [50]. The solution comes in the form of an algorithm which numerically evaluates U ijk (x), R (x) and V h (x). The rational reconstruction may be simplified by first reconstructing the coefficients on a univariate slice where all the remaining variablesx vary, and using that to match them with those factors in the ansatz { i (x, y)} m i=1 which depend only onx. This may lower the total degrees of the functions that need to be reconstructed.
In addition to the factors in the original ansatz { i (x, y)} m i=1 , the coefficients of the partial fraction decomposition (4.11) may also contain spurious factors. Consider for instance the toy example where a and b are arbitrary constants such that a = b. In this example, the inspection of the left-hand side indicates {y − a, y − b} as ansatz for the irreducible denominator factors.
The partial fraction decomposition however contains a factor of a − b in the denominator, which arises from the residue of the function at the zero of either of the denominator factors. Clearly a = b is a spurious singularity, manifestly absent on the left-hand side and produced by the partial fraction decomposition. In general, we can determine the potential spurious factors by evaluating the factors in the ansatz i (x, y) which depend on y at their zeros, where y * k is the zero of k (x, y), and Λ 1 y is the subset of factors which depend linearly on y, The restriction to zeros of linear functions of y is due to the facts that the i 's are irreducible polynomials over Q and that we are factoring over Q. The zeros of higher-degree irreducible polynomials would introduce algebraic and/or complex dependence. In practice, we observe that determining the spurious factors does not simplify the reconstruction. The greatest part of the denominators of the coefficients in the partial fraction decomposition (4.11) is in fact determined by the original ansatz { i (x, y)} m i=1 . What remains after they are multiplied away has a total degree which is typically lower than that of the numerators, which therefore dominates the determination of the number of sample points required for the reconstruction. While it is possible to determine entirely the denominators of the coefficients in Eq. (4.11), it would not reduce the number of required sample points substantially, and for this reason we refrain from doing it.
Having determined as many factors as possible in the coefficients of the partial fraction decomposition, we multiply them away and reconstruct the remainder using the multivariate rational reconstruction algorithms implemented in FiniteFlow. It is important to stress that the algorithm which evaluates the coefficients of the partial fraction decomposition contains the solution of a linear fit. For each numerical value ofx, Eq. (4.11) is sampled for several numerical values of y, roughly as many times as the number of unknowns. This generates a linear system of equations for the unknowns evaluated at the chosen value of x. The redundant equations are removed after the learning phase. The reconstruction on the univariate slices in the intermediate steps of the algorithm, because it requires several evaluations of the original functions, obviously has a higher computational cost with respect to directly evaluating r. On the other hand, the coefficients of the partial fraction decomposition depend on one fewer variable than the original function r, and may have substantially lower degrees. As a result of all these aspects, the partial fraction decomposition may be outperformed by a direct reconstruction for simple functions, but becomes more and more convenient as the complexity of the functions increases.

Summary and impact of the reconstruction strategy
The techniques discussed in the previous sections are general and can be applied to any rational reconstruction problem, in combination or separately. In order to reconstruct the  Table 1: Maximal numerator/denominator polynomial degrees of the rational coefficients of the most complicated finite remainders at each stage of our reconstruction strategy. The column "original" refers to the rational coefficients prior to any optimisation. The asterisk * highlights that, after the partial fraction decomposition in stage 3, the coefficients to be reconstructed depend on one fewer variable.
rational coefficients of the two-loop diphoton finite remainders we apply them consecutively as follows.
Stage 1. We fit the linear relations among the rational coefficients with an ansatz, as discussed in Section 4.1. We begin with the (d s − 2) 1 components and use the coefficients of the two-loop leading-colour five-gluon finite remainders as ansatz. For the (d s − 2) 0 components, which are more complicated, we add to the ansatz the (d s − 2) 1 -coefficients already reconstructed.

Stage 2.
We guess the factors from the ansatz (4.5) by reconstructing a univariate slice and multiply them away, as explained in Section 4.2.
Stage 3. We partial fraction on the fly with respect to x 4 , applying the algorithm presented in Section 4.3. The coefficients to be reconstructed after this stage are those in the ansatz for the partial fraction decomposition (4.11), and depend on one fewer variable.
Stage 4. We reconstruct another univariate slice and perform an additional factor guessing, as in the second stage.
The drop in the complexity of the rational coefficients after each stage for the most complicated two-loop diphoton finite remainders, which are in the Maximally-Helicity-Violating (MHV) configurations, is illustrated in Table 1. As proxy for the complexity of the coefficients we use the maximal numerator/denominator polynomial degrees, which can be evaluated by reconstructing univariate slices as discussed in Section 4.3.
Interestingly, we observe that the coefficients of the subleading-colour 3g2γ two-loop finite remainders F instead are not entirely fixed by the five-gluon ones, but using the latter as ansatz in the linear relations reduces significantly the maximal polynomial degrees of the coefficients which remain to be reconstructed.
As can be appreciated in Table 1, our strategy leads to a substantial drop in the polynomial degrees. Furthermore, the coefficients to be reconstructed after the partial fraction decomposition (stage 3) depend on one fewer variable. This makes the decrease in the number of sample points required for the reconstruction even more pronounced. The price to pay for this is that performing the partial fraction decomposition increases the evaluation time per point, as discussed at the end of Section 4.3. With our setup we observe that, for the most complicated finite remainders, the evaluation times grows roughly by one order of magnitude, while the number of sample points required for the reconstruction decreases by two orders of magnitude. This leads to an overall gain of roughly one order of magnitude in the reconstruction time. We stress that the evaluation time relevant here is that of the algorithm which evaluates the rational coefficients over finite fields, not the final evaluation time of the finite remainders. Once the reconstruction is completed, in fact, the rational coefficients are evaluated from their analytic expressions. For the evaluation time of the finite remainders, we refer to Section 6.
Our approach therefore leads to an important simplification in the reconstruction of the rational coefficients. Moreover, the ensuing analytic expressions are dramatically more compact. This makes them suitable for compilation in a C++ library, an essential step for future phenomenological applications.

Compact analytic expressions for the all-plus configuration
Prior to discussing the numerical implementation of all two-loop helicity amplitudes, we would like to comment on the all-plus amplitude, which displays a particularly simple analytic form. We find that the structures appearing are closely related to those appearing in the five-gluon all-plus amplitudes at one [98][99][100][101] and two loops [56,[92][93][94]. We present the finite remainders in the expansion around d s = 2.
The all-plus amplitude is finite and rational at one loop. The finite remainder can be written as Remarkably, this amplitude is invariant under conformal transformations, and the expression given here exhibits this property in a manifest way [101]. If all masses are neglected, the SM Lagrangian is conformally invariant. This symmetry is obscured at loop level by the appearance of scales associated with the divergences and it is therefore rather surprising to observe it in a one-loop amplitude. One might naïvely suppose that this is a consequence of the finiteness of the all-plus one-loop amplitudes. Yet, the single-minus one-loop amplitudes are equally finite, but they are not conformally invariant. This phenomenon still calls for an explanation. These properties are discussed in detail in Ref. [101], where the authors prove that the n-gluon amplitudes in QCD are conformally invariant at one loop. Since the diphoton amplitudes can be expressed as permutations of pure gluon scattering [74,75] and the conformal generators commute with permutations, all considerations regarding conformal symmetry trivially extend to the diphoton case. At two-loop order, the d s = 2 contribution is the only one involving transcendental functions. Its expression is remarkably simple, is the finite part of the one-loop box with an off-shell leg. The analytic continuation of the box functions to any scattering region can be easily achieved by adding a small positive imaginary part to each two-particle Mandelstam invariant, s ij → s ij + i0 + . The other partial amplitudes at two loops are rational, 3;0 (1 + , 2 + , 3 + , 4 + γ , 5 + γ ) = where tr 5 (p i , p j , p k , p l ) = tr(γ 5/ p i / p j / p k / p l ). The peculiar simplicity of this amplitude at two loops follows from the fact that it vanishes at tree level and it is rational in four dimensions at one loop. The one-loop amplitude can in fact be used as an effective on-shell vertex in four-dimensional unitarity [92,102,103]. In this way, the cuts of the two-loop amplitude become one-loop cuts with an insertion of the effective vertex. The one-and two-loop allplus finite remainders are thus treated as tree-level and one-loop objects, respectively. As a result, the special functions appearing in the finite remainder at two loops can have at most transcendental weight two (up to order 0 ). Moreover, the rational coefficients of the transcendental functions can be shown through four-dimensional unitarity to be given by (permutations of) the one-loop all-plus finite remainder. They thus inherit the symmetry under conformal transformations from the one-loop amplitude. These beautiful properties are manifest in our explicit expressions (5.2) and (5.1). Complementing four-dimensional unitarity with recursion relations for the rational terms allows to compute the two-loop all-plus finite remainders in the purely gluonic case avoiding altogether the computation of the two-loop integrals [92,94]. Some results are available even for amplitudes involving more than five plus-helicity gluons [104][105][106][107][108][109].
Amplitudes with a single minus helicity share some of the simplicity of the all-plus case. They also vanish at tree level, and are finite and rational at one loop. As a result, they also have maximum transcendental weight two at two loops. Differently from the all-plus amplitudes, however, they do not have the structure that F (2) 1;0 has uniform transcendental weight two with all other contributions being rational. For the amplitudes with two negative helicities, instead, the finite remainders have maximum weight two and four at one and two loops, respectively.

Implementation and performance
The finite remainders are coded up into the NJet C++ library, which is linked to the PentagonFunctions++ library [21] for the evaluation of the special functions. The six independent helicity amplitudes (shown in Table 2) are permuted analytically onto the global basis of pentagon functions defined in the 12 → 345 scattering region to provide a complete list of 16 "mostly-plus" helicity amplitudes required for the sum. This task is performed using the permuted coefficients from the six fully reconstructed amplitudes as an ansatz into the linear relations so additional reconstruction time is avoided (see Section 4.1). Having identified a global basis of pentagon functions for the complete colour and helicity sum, we formulate the partial amplitudes as where h is the helicity configuration, f h j is a list of integers corresponding to the global list of pentagon function monomials, M h ij are sparse matrices of rational numbers, and c h i are the independent rational coefficients written in terms of independent polynomials in the momentum twistor variables x i . The pentagon function monomials are split into parity-odd and -even components, which allows the remaining 16 "mostly-minus" helicities to be computed by simply flipping the parity of the special functions and applying complex conjugation to the coefficients. The colour-and helicity-summed matrix element is constructed numerically from these ingredients. The sparse matrix multiplication is implemented using the Eigen library [110]. Evaluation with 128-bit and 256-bit floating-point numbers (f128 and f256) is provided via the QD library [111]. The code is available through https://bitbucket.org/njet/njet, where we provide additional installation instructions and example programs demonstrating the usage.
The C++ code returns the values of the one-and two-loop hard functions, H (1) and H (2) , obtained by squaring Eq. (2.8), substituting the decomposition in N c and n f from Eq. (2.9), subtracting the IR and UV poles, and finally summing over colour and helicity, 3 .

(6.2)
The sum over colours for each helicity can also be returned if required. We find the evaluation time is dominated by the special functions, particularly when higher precision is required. In order to ensure fast and stable numerical evaluation, we adopt the following evaluation strategy.
1. The user-provided phase-space point is checked for the precision of the on-shell constraints. Points are adjusted in case the precision is not acceptable for the requested number of digits: 64-bit floating-point numbers (f64) ∼ 15; f128 ∼ 31; f256 ∼ 62.
2. The colour-and helicity-summed amplitude is computed using f64 precision at two points which differ only by overall dimension scaling factor. After accounting for the overall dimension of the squared amplitude, the two evaluations should only differ due to rounding errors at intermediate stages in the evaluation of the coefficients. This accuracy scaling test has been used extensively at one loop. We refer to this accuracy as f64/f64 since both coefficients and special functions use f64 precision.
3. If the estimated number of correct digits from the scaling tests falls below a userdefined threshold, the coefficients only are recomputed using f128 precision after the original point is corrected to f128 precision (as in step 1). We refer to this as f128/f64 precision.
4. The scaling test is performed again and if it fails the special functions are re-evaluated in f128 precision. This is f128/f128 precision.
5. These steps can be repeated to obtain up to f256/f256 precision. In practice these steps are rather expensive and unnecessary for standard phenomenological applications, so they are omitted from our strategy.
While the dimension scaling test has been used successfully at one loop, we need to be more careful in our applications when linking the PentagonFunctions++ library, which also makes use of the dimension re-scaling internally. To validate the reliability of the scaling test as an estimate of the error of the result, we evaluate both with a direct f128/f128 computation and via a scaling test with an error cutoff of four digits at f64/f64 for a set of 60 000 points. To ensure a realistic validation, we use "physical" points with a phase-space sampling density determined by the one-loop process, obtained from NNLOJET. We compare the estimated error provided by the f64/f64 scaling test to the relative difference between the f64/f64 and f128/f128 evaluations, which is taken as the true error. In the following, percentages are always with respect to the entire set of points.
The scaling test returns a negative for 2.8 % of the points. According to true error, an additional 0.2 % of the points should be failed and are missed by the scaling test (false positive). Of these points, almost all have true error of four digits, the remaining 0.008 % with three digits, so the effect on stability is small. The scaling test also fails some points unnecessarily (false negative), this subset comprising 0.7 % of all points, which incurs a small performance penalty in the evaluation strategy. The effects of the false estimates are considered to be allowably small.
We note that the dimension scaling test is statistical and therefore one will always find anomalies in a sufficiently large sample. Care should be taken when integrating over extreme regions of phase space.
To assess the stability of our implementation ( Figure 3) and measure timings, we evaluate the amplitude squared over 100 000 points of the physical phase space. We find a helicity N c F   We present a benchmark evaluation at a point taken from the physical phase space. We choose a generic configuration where the momentum invariants (GeV 2 ) and tr 5 (GeV 4 ) take the values, quoted to four significant figures,  Figure 3: Histogram of the error estimate on the two-loop evaluations as given by the scaling test. We use the evaluation strategy with a target accuracy of three digits and show errors for all precision levels. We see 1.8 % of points failing f64/f64 evaluation, with 1.2 % passing at f128/f64 and 0.6 % passing at f128/f128. The evaluation strategy achieves target accuracy for all of the 100 000 physical phase-space points tested.

Conclusions
In this paper we have presented a complete, full colour, five-point amplitude at two loops in QCD. All helicity configurations have been implemented into the NJet C++ library, which provides an efficient and stable evaluation over the physical scattering region. Though the algebraic complexity of the amplitude is considerable, the direct analytic reconstruction of the finite remainders was possible by making use of linear relation amongst the coefficients and partial fractioning in one variable, which could be done without any analytic knowledge of the intermediate steps in the reduction. We expect these techniques will have applications to other important high-multiplicity two-loop calculations with more external scales such as five-particle scattering with an off-shell leg, for which there has also been recent progress [17,81,[112][113][114][115][116]. We have found a form that is suitable for phenomenological applications and look forward to new precision predictions for diphoton production at hadron colliders including the dominant N 3 LO corrections we have computed here.

A Momentum twistor parametrisation
Following [8,52,87], the construction begins with where λ i is the negative-helicity spinor, and µ i is related to the positive-helicity spinorλ i viaλ with the indices defined modulo 5. Using the Poincaré and U (1) symmetries it is possible to fix all but 5 of the entries of the momentum twistor matrix Z = (Z i ) i=1,...,5 . Explicitly we choose the form, The parameterisation used in this work has some benefits: the only dimensionful quantity is x 1 and all holomorphic quantities are described using only x 1 , x 2 , x 3 . For real kinematics only x 2 and x 3 are complex. Notice that the conversion between the momentum twistor coordinates and spinor-helicity expressions is only invertible for phase-free quantities. For this purpose we may use the following relations, In our work we express the helicity amplitudes in terms of the momentum twistors variables x i . The phase information can be restored by multiplying and dividing by a suitable phase factor, where A is an helicity amplitude -or in general some object with a non-trivial phaseand Φ is an arbitrary factor with the same helicity weights as A. The quantities in the parentheses in Eq. (A.5) are both written in terms of momentum twistors. Their ratio is phase-free and can thus be expressed in terms of the scalar and pseudo-scalar invariants s ij 's and tr 5 , e.g. through Eqs. (A.4). The factor outside the parenthesis is instead written in terms of the spinor helicity variables and carries all the phase information of A.