Scattering amplitudes over finite fields and multivariate functional reconstruction

Several problems in computer algebra can be efficiently solved by reducing them to calculations over finite fields. In this paper, we describe an algorithm for the reconstruction of multivariate polynomials and rational functions from their evaluation over finite fields. Calculations over finite fields can in turn be efficiently performed using machine-size integers in statically-typed languages. We then discuss the application of the algorithm to several techniques related to the computation of scattering amplitudes, such as the four- and six-dimensional spinor-helicity formalism, tree-level recursion relations, and multi-loop integrand reduction via generalized unitarity. The method has good efficiency and scales well with the number of variables and the complexity of the problem. As an example combining these techniques, we present the calculation of full analytic expressions for the two-loop five-point on-shell integrands of the maximal cuts of the planar penta-box and the non-planar double-pentagon topologies in Yang-Mills theory, for a complete set of independent helicity configurations.


Introduction
Scattering amplitudes in Quantum Field Theory (QFT) are an essential ingredient for understanding the interactions among fundamental particles that we observe in nature. The development of techniques and algorithms for their calculation is therefore of crucial importance for comparing observations with theoretical predictions. In particular, the high accuracy expected from experimental data which is being collected at the Large Hadron Collider (LHC), as well as the high centre-of-mass energy of the interactions produced by this machine, require accurate predictions for processes with high-multiplicity final states. This has motivated, in recent years, many studies of the structure of scattering amplitudes in QFT, especially in gauge theories, which often led to the development of new efficient methods for their evaluation.
Despite the remarkable recent progress in the calculation of high-multiplicity tree-level and one-loop amplitudes, and the numerical automation of the latter in several codes and frameworks [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19], at two and higher loops these calculations are still essentially restricted to 2 → 2 processes. This is due to the significant increase in complexity of the two-loop problem with respect to the one-loop case. More in detail, the most common strategy for the calculation of loop amplitudes is to rewrite them as a linear combination of loop integrals. While the coefficients of this linear combination are rational functions of kinematic invariants, the loop integrals can instead be computed in terms of special functions. While for lowmultiplicity processes the most demanding task is arguably the calculation of the integrals, for high-multiplicity processes the computation of the coefficients can often have comparable or higher complexity (notice e.g. that a complete planar basis of massless five-point two-loop integrals is known [20,21], but five-point QCD amplitudes are still unavailable for a generic helicity configuration). While one-loop amplitudes are often computed numerically, two-loop amplitudes are more often computed analytically.
There are several reasons to prefer an analytic approach to a numerical one. Analytic expressions often yield a faster and more stable numerical evaluation than purely numerical algorithms. Moreover, analytic results allow to perform various kinds of studies and manipulations, such as the analysis of the behaviour of amplitudes in interesting kinematic limits (e.g. infrared and high-energy limits). Analytic calculations also allow to have better control over the results and possibly infer general properties which might be useful for the development of new analytic or numerical algorithms.
A well known bottleneck of analytic calculations in high-energy physics is however the large size of intermediate expressions, which can often be orders of magnitude larger than the final results. This is to be expected, since physical results often enjoy properties which are not shared by each intermediate step of the calculation. Moreover, intermediate steps are often described by a larger number of variables (such as the loop components) which do not appear in the final result. The problem can be mitigated by the use of computer algebra systems such as Form [22], which specializes in handling large expressions, or by using techniques such as generalized unitarity [23,24], where intermediate steps of the calculation are gauge invariant and hence the complexity of their expressions is reduced with respect to a diagrammatic approach.
In this paper we assess the possibility of side-stepping the issue of large intermediate expressions, by reconstructing analytic results from their numerical evaluation, where each intermediate step is trivially a number or a set of numbers (this will be better defined in the next paragraph). A polynomial or a rational function can be reconstructed, with very high probability, from its numerical evaluation at several values of its arguments. In particular we focus on the functional reconstruction of multivariate polynomials and rational functions with applications to calculations in high-energy physics.
The first question to address is which kind of numerical evaluation is suited for a functional reconstruction. An obvious choice would be a floating-point calculation, but these are affected by numerical inaccuracies which would add an additional layer of complexity to the functional reconstruction algorithms. Exact calculations might instead be performed over the field of rational numbers. However, numerical calculations with rational numbers are affected by a similar problem to the one of analytic calculations. Indeed, while a large intermediate expression would be translated into a rational number, in general the number of digits of the numerator and the denominator of this number would be very high. This requires extensive use of computationally expensive arbitrary-precision arithmetic, which can significantly slow down the calculation. A common and successful approach in computer algebra is the use of finite fields, which have a finite number of elements and can be represented by machine-size integers, offering the possibility of performing fast but exact calculations in statically-typed languages such as C and C++. Their main drawback is that, switching from the rational field to a finite field, some information is lost and must be recovered by repeating a functional reconstruction over several finite fields. This strategy is however much more efficient than a calculation over the rational field.
The usage of finite fields in computer algebra in actually quite common. Many computer algebra systems use finite fields under the hood for solving problems such as polynomial factorization and Greatest Common Divisor (GCD). The application of finite fields in highenergy physics has been introduced in ref. [25], in the context of Integration-By-Parts (IBPs). However, to the best of our knowledge, an application of finite-field techniques to a realistic problem in high-energy physics involving the reconstruction of multivariate rational functions is not present in the literature and is presented here for the first time. In particular, the main missing ingredient which we illustrate in this paper is the application of a functional reconstruction algorithm capable of handling relatively complex results which depend on many variables, such as those appearing in typical high-multiplicity multi-loop calculations.
The paper is roughly divided in two parts. In the first part we describe dense 1 functional reconstruction algorithms for univariate and multivariate polynomials and rational functions. These algorithms reconstruct polynomials and rational functions from their repeated numerical evaluation over finite fields (although in principle they can actually be used over any field) and they are independent of the specific algorithm used for their evaluation. In particular, for univariate polynomials we use the well known Newton's polynomial representation. For univariate rational functions, we use Thiele's interpolation formula. 2 For multivariate polynomials, we use a recursive version of Newton's formula. For multivariate rational functions we were not able to find a dense reconstruction algorithm suited for our needs in the literature. However we found that the technique proposed in ref. [26] for sparse rational functions can be easily adapted to the dense case, by combining it with the other techniques we mentioned for univariate rational functions and multivariate polynomials. The resulting algorithm is capable of efficiently reconstructing functions with many non-vanishing terms and depending on several variables, as we will show in the examples. Using these methods, the analytic calculation of any polynomial or rational function can be turned into the problem of providing an efficient numerical evaluation of the same function over finite fields.
The second part of the paper concerns the application of the mentioned reconstruction algorithms to techniques relevant for the calculation of scattering amplitudes. It should be stressed that any algorithm which can be implemented via a sequence a rational elementary operations (addition, subtraction, multiplication and division) is suited for the usage of these reconstruction techniques, which therefore have a very broad spectrum of applications. In particular, widely used methods such as tensor reduction and IBPs obviously fall into this category. In this paper we however focus on multi-loop integrand reduction via generalized unitarity [23,24,[27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46], since the algorithm is suited for high-multiplicity processes and, as 1 A dense reconstruction algorithm for polynomials and rational functions, unlike a sparse reconstruction algorithm, seeks to be efficient in the general case where the result has many non-vanishing terms, rather than in the special case where it only has a small number of non-vanishing terms compared with its total degree. 2 The formula is named after the mathematician Thorvald Nicolai Thiele (1838-1910).
stated, writing scattering amplitudes as linear combinations of integrals (which may be further processed by IBPs at a later stage) is currently one of the main bottlenecks of high-multiplicity multi-loop calculations. These techniques have indeed been used in recent five-and six-point calculations of two-loop amplitudes in non-supersymmetric Yang-Mills theory [37,47,48]. In order to provide the building blocks needed by generalized unitarity, we also discuss in some detail a finite-field implementation of the spinor-helicity formalism in four [49,50] and six dimensions [51][52][53], as well as the calculation of tree-level amplitudes over finite fields via Berends-Giele recursion [50]. In particular, the six-dimensional spinor-helicity formalism is used to provide a higher-dimensional embedding of loop momenta and spinors, which will thus have an explicit finite-field numerical representation.
As explicit examples combining all these techniques, we present the calculation of full analytic expressions for the two-loop five-point on-shell integrands of the maximal cuts of the planar penta-box and the non-planar double-pentagon topologies in Yang-Mills theory, for a complete set of independent helicity configurations. In particular, in the all-plus case we find agreement with the results available in the literature, while for the other helicity configurations the result is new and we make it publicly available for comparisons with future calculations.
All the algorithms discussed in the paper have been implemented in a C++ library which can produce the mentioned analytic results from finite-field evaluations using 64-bit integers, without relying on any external computer algebra system. The paper is structured as follows. In sect. 2 we set the notation and review some notions about finite fields which we will use in the rest of the paper. While all these notions are well known, they are rarely used in high-energy physics and hence they are reviewed in some detail for the convenience of the reader. In sect. 3 we describe the functional reconstruction algorithms mentioned above. In sect. 4 we discuss the implementation of spinor-helicity and tree-level techniques over finite fields. These are mostly meant as a stepping stone for the discussion of integrand reduction and generalized unitarity, which is illustrated in sect. 5, where we also provide the two-loop five-point examples mentioned above. In sect. 6 we give some details about our proof-of-concept C++ implementation of the algorithms illustrated in this paper, which might be useful for other implementations. In sect. 7 we finally draw our conclusions and briefly discuss further possible applications of these techniques. In Appendix A we recall some well known theorems and algorithms involving modular arithmetic and finite fields, highlighting the role they play in the reconstruction algorithms illustrated in this paper. More details about our usage of the six-dimensional spinor-helicity formalism over finite fields are given in Appendix B. In Appendix C we discuss an efficient method for generating two-loop unitarity cuts from Berends-Giele currents, which is a generalization of the one-loop algorithm used by the public code NJet [18], and whose two-loop extension is not present elsewhere in the literature.

Basic concepts, definitions and notation
In this section we set the notation and review some well known concepts about finite fields which will be used later in the paper.

Finite fields
Finite fields are fields containing a finite number of elements. For the purposes of this paper, we will only consider fields of integers modulo p, denoted by Z p , where p is a prime number.
In particular, we identify Z n , with n a positive integer (not necessarily prime), with the set of non-negative integers 3 smaller than n, (2.1) Elementary arithmetic operations in Z n , such as addition, subtraction and multiplication, are defined using modular arithmetic, namely by performing the corresponding operation in Z and taking the remainder of the integer division of the result modulo n. Given an element a ∈ Z n with a = 0, if a and n are co-prime we can define the inverse a −1 of a in Z n with respect to multiplication, i.e. an element b ∈ Z n such that One can indeed show that such a number b exists (and is unique in Z n ) if and only if a and n are co-prime. If n = p is a prime number, the existence of an inverse is therefore guaranteed for every non-vanishing element of Z p . This implies that Z p is a field and any rational operation on its elements is well defined. The multiplicative inverse can be computed using the extended Euclidean algorithm, as explained in Appendix A.1. The existence of a multiplicative inverse implies that we can define a map between rational numbers and elements of a finite field Z p . In particular, given a rational number q = a/b ∈ Q, we define This map is obviously not invertible (since Q is infinite and Z p is finite), however it turns out one can reconstruct q, with very high probability, from its images in several finite fields Z p i where {p i } is a set of prime numbers, as explained in sect. 2.3 and Appendix A. This will enable us to reconstruct rational functions with rational coefficients from their values over finite fields. It is worth observing that we can similarly map q in Z n , with n not prime, as long as n and the denominator b are co-prime.

Polynomials and rational functions
In this paper we use a multi-index notation. Given a sequence of n variables z = (z 1 , . . . , z n ), and the n-dimensional multi-index α = (α 1 , . . . , α n ) with integers α i ≥ 0, a monomial z α is defined as The total degree of a monomial is denoted by |a|, With F a generic field, we use the following (standard) definitions: is the ring of polynomials in the variables z with coefficients in F. Any polynomial function f ∈ F[z] can be uniquely identified by a set of multi-indexes {α} and coefficients • F(z) is the field of rational functions in the variables z with coefficients in F. Functions f ∈ F(z) can be expressed as a ratio of two polynomials p, q ∈ F[z] as where n α , d β ∈ F, while {α} and {β} are sets of multi-indexes. Unlike the polynomial representation given in Eq. (2.6), the representation of a rational function is not unique, even if we assume p and q to have no common polynomial factors. However, after GCD simplification, the only possible ambiguity is an overall constant normalization of the numerator and the denominator. In order to have a unique representation, useful e.g. when comparing functions obtained in different finite fields, we use the convention of defining the coefficient of the lowest degree term of the denominator (with respect to the chosen monomial order) to be equal to one.
The functional reconstruction methods described in this paper are based on multiple evaluations of the function f to be reconstructed, which in turn correspond to assigning to each variable a value in the field F. For the univariate case we denote these values by y i ∈ F while multivariate we denote them by y i ∈ F n , where i is a label distinguishing different evaluation points.
In this paper the field F will either be the rational field Q or a finite field Z p . More in detail, our goal is the calculation of polynomial functions in Q[z] and rational functions in Q(z). We will do so by performing a functional interpolation of the same functions in Z p [z] or Z p (z) respectively, for several primes p if needed, and then use these to reconstruct the results over the rational field.

Rational reconstruction from finite fields
In the next sections we will describe an algorithm for efficiently reconstructing polynomials and rational functions over finite fields. As apparent from their representation in Eq. (2.6) and (2.7), these functions can be identified by a sequence of monomials and their respective coefficients. The final step of the reconstruction algorithm therefore consists in promoting these coefficients from elements of a finite field Z p into a proper rational number. As we have seen in sect. (2.1), one can map a rational number q into an element of the set Z n of integers modulo n, as long as n and the (reduced) denominator of q are co-prime. Although this map is not invertible, it turns out one can use a variation of the extended Euclidean algorithm [54] (see also Appendix A.1) in order to make a guess for q from its image q mod n ∈ Z n . This method is known as rational reconstruction. The guess will, in general, be correct when the numerator and the denominator of q are much smaller than n (heuristically one finds that the threshold is around √ n). However, because of our requirement of working with machine-size integers (see Sect. 6 for more details), we cannot always choose the prime p defining the field Z p to be significantly larger than the numerator and the denominator of any rational number which can be expected to appear in the results.
The solution to the problem comes from the Chinese remainder theorem, which allows to uniquely reconstruct an element in Z n , with n = n 1 · · · n k and the n i pairwise co-prime, from its images in Z n i for i = 1, . . . k, as explained in Appendix A.2. This implies that, given a set of prime numbers p i , we can perform the functional reconstruction over Z p i and then combine the results to find the image of the rational coefficients in Z p 1 ···p k . By performing the rational reconstruction over Z p 1 ···p k , the result will thus be correct when the product of the selected primes is large enough. 4 This allows us to use machine-size integers for a fast functional reconstruction in Z p i , while the use of multi-precision arithmetic (computationally much more expensive) is restricted to this rational reconstruction step in Z p 1 ···p k , which takes a very small fraction of computing time compared to the one spent for the functional reconstruction over prime fields.
More in detail, given a sequence of primes p 1 , . . . , p k , we adopt the following algorithm for the full reconstruction of a rational function in f ∈ Q(z) (a completely analogous one can obviously be used for a polynomial): 1. Reconstruct the function f in Z p 1 (z) and store the result.
2. Use the rational reconstruction algorithm to promote the stored result to a (new) guess g ∈ Q(z).
3. Consider a new prime p i+1 , where i is the number of primes which have already been used so far. Evaluate the guess g over the new field Z p i+1 for several values of z. If the result for g(z) agrees with the evaluation of the function f (z), accept the guess g as the correct answer, i.e. assert g(z) = f (z), and successfully terminate the algorithm. Otherwise proceed to the next point. 4. Reconstruct the function in Z p i+1 (z) and combine it with the stored result in Z p 1 ···p i (z) using the Chinese remainder theorem in order to obtain the correct result in Z p 1 ···p i+1 (z). The latter thus replaces the previously stored result. Repeat from point 2.
The algorithm terminates when the comparison in point 3 is successful. For the examples presented in this paper, we typically only need to perform the functional reconstruction over one or two prime fields. We observe that the techniques reviewed in this section (which, as stated, are well known) allow reconstructing a multivariate rational function with very high probability. In practice, exceptional cases are very artificial and irrelevant for realistic applications. It should also be noted that the final result can be extensively checked against the evaluation of the function f on even more values of the prime p and the variables z.
In the rest of this paper, we will discuss a functional reconstruction algorithm which is suited for complex theoretical calculations in high-energy physics, and its application to techniques related to the computation of tree-level and multi-loop scattering amplitudes in QFT.

Functional reconstruction
In this section we describe a dense functional reconstruction algorithm for polynomials and rational functions whose performance scales well with the complexity of the result. For the sake of generality we make no further assumption about the functions to be reconstructed. Unless explicitly stated otherwise, the techniques illustrated in this section can be applied to functions over any field, although in practice we are mostly interested to their application over finite fields.

The black-box interpolation problem
Given a function f of n variables z = (z 1 , . . . , z n ) over a field F, the so-called black-box interpolation problem is the problem of reconstructing the function f from the results of its evaluation for several values of z. In other words, one can think about the function as a numerical procedure of the form while the functional reconstruction method has no knowledge about the algorithm used for the calculation of f . In this paper we are interested in reconstructing polynomials and rational functions over the rational field. Our setup is a modified version of the black-box interpolation problem, where the function is evaluated modulo a prime number p, and can schematically be represented as We recall that the rational reconstruction technique reviewed in sect. 2.3 reduces the problem of a functional reconstruction over Q to the problem of a functional reconstruction over prime fields Z p for generic p. This makes the setups in Eq. (3.1) and Eq. (3.2) effectively equivalent for our purposes. Hence, in the remainder of this paper, we will focus on functional reconstruction techniques over finite fields Z p , where p is an arbitrary prime. As stated, these are actually valid in any field F and the use of finite fields is meant to provide a fast but exact numerical evaluation of the black-box function f to be reconstructed. Therefore, in this section we will simply denote the generic field by F. The advantage of turning an analytic calculation into a black-box interpolation is that it reduces the problem of computing a function f into the one of providing a fast numerical evaluation for it. Since the reconstruction is independent of the algorithm used for the evaluation of f , it has a very broad spectrum of applications. Numerical calculations can in turn avoid issues such as large intermediate expressions, which affect many computations in high-energy physics. With this approach, the number of evaluations needed for the reconstruction of a function scales linearly with the number of terms of the result itself and is independent of the complexity of intermediate expressions which may appear using fully analytic techniques.
We remind the reader that, because we are dealing with polynomials and rational functions, which can be represented as in Eq. (2.6) and Eq. (2.7) respectively, the goal of a functional reconstruction algorithm is to identify the monomials appearing in their definition and the corresponding coefficients as elements of the field F.
Functional reconstruction algorithms roughly fall into two categories: dense and sparse algorithms. As suggested by their name, sparse reconstruction algorithms attempt to be more efficient (with respect to number of function evaluations needed) in the case where the number of non-vanishing terms is small compared to the one expected from their total degree. In this paper we focus, as mentioned, on dense reconstruction algorithms, seeking good efficiency in the most general case where many non-vanishing terms are present, rather than being optimal in the special cases where the function to be reconstructed is relatively simple.
In order to better motivate the discussion which follows, it is useful to consider first a straightforward system-solving strategy for the functional reconstruction, and point out why it is not suitable for our purposes. Given a set of values y i ∈ F n , using Eq. (2.6) and Eq. (2.7), one can build systems of equations of the form to be solved for the coefficients {c α } and {n α , d β } for polynomials and rational functions respectively, using basic techniques such as Gauss elimination. Notice, however, that one cannot know a priori which monomials will appear in the result. Moreover, in sect. 2.2 we defined the coefficients of our canonical representation of rational functions such that the coefficient of the lowest degree term in the denominator is equal to one, in order to solve the ambiguity in their representation. Unfortunately we have no knowledge about the mentioned term before having performed the reconstruction. This is however a minor issue which will be solved as discussed in sect. 3.5 using techniques proposed in ref. [26]. The same techniques will allow us to assess the total degree of the numerator and the denominator of a multivariate rational function, or the one of a polynomial, using a relatively small number of evaluations. Hence, a viable solution to the functional reconstruction problem consists in listing the full set of N monomial terms compatible with the total degree of the function involved, sampling the function with (at least) N values for the set of variables z, and solving the resulting N × N system of equations for the coefficients. While this method is straightforward and efficient for simple functions depending on only one or two variables, it has however a bad scaling behaviour when increasing the number of variables or the total degree of the result. This can be understood simply by recalling that solving an N × N dense system of linear equations is an O(N 3 ) operation, and the multivariate problems in which we are interested in can have several thousands (or even hundreds of thousands) of potentially non-vanishing terms (notice e.g. that the most general polynomial in n variables and total degree R has N = R+n R terms). The reconstruction algorithms we are going to describe have instead a much better scaling with the complexity of the result, and the time spent for the functional reconstruction itself is typically much smaller than the time required for evaluating the function to be reconstructed. In the following, we start by describing well established algorithms for the univariate case and later use them as ingredients for the multivariate one.

Univariate polynomials
For univariate polynomial functions f = f (z) we adopt a well known reconstruction method based on Newton's polynomial representation. Given a sequence of distinct values y 0 , . . . , y R ∈ F, a univariate polynomial f ∈ F[z] of total degree R can be written as [55] The coefficients a r can be computed recursively by evaluating the function f at the values y i as An important feature of the method is that each coefficient a r only depends on the evaluation of f at the points y i with i ≤ r. This implies that new evaluation points cannot change the value of the previously computed coefficients. This is ideal for the case where the total rank R of the polynomial f is not known a priori, and it is our main reason for preferring this method over alternatives. In practice, we apply the algorithm recursively until we find a set of consecutive coefficients a r which evaluate to zero. This is the termination criterion of the algorithm (notice that, even when the canonical form of the polynomials has several vanishing entries, in general the entries of its Newton representation will be non-vanishing, hence the described termination criterion is robust and an incorrect termination is extremely unlikely). The sequence y i is generated dynamically, taking into account that the algorithm evaluating of the function f might fail at a particular point. More in detail, we choose y 0 as an arbitrary element of the field F and then we recursively define the following ones as y i = y i−1 + 1, as long as the evaluation of f is successful. If the evaluation of f fails, we try replacing the current point y i with y i + 1, until we find a value for which the evaluation of f is possible. If too many consecutive evaluations fail, we terminate the algorithm declaring the reconstruction unsuccessful.
Even though Newton's representation is more practical for functional reconstruction purposes, after a succesful reconstruction it is convenient to convert it back into the canonical representation given by Eq. (2.6), which in the univariate case can be written as It is worth observing that the conversion from the representation in Eq. (3.5) to the one in Eq. (3.7) only requires the following two operations • addition of univariate polynomials, • multiplication of a univariate polynomial by a linear univariate polynomial.
Both operations are simple enough to be efficiently implemented over finite fields Z p in statically-typed languages, without resorting to external computer algebra systems. More details on our implementation are given in sect. 6.

Univariate rational functions
For univariate rational functions f ∈ F(z) it is worth distinguishing two cases. The first case is applicable when the degree of the numerator and the denominator (R and R respectively) are known and the constant term of the denominator is known to be non-vanishing. As we shall see, this is useful for the multivariate reconstruction discussed in sect. 3.5. In this particular case, the function admits a representation of the form with d 0 = 1. We find the system-solving method explained at the end of section 3.1 is well suited for this univariate case, since the rank of the numerator and the denominator are unlikely to be high enough to make Gauss elimination impractically expensive, at least for the kind of problems we are interested in this paper (as we stated, the same is not true for the multivariate case, as the number of variables or the degree of the numerator and the denominator increase). Having set d 0 = 1, the system of equations reads and can be solved for the unknown coefficients {n r , d r } by evaluating the function f at least R + R + 1 times. In practice we include a few more evaluation points y i , making the system slightly over-constrained as a cross check.
We now address the more general case where the degree of the numerator and the denominator are not known. We find convenient to use a method based on Thiele's interpolation formula [55], which expresses a rational function f ∈ F(z) as a continued fraction where y 0 , . . . , y N is a sequence of distinct elements of F. Thiele's interpolation formula can be regarded as the analog of Newton's formula for rational functions. The second line of Eq. (3.10), by comparison with the second line of Eq. (3.5), makes the analogy manifest.
The coefficients a i can be recursively computed by evaluating the function f at the values y i , Similarly to Newton's interpolation formula, each value a j only depends on the evaluations at z = y i with i ≤ j, which makes the algorithm convenient in the case where the total number of terms is not known, since new evaluation points do not change previously computed terms.
Our strategy for generating the sequence of points y i is also the same as the one used in the polynomial case. It is worth pointing out that Thiele's interpolation formula, being a continued fraction, contains spurious singularities which might make the application of equations (3.11) impossible. Similarly to the case where the evaluation of f fails, if such a spurious singularity is encountered at a point y i , we simply discard the value and replace it with y i + 1. The termination criterion is the agreement between a new evaluation f (y i ) with the evaluation of the rational function defined by the coefficients a 0 , . . . a i−1 already computed. We make the algorithm more robust by requiring agreement between the reconstructed function and several new evaluations of f . After a successful interpolation, the result is converted into the canonical form given by Eq. (3.8), except that the condition d 0 = 1 is replaced by d min r = 1, where z min r is the lowest degree monomial with a non-vanishing coefficient in the denominator. We observe that Thiele's formula with N + 1 terms represents a rational function with degree R and R for the numerator the denominator respectively, where R = R = N/2 if N is even, and R = R + 1 = (N + 1)/2 if N is odd. In other words, either the degree of the numerator and the denominator of the reconstructed function are equal, or they differ by one unity at most. This implies that the highest degree coefficients n r or d r , obtained by converting the result into its canonical form, might be vanishing, in which case they are discarded. For this reason, if the degrees R and R are already known, the system-solving strategy typically requires fewer evaluations of the function, since this way one can avoid reconstructing zeros. Thiele's interpolation is however preferred when R and R are not known.
The conversion into a canonical form can be implemented by performing on the polynomial numerator and the denominator of the function the same kind of operations we listed for converting Newton's polynomial representation into the canonical form (and inversion, which is simply implemented by swapping numerator and denominator). Hence, as we stated for the previous case, this does not require the usage of computer algebra systems and can be easily implemented in statically-typed languages using functions over finite fields Z p .
We stress that the result we obtain can be shown to be minimal with respect to the degrees of the numerator and the denominator, and hence no GCD simplification is needed after the reconstruction is converted into a canonical form (and possible high-degree zero terms are discarded).

Multivariate polynomials
Given a sequence of variables z = (z 1 , . . . , z n ) and a multivariate polynomial function f ∈ F[z], the interpolation of f can be performed by recursive application of the univariate Newton's reconstruction method described above. In other words, we consider a generic multivariate polynomial f ∈ F[z 1 , . . . , z n ] as a univariate polynomial in the first variable z 1 , whose coefficients are polynomials in the remaining variables (z 2 , . . . , z n ). Newton's interpolation formula (3.12) can thus be generalized to the multivariate case, by upgrading the coefficients a i from elements of the field F to elements of the polynomial ring F[z 2 , . . . , z n ], where, as before, the y i are distinct elements of F. The solutions for the coefficients a r in Eq. (3.6) also apply to the multivariate case, with the following substitutions In this case, the right hand side of each equation (3.6) for a r (z 2 , . . . , z n ) thus depends on the previously computed polynomial coefficients a j (z 2 , . . . , z n ) with j < r, and on the function f where the first variable z 1 has been set to the value z 1 = y r . Therefore the reconstruction of a r (z 2 , . . . , z n ) is reduced to the black-box interpolation of a polynomial function in n − 1 variables. In particular, the evaluation of a r (z 2 , . . . , z n ) can be obtained by evaluating f (y r , z 2 , . . . , z n ) and then combining it with the previously computed polynomial coefficients a j by applying the formulas in Eq. (3.6). The recursion ends with the univariate case, where we apply the univariate polynomial reconstruction algorithm we already discussed.
In order to convert a multivariate polynomial from a (recursive) Newton representation into the canonical form in Eq. (2.6), one needs the following basic operations • addition of multivariate polynomials, • multiplication of a multivariate polynomial by a linear univariate polynomial.
Although these operations are slightly more involved than the corresponding ones for the univariate case, they still allow a rather straightforward and efficient implementation in staticallytyped languages, especially in the case of polynomials over finite fields Z p . We will discuss the details of our implementation and the polynomial representation we used in sect. 6.

Multivariate rational functions
The reconstruction of multivariate rational functions is a considerably more complex problem that those addressed so far. We were not able to find a dense multivariate reconstruction algorithm suited for our purposes in the literature. However, we observe that the techniques proposed in ref. [26] for sparse rational functions, can also be applied, with some minor modifications, to dense rational functions as well. In this section we describe our version of the techniques proposed in [26], and in particular their combination with the dense functional reconstruction methods discussed above. The efficiency of the resulting algorithm turns out to meet our performance goals, as we shall see in the next sections.
We consider the variables z = (z 1 , . . . , z n ) and a rational function f ∈ F(z), which admits a canonical representation of the form of Eq. (2.7). A first issue to be addressed is the ambiguity in the overall normalization of the numerator and denominator of the function. As we stated, we get rid of this ambiguity by requiring the canonical representation of the function to have the coefficient of the lowest degree term in the denominator equal to one. However, identifying such a term (or indeed any term in the function) before the functional reconstruction is in general not possible. As observed in ref. [26], there is however a simple case where this issue can be easily solved, i.e. when the lowest-degree non-vanishing term of the denominator is the constant term, in which case we can simply set the canonical normalization by imposing (3.14) It is also noted in ref. [26] that, even in the case where d (0,...,0) = 0, one can always identify a shift s = (s 1 , . . . , s n ) ∈ F n in the arguments of f such that the shifted function satisfies d (0,...,0) = 0 (i.e. is non-singular in z = (0, . . . , 0)) and can thus be normalized as in Eq. (3.14). Hence, in these cases, the method we are going to describe can be applied to the function f s instead. A minor subtlety is the fact that, after such a shift, the new function f s might become considerably more complex than the original function f , and thus harder to reconstruct. This issue is actually much more relevant for the sparse reconstruction case discussed in ref. [26], but it can also affect the dense reconstruction case, especially if the variables z have been carefully chosen for being suited to describe the function at hand. We will address this subtlety later and for the time being we turn to the description of the functional reconstruction of a rational function f ∈ F(z) whose constant term in the denominator is non-vanishing. The method illustrated in ref. [26] consists in introducing an auxiliary variable t and defining a new function h ∈ F(t, z) as (3.16) Using the canonical representation of f given by Eq. (2.7) and denoting the total degree of the numerator and the denominator of f by R and R respectively, we get In other words, the function h can be regarded as a univariate rational function in the variable t, whose coefficients are multivariate homogeneous polynomials in the variables z. We Notice that the constant term of the denominator of f coincides with the constant term of the denominator of h seen as a univariate rational function in t, hence by normalizing the latter to 1, the same normalization is automatically applied to the former. Our implementation of the method consists in reconstructing the homogeneous polynomials p r and q r for 0 ≤ r ≤ R and 1 ≤ r ≤ R , using the multivariate Newton reconstruction method illustrated above. The evaluations of these polynomials at a generic point z = y i ∈ F n are in turn obtained by reconstructing as a univariate rational function in t and identifying p r (y i ) and q r (y i ) with its coefficients. Notice that this univariate reconstruction is actually equivalent to the evaluation of all the polynomials p r and q r at the same time. In order not to waste evaluations, for each sampled value y i ∈ F n we cache the corresponding reconstructed univariate function h y i ∈ F(t) so that its coefficients can be re-used for the reconstruction of several polynomials. Moreover, since p r (z) and q r (z) are homogeneous polynomials of degree r and r respectively, during their reconstruction we can drop their dependence on the first variable z 1 by setting z 1 = 1 and consider the polynomials p r (1, z 2 , . . . , z n ) and q r (1, z 2 , . . . , z n ) instead. The dependence on z 1 is thus restored by homogenizing the result. This simplification makes up for having introduced the auxiliary variable t.
We now briefly discuss the univariate reconstruction algorithm we use for the functions h y i ∈ F(t) at each point y i ∈ F n . Since we typically cannot know a priori the degrees R and R of the numerator and the denominator respectively, we first perform a few univariate reconstructions using Thiele's interpolation method, in order to obtain this information. As a byproduct, this is also used to check whether the constant term of the denominator vanishes, in which case, as mentioned, we proceed by specifying a different shift z → z+s in the arguments of f (more details about this point are given below). Once the degree of the numerator and the denominator are known, and a shift s such that d (0,...,0) = 0 is found, we switch to the system-solving algorithm described at the beginning of sect. 3.3, since it typically requires fewer evaluations of the function (as explained, Thiele's interpolation becomes optimal with respect to the number of evaluations needed only when the degree of the numerator and the denominator are equal or the former is one unity higher than the latter).
We finally briefly describe a convenient way to implement the aforementioned variable shift to avoid the case where d (0,...,0) = 0 (this point is more extensively discussed in ref. [26], to which we refer the reader for further details). For simplicity we focus on the polynomials p r in the numerator of the function, although it should be clear that completely analogous statements can be made for the polynomials q r in the denominator. The key observation is that a shift in the variables z → z+s in one of the homogeneous polynomials p r can only affect the homogeneous polynomials p r with r < r. In particular the highest degree polynomial p R is the same for the function f and the shifted function f s defined in Eq. (3.15). We then start with the reconstruction of the highest degree polynomial p R , which is the same for f and f s . We then compute the effects of the shift on lower rank terms, which for a generic p r are given by p r (z + s) − p r (z). (3.21) During the reconstruction of lower degree terms, each function evaluation of the polynomials defining f s are thus corrected by the effects of this shift induced by higher rank terms. With this method we can reconstruct the function f directly (up to an overall common normalization of the numerator and the denominator, which can be fixed at the end using our definition of their canonical form), which as stated can be expected to be simpler than f s . As a final remark, we point out that, on top of the basic polynomial operations needed for the multivariate polynomial reconstruction, in order to implement this algorithm one also needs to compute the effects of a shift z → z + s in the variables. This is relatively straightforward if done for one variable at a time, by using Another operation which is needed is the homogenization required to restore the dependence of the polynomials p r and q r on their first variable, but depending on the polynomial representation used this can be trivial, since it amounts to adjusting the first entry of the multi-indexes α = (α 1 , . . . , α n ) representing the exponents of the monomials. As already mentioned, more details about our implementation are given in sect. 6.
This completes the description of the functional reconstruction algorithms used in this paper. In sections 4 and 5 we show how to evaluate functions relevant for calculations highenergy physics over finite fields Z p , making thus possible to use these techniques for the reconstruction of their analytic form as rational functions over the field Q.

Examples
Before moving to applications in high-energy physics, we briefly discuss two simple examples of functional reconstruction, which might be useful in order to clarify the concepts introduced in this section, especially for the reconstruction of multivariate rational functions, which is considerably more involved than the other cases. We perform the reconstruction over the field Q, since it is easier to follow, but any step can equivalently be performed over finite fields Z p , yielding the same result modulo p.
Function non-singular at z = (0, . . . , 0) We consider the following rational function of two variables Because the function is well defined at z = (0, 0) there is no need to shift variables in this case. Our goal is thus to reconstruct the analytic formula of this function from a black-box providing a numerical evaluation of it. The first step consists in defining the function h in Eq. (3.17), which in this case is given by with p 0 (z 1 , z 2 ) = 3, q 0 (z 1 , z 2 ) ≡ 1, We thus reconstruct the polynomials p r and q r by interpolating the function h y i (t) ≡ h(t, y i ) for several points y i . As stated, since the polynomials p r and q r are homogeneous and have degree r and r respectively, we can set z 1 = 1 and homogenize the result at the end.
Function singular at z = (0, . . . , 0) We now briefly discuss the case of a function which is singular at z = (0, 0). We consider which differs from the previous example only for the missing constant term in the denominator. Using Thiele's interpolation formula for h (1,1) (t), we now get which tells us that the constant term of the denominator is vanishing and thus we need to specify an argument shift. By trial and error, repeating the univariate functional reconstruction for several shifts, we find e.g. that s = (2, 0) is a good shift, i.e.
is non-singular at (z 1 , z 2 ) ≡ 0. The function h s is now defined as and can similarly be written as in Eq. (3.24). Now its shifted polynomial coefficients are These coefficients can be reconstructed similarly to the previous case, but as we explained we use a different strategy where the effects of the shift are recursively subtracted from the evaluations of each polynomial coefficient. This allows us to reconstruct the polynomials coefficients p r and q r of the original function f , up to an overall normalization of the numerator and the denominator, rather than the coefficients p s,r and q s,r of the shifted function f s . For simplicity we focus on the reconstruction of the polynomials p r in the numerator. We observed that the highest degree polynomial p 2 is independent of the chosen shift, and thus we can simply reconstruct it from several interpolations of h s,y i (t) as in the non-singular case. We then compute the effects of the variable shift on the lower degree terms, which are given by The term in the first and second parenthesis on the r.h.s. are the effects of the variable shift coming from p 2 on the coefficients p 1 and p 0 respectively. When reconstructing p 1 , we thus evaluate p s,1 (for which we can re-use the interpolations of h s we already performed for p 2 ) and correct its values using Hence the polynomial p s,1 is never actually reconstructed, since we correct its evaluation as in the previous equation and reconstruct p 1 instead. We thus proceed with the reconstruction of p 0 , for which we need the effects of the shift on p 1

Spinor-helicity and tree-level techniques
In this section we describe how spinor-helicity techniques and tree-level calculations can be implemented over finite fields Z p , and thus used in combination with the reconstruction algorithms illustrated above. In particular, one has to ensure that both the inputs and each intermediate step of a calculation performed via these technique can be implemented through a sequence of rational operations. These techniques also serve as building blocks for multiloop calculations via integrand-reduction and generalized unitarity, whose implementation on finite fields will be described in sect. 5. More in detail, in sect. 4.1 we describe how to use the well known four-dimensional spinorhelicity formalism in finite-field calculations. As we shall see, for loop calculations in the context of generalized unitarity, it can also be convenient to embed loop momenta in a Ddimensional space with integer D > 4. For this purpose, we will also use the six-dimensional spinor-helicity formalism [51][52][53], as described in Appendix B. In sect. 4.2 we recall how to combine these techniques for tree-level calculations by means of Berends-Giele recursion.

Four-dimensional momenta and spinors
The four-dimensional spinor-helicity formalism offers a very convenient way of expressing helicity amplitudes in gauge theories. This formalism is widely used and here we only review a few details which are relevant for our implementation.
Given a massless momentum p, its corresponding two-component spinors |p and |p] can be defined as the independent solutions of the Dirac equation, where σ µ = (1 2×2 , σ i ) with σ i being the Pauli matrices, and they satisfy In particular, if p is the momentum of an outgoing particle of a process, |p and |p] are identified as the negative-and positive-helicity solution of the Dirac equation respectively. We consider an arbitrary n-point process, with external momenta p 1 , . . . , p n , all taken as outgoing for simplicity. In the following we assume all external particles are massless, but our statements can be easily generalized to the massive case by performing a massless decomposition of the massive external momenta into a sum of massless vectors. For each external particle p i we have the associated spinors |p i and |p i ], also denoted by |i and |i] for simplicity. Scattering amplitudes are rational functions of the spinor components, as apparent from well known relations valid for external momenta and polarization vectors µ where η is a reference vector such that the scalar product (p · η) = 0. The total number of these spinor components is however much larger than the number of independent variables needed to describe an amplitude. We therefore find much more convenient to adopt the strategy described in references [37,56,57], which consists in describing scattering amplitudes with a minimal number of variables, which in turn provide a rational parametrization of the components of the spinors.
More in detail, we recall that, under a redefinition of the spinor phases given by the little group tranformation an n-point amplitude A(1, . . . , n) transforms as where h i is the helicity of the i-th particle (e.g. ±1/2 for fermions and ±1 for gluons). It is thus convenient to extract from the amplitude an overall factor A (phase) (1, . . . , n) which, under Eq. (4.5), has the same transformation properties as the amplitude A. While there is obviously no unique choice for A (phase) , it is clear that it can be chosen based on the external helicities only and does not require any other knowledge about the amplitude A itself (see e.g. ref. [57] for a choice valid for any number of external legs, or sect. 5.1 for a five-point example). The ratio A/A (phase) is obviously phase-free, i.e. invariant under the transformation in Eq. (4.5). Any phase-free function of the (four-dimensional) kinematic components which is invariant under the Poincaré group can depend on 3 n−10 independent invariants. Depending on the problem at hand, these may be chosen as a subset of the Mandelstam invariants s ij = (p i + p j ) 2 , however these are not suited for our purposes since they don't provide a rational parametrization of the spinor components. It is however not hard to build, for an arbitrary process, a set of 3 n − 10 invariants which, up to a Poincaré and little group transformation, yield a rational parametrization of the spinor components. A particularly convenient one, based on earlier work by Hodges on momentum twistors [56], is given in references [37,57] (we will also give some examples later, when discussing explicit applications). Hence, in the following we denote by x 1 , . . . , x 3n−10 a complete set of invariants which provide a rational parametrization of the spinor components (up to a Poincaré and little group transformation). We refer to these as momentum-twistor variables. Twistor variables can be interpreted as a rational parametrization of the phase space and, despite having been introduced in the context of conformal theories, they can be used in any relativistic quantum field theory and indeed they played an important role in multi-leg higher-loop calculations in non-supersymmetric gauge theories presented recent years [37,47,48]. Since they can give a complete description of the ratio A/A (phase) , a generic scattering amplitude can be rewritten in terms of them as A(1, . . . , n) = A (phase) (1, . . . , n)Ã(x 1 , . . . , x 3n−10 ). (4.7) BecauseÃ is a rational function of the momentum-twistor variables, it is suited to being evaluated over finite fields Z p , and therefore to the application of the functional reconstruction algorithms described earlier in this paper. When dealing with helicity amplitudes, our starting point is therefore the definition of the spinor components in terms of momentum-twistor variables. In practice we observe that we can always choose our variables such that all but one, say x 1 , are dimensionless. For the purpose of the functional reconstruction, it is thus convenient to set x 1 = 1 and recover its functional dependence at the end via dimensional analysis. Hence our chosen set of independent variables is z = (x 2 , . . . , x 3n−10 ) . If internal or external massive particles are involved, we should also add their masses to the list of the independent variables. From the components of the spinors, we then build the external momenta p µ . Explicit analytic formulas for these in terms of the corresponding spinors can be easily worked out once and for all using Eq. (4.3). In particular we represent any (massless or massive) momentum p by its light-cone components where, with respect to their usual definition, we dropped a factor 1/ √ 2, which being irrational cannot be translated into Z p . 5 With this representation, the light-cone components of p are also rational functions of the momentum-twistor variables (factors i cancel out against those in the Pauli matrices). Similarly, we compute polarization vectors according to Eq. (4.4), except that we divide them by an extra √ 2 (more details are given in the next paragraph). The resulting expression, in light-cone components, is also a rational function of the momentumtwistor variables.
A minor issue arises from overall factors i and √ 2 present in the definition of polarization vectors and in the color-ordered Feynman rules of gauge theories. These factors are however completely spurious, sinceÃ(x 1 , . . . , x 3n−10 ) is always a rational function of the momentumtwistor variables, provided that a factor i is also extracted in A (phase) . Therefore, if µ is an external polarization vector, V n is a generic n-point vertex, and P is a propagator, with colour matrices normalized as tr[T a , T b ] = δ ab , we get rid of these unwanted factors by making the following substitutions Notice that the rescaling of V n is consistent when combining different kinds of vertexes (e.g. V 3 ×V 3 scales in the same way as V 4 ). The correct overall factor, which is a rational number, is restored at the end with a power counting on the number of external gluons, internal vertexes and propagators. The ingredients outlined so far are sufficient for expressing a tree-level amplitude, as well as the coefficients of a multi-loop amplitude written as a linear combination of loop intergrals, as a function of the momentum-twistor variables which can be evaluated, either on Q or on finite fields Z p , via a sequence of elementary rational operations. The application of functional reconstruction algorithms based on finite-field evaluations is thus possible, and it can be an efficient way of computing scattering amplitudes in gauge theories. In the following we give more details about how to combine these ingredients for tree-level calculations via recursion relations, and for multi-loop calculations via integrand reduction and generalized unitarity.
Similar concepts apply to the spinor-helicity formalism in higher numbers of dimensions which, as stated, can be useful in order to provide a higher-dimensional embedding of loop momenta. Details on the six-dimensional spinor-helicity formalism are given in Appendix B.

Berends-Giele recursion
Berends-Giele recursion [50] is an efficient method for the numerical calculation of tree-level amplitudes. Even though it is typically used with floating-point arithmetics, it can similarly be applied with trivial modifications to finite fields Z p .
It is straightforward to apply the concepts outlined in sect. 4.1 for the four-dimensional case and in Appendix B for the six-dimensional one, to Berends-Giele recursion. One can start defining one-point on-shell currents, which we symbolically denote by as external polarization vectors (J(i) ≡ µ h i (p i )), spinors (J(i) ≡ |i , |i]) or constants (J(i) ≡ 1) depending on the kind of particle involved, i.e. vector bosons, fermions, and scalars respectively. Then, higher-point off-shell currents are computed by contracting two or more lowerpoint currents with appropriate spinor and tensor structures which can be easily worked out from the Feynman rules of the theory. For color-ordered amplitudes, the recursion for the calculation of an off-shell current J(1, . . . , m) is depicted in Fig. 1. In the last step of the recursion for the calculation of an n-point amplitude, an on-shell current J(1, . . . , n − 1) is computed via a similar recursion relation as the one in Fig. 1, except that there is no multiplication by the propagator factor, and then it is contracted with the on-shell one-point current defined by the n-th external leg.
Besides its efficiency, largely due to the fact that lower-point currents can be re-used for the calculation of several higher-point currents, Berends-Giele recursion also has the advantage that it can be straightforwardly implemented for any theory and that the algorithm is largely independent of the number of space-time dimensions. For these reasons Berends-Giele currents can also be efficiently used as building blocks for multi-loop integrands, as we will describe in the next section and in Appendix C.

Multi-loop integrand reduction and generalized unitarity
In this section we discuss a finite-field implementation of a specific algorithm for the calculation of scattering amplitudes in gauge theories, namely integrand reduction via d-dimensional generalized unitarity. The algorithm uses as building blocks tree-level amplitudes and Berends-Giele currents, which in turn can be computed using the spinor-helicity formalism as discussed in sect. 4.
We consider -loop contributions to n-point scattering amplitudes in dimensional regularization. In particular we choose a regularization scheme such that [58] • external momenta and polarizations are in four dimensions The popular t'Hooft-Veltman [59] and Four-Dimensional-Helicity [58] schemes can be obtained as special cases, by setting d s = d and d s = 4 respectively at the end of the calculation.
A generic contribution to a loop amplitude takes the form where N and D are polynomials in the components of the loop momenta k i (a rational dependence on the external kinematic variables is always understood). In particular, the denominators D i correspond to loop propagators and have the generic quadratic form It is often useful to split the loop momenta k i in a four-dimensional part k [4] i and a (d − 4)dimensional part k and define extra-dimensional scalar products In the regularization scheme defined above, a loop integrand can only depend on the additional (d − 4)-dimensional components of the loop momenta through the scalar products µ ij defined above. In particular, for one-loop amplitudes we only have one extra-dimensional scalar product, i.e. µ 11 , while at two loops we have three of them, namely µ 11 , µ 22 and µ 12 . The dependence on these extra-dimensional components can be implemented by embedding the loop momenta in a D-dimensional space, where D is a sufficiently large integer. In particular, the choice D = 6 is sufficient for both one-and two-loop applications. An advantage of this higher-dimensional embedding, especially in the context of generalized unitarity, is that one can use a higher-dimensional spinor-helicity formalism for the calculation of the integrands, having thus an explicit representation of both external and internal states. This strategy has been used e.g. in references [37,48]. Having an explicit representation of loop momenta and spinors is also obviously advantageous in numerical calculations, including computations on finite fields Z p , which make this formalism suited for our goals.
Integrand reduction methods rewrite loop integrands of the form of Eq. (5.1) as a sum of irreducible contributions, where the sum on the r.h.s. runs over the sub-topologies T of the parent topology identified by the set of denominators D j . The so-called on-shell numerators ∆ T , also known in this context as residues, are linear combinations of basis elements {m α T }, namely where α runs over an appropriate topology-dependent set of multi-indexes and m T represents a sequence of polynomials in the loop momenta. The coefficients c T,α , which only depend on the external kinematics, can be obtained by evaluating the integrand on values of the loop momenta satisfying the so-called multiple-cut conditions {D j = 0} j∈T . This corresponds to put on-shell a subset of the loop momenta. In particular, by evaluating the integrand on several solutions of the cut constraints, one obtains a linear system of equations for the coefficients c T,α which can be solved e.g. via Gaussian elimination. In general, there is no unique choice for the set of basis elements {m α T }, which are only constrained by the requirements of being independent of the aforementioned cut conditions and forming a complete integrand basis compatible with the rank of numerator with respect to the loop momenta, up to terms proportional to the denominators D i of the topology T . Techniques for choosing an appropriate integrand basis have been proposed elsewhere [29][30][31]48] and their discussion is outside the purposes of this paper. Here we will limit ourself to specify case-by-case the integrand basis we used in each example.
As stated, in the calculations presented in this paper, we embed the loop momenta in a D-dimensional space (with D = 6). Each loop momentum k i is thus decomposed into a basis For each cut, we identify a set of independent free parameters {τ k } which describe the set of solutions. In particular we look for a set of variables such that the coefficients y ij = y ij (τ k ) of the linear combination are rational functions of the parameters τ k , which is particularly convenient when working with finite fields. 6 We point out that this has been shown to be possible in many two-loop examples in d dimensions [37,38,48], while in general it is not + + + Figure 2: Sum of diagrams with gluon (solid lines) and scalar (dashed lines) loops, for the penta-box topology. Each scalar loop should be multiplied by the number of scalar flavours, which in our case is equal to d s − D.
in the four-dimensional limit. Using D-dimensional cut loop momenta defined by assigning numerical values to the free parameters τ k , we thus evaluate both the integrand the on-shell basis m α T , generating the linear system of equations we can solve for the coefficients c T,α in Eq. (5.6).
The evaluation of the integrand on multiple cuts might be done by means of its analytic expression, if available (e.g. via diagrammatic techniques). The method of generalized unitarity instead exploits the fact that a multi-loop integrand, when a subset of the loop momenta are put on-shell by the cut constraints, factorizes as a product of tree-level amplitudes. This factorization can be understood by the fact that the numerator of an on-shell propagator factorizes as a sum of polarization states. Using generalized unitarity, the amplitude is expressed in terms of a smaller number of contributions compared with diagrammatic techniques, and gauge invariance is guaranteed in intermediate steps of the calculation, thus avoiding cancellations of gauge-dependent terms between different diagrams. Moreover, it offers the possibility of exploiting efficient tree-level techniques for loop calculations as well. When using Berends-Giele currents, additional simplifications are possible during the evaluation of the cut integrands, along the lines of what is implemented in the one-loop public code NJet [18], whose two-loop extension is briefly discussed in Appendix C.
Since we embed loop momenta in D = 6 dimensions, we can evaluate the amplitudes relevant for each cut using the six-dimensional spinor-helicity techniques presented in ref. [51] and whose application to finite-field calculations is discussed in Appendix B. We also add to the theory d s − D flavours of scalars representing additional polarizations of the internal gluons which, as stated at the beginning of this section, are taken to be d s -dimensional. In particular, each on-shell integrand at two-loops can be written as is a D-dimensional integrand with i scalar loops. Notice that the result for ∆ T does not depend on the dimension D of the chosen embedding, unlike each of the terms on the r.h.s. of the previous equation. Moreover, for pure Yang-Mills theories, the contribution ∆ (D,2) T is non-vanishing only for one-loop squared topologies. As an example, the two-loop planar penta-box is obtained as the sum of contributions in fig. 2. This is the same set-up used for the analytic calculations presented in references [37,47,48].
In the following examples, we apply the functional reconstruction algorithm to the coefficients c T,α . T,α = 0. All the analytic results are publicly available, and should be useful for comparisons with future calculations. 7

Two-loop five-point planar penta-box
We consider the two-loop penta-box topology, where the five external gluons are taken as outgoing. The loop momenta k 1 and k 2 are defined as in fig. 3.
As stated, the kinematics is defined by the spinor components, which in turn are parametrized by momentum-twistor variables. For the five-point case we use the parametrization given in [37], namely The variables x i can be expressed in terms of the Mandelstam invariants s ij and tr 5 (1 2 3 4) using Eq. (A.8) of ref. [37]. In particular, all the momentum-twistor variables but x 1 (we recall that x 1 = s 12 ) are dimensionless. This means that the dependence of the result on x 1 can be fixed by dimensional analysis and the functional reconstruction will thus use the following set of variables z = (x 2 , x 3 , x 4 , x 5 ). (5.11) As discussed in sect. 4.1, we factor out of the result a global helicity-dependent phase A (phase) . In this example we define it as and similar for all the ones obtained by cyclic permutations of the external legs. Helicity configurations with three or more negative helicities are not listed, since they can be obtained from the ones above and their cyclic permutations by charge conjugation. Notice that charge conjugation corresponds to swapping the spinors |i ↔ |i] both in the definition of A (phase) and in their parametrization in terms of momentum-twistor variables. In this case the definition of the x i is modified by applying the same transformation, which amounts to the substitution tr 5 → −tr 5 in Eq. (A.8) of ref. [37].
Given the symmetries of the penta-box diagram, all the helicity configurations can be obtained from the following set (1 − , 2 + , 3 + , 4 + , 5 + ), (1 + , 2 − , 3 + , 4 + , 5 + ), (1 + , 2 + , 3 + , 4 − , 5 + ), For each of these, the on-shell integrand is parametrized as where pb,α (x 2 , x 3 , x 4 , x 5 ), (5.15) and the c (i) pb,α are computed via the functional reconstruction method previously illustrated in this paper. The integrand basis is chosen as in ref. [37], namely m pb = (2 (k 1 · p 5 ), 2 (k 2 · p 2 ), 2 (k 2 · p 1 ), µ 11 , µ 12 , µ 22 ), pb,α , the maximum number of terms in a coefficient, the maximum degree (of either the numerator or the denominator) of a coefficient, and the average number of terms in the non-vanishing coefficients. The number of terms in a rational function is defined as the sum of the number of monomials in its numerator and the one in its denominator. Full analytic expressions are available at the url https://bitbucket.org/peraro/ff2lexamples. This basis is smooth in the four-dimensional limit µ ij → 0. The results we obtain for the all-plus topology agree with those computed in references [37,48]. The on-shell integrands for the other helicity configurations are new. The results are summarized in table 1.

Two-loop five-point non-planar double-pentagon
We now consider the two-loop non-planar double-pentagon topology depicted in fig. 4. The set-up of the calculation is completely analogous to the one for the penta-box. The kinematics and the global spinor phase are still defined using Eq. (5.10) and Eq. (5.12) respectively. For this topology, we consider the following complete set of independent helicity configurations For this example we use a slightly different approach for the parametrization of the integrand. Rather than using an integrand basis with a smooth four-dimensional limit µ ij → 0, we choose one which is only composed of scalar products between loop momenta and external momenta. While this, in general, might make some of the coefficients more complex and the integration harder, it has the advantage that the result can be directly plugged into available programs for integration by parts and other multi-loop techniques. This is also consistent with the adaptive integrand decomposition recently proposed in ref. [38], where the integrands are always expressed in terms of this kind of scalar product and the four-dimensional components of the loop which are orthogonal to the external legs are integrated out (as this is a five-point topology, it corresponds to a limiting case where no such orthogonal direction is present). More in detail, we use ∆ dp = A (phase) α (s 12 ) −|α| c dp,α m α dp , (5.19) where, as before, dp,α (x 2 , x 3 , x 4 , x 5 ), (5.20) while the integrand basis is now given by m dp = (2 (k 1 · p 5 ), 2 (k 2 · p 1 ), 2 (k 1 · p 3 ) − 2 (k 2 · p 3 )),   The results we obtain for the all-plus topology are in agreement with those computed in ref. [47], although in that reference they were deduced from the known expressions in the planar case, while here they have been directly computed using integrand-reduction via generalized unitarity on the non-planar topology. The on-shell integrands for the other helicity configurations are computed here for the first time. The results are summarized in table 2.

Implementation
In this section we give some detail about our implementation of the functional reconstruction method illustrated in sect. 3 and its application to the techniques described in sections 4 and 5.
The programming language of our implementation is C++, and in particular the C++-11 standard of the language. 8 The first design choice regards the definition of the finite fields Z p . Our requirement of working with machine-size integers imposes an upper limit p < P max on the choice of the primes p defining the fields. We work with 64-bit unsigned integers, which can take values between i min = 0 and i max = 2 64 − 1. The most stringent requirement is avoiding integer overflow in multiplication, i.e. the requirement that by multiplying two integers in Z p with p < P max , before taking modulo p of the result, we still obtain a value smaller than i max . This implies we can choose P max such that P 2 max ≤ i max . In our implementation we find more practical to use a more conservative choice, namely C P 3 max < i max with C = 9. This allows us to be slightly more relaxed in the implementation and take (sums of) products of three integers in Z p before taking modulo p. This approach also drastically reduces the number of mod p operations which are needed. We hard-code 60 primes satisfying these criteria, in the interval 897 473 ≤ p ≤ 978 947. With this choice, in most of the examples presented in this paper we only need one or two primes p for the rational reconstruction of the result over Q (according to the discussion in sect. 2.3 and Appendix A.1).
One important difference between native operations which involve machine integers or floating-points numbers and operations on elements of a finite field Z p concerns division. Indeed, as stated, performing a division in Z p requires the calculation of a multiplicative inverse (see Appendix A.1 for an algorithm). Hence, division in Z p is a relatively expensive operation (compared to other native operations) which needs a call to a specific routine, and it is generally a good idea to minimize the number of times it is required. However, because the calculation of a multiplicative inverse is implemented by a routine, catching divisions by zero is very easy. Notice that such cases can be quite common during a functional reconstruction procedure, where many values for the variables z are probed. This makes detecting (actual or spurious) singularities in the evaluation of a function relatively easy and, as stated, the reconstruction algorithms illustrated above can accomodate this possibility.
We now briefly describe our internal representation of polynomials and rational functions. For univariate Newton polynomials, defined as in Eq. (3.5), we store two arrays of integers, namely the coefficients a i and the values y i . A completely analogous representation is used for Thiele's rational functions in Eq. (3.10). For multivariate Newton polynomials depending on n variables, recursively defined as in Eq. (3.12), we store the array of integers y i and an array of pointers to the coefficients a i , which are now Newton polynomials in n − 1 variables. For our canonical representation of polynomials, given in Eq. (2.6), we adopt instead a sparse representation. More in detail, we store an array of pointers to a data structure representing each non-vanishing term. These terms are in turn stored as contiguous chunks of memory containing the coefficient c α and the exponent α corresponding to each monomial. We also store the total degree |α| for convenience. The internal representation uses the graded lexicographic monomial order (i.e. terms are ordered by their total degree, and terms with the same total degree are ordered lexicographically). We found this representation suited for implementing the basic polynomial operations needed by our reconstruction algorithm, namely polynomial addition, multiplication by linear univariate monomials, and homogenization with respect to one variable. In particular the last two operations take advantage of the possibility of manipulating the entries of the exponents α, which is trivial using our sparse representation. Finally, rational functions are simply stored as a pair of polynomials representing the numerator and the denominator respectively.
As stated, the only technique discussed in this paper which requires multi-precision arithmetic is the rational reconstruction algorithm. In our implementation we use the popular GNU Multiple Precision Arithmetic Library (GMP). The final result over the field Q is stored using a sparse representation similar to the one we use for Z p . For functions over Q however we do not implement any of the polynomial algorithms mentioned above, but only the routines needed for merging results over several finite fields Z p i to obtain a guess in Q, as discussed in sect. 2.3 and Appendix A.2.
It should be stressed that the results obtained with the illustrated reconstruction algorithm for rational functions is minimal with respect to the total degree of the numerator and the denominator, hence there is no need to import it into an external computer algebra system for polynomial GCD simplification (although it can obviously be used by algebra systems for other kinds of manipulations). It is worth making a few remarks about the application of the reconstruction algorithm to the examples described in sect. 5. As stated, the coefficients c (i) T,α to be reconstructed for each topology T are computed by evaluating the loop integrands on the multiple cuts and thus obtaining a linear system of equations. The latter can be solved for them by means of Gauss elimination, for which we have a very straightforward implementation suited for dense systems. In order to minimize the number of integrand evaluations we need, as well as the size of the system of equations, we first evaluate each coefficient for a set of arbitrary values of the variables z and the primes p defining the field Z p . These evaluations are used to collect information about which coefficients are non-vanishing for each helicity configuration. Then, when applying the functional reconstruction algorithm, we restrict the system to the non-vanishing coefficients only. With fewer unknowns, we thus require fewer evaluations of the integrand in order to find a solution. Moreover, as stated, because the solution of the system returns all the non-vanishing coefficients, we cache their value for each z in a hash table. This allows to quickly look up the values of c Our current implementation can be considered a proof-of-concept and lacks many optimizations which can be estimated to improve the performance by at least one order of magnitude. However it is already suited for applications to complex problems in high-energy physics such as multi-loop high-multiplicity calculations, where it can easily outperform equivalent fully-analytic approaches. In the examples presented in the previous section, for most of the helicity configurations, the total run time required for obtaining full analytic results for all the coefficients of the on-shell integrands varies between a fraction of a second and about ten minutes, on a single core. For exceptionally complex cases (such as the one involving functions of total degree up to 42), we need to perform the reconstruction over up to three finite fields, for a total run time of about a hundred minutes.
As a final observation, we point out that the purpose of the calculations in sect. 5 is to illustrate a straightforward application of functional reconstruction algorithms over finite fields to modern techniques for loop calculations. In particular, we have not attempted to find an integrand basis which would yield simple results (such as the local integrands discussed in ref. [48]). We observe however that, using the techniques we described in sect. 3.5, one can compute the total degree of the numerator and the denominator of the results using a relatively small number of function evaluations. This can be used in order to estimate the complexity of the total result, when looking for an integrand basis yielding simpler coefficients, without the need of performing a full functional reconstruction.

Conclusions and outlook
We illustrated an algorithm for the reconstruction of multivariate polynomials and rational functions from their evaluation over finite fields, which has good scaling with the number of variables and the complexity of the results. We then discussed its application to techniques for the calculation of scattering amplitudes in gauge theories, such as the spinor-helicity formalism, tree-level recursion relations, and multi-loop integrand reduction via generalized unitarity. The algorithm can compute complex analytic results, side-stepping the issue of large intermediate expressions which would arise in equivalent fully-analytic calculations. The only input required by the method is a procedure for the numerical evaluation of the function to be reconstructed over finite fields.
As an application of these techniques, we presented for the first time analytic results for the maximal cut of two five-point two-loop topologies in Yang-Mills theory, for a complete set of independent helicity configurations. The complexity of the reconstructed results highlights the suitability of the method to handle complex problems, in particular high-multiplicity two-loop calculations, which are currently of great relevance for high-energy phenomenology.
In this paper we discussed dense reconstruction algorithms. Given that some of our results contain several thousands of terms, we believe this to be the best choice to handle the general case. However, when the functions contain a relatively small number of terms compared to the one expected from the total degree of the result, dense algorithms can be outperformed by sparse algorithms. A possible approach is the usage of so-called racing algorithms, i.e. running a dense and a sparse algorithm in parallel, sharing the same function evaluations, and terminate the reconstruction when the fastest of the two is successful. If the time spent for the reconstruction is dominated by the numerical evaluation of the function, as in our case, this method can achieve optimal performance in virtually all cases.
One can observe that the simpler the result is, compared to the intermediate expressions appearing in an equivalent analytic calculation, the greater the advantage we can expect from using a black-box reconstruction method rather than following a purely analytic approach. This makes the method ideal in the case where large analytic cancellations yield simple final results. Even for somehow more complex results, such as many of the examples presented in this paper, we found that our functional reconstruction algorithm can easily outperform an equivalent analytic calculation by several orders of magnitude. We observed however that the analytic results presented here might have been significantly simpler using a better integrand basis, along the lines of what was proposed in ref. [48]. In particular, the capability of the proposed reconstruction method of quickly computing the total degree of the result can be exploited in order to estimate its complexity and thus look for a choice of variables or an integrand basis which makes the results simpler. As a future application, the method can thus be beneficial to the extension of the concepts outlined in ref. [48] to a generic helicity configuration.
Although we focused on integrand reduction via generalized unitarity, we stress that the number of potential applications of the proposed approach is much larger. Indeed, as we observed, any method which can be numerically implemented via a sequence of elementary rational operations is suited for the functional reconstruction algorithm we illustrated. On top of the ones presented here, other possible applications are diagrammatic techniques and IBPs (indeed finite-field applications to the latter have already been proposed in ref. [25]).
The methods proposed in this paper are very general and have a very broad spectrum of potential applications. In particular, as shown by our proof-of-concept implementation, they are suited for computing complex analytic results in combination with state-of-the-art multi-loop techniques. Because every non-vanishing element of a field must have a multiplicative inverse, we can consider Z n as a field if n = p is prime. Notice that the calculation of the sequence {t i } is not needed for the purposes of computing a multiplicative inverse. A modified version of the extended Euclidean algorithm can also be used for guessing a rational number q from its image a ∈ Z n [54]. Indeed, it is straightforward to show that each iteration of the Euclidean algorithm satisfies and therefore, by setting b = n and taking mod n of both sides, Hence r i /s i is a possible guess for q in each iteration of the extended Euclidean algorithm. However one can see that, when n is sufficiently large, most of these guesses will have very large numerators and denominators, except for the case where Hence the rational reconstruction algorithm can be implemented by terminating the Euclidean algorithm at the iteration i = k, when r 2 k < n. The calculation of the sequence {t i } is not needed for the rational reconstruction.
This algorithm typically succeeds in reconstructing a rational number q = a/b ∈ Q from its image in Z n when its numerator and denominator satisfy a 2 , b 2 n. (A.8) As explained in sect. 6, our requirement of working with machine-size integers imposes an upper bound on our choices of n = p (in our implementation, p < P max with P max = O(10 6 )) and thus the former relation is generally not guaranteed to hold. However, as we mentioned in sect. 2.3, this issue can be solved by performing the functional reconstruction on several finite fields Z p i with p i < P max , and combining them by means of the Chinese remainder theorem, as described below.

A.2 Chinese remainder theorem
The Chinese remainder theorem allows us to reconstruct a number a ∈ Z n with n = n 1 · · · n k and the n i pairwise co-prime, from its images a i in the sets Z n 1 , . . . , Z n k . Hence, by performing a calculation on several prime fields Z p i , with p i < P max , and combining them with the Chinese remainder theorem, one can reconstruct the image of the same result in Z p 1 ···p k and apply the rational reconstruction algorithm described above to the latter.
More in detail, given a ∈ Z n , a set of pairwise co-prime numbers n 1 , . . . n k such that n = n 1 · · · n k , and a set of congruences By using several primes p 1 , . . . , p k , one can thus make their product large enough for the reconstruction algorithm (described in the previous section) to succeed in Z p 1 ···p k . In practice, as we explained, we perform the functional reconstruction over one prime field Z p i at a time and recursively merge the result on the bigger set Z p 1 ···p k . The recursion terminates when the result of the rational reconstruction is in agreement with the evaluation of the function on different prime fields. Hence, all our calculations can be performed on Z p i with primes p i < P max using machine-size integers and only the application of the Chinese remainder theorem requires multi-precision arithmetic.
It is worth observing that, when applying the Chinese remainder theorem to two congruences at a time, as in our case, one has n = n 1 n 2 and the above formulas reduce to a = m 1 a 1 + m 2 a 2 mod n 1 n 2 , (A.12) with m 1 = (n −1 2 mod n 1 ) n 2 , m 2 = (n −1 1 mod n 2 ) n 1 . (A.13) The implementation can be further simplified by noting that m 2 = (1 − m 1 ) mod n 1 n 2 . (A.14) where σ (6) µ andσ (6) µ , for µ = 0, . . . , 5, are now 4 × 4 matrices which can be regarded as higher-dimensional versions of the Pauli matrices (an explicit representation can be found in ref. [51]). The indexes a,ȧ = 0, 1 label the two independent solutions of the Dirac equation, namely solutions with positive and negative helicity for spinors and anti-spinors respectively. A sub-set of the six-dimensional spinor components can be identified with the ones of |p and |p] in the four-dimensional limit, hence making the conversion from four to six dimensions trivial. For our purposes, the most important relation satisfied by six-dimensional spinors is where ab and ȧḃ are anti-symmetric Levi-Civita tensors in two dimensions with 01 = − 01 = 1, and a sum over repeated indexes is understood. The Levi-Civita tensors are also used for raising and lowering the indexes a andȧ. Unlike the four-dimensional case, where momenta are built from spinors (which in turn are determined by their representation in terms of momentum-twistor variables), in the sixdimensional one we are mostly interested in the opposite, i.e. building spinors |p a and |pȧ] from the entries in Eq. (B.1) of a six-dimensional momentum, such that Eq.(B.3) is satisfied. This is because, in the context of generalized unitarity, we start from six-dimensional momenta corresponding to on-shell loop propagators, and from these we must build their corresponding spinors. While the solution of the problem is not unique, it is not hard to work out a suitable representation, although one has to separately consider special cases where one or more of the entries are vanishing. It should be pointed out that, once the sum over the internal helicities of a unitarity cut has been performed, spinors corresponding to loop momenta always combine as in the right-hand-sides of Eq. (B.3), hence these relations (and the Dirac equation) are the only relevant ones for this purpose.
The last six-dimensional ingredient we need are polarization vectors, for cut loop propagators involving gluons. For six-dimensional gluons we have four independent polarizations, which can be identified by the labels (aȧ) of their SU (2) × SU (2)  While (++) and (−−) respectively correspond to the positive and negative helicity in the four-dimensional limit, the last two are instead only present in six dimensions. Given an auxiliary vector η µ , polarization vectors can be written as [51] µ aȧ (p, η) = 1 √ 2 (p · η) p a |σ µ |η b η c |pȧ] bc . (B.5) As in the four-dimensional case, in our representation we divide them by an additional factor √ 2, such that their light-cone components become rational functions of the spinor variables. In ref. [51] it is shown that these vectors obey all the properties required by polarizations of gauge bosons, and the following relation µ aȧ (p, η) ν aȧ (p, η) = g µν − 1 (p · η) (p µ η ν + p ν η µ ) .  ℓ j 1 ℓ j 1 +1 ℓ j 1 +j 2 +1 ℓ j 1 +j 2 +2 ℓ j 1 +2 ℓ j 1 +3 Figure 5: Schematic depiction of a unitarity cut. Grey blobs represent tree-level amplitudes and they are joined by the lines corresponding to the on-shell momenta of the cut loop propagators i . The loop momenta are defined as k 1 = 1 , and k 2 = − j 1 +j 2 . Double lines represent an arbitrary number of external legs.
Similarly to the four-dimensional case, the ingredients reviewed here allow us to work with six-dimensional spinors by performing rational operations on their components, hence they are suited for numerical evaluations over finite fields Z p and thus for the application of the functional reconstruction algorithms described in sect. 3.

Appendix C Two-loop unitarity cuts from Berends-Giele currents
In this appendix we briefly illustrate how to evaluate two-loop unitarity cuts efficiently using off-shell Berends-Giele currents. This is a straightforward generalization of the algorithm used by the public numerical code NJet [18] at one-loop. The algorithm is suited for numerical implementations, including evaluations over finite fields Z p . For simplicity, we consider a multiple cut of the form depicted in fig. 5, for a theory with only gluons, but it should be clear that everything can be easily extended to more general cases. In particular, fig. 5 represents a unitarity cut where the momenta of the loop propagators which are put on-shell are denoted by i . We split the loop propagators in three categories: { 1 , . . . , j 1 } are propagators depending on the loop momentum k 1 only, { j 1 +1 , . . . , j 1 +j 2 } depend on k 2 only, and { j 1 +j 1 +1 , . . . , j 1 +j 2 +j 12 } depend on both k 1 and k 2 . Double lines represent an arbitrary set of external legs. The cut is defined as the product of the tree-level amplitudes (represented as grey blobs in the picture) defined by the on-shell propagators, summed over the helicity states of the internal legs.
We first focus on the tree-level amplitude involving 1 and 2 . When this amplitude is computed via Berends-Giele recursion, the last step of such recursion is the contraction of the on-shell current involving 1 and the appropriate sequence of external legs with the polariza-tion vector corresponding to the on-shell propagator 2 . This is schematically represented by the following equation In the calculation of the current appearing on the r.h.s. of the previous equation, the lowerpoint currents depending only on the external legs only need to be computed once per phasespace point, since they are independent of the cut. We now recall that, in the product of amplitudes defined by this unitarity cut, the polarization vector h 2 ( 2 ) always appears in the combination µ h 2 ( 2 ) ν −h 2 (− 2 ). After the sum over internal helicities we have as apparent e.g. from Eq. (B.6). In the previous equation, η is the reference vector used to define the internal polarizations, and the second term on the r.h.s. may be dropped because of gauge invariance. This implies that, instead of computing a product of amplitudes and summing over the internal helicities, one can equivalently simply compute the Berends-Giele current on the r.h.s. of (C.1), without contracting it with the polarization vector h 2 ( 2 ), and use this current as an input on-shell current for the calculation of the next amplitude (in this case, the one involving 2 and 3 ), as if the current itself was the polarization vector associated to 2 . The argument can obviously be iterated over all propagators i with i ≤ j 1 , i.e. those which depend on the first loop momentum k 1 only, so that for each helicity h associated with