Tensorial Reconstruction at the Integrand Level

We present a new approach to the reduction of one-loop amplitudes obtained by reconstructing the tensorial expression of the scattering amplitudes. The reconstruction is performed at the integrand level by means of a sampling in the integration momentum. There are several interesting applications of this novel method within existing techniques for the reduction of one-loop multi-leg amplitudes: to deal with numerically unstable points, such as in the vicinity of a vanishing Gram determinant; to allow for a sampling of the numerator function based on real values of the integration momentum; to optimize the numerical reduction in the case of long expressions for the numerator functions.


Introduction
In the last few years we observed enormous progress in the computation of one-loop virtual corrections for processes involving many particles.
Commonly, all these calculations rely on the decomposition of the tensor integrals in terms of scalar master integrals which are known analytically, and can be computed by means of public libraries, like LoopTools [1], QCDloop [2,3], Golem 95 [4] or OneLOop [5,6]. The task of reducing tensor integrals to scalar integrals can be achieved by following a fully analytic procedure, according to the well established Passarino-Veltman reduction [7,8], which proved its effectiveness for reactions involving a small number of particles. Improved tensor reduction methods led to the development of tools [4,[9][10][11] which are able to deal in an efficient and numerically stable way with processes of high complexity such as the EW corrections to e + e − → 4f [12] and the QCD corrections to pp → ttbb [13][14][15], or qq → bbbb [16].
In recent years, inspired by unitarity arguments [17,18], alternative approaches have emerged, aiming at the direct determination of the coefficients of the master integrals.
A novel powerful framework for one-loop calculations was developed by merging the idea of exploiting the kinematic cuts of the scattering amplitudes in four dimensions [19,20] with a systematic analysis of the structure of their integrands [21], leading to a framework by now known as OPP method [22,23]. This approach has been further extended to the generalized d-dimensional unitarity-based method [24,25] for the reconstruction of scattering amplitudes in dimensional regularization [26].
Controlling the quality of the numerical reconstruction is crucial for the new approaches to one-loop calculations [27,28,43]: with respect to the tensorial reduction, the new approaches require a particular attention to the issues of numerical efficiency and the control over numerical instabilities, which are some of the strong features of purely algebraic techniques. In addition to standard tests, commonly used within all the reduction methods to monitor the quality of the final results (such as the checks on the coefficients of the UV and IR poles or the comparison between results using different levels of numerical precision), the OPP/d-dimensional unitarity framework provides internal-consistency checks on the quality of the reconstructed coefficients. The phase space points for which the reconstruction fails the tests are in general (re)processed using routines that provide higher numerical precision which demand longer computing time. Multi-precision procedures are implemented either in dedicated routines or by doubling the original algorithm within a copy running in higher floating point precision.
In this paper we present a new simple strategy for the numerical evaluation of one-loop amplitudes, based on the reconstruction of the tensorial representation of their integrands. Accordingly, any integrand can be expressed in a basis of tensors formed by products of the integration momentum, each multiplied by a tensorial coefficient. The tensorial decomposition is achieved by sampling the integrand for real values of the components of the loop momentum. Finally, the reconstructed coefficients can be contracted numerically with the corresponding tensor integrals, evaluated by means of dedicated libraries such as Golem 95 [4].
This technique shows its virtues in at least three cases. First, it is suitable to treat unstable configurations in the proximity of vanishing Gram determinants. The reduction of tensor integrals to express them in terms of scalar ones is responsible for the appearance of spurious, potentially singular coefficients, which might spoil the numerical accuracy in phase space regions where the Gram determinants tend to zero. Using a tensor basis, rather than a scalar one, this problem is bypassed completely. Although the evaluation of the tensor integrals demands more time than the computation of the scalar integrals, it might still be competitive when compared to a complete reduction in higher floating point precision. Moreover, a "rescue-system" based on the tensorial reconstruction of the integrand should be performed only for a small fraction of points when integrating over the full phase space.
Second, the tensorial reconstruction combined with the sampling by real momenta yields the automatic generation of the integrand available directly from tree-level generators like MadGraph [44] and HELAC [45,46], which provide tree amplitudes for real values of external momenta. This possibility can be an alternative to the unitarity-based generation of the integrand, which, by construction, requires tree-level amplitudes evaluated at complex (yet on-shell) external momenta.
A third possibility is represented by the use of the reconstructed tensor integrand as "preprocessed" integrand for the reduction algorithm. Namely, the tensorial reconstruction of the integrand, by definition, amounts to disentangling the part of the integrand depending on the loop momentum from the one depending only on the kinematic variables. The dependence on the kinematics is carried by the tensorial coefficients, which have to be evaluated only once per phase-space point. Therefore, given any one-loop integrand, one can reconstruct its tensorial structure at a given phase-space point, and use the reconstructed expression as input for the reduction procedure at the integrand level. The numerical evaluation of the preprocessed integrand can be more effective than the evaluation of the original integrand, because the kinematic information stored in the tensorial coefficients is constant during the reduction algorithm, as it evolves simply through the variation of the loop-momentum used for the sampling. Within the framework of the preprocessed numerator, the tensorial coefficients play the role of an algebraic alternative to a caching system.
The paper is organized as follows. The reconstruction algorithm is discussed in Section 2. Section 3 describes the various applications of the method, with particular emphasis on the "rescue-system" option. In this Section we also describe an example of implementation by means of existing codes such as samurai and Golem 95. Finally in Section 4 we present our conclusions.

Algorithm
Within the dimensional regularization scheme, any one-loop n-point amplitude can be written as We use a bar to denote objects living in d = 4 − 2ǫ dimensions, following the prescription / q = / q + / µ withq 2 = q 2 − µ 2 . Further, we use the notation f (q) as short-hand notation for f (q, µ 2 ).
Our task is to rewrite the numerator function N (q) as a linear combination of tensors of increasing rank, up to the maximum power of the integration momentum appearing in the numerator.
We first focus on the part of the numerator N (q) that depends only on the 4-dimensional part of the loop momentum q. Let's assume that the numerator has at most R powers in the integration momentum q. We aim at building numerically the tensorial representation which reproduces N (q) before integration for each value of q. In the most general case, we have an expression of the form where the contraction of the indices µ i = 1, . . . , 4 is performed with the trivial metric (1, 1, 1, 1). For r = 0, we indicate the constant term as C 0 . We observe that the reconstruction of Eq. (2.2), namely the calculation of all coefficients C µ 1 ...µr , is independent of the number of denominatorsD i appearing in the original amplitude. It will however depend on the specific phase space point or helicity configuration that we want to process, in the same way as the numerator functions of other numerical methods.
In order to determine all the coefficients, we can simply evaluate both sides of Eq. (2.2) for an arbitrary set of values of the integration momentum. Those values can be chosen as real four-momenta, thus allowing the treatment of numerators depending on a real integration momentum.
It is useful to rewrite N (q) by separating the tensorial components. Each of the terms in Eq. (2.2) can be written as a multivariate polynomial in the components of q, where q 4 denotes the energy component (2.3) Here, the notation ⊢ indicates that the indices i j have to form an integer partition of r.
We can now compute the coefficientsĈ (r) ; the conversion fromĈ → C is easy since each component of C µ 1 ...µr contributes to one particularĈ (r) i 1 ...i 4 where the tuple (µ 1 , . . . , µ r ) contains exactly i j occurrences of the number j. Conversely, when contracting the tensor C µ 1 ...µr with a tensor integral of rank r, one has to take into account thatĈ and therefore n r = {1, 4, 10, 20, 35, 56, 82} for r = 0, 1, . . . 6 respectively. To reconstruct the tensorial coefficients of Eq.(2.2) we will then consider the numerator for four-dimensional loop momentum q, N (q) = N (x, y, z, w), as a multivariate polynomial of degree R in the four variables x, y, z and w, being the components of q, i.e. q µ = (x, y, z, w). We will determine all the coefficients by sampling q, (hence its four components) in a bottom-up approach, starting from the simplest choice of q, namely the null-vector, and proceeding with sets of momenta q where one variable at a time is different from zero.

Level-0
The coefficient C 0 can be immediately determined from the identity: where q = (0, 0, 0, 0). This trivially completes the calculation of the constant term.

Level-1
We can subtract N (0) from N (q), and define In this case we may consider four polynomial systems, generated respectively by choosing q with only one non-vanishing component, whose solution leads to the determination of l 1 = 4R coefficients. Using q = (x, 0, 0, 0), we extract only the R coefficients that are proportional to the first component of q. Therefore our system reduces to and can be easily solved by choosing R different values for x. We repeat the same operation for q = (0, y, 0, 0), and proceed analogously with the remaining two choices of q with only one component different from zero.

Level-2
After completing Level-0 and Level-1, we know 4R + 1 coefficients, and we can again subtract the corresponding terms from N (q), to obtain We proceed with the calculation of the l 2 = 3R(R − 1) coefficients multiplying two nonvanishing components of q. They will be found from six systems, each containing R(R−1)/2 coefficients. We begin with q = (x, y, 0, 0), and with this choice of q, N (2) (q) reduces to: x a y bĈ (R) ab00 , (2.10) where a and b are different from zero. To extract the coefficients, we solve straightforwardly a system generated by sampling R(R − 1)/2 pairs (x, y). Then we choose q = (x, 0, z, 0), and get the generating polynomial for the second system, which yields again R(R − 1)/2 coefficients. We repeat the same sampling algorithm for the remaining four cases.

Level-3
The first step of Level-3 is the subtraction of the known coefficients from N (q), defining the polynomial This time, we proceed with the calculation of the l 3 = 2R(R − 1)(R − 2)/3 coefficients by multiplying three non-vanishing components of q. They will be found from four systems, each containing R(R − 1)(R − 2)/6 coefficients. We begin with q = (x, y, z, 0), so that N (3) (q) reduces to: where a, b and c are different from zero. By sampling R(R−1)(R−2)/6 triplets (x, y, z), we generate a system for the extraction of the coefficients. Analogous to level-2, we continue with the other three polynomials generated when N (3) (q) is evaluated respectively for q = (x, y, 0, w), q = (x, 0, z, w), and q = (0, y, z, w). The solutions of the associated systems complete the reconstruction of all coefficients multiplying three non-vanishing components of q.

Level-4
The final step is represented by the determination of the coefficients multiplying four non-vanishing components of q. They form the polynomial defined as This polynomial is evaluated at q = (x, y, z, w), At the end of Level-4, the tensorial reconstruction is complete: the original numerator N (q) is rewritten as a combination of tensors formed by products of loop momenta q. The tensorial coefficients multiplying each tensor store the information depending on the kinematics. We will denote the reconstructed numerator by N (q) . For example, in the case of a numerator N (q) of rank six, the reconstruction of N (q) requires the calculation of 210 coefficients, obtained by evaluating N (q) 210 times. More details on the combinatorics are given in Appendix A. We would like to point out that by construction the proposed algorithm does not introduce any spurious sources of instabilities. As already mentioned, N (q) can be employed in several ways. First, it can be used as rescue system to treat unstable configurations in the proximity of vanishing Gram determinants, because the use of a tensor basis, rather than a scalar one, avoids the appearance of spurious potentially singular coefficients. Second, as N (q) can be constructed from products of trees evaluated at real momenta only, it matches well with the automatic generation of the integrand directly from available tree-level generators like MadGraph [44] and HELAC [45,46], which provide tree amplitudes for real values of external momenta. Third, the reconstructed tensor integrand N (q) can be used as an integrand which is more suitable for further processing, as will be explained in detail in section 3.2.

Example of the reconstruction for rank two
If the numerator has at most two powers of q, Eq. (2.2) reads N (q) = C 0 + C µ q µ + C µν q µ q ν . (2.16) We have a total of 15 independent coefficients to determine, because q µ q ν selects the symmetric part of C µν . According to the algorithm outlined in the previous section, to reconstruct the unknown coefficient we have to solve the following systems.
• Level-0. We start from N (q) and evaluate it at the null-vector, obtaining trivially the constant C 0 .
• Level-1. Subtracting the known term, we define and evaluate it for the following four choices of q,
We again subtract the determined coefficients, define and evaluate it for the following six choices of q, The extraction of the l 2 = 3R(R − 1) = 6 coefficients is trivial in this case as well, because each monomial has to be sampled only once.
At the end of Level-2, we have found the numerical values of the 15 coefficients appearing in Eq.(2.16).

Reconstruction of the µ 2 -dependence
To account for the µ 2 -dependence of the numerator N (q), where µ 2 is the radial integration variable in the (d − 4)-dimensional subspace, the tensorial decomposition of the numerator requires more terms than in the 4-dimensional case previously discussed. The extended decomposition reads This form can be derived from the fact that rational terms only come from the combination of (d − 4)-dimensional terms with UV divergent integrals, and the listed numerators are the only ones leading to UV divergent integrals in addition to N (q) (in a renormalisable gauge). Note that the µ 2 terms can be inferred from the general relation In calculating the terms proportional to both powers of q and µ 2 , we should consider that not all terms will contribute to the final result. We will therefore limit our procedure to non-vanishing terms plus additional vanishing terms which are required to compute other pieces. For example, by power counting, we can exclude the presence of the term G (2) µ 4 from bubble and triangle diagrams. On the other hand, we cannot neglect the term G (1) µ 2 for box diagrams, even if we know that it will give a vanishing contribution: since we are operating at the integrand level, its presence is necessary in order to compute other non-vanishing terms in Eq. (2.34), starting from G (2) . Moreover, only the diagonal terms of G (4) in Eq. (2.34) are needed, since the non-vanishing part of the resulting integral is proportional to g αβ . Therefore, for n denominators, the numerator function N n (q) of Eq. (2.30) will reduce to: respectively. The presence of µ 2 amounts to dealing with multivariate polynomials with one more variable than in the four-dimensional case. We can reconstruct all the coefficients G (i) by applying a technique similar to the one employed in the previous Section, therefore, we will not repeat the discussion on the adopted sampling.
The integrals in d dimensions necessary for the evaluation of these terms have been given in several papers, see e.g. [47][48][49][50]; we list them here for completeness: Their multiplication by the coefficients G (i) completes the calculation of the rational part corresponding to the evaluation of R 2 in Refs. [50][51][52]. After this step is done, we can set µ 2 = 0 in all expressions and retain only the four-dimensional numerator in the remainder of the calculation.

Examples of Applications
The tensorial reconstruction, paired with an efficient program for the evaluation of tensor integrals, can represent a simple and efficient way of computing one-loop virtual corrections. This method is particularly efficient for processes involving integrands of maximum rank equal to four: in this case the number of coefficients to compute is small and routines for the evaluation of tensor integrals very efficient. Important processes such as pp → bbbb or pp → ttbb belong to this category.
In this Section however we explore different uses of the tensorial decomposition, in particular the advantages that this method can bring when combined with other advanced approaches for the reduction of one-loop amplitudes, such as OPP/d-dimensional unitarity.

Dealing with unstable points
One of the major challenges for the new reduction methods, in particular when compared to algebraic reduction techniques, is to deal efficiently and automatically with phase space points which are numerically unstable. This is typically in the proximity of a vanishing Gram determinant.
All the points that do not pass the reconstruction/stability test within any chosen reduction algorithm can be reprocessed by using the technique presented here. The tensorial reconstruction avoids the reduction to scalar integrals and thus the emergence of Gram determinants; on the other hand, the technique requires the evaluation of tensor integrals, which is in general more time consuming.
In order to approach continuously a kinematic configuration that is numerically unstable, we consider a 4-point rank 4 diagram made of a fermion loop with two massless and two massive vector particles attached to it. We approach the phase space configuration of a vanishing Gram determinant by taking the limit Q → 0 within the kinematics where E = m 2 + Q 2 changes with Q, while θ and m 2 are kept constant (in the following plots we set θ = 35 180 π and m 2 = 7). The Gram determinant is given by det G = 32 E 4 Q 2 sin 2 θ, while det S, the modified Cayley determinant, goes to a constant as Q → 0. The calculation is performed with samurai, using a standard d-dimensional integrandlevel reduction; the results are compared to the improved tensorial technique where we employ Golem 95 for the evaluation of the tensor integrals.
We present the case of a four-point function because this is typically the case in which the Gram determinant issue shows up. We would like to note that for kinematics far from a vanishing Gram determinant, the new method based on tensorial reconstruction and samurai have similar performances; however, for diagrams with more than four legs, the performance of samurai with "standard" reduction at the integral level is naturally better in these "safe" phase space regions.
As shown in the top panel of Figure 2, at a certain value of det G/ det S, the standard reduction technique starts deviating from the stable result obtained by the improved tensor reconstruction, exhibiting the sensitivity to the vanishing Gram determinant. The price to pay for the improved stability is an increased run time. The bottom panel of Figure 2 shows the timing evaluated for the tensorial reconstruction method in double precision, compared to an implementation in quadruple precision of the standard reduction at the integral level, each normalized to the standard reduction in double precision.
To make a comparison with quadruple or multiple precision we follow a strategy similar to the one proposed in [43]: we generate the kinematics in double precision, the scalar integral evaluation is performed in double precision as well. Then we upgrade formally the double precision kinematics to quadruple and go through the algorithm evaluating the numerator with multiple precision.
This can certainly give some benefit because near a configuration of vanishing Gram determinant the situation is such that numerator and denominator vanish, and with quadruple precision one can succeed to get a cancellation for small values of the Gram determinant. Such an improvement is evident from Figure 2 (top panel); nevertheless, within this construction we do not find further improvements increasing the precision of the algorithm from quadruple to multiple. Hence, this implementation of quadruple precision only delays the Gram determinant problem, thus reducing the affected part of the phase space rather than curing the problem at its root. One might argue that a little improvement here would mean a big improvement in terms of statistics when extensive Monte Carlo runs are performed. On the other hand, in our implementation, examining the timing indicates that quadruple precision is more time consuming then tensorial reconstruction, at least for the example considered here.
The performance of the tensorial reconstruction relies on the underlying implementation of the tensor integrals: for this purpose we employed the Golem 95 library. The timing depends on the value of det G det S , as expected given the internal structure of the Golem 95 library. When det G det S is above a certain threshold a fast, analytic reduction is used. For points where the ratio is below the threshold, the library switches to a method using the numerical quadrature of one-dimensional integral representations of the tensor integral form factors, thus avoiding the introduction of inverses of Gram determinants, but increasing the run time by up to a factor seven as compared to the standard reduction at integrand level, for this particular example.
Considering that the number of points where the standard approach fails in a crosssection calculation typically amounts to a few per mil, the additional cost is relatively small. Moreover, within the set of diagrams contributing to a specific process, only a subset will suffer from the instabilities and requires the additional evaluation time.
We have so far only discussed how the proposed method can cure numerical instabilities related to small Gram determinants, and will now elucidate strategies for an a priori detection of points requiring the application of this reduction technique. One would like to keep the fraction of points which are classified as unstable as low as possible to avoid switching unnecessarily to a slower method. In the remainder of this section we will study the performance in detecting such points for various reconstruction tests that have been proposed in the literature.
Cancellation of the poles. The cancellation of the infrared and ultraviolet poles is often used as a first indicator to identify points suffering from numerical instabilities. The panel below the top one of Figure 2 shows that there is indeed a correlation between the size of the Gram determinant and the deviation of the poles from the expected result, the latter being obtained from the tensorial method which gives a stable answer. However, one also observes a sizable spread in the values, which hampers the choice of a good threshold to discriminate unstable from stable points.
The "N = N " local test. The local test only examines the reconstructed polynomials corresponding to specific cuts. In the example we have plotted at each point the bubble cut which shows the maximum discriminant, the latter being defined as the relative difference between the reconstructed and the original polynomial. We find a steep slope in the region where the standard reduction method fails, however, the curve flattens out at smaller values of the Gram determinant.
The "N = N " global test. A stronger correlation can be found by looking at the global "N = N " reconstruction test. The idea behind this test is to compare the original numerator function to the reconstructed one for an arbitrary value of the integration momentum. The global test shows a clear correlation to the Gram determinant. By comparing the top plot of Figure 2 with the benchmark from the "N = N " global test of Figure 2, we can select an appropriate threshold: in this particular example, a threshold of O(1) in the discriminant is sufficient for the classification of points as "unstable".
The power test. The so called power test was first presented in [28]. It uses the fact that in the reconstruction of the numerator one typically allows for a higher power of the integration momentum in the reconstructed function than one would expect in the original numerator function. One can therefore find certain combinations of coefficients which should sum to zero if the reconstruction was successful. Moreover, with respect to other reconstruction tests, the power test has the advantage of being totally independent from the choice of the integration momentum. Similar to the case of the global N = N test we find a strong correlation between the goodness of these zeroes and the size of the Gram determinant and therefore find another trustworthy criterion for the discrimination of numerically unstable points in the presence of degenerate kinematics.

Improving the numerator sampling
Apart from dealing with numerical instabilities, even for stable phase-space points there are at least two possible applications of this tensorial reconstruction.
After the tensorial reconstruction is performed, we can feed the reconstructed numerator rather than the original one to the reduction algorithm. The added time needed to reconstruct the numerator can be compensated by a faster evaluation of the numerator during the reduction. For this reason, in the case of long expressions for the numerator function, processing the reconstructed tensor can be more efficient than the original numerator.
In order to address this issue, we performed a simple test: given a phase space point far from singularities, we consider a 6-point one-loop amplitude with a dummy numerator of rank R made of one line of code by means of sums and powers of various scalar products. We can increase linearly the complexity of the numerator by adding identical copies of the same line and monitor the run time for the evaluation of the amplitude with or without a preliminary reconstruction of the tensor integrand N (q) . The results for the time ratio between the two methods for increasing sizes of the numerator are given in Table 1. It is interesting to observe that, in the case of rank R = 4, we can easily gain a factor of 2 in the overall timing.  Table 1: Timing for a rank 4 and rank 6 six-point function with a numerator of N lines. We display the ratio between the run times of the "hybrid" method and the standard reduction. In the "hybrid" method we compute the reconstructed tensor N (q) and use it for the reduction in place of the original numerator: already for a numerator of about N = 100 lines, we get a significant improvement in computation time.
As a further application, the method allows for a sampling based on real values of the integration momentum: this feature can be used to facilitate the adaptation of tree-level generators to the task of producing one-loop numerator functions.

Implementation with samurai and Golem 95
For all the examples and calculations contained in this paper, the new techniques have been implemented using samurai for a d-dimensional integrand-level reduction, and Golem 95 for the evaluation of the tensor integrals. However the validity of the methods presented goes beyond our specific implementation and can be used in several alternative frameworks.
Coming back to our implementation, it required updates within samurai which will be part of a future release of the package. In particular: Rescue System: Unstable points, detected by means of one of the available tests described above, will be automatically reprocessed using the tensorial decomposition. The evaluation of tensor integrals is performed by Golem 95. Tensorial numerator: A flag will allow to reconstruct the tensorial integrand N (q) , to be processed by the reduction algorithm in place of the original numerator. This is useful in the case of long expressions for the numerator function.

Conclusions and Outlook
We presented a new approach to the reduction of one-loop amplitudes in which we reconstruct the tensorial expression of the scattering amplitudes by means of a sampling in the integration momentum at the integrand level. This approach could be used within existing techniques for the reduction of one-loop multi-leg amplitudes in several ways; for example to deal with numerically unstable points, to allow for a sampling of the numerator function based on real values of the integration momentum, and to optimize the numerical reduction in the case of long expressions for the numerator functions.
We also illustrated how the existing reconstruction tests can be employed efficiently to detect potentially unstable phase space points. We believe that this method can represent a viable option as a "rescue-system" to deal with unstable phase space configurations: the increase in computation time required by the evaluation of tensorial "master" integrals appears to be less costly than reprocessing the unstable points with quadruple-or multiprecision routines.
The tensorial reconstruction has been implemented and successfully tested within samurai and Golem 95 and will be part of the next release of these programs. Hence, the largest system appears at rank six and is of size 20.