Scattering AMplitudes from Unitarity-based Reduction Algorithm at the Integrand-level

SAMURAI is a tool for the automated numerical evaluation of one-loop corrections to any scattering amplitudes within the dimensional-regularization scheme. It is based on the decomposition of the integrand according to the OPP-approach, extended to accommodate an implementation of the generalized d-dimensional unitarity-cuts technique, and uses a polynomial interpolation exploiting the Discrete Fourier Transform. SAMURAI can process integrands written either as numerator of Feynman diagrams or as product of tree-level amplitudes. We discuss some applications, among which the 6- and 8-photon scattering in QED, and the 6-quark scattering in QCD. SAMURAI has been implemented as a Fortran90 library, publicly available, and it could be a useful module for the systematic evaluation of the virtual corrections oriented towards automating next-to-leading order calculations relevant for the LHC phenomenology.


Introduction
With the beginning of the experimental programs at the LHC, the need of describing particle scattering events with high accuracy becomes more pressing. On the theoretical side, perturbative calculation within leading order precision cannot be sufficient, therefore accounting for effects due to next-to-leading order corrections becomes mandatory.
The next-to-leading order (NLO) corrections to an n-parton final state process receive contributions from two sources: the one-loop correction to the (2 → n)-scattering, due to the exchange of an internal virtual particle; and the tree-level scattering (2 → n + 1), due to the real emission of an extra parton. Each contribution contains divergencies which cancel mutually in the final result where they are combined.
The increasing computational complexity of one-loop amplitudes, when the number of particles involved in the scattering increases, has limited the possibility of developing an automated multi-process evaluator for scattering amplitudes at NLO. The available results have been so far computed on a process-by-process basis, but, due to the recent advances in computational techniques for high-energy physics, that possibility is now at the horizon.
It is well known that any one-loop amplitude can be expressed as a linear combination of a limited set of Master Integrals (MI) [70,71]: therefore, the evaluation of one-loop corrections reduces to evaluating the coefficients that multiply each MI. Aiming at the full reconstruction of one-loop amplitudes through such a decomposition, several automated packages have appeared, either in public releases like CutTools [72] and Golem [73], or in private versions such as the routines described in [41], [74] and [75], and codes like BlackHat [76], Rocket [77], and Helac-1Loop [78].
The development of novel numerical techniques have received a boost by the combination of three important ideas: i) universal four-dimensional decomposition for the numerator of the integrand for any one-loop scattering amplitudes [79,80]; ii) four-dimensional unitarity-cuts, detecting only the (poly)logarithmic structure of the amplitude, known as the cut-constructible term [53,81] (see [82] for a more comprehensive list of references); iii) unitarity-cuts in d-dimension, yielding the complete determination of dimensionally regulated one-loop amplitudes [83][84][85][86][87][88].
The first two ideas merged in the by-now known as OPP-approach [80,89], proposed by Papadopoulos, Pittau, and one of us, where the multi-pole decomposition of the numerator of any Feynman integral is achieved by a polynomial sampling that exploits the solutions of generalized unitarity-cuts.
In the context of four-dimensional unitarity, the problem of computing the cut constructible term and the rational term, that escapes the four-dimensional detection, are necessarily considered as separate issues. The reconstruction of the latter usually requires information from an extra source. When not obtained from the direct calculation of Feynman integrals, the rational term can be reconstructed by adding a piece derived from the cut-constructible part (for instance, the overlapping-term within the on-shell method [90], or the R 1 -term within the OPP-approach [91]), and a remaining piece computed through an additional tree-level like construction (for instance, the BCFW-recursive term within the on-shell method [90], or the R 2 -term within the OPP-approach [91][92][93]).
In this paper we present samurai, a tool based on a hybrid algorithm for the numerical computation of one-loop amplitudes. samurai relies on the extension of the OPP-polynomial structures to include an explicit dependence on the extra-dimensional parameter needed for the automated computation of the full rational term according to the d-dimensional approach, and makes use of a polynomial interpolation based on the type of Discrete Fourier Transform (DFT) described in [94].
We aim at producing a versatile code which could deal with any one-loop corrections, in massless as well as massive theories. Our reduction algorithm can process both (numerator of) Feynman integrals, proper of diagrammatic methods, and products of tree-level amplitudes, as adopted in the framework of unitarity-based techniques. For a complete reconstruction of the rational term, the input should contain an explicit dependence on the dimensional-regularization parameters. In fact, it is expected to have a polynomial behavior in µ 2 , being µ the radial integration variable in the extra-dimensional subspace, and in ǫ ( = (4 − d)/2 ) according to the choice of the regularization scheme. The result is given as Laurent expansion in ǫ up to the finite-order, and accounts for the full rational terms.
samurai is implemented as a Fortran90 library, publicly available at the webpage: http://cern.ch/samurai and it is linked to OneLOop [78] and QCDLoop [95] for the numerical evaluation of the MI. We applied it to a series of known processes, like the four-, six-photon and eight-photon scattering in QED, the QCD virtual corrections to Drell-Yan, to the leading-color amplitude for V + 1jet production, to the six-quark scattering, q 1q1 → q 2q2 q 3q3 , and to the contributions of the massive-scalar loop-diagrams to the all-plus helicity five-and six-gluon scattering.
In particular, for the virtual corrections to q 1q1 → q 2q2 q 3q3 [52], we also considered the reduction of automatically generated integrands, by interfacing samurai with an infrastructure derived from golem-2.0 [96], which provides numerators of Feynman integrals. These examples are thought to be used both as a guide to understand the samurai framework, and as templates to generate the codes for other calculations.
In the context of collaborations among different groups aiming at automated NLO calculations relevant for LHC phenomenology [97], and, therefore, providing complementary structures to be interfaced [98], samurai could constitute the module for the systematic evaluation of the virtual corrections.
The paper is organized as follows. The reduction algorithm is discussed in Section 2; Section 3 describes the key-points of the samurai library, while a series of applications are illustrated in Section 4. In Section 5, we resume our conclusions.

Reduction Algorithm
The reduction method is based on the general decomposition for the integrand of a generic one-loop amplitude, originally proposed by Papadopoulos, Pittau and one of us [80,89], and later extended by Ellis, Giele, Kunszt and Melnikov [87,88]. Within the dimensional regularization scheme, any one-loop n-point amplitude can be written as We use a bar to denote objects living in d = 4 − 2ǫ dimensions, following the prescription Also, we use the notation f (q) as short-hand notation for f (q, µ 2 ).

Integrands
samurai can reduce integrands of one-loop amplitudes which can be defined in two ways, either as numerator functions (sitting on products of denominators), or as products of treelevel amplitudes (sewn along cut-lines). The former definition accommodates a reduction based on a diagrammatic method, while the latter is proper of a unitarity-based technology. According to the chosen dimensional regularization scheme, the most general numerator of one-loop amplitudes N (q, ǫ) can be thought as composed of three terms, 3) The coefficients of this ǫ-expansion, N 0 , N 1 and N 2 , are functions of q ν and µ 2 , therefore in our discussion, except when a distinction between them is necessarily required, we will simply talk about N , giving as understood that the same logic would apply to each of the three contributions N i .

Decomposition
According to [80,89], the numerator N (q) can be expressed in terms of denominatorsD i , as follows where i << m stands for a lexicographic ordering i < j < k < ℓ < m. The functions ∆(q) = ∆(q, µ 2 ) are polynomials in the components of q and in µ 2 . By using the decomposition (2.4) in Eq.(2.1), the multi-pole nature of the integrand of any one-loop n-point amplitude becomes trivially exposed, which, as we will see, is responsible of the decomposition of any dimensional regulated one-loop amplitude in terms of Master Integrals (MI) associated to 4-, 3-, 2-, and 1-point functions, respectively called boxes, triangles, bubbles, and tadpoles.

Polynomial Structures and Discrete Fourier Transform
The calculation of a generic scattering amplitude amounts to the problem of extracting the coefficients of multivariate polynomials, generated at every step of the multiple-cut analysis. To determine these coefficients we implement a semi-numerical algorithm whose main features are: • the extension of the OPP-polynomials [80,89] for quadruple-, triple-and double-cut to the framework of d-dimensional unitarity [87,88]; • the parametrization of the residue of the quintuple-cut affecting only the polynomial dependence on the extra-dimension scale [99]; • the numerical sampling of the multiple-cut solutions according to the type of Discrete Fourier Transform described in [94].

Polynomials
In this section we review the interpolation of the polynomial ∆(q), appearing in Eq.(2.3), implemented in samurai.
For each cut, we decompose q, namely the 4-dimensional part ofq, into a specific basis of four massless vectors e i [57,79,80], such that and where e 1 and e 2 are real vectors, while e 3 and e 4 are complex.
The massless vectors e 1 and e 2 can be written as a linear combination of the two external legs at the edges of the propagator carrying momentumq + p 0 , say K 1 and K 2 , The massless vectors e 3 and e 4 can be then obtained as, (2.10) In the case of double-cut, K 1 is the momentum flowing through the corresponding 2-point diagram, and K 2 is an arbitrary massless vector. In the case of single-cut, K 1 and K 2 cannot be selected from the diagram, and are chosen as arbitrary vectors. After defining the basis adopted for decomposing the solutions of the multiple-cuts, we can list the corresponding polynomial functions, whose variables are the components of the loop-momentum not-constrained by the cut-conditions.

Quadruple Cut
The residue of the quadruple-cut,D i = . . . =D ℓ = 0, defined as, is parametrized as, 14) where K 3 is the third leg of the 4-point function associated to the considered quadruple-cut.

Triple Cut
The residue of the triple-cut,D i =D j =D k = 0, defined as, is parametrized as,

Double Cut
The residue of the double-cut,D i =D j = 0, defined as, can be interpolated by the following form,

Single Cut
The residue of the single-cut,D i = 0, defined as, can be interpolated as follows, (2.20)

Discrete Fourier Transform
As proposed in [94], the coefficients of a polynomial of degree n in the variable x, say P (x), defined as, can be extracted by means of projections, according to the the Discrete Fourier Transform. The basic procedure is very simple: 1. generate the set of discrete values P k (k = 0, ..., n), by sampling P (x) at the points 2. using the orthogonality relation each coefficient c ℓ finally reads, The extension of the DFT projection to the case of multi-variate polynomials is straightforward.
As one can notice the formula for the coefficients c ℓ , although simple, diverges when ρ goes to zero. By using the parametrization in Eq.(2.6), the radius ρ happens to be constrained by the on-shell cut-condition. Depending on the external invariants and internal masses, the dangerous value ρ = 0 might occur. In a previous work [94], we described a safer sampling, which significantly reduces the numerical instabilities arising from the vanishing of ρ. We do not repeat the same discussion here, but recall that the sampling of the multiple-cut solutions used for the polynomial interpolation of the triple-and double-cut residues within samurai are chosen according to that algorithm. By using the DFT solutions as described in [94], we sample the numerator functions exactly as many times as the number of the unknown coefficients, without needing additional sampling points to improve the numeric precision, which would demand more computing time.

Amplitude and Master Integrals
The knowledge of all the coefficients appearing in the polynomials ∆ ijkℓm , ∆ ijkℓ , ∆ ijk , ∆ ij , and ∆ i implies the following expression for the one-loop n-point amplitude, where, beside the scalar boxes, triangles, bubbles and tadpoles, the other master integrals are [84,100] d dq The last two master integrals, J ij , respectively a linear and a quadratic 2point function, appear as a consequence of the polynomial structure of ∆ ij (q), defined in Eq.(2.18), which was chosen to have no singularity in presence of vanishing external invariant [89]. The vector e 2 entering their definition is an element of the loop-momentum basis, defined in Eq.(2.6), and used for the solutions of the double-cutD i =D j = 0. Also, because of the monomial parametrization of the quintuple-cut residue, ∆ ijkℓm (q), given in Eq.(2.12), the decomposition of the amplitude in terms of MI, Eq.(2.26), is free of scalar pentagons, as already noticed in [99].

Running samurai
In this section we give some details about using samurai. All the files are available on the webpage: The archive samurai v1.0.tar.gz contains the files for the samurai library, several examples of calculations, and also the routines for the evaluation scalar integrals QCDLoop [95] and OneLOop [78].
1. Download the archive samurai v1.0.tar.gz and extract the files. They will be copied in a folder called /samurai.

2.
Run the Install script. It will compile all useful routines and organize them. All routines are written in Fortran 90 and the default compiler is gfortran. In order to change compiler (or compiling options), the user should edit all the makefile commands.
After running the Install script, you will find four subfolders within the /samurai directory: the subdirectory named /libs will contain all the libraries, namely the

Initialization
To initialize the samurai library, one needs to choose the arguments of the subroutine initsamurai call initsamurai(imeth,isca,verbosity,itest) which specify the the type of input to reduce (imeth), the routines for the numerical evaluation of the scalar integrals (isca), the details of the output (verbosity), and the test to apply to the reconstruction (itest): • imeth -samurai can reduce integrands of one-loop amplitudes defined either as numerator of diagrams sitting on products of denominators, specified with imeth=diag; or as products of tree-level amplitudes sewn along cut-lines, specified with imeth=tree.
• verbosity -The level of information printed in the file output.dat can be chosen with the value of verbosity: verbosity=0, no output; verbosity=1, the coefficients are printed; verbosity=2, the value of the MI's are printed as well; verbosity=3, the outcome of the numerical test appears.
• itest -This option is used to select the test to monitoring the quality of the numerical reconstruction. The possibilities are: itest=0,1,2,3 to have respectively none, the global (N = N )-test, the local (N = N )-test, and the power-test, which are described in Sec. 3.4 While imeth=diag supports all the options for itest, the choice imeth=tree allows only itest=0,2.

Integrand definition
After selecting the routines for the scalar integrals and the reduction technique, the user should provide information about the integrand, by specifying the numerator and the denominators.
The denominators of the diagram to be reduced are defined through the subroutine InitDenominators which generates the lists of internal momenta Pi and squared masses msq characterizing each propagator: call InitDenominators(nleg,Pi,msq,v0,m0,v1,m1,...,vlast,mlast) The arguments of the subroutine, labeled as input/output ([i/o]) according to their role, are: In the notation Pi(i,m), the first index, i=0,...,nleg-1, runs on the set of the denominators; while the second index m=1,...,4, runs over the components of the vector, with the energy being given as 4 th component. • v0, m0 -[i]. The vector v0 and the mass m0 are assigned to the first denominator.
• vlast, mlast -[i]. The vector vlast and the mass mlast are assigned to the last denominator.

Reduction
Having defined the integrand denominators, characterized by Pi and msq, the actual reduction of the input (xnum) is performed by the library samurai, call samurai(xnum,tot,totr,Pi,msq,nleg,rank,istop,scale2,ok) which writes the total result of the reduction in tot. For convenience, the rational term is also separately written in totr.
Here comes the detailed description of each argument: The numerator of the diagram is defined in an external function, whose name can be decided by the user, but with fixed arguments. Hereby we adopt the dummy name xnum.
The complex function xnum(icut,q,mu2) is the integrand to be reduced. The arguments of the function xnum(icut,q,mu2) are: icut, an integer labeling the cut, where each digit corresponds to a cut-denominator in descending order (ex. icut= 3210 corresponds to the quadruple-cutD 0 =D 1 = D 2 =D 3 = 0 ); q, the virtual four-momentum, q (with the energy given as 4 th component); and mu2 the extra-dimensional mass-scale, µ 2 .
When imeth=diag, xnum is expected to have the form of a numerator, hence being polynomial in q and µ 2 . In this case xnum is a unique function to be processed at every level of the top-down reduction by cycling on icut, but does not depend on the considered cut.
When imeth=tree, xnum is expected to be formed by the product of tree-amplitudes, therefore the presence of propagators it is also allowed. In this case, xnum is not unique, but should change according to the considered cut. Therefore, the value of icut yields a selective access to the proper integrand within the same function.
• tot - [o]. The complex variable tot contains the final result for the integrated amplitude of numerator xnum. The finite part, that also includes the rational term, will be stored in tot(0), while tot(-1) and tot(-2) contain the single and double poles, respectively.
• totr - [o]. For the purpose of comparisons and debugging, we also provide the rational part totr alone. This complex number is the sum of all contributions coming from integrals in shifted dimensions, namely all contributions that contain a dependence from µ 2 in the reconstructed integrand.
• rank -[i]. This integer value is the maximum rank of the numerator. This information is extremely valuable in order to optimize the reduction and improve the stability of the results. Using this information, we can simplify the reconstruction of the numerator by eliminating contributions that do not appear in the reduction. If the information about the rank of the integrand is not available, rank should be set equal to nleg.
• istop -[i]. This flag stops the reduction at the level requested by the user. istop is an integer, whose range of values is from 1 to 5.
istop=5,4,3,2,1 will interrupt the calculation after determining pentagon, box, triangle, bubble, and tadpole coefficients respectively. This procedure can be particularly useful to improve the precision of calculations when one knows a priori that a particular set of integrals does not contribute to the considered process.
• scale2 - [i]. This is the scale (squared) that is used in the evaluation of scalar integrals.
• ok - [o]. This logical variable carries information about the goodness of the reconstruction. The default values is ok=true, and it is set to ok=false when the reconstrucion test fails.
As stated in Section 4.7.1, the generic one-loop integrand can be polynomial in ǫ up to the second-order. Each coefficient of the ǫ-decomposition can be assigned to a specific function, i.e. xnum0, xnum1, xnum2, which can be independently processed.

Reconstruction Tests
There are three different ways of monitoring the quality of the coefficients reconstructed by samurai.

Global (N = N )-test
The first option (itest=1) is the so-called "N = N " test on the reconstructed expression for the numerator functions, which was already discussed in [80,89]. It is based on the equality given by Eq. (2.4), between the original numerator in the l.h.s. and the reconstructed one in the r.h.s., evaluated at an arbitrary value ofq. A possible drawback of this precision test lies in the fact that the coefficients of tadpoles and bubbles in Eq.(2.4) multiply a large set of denominators: for a six-point function, each tadpole coefficient multiplies five denominators, namely a term proportional to masses or momenta, q, raised to ten powers, that can be huge in some cases or very small in other situations. This might have the effect of hiding the contribution of some coefficients or, as happens more frequently, might yield to overestimating the error in the reconstruction.

Local (N = N )-test
A second check is a "local N = N " test (itest=2), regarding the reconstruction of each polynomial ∆(q), respectively defined in Eqs. (2.11, 2.13, 2.15, 2.17, 2.19). In this case the value ofq used for the numerical check is chosen among additional solutions of the considered multiple-cut, which have not participated to the determination of ∆(q) itself. This option is suitable for a unitarity-based calculation (imeth=tree).

Power-test
A third option (itest=3) for testing the precision of the reconstruction is the "power test". We can observe that the maximum powers in q in the r.h.s and l.h.s of Eq.(2.4) are different: the reconstructed side can contain terms with high powers of q that are not present in the original numerator. Therefore it is clear that the overall coefficients in front of these terms should vanish. The reconstructed expressions in general are not simple, since they involve pieces coming from the polynomial spurious terms multiplied by the denominators. However, for each choice of the rank and number of denominators, there is at least one simple set of coefficients that sum to zero exactly. Moreover, this set is the lowest one in the reconstruction and therefore it carries information about any loss of precision at previous steps in the reduction. If the difference between the rank and the number of denominator is equal to three (nleg-irank =3), the sum of all the coefficients of three-point scalar integrals should be zero, namely: where the sum is over all possible triple cuts. Analogously, if the difference between the rank and number of denominator is equal to two (nleg-irank=2), the sum of the coefficients of two-point scalar integrals should be zero, namely: where the sum involves all double cuts. Finally, if the difference between the rank and number of denominator is equal to one, (nleg-irank=1), the sum of the coefficients of the tadpole scalar integrals should be zero, namely: where the sum involves all single cuts. The situation is slightly more complicated for maximum rank when difference between the rank and number of denominator is equal to zero. If (nleg-irank=0), we should consider all the one-point spurious coefficients As a final remark, we observe that the outcome of the "power test" does not depend in any way from the choice of the integrated momentumq, unlike the previous two methods.
The threshold values for the reconstruction checks can be set in the file ltest.dat, to be located in the directory where the call to initsamurai is made. The phase-space points failing the tests (ok=false) are stored in the file bad.points, in the same directory. In principle they could be re-processed enhancing the numerical precision by compiling the samurai library in quadruple-precision.

Comments on Precision
The precision of the results obtained using a reduction algorithm at the integrand-level depends on many variables.
When the numerator is a real function of the external momenta and masses there is a simple way to establish the quality of the reduction: real functions give rise to real coefficients of MI. In this case, the error on each coefficient can be estimated by the size of the imaginary part, that should vanish.
More generally, the quality of the reconstruction can be quantified by the ratio of the difference between the exact calculation (analytical or multi-precision) and the reconstructed one, and the former, evaluated over a large set of unweighted points. This procedure gives a good indication, but it is not always safe, because the error on the prediction in a calculation based on the importance sampling could suffer from the accumulation of bad points in the neighborhood of higher weights.
We identify three kinds of possible instabilities, which could be all controlled by adopting quadrupole or multiple precision routines. The first kind of instabilities is related to the well known problem of the vanishing of the Gram determinants, inducing an enhancement of the coefficients of the MI carrying such pathological kinematic factor. They can be monitored by the tests implemented in samurai, and the dangerous cases could be dealt with by introducing branches to dedicated reduction routines, hence without making use of the multiple precision. The second kind corresponds to big cancellations among the contributions from different diagrams in the same calculation. On-shell methods, which work with purely gauge invariant objects, seems to represent the best option to avoid such problem. The third type of instability can occur when the values of internal masses are sensibly larger then the phase-space invariants. In this case, both the cut-constructible part and the rational term are large but their sum remains relatively small. This in principle could be cured with a change of the integral basis where the cancellations are built-in.
Our tool does not switch automatically between double and quadruple precision. The running in the latter case is time-consuming, therefore, along the lines of the above considerations, we are investigating a more systematic treatment of the problematic configurations, which goes beyond the scope of this version of the code, and will be the subject of a future publication. samurai can process two different kinds of input, according to the strategy adopted for the generation of the integrand. In the Feynman diagrams approach one should provide a set of numerator functions, each accompanied by a corresponding list of denominators. On the other hand, in the generalized unitarity approach the input will be in the form of products of tree-level amplitudes. In the following we describe some calculations performed within both frameworks.

Examples of Applications
In several cases, we use Rambo [101] for generating phase-space points.

Four-photon Amplitudes
This example is useful to verify the proper reconstruction of the rational term. The leading term of the process γγ → γγ in QED proceeds via fermion-loop [102,103]. We treat both cases of massless and massive fermion. The four-photon amplitudes get contributions from the 6 Feynman diagrams representing the possible permutations of the 4 photons attached to the fermion loop. Indeed, only 3 permutations are independent and need to be evaluated, because loops related by flipping the fermion line give the same answer. Let us consider the diagram with the photons labeled in clockwise order 1234, carrying the following denominators, and numerator, where L i is the 4-dimensional part ofL i (= L 1 + µ). Note that now the whole expression can be evaluated numerically in terms of the four dimensional complex variable q and the real variable µ 2 . Using samurai, it is easy to see that: the term proportional to µ 2 m 2 in Eq.(4.4) gives rise to null integrals and does not contribute, the terms proportional to µ 2 q µ q ν are not individually zero but they cancel when summing over all contributions; and finally that the µ 4 -term gives the correct rational term.
The construction of the amplitudes follow closely the one that we used for the four photons. Out of the 120 contributing diagrams, all containing up to rank-6 tensor integrals, only 60 need to be computed. We can construct all of them as permutation of just one diagram. In the massless case, we consider the diagram with the photons in the clockwise order 123456, whose corresponding numerator reads, whereL 1 =q ,L 2 =q + p 2 ,L 3 =q + p 23 ,L 4 =q + p 234 , L 5 =q + p 2345 ,L 6 =q + p 23456 . (4.6) This example turns out to be challenging for the reduction algorithm, because each diagram separately admits a non-trivial reduction with non-vanishing coefficients for all the MI and rational terms but, after summing together the partial results of all diagrams, there are strong cancellations.
In the final answer all contributions coming from 2-point functions cancel out. Moreover, also the rational terms vanish. Indeed, the final expression contains only cut-constructible terms and no rational part and the knowledge of the coefficients of boxes and triangles alone is sufficient to obtain the correct answer for the total amplitude. After the dimensional decomposition of the loop momentumq, it is easy to see that all the terms containing one, two or three powers of µ 2 give rise to vanishing integrals and do not contribute. As a consequence, the only term needed in the numerical evaluation is the four dimensional one: N (q, µ 2 ) = N (q) = −Tr / L 1 / ǫ 2 / L 2 / ǫ 3 / L 3 / ǫ 4 / L 4 / ǫ 5 / L 5 / ǫ 6 / L 6 / ǫ 1 . By exploiting the knowledge that contributions from bubbles and rational terms will vanish, and therefore removing these terms from the reduction, we verify an improvement on the final result. Infact, by setting istop = 3 and isolating only the cut-constructible terms (by subtracting totr diagram by diagram), the results of samurai turn out to be in better agreement: As expected, the strong cancellations between the 60 diagrams spoil the precision of the full results even if the number of good digits for this specific phase-space point can still be considered sufficient for phenomenological studies.

Eight-photon Amplitudes
The eight-photon amplitudes [83,106,110] are an example of the functionality of samurai for many-particle scattering. The numerator function is written along the same lines as in the previous two sections. In this case, the number of diagrams is 5040. We evaluate the amplitudes for two helicity choices. By using the same sampling set as in [106], we show in Fig.1 how the numerical result produced with samurai in the MHV case, − − + + + + ++, are tight to the analytic behavior [83]. The NNMHV case, − − − − + + ++, shown in Fig.2, is a new result that confirms the structure of the amplitude discussed in [110], where only boxes do contribute.  The one-loop correction to uū → e + e − [111,112] is an easy example of a numerator with ǫ-dependent terms. The numerator of the diagram in Fig. 3 can be cast in the form
The value d = 4 in the expression above corresponds to the result in the Dimensional Reduction (DR) scheme, while the choice d = 4 − 2ǫ yields an ǫ-dependent term, according to the Conventional Dimensional Regularization (CDR) scheme. samurai can be used to reduce both the ǫ 0 and the coefficient of the ǫ 1 term individually, namely N 0 and N 1 of Eq.(2.3). It is easy to see that the inclusion of the latter has the well known effect of subtracting a contribution C F g 2 s times the tree-level amplitude from the finite part of the DR-result.

Leading-color Amplitude for V + 1jet
The leading color amplitude for the virtual NLO correction to V + 1jet production at the hadron collider is a good exercise to show the reduction in a case where the contribution of all diagrams is cast in a single numerator function.
Once the color factors have been stripped, this amplitude can be calculated at the Feynman diagram level taking the sum of the parent diagram in Fig. 4 and its pinched diagrams, i.e. four triangles and two bubbles. The presence of the γ 5 in the weak vertex imposes a choice on its treatment in dimensional regularization. Adopting the Dimensional Reduction (DR) scheme and assuming an anticommuting γ 5 one can get the right result adding a well known finite-renormalization contribution, amounting to (−N c /2) times the tree-level amplitude.
With the proper routing of the loop momentum in the diagrams, it is possible to collect all the diagrams over the four denominators of the parent box: the numerator of triangles is multiplied by the single missing denominator, while the bubbles by two denominators. In this, we should process only one numerator function. This way of collecting the diagrams does not spoil the precision of the result. Using this construction, we found perfect agreement with the expression for A 5;1 given in the Eqs.(D.1-D.5) of [113].

Five-and Six-gluon amplitudes
We choose two simple examples, namely the amplitudes contributing to the rational part of the all-plus helicity 5-gluon and 6-gluon scattering [60,[114][115][116], to show how a unitaritybased calculation can be implemented within samurai (option imeth=tree).
The diagrams involved correspond to one-loop amplitudes with external gluons coupled to a massive-scalar loop, whose integrand can be built by means of the tree-level amplitudes given in [116,117], namely

19)
A tree 5 (1 s ; 2 + , 3 + , 4 + ; 5 s )= , (4.20) where r 2 is the reference vector of the gluon-2, and p ij = k i +k j . For instance, the integrand of the quintuple-cut shown in Fig.5 can be written as, In the case of the 5-gluon amplitudes, we give the complete set of integrands, for quintuple-, quadruple-, triple-and double-cuts (istop=2), although only boxes appear in the result.
In this case, we see explicitly that triangles and bubbles do not contribute. For the same reason, in the 6-gluons case, we only give the integrands for the quintuple-and quadruplecuts (istop=4). The results of these calculations, due to the external helicity choice, are purely rational in the d = 4 limit and agrees with the results of [117].

Six-Quarks Scattering
When the number of diagrams contributing to the scattering amplitude is small, the input file that includes the numerators and the list of momenta to be processed by the reduction is fairly simple, and the calculations are feasible with a minimal amount of automation [118,119]. However, even in simple cases, a careful automation reduces the probability of introducing bugs or human mistakes in the code.
An automatized generation of the input files becomes a necessity as the complexity of the process increases. As a final example (with the diagrammatic approach), we tackle a more involved calculation, namely the one-loop QCD corrections to the 6-quark scattering q 1q1 → q 2q2 q 3q3 . The number of Feynman diagrams contributing to this process requires a fully automated approach. The amplitude for q 1q1 → q 2q2 q 3q3 involves 258 Feynman diagrams (8 hexagons, 24 pentagons, 42 boxes, 70 triangles, and 114 bubbles). Each diagram, or convenient combinations of them, should be processed by the reduction algorithm separately. The numerators and the lists of denominators required by the reduction have been generated and automatically written in a Fortran90 code, ready to be processed by samurai.
We use this example also as a first benchmark on the functionality of our framework. During the generation of the code, all Feynman diagrams contributing to the process are automatically written and organized in Fortran90 files fully compatible with the reduction library, ready to be run. In order to check our algebraic manipulations, we compute both N 0 (q) and N 1 (q) of Eq.(2.3), namely also the part of the numerator proportional to ǫ, although in an actual calculation this can be avoided by choosing the regularization scheme conveniently.
There are eight different helicity configurations that contribute to this process. Our numerical results have been compared with those obtained for the same process with golem-2.0 and golem95 [96] and we found perfect agreement.
On a Intel(R) Xeon(R) CPU X5482 3.20GHz machine, the generation of the code for the full process takes less than 10 minutes, and the result for each color summed helicity amplitude is produced in 55 ms per phase-space point. However, by avoiding the reduction of N 1 (q) with a proper scheme choice, the computing time goes down to 36 ms/ps-point.

Numerator
When working with Feynman diagrams, we prepare the numerator function N (q) by processing the output of a diagram generator symbolically with a computer algebra program; the actual computer program is written by an optimizing code generator (see also Fig. 6). This modular approach is very generic and, to a large extend, can be based on existing tools; in particular we have an automated setup using QGraf [120], Form [121] and haggies [122]. Furthermore, the matrix element generator golem-2.0 [96] has been extended to provide an interface which simplifies the use of the components mentioned above. We want to stress that the described setup is very modular and that any component in the workflow can be exchanged by alternative solutions.
As discussed in Section 2, the most general numerator of one-loop amplitudes, N (q, ǫ), can be written as, N (q, ǫ) = N 0 (q) + ǫN 1 (q) + ǫ 2 N 2 (q). (4.23) The functions N 0 , N 1 and N 2 are functions of q ν and µ 2 , therefore in our discussion, except when it is necessarily required a distinction between them, we will simply talk about N , giving as understood that the same logic would apply to each of the three contributions N i . We work with the helicity projections of the amplitude which are decomposed into subamplitudes formed by the sum of all diagrams sharing the same set of denominators. The color information is hidden from the reduction by defining the numerators of the subamplitudes, N (i) (q, ǫ), as the contraction of the numerators of the one-loop diagrams with the tree-level amplitude. If we call N {i 1 i 2 ...in} the numerator stemming from the sum of all diagrams which have (exactly) the denominatorD i 1D i 2 · · ·D in , the corresponding subamplitude would be In our implementation this product is done numerically and does therefore not add to the complexity of the expressions. In cases where the tree-level matrix element vanishes, one can always find an appropriate set of color projectors P † I P I into one-dimensional subspaces such that A † n · A n = I (P I A n ) † · (P I A n ). (4.25) where P I correspond to Wigner-Eckhard symbols. In the cases with no external color, the only projection is P 0 = 1. The objects P I · N {i 1 i 2 ...in} hence are the objects that undergo the reduction. Optionally, one can also group larger sets of diagrams into subamplitudes by also considering diagrams which contain a subset of the maximal set of denominators. The numerator of the corresponding subamplitude in the latter sense would be

Algebraic Simplification of the Lorentz Structure
In order to unravel the dependence of N (q, ǫ) on q, µ 2 and ǫ we use dimension splitting based on the 't Hooft-Veltman scheme. We define the subspaces of the regulated Minkowski space such that g µν = g µν +g µν ,ḡ µ µ = d, g µ µ = 4,g µ µ = −2ǫ, g µρg ρν = 0 (4.27) and with the corresponding projections of the Dirac matrices γ µ = g µ νγ ν andγ µ =g µ νγ ν the Dirac algebra is uniquely defined by Working with this scheme one can show [123] that after separating the four from the (d − 4) dimensional projection of each Dirac matrix one can factorize a mixed spinor line into In this notation the definition of the helicities is such that |p ± = 1 2 (1 ± γ 5 )u(p) and p ± | =ū(p) 1 2 (1 ± γ 5 ), where p and p ′ are lightlike vectors. The extension to massive vectors is straightforward by projecting each massive vector onto a sum of two lightlike vectors. The trace in Eq. (4.29) evaluates to a product of metric tensorsg µ i µ j using the usual rules for spinor traces. Since in the 't Hooft-Veltman scheme at one-loop the only d-dimensional vector is the integration momentum these metric tensors lead to factors of µ 2 and ǫ. The Lorentz indices inside the remaining, four-dimensional spinor lines are eliminated using Chisholm identities, of which we apply also a variant specific to spinor chains, where Γ and Γ ′ are strings of four-dimensional Dirac matrices and ← − Γ denotes the string in reversed order. After these steps, the numerator is suitable for efficient numerical evaluation since it is expressed entirely in terms of constants, dot products and spinor products of the form p λ |p ′ λ and p λ |/ q|p ′ −λ . The result of golem95 for the helicity configuration (q − 1 ,q + 1 , q − 2 ,q + 2 , q − 3 ,q + 3 ), at the ps-point given in Eq. showing a nice agreement (the color-average factor, 1/9, and the helicity-average factor, 1/4, are already included).
The double-and single-pole of the virtual contribution are consistent with the infrared poles amounting to [28],

Precision of Integrated Results
We have used the matrix element of the q 1q1 → q 2q2 q 3q3 amplitude for recalculating the q 1q1 → q 2q2 q 2q2 amplitude [96] by anti-symmetrizing over the final state. We have integrated the virtual matrix element with MadEvent [2,3] and compared the poles of the virtual amplitude to those of the integrated dipoles using MadDipole [34,36]. Figure 7 shows the remainder of the pole contributions which should sum up to zero. The results represent a realistic Monte Carlo integration and indicate that the precision is well under control.

Conclusions
In this work we have presented samurai, a tool for the automated numerical evaluation of one-loop corrections to any scattering amplitudes within the dimensional-regularization scheme. Its implementation is based on the decomposition of the integrand according to the OPP-approach, extended to the framework of the generalized d-dimensional unitarity-cuts technique, and on the use of the Discrete Fourier Transform as polynomial interpolation technique. We have shown how samurai can process integrands written either as numerator of Feynman integrals, like in diagrammatic methods, or as product of tree-level amplitudes, according to unitarity-based methods. In both cases, the advantage of working within a d-dimensional unitarity framework is that the result of samurai is complete and does not require any additional information for the reconstruction of the rational terms.  Figure 7: Estimate for the precision obtained from the difference between the single (resp. double) poles of the virtual amplitude and those of the integrated dipoles for q 1q1 → q 2q2 q 2q2 . The results have been obtained by integrating 10 5 phase-space points at √ s = 14 TeV, where we have used cuts on p T > 30 GeV and the rapidity η < 2.5 as well as a separation cut of ∆R > 0.8 between the final state particles. We used the CTEQ6m [124] PDF set with two-loop running for α s with a renormalisation scale of µ = i p T (i) 2 .
We discussed its application to a series of examples such as the 4-, 6-, and 8-photon scattering amplitudes in QED, the QCD virtual corrections to Drell-Yan, the leading color amplitude for V + 1jet production, the six-quark amplitudes, and contributions from massive-scalar loop to the all-plus helicity 5-and the 6-gluon amplitudes. For the sixquark scattering q 1q1 → q 2q2 q 3q3 , we also considered a fully automated reduction, from the integrand generation to the final result.
Given the versatility of the code, samurai may constitute a useful module for the systematic evaluation of the virtual corrections, oriented towards the automation of nextto-leading order calculations relevant for the LHC phenomenology.
The reduction library libsamurai and the examples are publicly available at the webpage: http://cern.ch/samurai the numerical comparisons of the 6-photon amplitudes. We like to thank Nicolas Greiner for providing the MadEvent code used for the example in Section 4.7.4. P.M. and F.T. are pleased to thank Zoltan Kunszt, Zoltan Trocsanyi and Bryan Lynn for clarifying discussions. G.O. and T.R. wish to acknowledge the kind hospitality of the Theory Department at CERN, at several stages while this project has been performed. The work of G.O. was supported by the NSF Grant PHY-0855489 and PSC-CUNY Award 60041-39 40; T.R. has been supported by the Foundation FOM, project FORM 07PR2556.