Tensor integrand reduction via Laurent expansion

We introduce a new method for the application of one-loop integrand reduction via the Laurent expansion algorithm, as implemented in the public C++ library Ninja. We show how the coefficients of the Laurent expansion can be computed by suitable contractions of the loop numerator tensor with cut-dependent projectors, making it possible to interface Ninja to any one-loop matrix element generator that can provide the components of this tensor. We implemented this technique in the Ninja library and interfaced it to MadLoop, which is part of the public MadGraph5_aMC@NLO framework. We performed a detailed performance study, comparing against other public reduction tools, namely CutTools, Samurai, IREGI, PJFry++ and Golem95. We find that Ninja out-performs traditional integrand reduction in both speed and numerical stability, the latter being on par with that of the tensor integral reduction tool Golem95 which is however more limited and slower than Ninja. We considered many benchmark multi-scale processes of increasing complexity, involving QCD and electro-weak corrections as well as effective non-renormalizable couplings, showing that Ninja’s performance scales well with both the rank and multiplicity of the considered process.


Introduction
Scattering amplitudes in quantum field theory describe the fundamental interactions between elementary particles and provide a powerful way for inferring theoretical models form highenergy phenomenology and vice-versa.At the scales probed by modern colliders, such as the Large Hadron Collider (LHC) at CERN, scattering amplitudes can be computed in perturbation theory as a Taylor expansion in the coupling constants.Leading-order results are plagued by very large theoretical uncertainties and as such they are often not reliable enough for direct comparisons with experimental results.Phenomenological studies can significantly benefit from theoretical predictions at next-to-leading order accuracy or beyond, which are however complicated by several factors, a crucial one being represented by quantum corrections to amplitudes computed via loop integrals.The calculation of these integrals can be extremely challenging, especially for processes involving many external legs and physical scales.Such processes can however be of great relevance, both for testing the Standard Model in unexplored regions of phase space and for simulating backgrounds to signals of interesting (new) physics.This makes the calculation of loop amplitudes a very active field of research.
A solution to the problem of computing generic one-loop integrals is offered by integrand reduction [1][2][3].
Integrand reduction methods rewrite one-loop integrands as a linear combination of terms in an integrand basis, each of which has five or less loop propagators and yields either a vanishing integral or a known Master Integral.The numerical evaluation of these master integrals is possible by means of public libraries such as OneLOop [4], Golem-95 [5,6], LoopTools [7] and QCDLoop [8] (the last two use the FF library [9] internally).Because the form of the integrand basis is universal and independent of the process or the number of external legs, the algorithm can be applied to any one-loop scattering amplitude in any Quantum Field Theory.The coefficients of this decomposition, also known as integrand decomposition or OPP decomposition [1], can be efficiently computed by evaluating the integrand on multiple-cuts, i.e. values of the loop momentum such that a subset of the internal loop propagator denominators vanish.This algorithm is based on repeated numerical evaluations of the integrands and the solution of the resulting subsystems of equations for the coefficients.The method has been implemented in the public codes CutTools [10] and Samurai [11], and has been used within several automated frameworks [7,[12][13][14][15][16][17][18][19] for producing a wide variety of phenomenological results.MadLoop [20], part of the MadGraph5 aMC@NLO [21] framework (abbreviated MG5aMC henceforth), is an example of such tool.It automatically generates one-loop matrix elements and computes them using both traditional OPP reduction (CutTools and Samurai) and tensor integral reduction [22,23] (as implemented in the tools Golem95 [24], PJFry++ [25] and IREGI).MadLoop features an in-house implementation of the OpenLoops method [16] using a modified version of the ALOHA [26] module to compute the components of the tensor integrand numerator.
More recently, a new approach to one-loop integrand reduction has been developed, namely the integrand reduction via Laurent expansion method [27], which elaborates on techniques first proposed in [28,29] for analytic calculations.Within this approach, the computation of the coefficients of the Master Integrals is significantly simplified by performing a Laurent expansion of the integrands with respect to the components of the loop momentum which are not constrained by the multiple-cut conditions.Since loop integrands are rational functions of the loop components, within semi-numerical calculations the semi-analytic Laurent expansion can be performed via a simplified polynomial division algorithm between the expansion of the numerator and the loop denominators.Such a technique has been implemented in the public C++ library Ninja [30], which combined to the one-loop package GoSam [15,31,32] has been used for producing several phenomenological results for complicated processes both within the Standard Model and beyond.
The Laurent expansion reduction algorithm implemented in Ninja needs as input procedures which return the leading terms of the above mentioned parametric Laurent expansions of the numerator of the integrand.Generating such input is straightforward and easy-toautomate for analytic one-loop generators such as GoSam [15,31] and FormCalc [7], but it has so far prevented other one-loop tools following a more numerical approach from using reduction via Laurent expansions.
However, as already noted in ref.s [27,30], the only explicit analytic information needed by Ninja about the integrand is its dependence on the loop momentum (and not, for instance, on the external kinematics or polarization states), which is always known in the case of tensor-based loop generators, regardless of whether the entries of the tensors are generated analytically or numerically.
We present an efficient numerical algorithm for constructing the Laurent expansions needed by Ninja directly from the (numerical) entries of loop tensor numerators.In particular, each term of these expansions can be computed by contracting the tensor numerator with specific cut-dependent tensorial projectors.In a numerical implementation, these projectors can in turn be constructed at run-time by means of simple recursive formulas, from lower to higher ranks.This algorithm has been implemented within the Ninja library and the only inputs it needs for the reduction, besides the definition of the loop propagators, are the numerical components of the tensor numerator.This allowed to interface Ninja to MadLoop whose ability to compute these components is now sufficient, as demonstrated in this paper.
In sect.2, we recall the definition of the tensor integrand and in sect.3 we briefly review the integrand reduction technique via Laurent expansion.We fix the notation and introduce the computational techniques for building symmetric tensors in sect.4 which we use in sect.5 to derive formulas for the projection of the tensor numerator onto the coefficients of the Laurent expansion.Details on the implementation of this projection in Ninja as well as its interface to MadLoop are provided in sect.6.We present a detailed study of the stability and timing performances of the combination of MadLoop and Ninja in sect.7 and we give our conclusions in sect.8.

Tensor integrands
A generic one-loop amplitude can be written as a sum of n-point integrals of the form The numerator N of the integrand is a polynomial in the components of the d-dimensional loop momentum q, with d = 4 − 2 , while the denominators D i correspond to Feynman loop propagators and they have the general quadratic form where p i is a linear combination of external momenta and m i is the mass of the particle propagating in the loop (which can be complex-valued when treating unstable particles in the loop within the complex mass scheme [33,34]).One can split the d-dimensional loop momentum q into a four-dimensional part q and a (−2 )-dimensional part µ, 3) The numerator thus becomes a polynomial in the four-dimensional components of q and µ 2 , We define a four-dimensional tensor numerator as a numerator cast into the form where the tensor coefficients Ñ (r) µ 1 •••µr are independent of the loop momentum.In principle one could consider the more general definition of a d-dimensional tensor numerator, i.e. a linear combination of terms with the form of the r.h.s. of Eq. (2.5) multiplied by powers of µ 2 .Although it is straightforward to generalize the results of this paper to d-dimensional numerators, we will only consider the four-dimensional case since tensor-based one-loop matrix element generators typically compute the contributions arising from µ 2 terms in the numerators (also known as R 2 contributions) separately using special tree-level Feynman rules [35].
In the definition in Eq. (2.5), a tensor numerator is therefore defined by a sum of tensors homogeneous in rank r, for r = 0, . . ., R. The maximum rank R satisfies R ≤ n for renormalizable theories, up to a gauge choice.In this paper we will consider a more general case, i.e.R ≤ n + 1, allowing for up to one effective non-renormalizable vertex in the loop.
In a naive implementation, a generic tensor-numerator of rank R would be defined by R r=0 4 r = (4 R+1 − 1)/3 entries, growing exponentially with the rank.However, since the tensor described by Eq. (2.5) is completely symmetric by construction, we can cast the numerator in the alternative form where the total number of µ i -symmetric coefficients N (r) R , which only grows polynomially (namely as R 4 ) with the rank.
In order to clarify the notation, we consider the following example of an arbitrarily chosen d-dimensional numerator function which can be recast into the tensorial structure where {p i } denotes the collective kinematic information of the phase-space point considered (external momenta, masses, helicity and color assignation, etc. . .).The {} sym notation indicates that the corresponding tensor components are symmetrized according to the procedure described by Eq. 4.1.The four-dimensional tensor numerator is thus obtained by setting µ 2 = 0 and identifying the Lorentz structures multiplying the loop momentum µν ({p i }) . (2.9) Several techniques have been proposed for constructing the tensor loop numerator.MadLoop, which features an independent implementation of the OpenLoops method [16], progressively builds the loop numerator polynomial by successive calls to building block functions numerically computing the loop four-momentum dependence of each vertex and propagator involved in the loop.Contrary to analytic methods, this approach makes it very difficult to reconstruct the d-dimensional µ 2 dependence of the loop numerator, so that it is important that the loop reduction algorithm works equally well with only the 4-dimensional projection N (q), in which case the missing rational terms arising from µ 2 will be reconstructed independently.
An alternative method proposed in ref. [19], addressing the case where a numerical evaluation of the integrand is available but its full polynomial structure is not known, reconstructs the entries of the tensor by sampling the numerator on several values of the loop momentum.
In this paper, we will consider a one-loop tensor integrand to be defined by the entries of the symmetric tensor numerator in Eq. (2.6), as well as the momenta p i and masses m i appearing in the loop denominators as in Eq. (2.2).Thanks to the new projection techniques introduced in this paper, these are now the only input required by Ninja for performing the corresponding loop reduction.

Semi-numerical integrand reduction via Laurent expansion
In this section we briefly review the input needed by the semi-numerical integrand reduction via Laurent expansion algorithm [27] as implemented in the C++ library Ninja [30].We make no attempt in giving a comprehensive description of this reduction method, which can instead be found in ref.s [27,30].Indeed, in this paper we are merely interested in illustrating how to provide the required input starting from a tensor integrand defined as in sect.2, while the internals of the reduction algorithm are unchanged with respect to what has already been presented in the literature.
Integrand reduction methods compute loop integrals by exploiting the knowledge of the algebraic structure of the respective integrands.In more details, any one-loop integrand in dimensional regularization can be written as a sum of contributions with five or less loop propagators, regardless of the number of external legs or the complexity of the process.The corresponding numerators, also known as residues, are polynomials with a universal, processindependent parametric form.The unknown process-dependent coefficients appearing in this parametrization can thus be found via a polynomial fit.After integration, the amplitude is expressed as a linear combination of known Master Integrals, whose coefficients can be identified with a subset of the coefficients appearing in the integrand decomposition.
An efficient way of computing the coefficients of the integrand decomposition is by evaluating the integrand on multiple cuts, i.e. on values of the loop momentum such that a subset of loop denominators vanish.Indeed, when evaluating the integrand on the generic multiple cut D i 1 = • • • = D i k = 0, the only non-vanishing contributions to the integrand decomposition are those coming form the residues sitting on the cut (i.e.vanishing) denominators, as well as from the higher-point residues having the cut denominators as a subset of their propagators.This suggested the possibility of computing the coefficients of each residue by evaluating the integrand on a subset of the solutions of the multiple-cut equations defined by its loop denominators, subtracting the non-vanishing contributions form higher-point residues and solving a system of equations for the unknown coefficients.This is therefore a top-down approach, where higher-point residues are computed first (starting form 5-point residues) and systematically subtracted from the integrand when evaluating lower-point residues.These are known as subtractions at the integrand level.This is the approach followed by the public reduction codes CutTools [10] and Samurai [11].
As we mentioned, the integrand reduction via Laurent expansion method [27] can achieve better stability and performance by exploiting the knowledge of the analytic dependence of the integrand on the loop momentum.More specifically, on top of the numerical evaluation of the loop numerator, the algorithm needs as input a numerical evaluation of the leading terms of the numerator with respect to properly defined Laurent expansions, parametric in the directions of the loop momentum unconstrained by the multiple-cut conditions.From these, the coefficients of the integrand decomposition are computed via a simplified polynomial division algorithm between the expansion of the numerator and the loop denominators and corrected by counter-terms depending on higher-point residues.These are referred to as subtractions at the coefficient level, which simplify and replace the ones at the integrand level of the original algorithm.
In the following we describe the four inputs needed by Ninja, assuming the rank R satisfies R ≤ n + 1, while in the next sections we will describe how to automatically generate them at run-time from the coefficients of a tensor numerator.In this section we consider a generic µ 2 -dependent numerator for the sake of generality, although, as already mentioned, we will later specialize to the case of a four-dimensional tensor numerator defined as in Eq. (2.6).Notice however that the µ 2 -dependence arising from the expansion of the loop momentum q must be considered in both cases.The four input functions are • the numerator function used for the cut-constructible part of 4-point residues and optional internal tests as a function of the loop momentum (notice that this is the same input as in traditional integrand reduction algorithms), • the µ 2 -expansion used for the rational part of 4-point residues, returning the terms n defined by the expansion N (q, µ 2 ) as a function of the four-vectors v ν 0 and v ν 1 n (i) • the t 3 -expansion used for 3-point and 1-point residues, returning the terms n (i,j) t 3 defined by the expansion N (q, µ 2 ) as a function of the four-vectors v ν 0 , v ν 3 , v ν 4 and the complex number β n (i,j) • the t 2 -expansion used for 2-point residues, returning the terms n defined by the expansion N (q, µ 2 ) as a function of the four-vectors v ν 1 , v ν 2 , v ν 3 , v ν 4 and the complex numbers β i , with i = 0, 1, 2 (3.7) We remind the reader that the vectors v ν i defining the expansions are cut-dependent, so that the methods for the corresponding coefficients will be called on all the relevant cuts (and possibly more than once per cut, as needed) within one loop reduction.The terms above are all those needed for calculations with R ≤ n + 1.If the rank is lower than n + 1, fewer terms are needed, and in numerical implementations one should take care that only a minimal number of terms is computed so as to optimize performances.
Any one-loop generator capable of providing a numerical evaluation for the terms in Eq. (3.3), (3.5) and (3.7), on top of the evaluation of the numerator as in Eq. (3.1), can use Ninja.We now turn to describing a method for building these expansions from a tensor numerator of the form of (2.6).The algorithm then proceeds in a purely numerical way, using as input only the (numerical) entries of a symmetric tensor numerator.Indeed, as already mentioned, the terms of each expansion (3.2)-(3.6)can be defined as contractions between the tensor numerator and cut-dependent tensors which Ninja can build at runtime by means of recursive algorithms.Since all the methods defined above have been implemented within the Ninja library for generic tensor integrands with R ≤ n + 1, this allows current one-loop tensor generators to use Ninja for the reduction, simply by providing the coefficients N defining the loop numerator as in Eq. (2.6).

Symmetric tensors
In this section we introduce some notation on symmetric tensors and recursive formulas useful for efficiently building the cut-dependent tensorial projectors appearing in our results.

Notation
Consider a set of independent vectors {v 1 . . .v k } and the symmetrized tensor product of r (not necessarily distinct) vectors v i 1 , . . ., v ir , with i 1 , . . ., i r ∈ {1, . . ., k}, namely This tensor, being completely symmetrized, only depends on the number of times each v i enters the product.As noted in ref. [36], one can exploit this and introduce a natural correspondence between symmetric tensors and polynomials.More in detail, we will use the following polynomial notation where r i is the multiplicity of occurrence of v i on the r.h.s. of the equation, with i r i = r.The conventional prefactor we introduced on the r.h.s.exactly cancels out against the equivalent permutations of the tensor indexes, which turns out to be particularly convenient for our application.This is better clarified with a couple of explicit examples 2 The notation is also useful for writing tensor relations in a compact way.As a shorthand, if T and U are symmetric tensors of identical rank r, we also define the contraction where the sum over repeated indices is restricted so as to be consistent with the definition in Eq. (2.6).

Recursive formulas
Tensors can be built recursively by multiplying lower rank tensors with vectors.For this purpose, we can define the tensor product of a rank-(r − 1) tensor T with a vector v as the rank-r tensor Notice that, even if T is symmetric, the r.h.s. will in general not be a symmetric tensor.However one can easily work out recursive formulas which build symmetric tensors from linear combinations of tensor products.The easiest recursive formula involves rank-r tensors obtained by multiplying a single vector v with itself r times, namely The easiest non-trivial case involving two vectors v 1 and v 2 is The tensor v r−1 in the first addend on the r.h.s.can in turn be built beforehand using Eq.(4.5), while the tensor v r−2 1 v 2 appearing in the second addend is instead of the same type of the one on the l.h.s.but with a lower rank.Eq. (4.6) can thus be read as a recursive formula for building symmetric tensors of the form v r−1 1 v 2 , where the recursion goes from lower to higher ranks r, starting from r = 1 which trivially reads which should be read as a recursion relation in both r and k.Indeed, the second addend involves a symmetric tensor v r−1−k 1 v k 2 which is of the same type as the l.h.s.but with total rank r−1.The tensor in the first addend may instead be rewritten as v r−k hence as a tensor of the same form as the l.h.s.but with lower values of r and k.
For our purposes we need one more recursive formula involving three vectors v 1 , v 2 and v 3 , which reads The ingredients for this recursion are, similarly as before, tensors with lower r and k, as well as tensors of the form of Eq. (4.7).
It is also worth observing that all these recursive formulas can be seen as special cases of a more general one An important observation for numerical calculations is that these recursive formulas have the nice side effect of automatically embedding a system of abbreviations based on reusing common subexpressions.Indeed, as one can see from the definition in Eq. (4.4), each entry in a tensor product of total rank r can be obtained from an entry of rank r − 1 by a single multiplication.Because our formulas are recursive on the rank and involve linear combinations of tensor products, they provide a built-in mechanism for reusing subexpressions of lower rank when building tensors of higher rank.Moreover, the possibility of reusing common subexpressions is not limited to contributions defined within the same recursive formula, but it can also be extended to contributions across different equations in a way which fits particularly well with the method we will use for building the Laurent expansions of the integrands.We will see in the next section that the leading term of a Laurent expansion can always be obtained from tensors of the form of Eq. (4.5).Next-to-leading terms, when needed, will be constructed using Eq.(4.6).As we already observed, the r.h.s. of this equation involves a lower-rank tensor of the same form as the l.h.s. and a tensor of the same form of Eq. (4.5).While the former is available simply by implementing the recursion from lower to higher ranks, the latter can instead be reused from the tensors recursively built for the leading term.An analogous strategy is also possible for Laurent expansion terms beyond next-to-leading, where one can always use tensors built in previous steps of the calculation as input for the recursive formulas (more explicit examples will be given in sect.5).This greatly reduces the total number of operations needed for the construction of these tensors, without hard-coding complex analytic formulas and while having a relatively simple bookkeeping and still being completely general with respect to the rank of the tensors appearing in the recursion relations.
In the next section we show how the Laurent-expansion terms needed by Ninja can be generated by contracting tensor numerators with tensors of the same kind as those in Eq. (4.5), (4.7) and (4.8).

Tensor projectors for Laurent-expansion terms
As we stated above, we can build the terms of the Laurent expansions described in sect.3 by contracting the tensor numerator with appropriate cut-dependent tensors, which can be seen as projectors.These can in turn be built recursively using the formulas of sect.4.2.We will illustrate the method by explicitly working out a few cases.A complete list of formulas for all the tensor projectors is given in Appendix A.
As stated above, in the following we will consider a four-dimensional tensor numerator defined as in Eq. (2.6), which thus only depends on the four-dimensional components q µ of the loop momentum.However, because Ninja implements a d-dimensional version of the integrand reduction method, the parametrization of q on the d-dimensional cut solutions (and thus its Laurent expansion) will still depend on the extra-dimensional variable µ 2 and we must therefore keep track of this dependence while building the expansions (knowing, of course, that it can only come from the loop momentum, and not from the numerator).This d-dimensional reduction yields, on top of the coefficients of the master integrals, the contributions to the rational part of the amplitude coming from the µ 2 dependence of the loop denominators, also known as R 1 .An alternative approach to the calculation of R 1 (used for instance by CutTools) is its reconstruction from the coefficients of a purely four-dimensional reduction, also known as cut-constructible part.It is worth stressing that, as one can observe from the results collected in Appendix A, the calculation of the Laurent expansion terms involving µ 2 , with the approach presented in this paper, can always be recycled from the identical terms needed for the cut-constructible part, except for the box residues where the expression is however very simple.This allows to efficiently provide the algorithm of Ninja with all the terms needed by its d-dimensional integrand reduction, while remaining completely agnostic about the µ 2 -dependence of the loop numerator within the reduction routines.
In the following, n is the number of loop propagators and R is the rank of the tensor numerator.The Laurent-expansion parameter is denoted by t, and it is always convenient to compute the terms from higher to lower powers of t.A first reason for this is that the highest powers of t are always needed, while the lower powers might not be.A second compelling reason is that, as we already mentioned, tensors built for the leading terms in t can be reused as input for building terms with lower powers of t, using the recursive formulas introduced in the previous section.

Numerator evaluation
The easiest function to provide is the evaluation of the numerator function which, using Eq.(2.5), simply amounts to where we used the notation introduced in sect.4 (in particular Eq.(4.1) and (4.3)).Each tensor q r can in turn be built using the recursive formula of Eq. (4.5), which here reads q r = q r−1 ⊗ q. (5.2)

The µ 2 -expansion
The µ 2 -expansion, only needed for R ≥ n, involves a single term for R = n and two terms for R = n + 1.Since the vector v 0 comes with a power of t, while v 1 is O(t 0 ), it is straightforward to see that the leading term in t of the expansion defined by Eq. (3.2) is As done before, we can build v R 0 recursively by means of Eq. (4.5) (5.4) For the case where R = n + 1, we also need the next-to-leading term in t, which is given by the following formula where the tensor appearing in the first addend of the r.h.s. was already built for the leading terms using Eq.(5.4), while for the second addend we can use the recursive relation in Eq. (4.6), which in this case reads The first addend of this recursive relation also depends on tensors built using Eq.(5.4) for the leading term, while the second depends on a lower-rank tensor which gives the recursion in r.

The t 3 -expansion
The t 3 -expansion has a more complicated structure due to the presence of three vectors (v 0 , v 3 and v 4 ) and the free variable µ 2 on top of the expansion variable t.More in detail, the vector v 3 comes with a power of t, the vector v 0 is O(t 0 ) and the vector v 3 has a O(1/t) term multiplied by the constant β and a O(µ 2 /t) term.Hence, the projector for the leading term is a tensor containing only v 3 , while replacing a v 3 by a v 0 decreases the power in t by one, and replacing a v 3 by a v 4 decreases the power in t by two and also adds a µ 2 term.
Since the leading and next-to-leading terms of the expansion do not involve v 4 (and thus neither µ 2 ), they have exactly the same structure as those for the µ 2 -expansion.They are and The next-to-next-to-leading terms in t are two, namely a O(t R−2 µ 0 ) term and a O(t R−2 µ 2 ) term, and their expression involves all the three vectors v 0 , v 3 and v 4 .They are given by the following formulas (5.10) It is worth making a few observations.We already mentioned that the µ 2 -dependent terms can be determined from the cut-constructible ones, and indeed the contribution ) is common between the two equations and thus only needs to be computed once.Moreover the tensor v R−1 3 v 4 , and more in general all those of the form v r−1 3 v 4 , can be computed from the recursion relation in of Eq. (4.5) which in this case depends on the tensors v r 3 already computed above for the leading term in t.The tensor v R−2 3 v 2 0 can instead be computed using the formula in Eq. (4.7) with k = 2, to be read as a recursion relation in r and depending on tensors of the form v r−1 3 v 0 already computed for the next-to-leading terms in t.Similarly, one can compute all the other terms with lower powers of t, by means of a simple power counting on the vectors v 0 , v 3 and v 4 , and building appropriate tensors with the formulas of sect.4.2.One can also check that the formulas of Eq. (4.5), (4.7) and (4.8) suffice for the calculation of all the terms down to O(t R−4 ), which is all one needs for integrands with R ≤ n + 1.At each step one can perform the recursion with respect to r, for r = 0, . . ., R, reusing as ingredients tensors computed for lower r or for terms with higher powers of t.Explicit formulas for all terms are given in Appendix A.

The t 2 -expansion
The terms for the t 2 -expansion can be computed using the same method as for the t 3expansion.Since we are now dealing with three variables (t, x and µ 2 ) and four independent vectors (v i , with i = 1, . . ., 4) the main difference is a more involved bookkeeping, partially mitigated by the need of less powers in t.Explicit formulas are collected in Appendix A.

Implementation
The tensor projectors for the Laurent expansion terms described above have been implemented in the Ninja library.
The reduction algorithm implemented in Ninja requires as input a numerator which is an abstract interface implementing the methods described in sect.3 (the C++ programming interface is described in ref. [30]).We thus implemented such an interface which computes the expansion terms collected in Appendix A from the coefficients N (r) µ 1 •••µr of a generic tensor numerator, defined according to Eq. (2.6).
The tensor numerator is treated as a polynomial whose coefficients are stored in a unidimensional array, from lowest to highest according to the graded lexicographic monomial order in the variables q µ with q 0 ≺ q 1 ≺ q 2 ≺ q 3 (i.e.terms are ordered by their total degree and terms with the same total degree are ordered lexicographically).This is the same monomial order used internally by MadLoop and turns out to be particularly convenient for building the tensors described in this paper, since we use formulas which are recursive with respect to the total rank.It is worth observing however that none of the results presented in this paper rely on a specific representation of the momenta (and consequently of the tensor numerator).In particular, the formulas collected in Appendix A, as well as the algorithms implemented for building the corresponding tensors, are unchanged after a change of coordinates q µ → q µ = Λ µ ν q ν and can thus be applied to any other representation of the four-dimensional components after converting the momenta v i used as input into the alternative representation.
We also implemented in Ninja a Fortran-90 wrapper exposing this tensor interface, which in principle can be used by any one-loop tensor-based generator by specifying the loop propagators and the coefficients of the tensor numerator defining the integral to be computed.Both the Fortran and the C++ interface are publicly available since version 1.1.0 of the library.MG5aMC (v2.3.4 and onwards) now includes a version of Ninja which is used as the default loop reduction method1 .This default installation of Ninja can be automatically updated to the latest online one (independently distributed) by running the following command in the MG5aMC interactive interface:

MG5 aMC> install ninja
Ninja, similarly to CutTools is available both in double and quadruple precision at runtime and MadLoop will dynamically switch to quadruple precision when its internal stability tests indicate that the double precision computation does not meet the requirement in numerical accuracy.In MadLoop, the computation of the tensor numerator coefficients is completely independent from the loop reduction and, as a result, the stability rescue is first attempted by re-performing in quadruple precision only the loop reduction (although with input kinematics already promoted to quadruple precision accuracy).This is often enough to restore satisfactory numerical stability, hence avoiding the much more time-consuming full-fledged quadruple precision computation.
We also point out that Ninja's internal kinematic matrix K ij = (p i − p j ) 2 , with quantities defined as in Eq. (2.2), is initialized directly in MadLoop where the following three on-shell limits are set to be exact when below a certain adimensional threshold δ set to 10 −8 by default: This proved to help the numerical stability of the reduction, essentially because it avoids ever approaching the kinematic region where the master integrals switch from a massive to massless description.The choice of the analytic expression to be evaluated by the Master Integral library is typically controlled by an internal infra-red threshold parameter which would apply to each integral independently.By regulating the kinematic matrix in MadLoop, we guarantee the consistency of the expression employed for all master integrals.
Finally, all loop reduction methods except CutTools and IREGI can be independently (de-)activated before the generation of the one-loop matrix element numerical code by setting the corresponding MG5aMC path options in <MG root>/input/mg5 configuration.txt.If activated at generation time, then their use at run time can be controlled via the parameter MLReductionLib specified in the file MadLoopParams.dat.

Applications
In this section, we will present the summary of a detailed study of the timing and stability performances of MadLoop interfaced to Ninja.When available, we compare the results ob-tained with Ninja against other reduction algorithms, namely CutTools, Samurai, IREGI, PJFry++and Golem95, whose limitations (for the versions interfaced to MadLoop) are summarized in table 1. CutTools is a library implementing a four-dimensional version of the integrand reduction method, as well as a reconstruction of the R 1 term as explained at the beginning of section 5. Samurai is a similar tool which always performs a full d-dimensional integrand reduction, making it capable of handling d-dimensional loop numerators at the price of being less efficient of four-dimensional ones, since it implements a more complex reconstruction of the integrand.: For reducing loops with 9 (11) loop lines and more, Samurai (CutTools) must be recompiled with an increased value for its default maximum number of denominators.† : Loops with rank n loop prop.+ 1 are supported in CutTools only for models with effective interactions involving only the Lorentz structures of the Higgs-gluons vertices.: This IREGI limitation stems from the observation that its reduction of loops with rank larger than 6 is typically unstable for all kinematic configurations.
Table 1: Limitations of the different reduction methods interfaced to MadLoop.The notation n loop prop.refers to the number of internal propagators in the loop considered.All reduction tools except PJFry support complex masses.
depending on its applicability to each individual (group of) loop(s) being reduced.
The study carried in this section focuses on the following five classes of processes, chosen for their different characteristics that cover a wide spectrum of one-loop matrix-element computations.The notation {i, j} • X denotes that we considered all the processes with either i or j occurrences of particle X in the final states.
This class of processes is a common benchmark for pure QCD computations as it introduces the top mass as an additional scale.The one-loop amplitudes for each multiplicity of this class of processes were first computed in ref.s [37][38][39].The one-loop matrix element for the process gg → t tggg is generated and computed here for the first time for specific kinematic configurations (see Appendix.D.1).
• B) gg → H + {1, 2, 3} • g These processes are computed within the Higgs Effective Interaction Theory as implemented in [40].In this effective theory the top-quark loop is integrated out, yielding effective interactions between gluons and the Higgs.The resulting dimension-5 operators lead to loops with rank n loop prop.+ 1 which are especially challenging to reduce.
Thanks to the trivial Lorentz structure of the effective Higgs interactions, both Cut-Tools and older versions of Samurai are applicable [27], even though they do no support completely general tensor numerators of higher rank.The one-loop amplitudes for each multiplicity of this class of processes were first computed in ref.s [41][42][43].
This set of processes is similar to the one above, but involving a spin-2 particle Y of mass 1 TeV and whose graviton-like effective interactions are described in sect.2.3 of ref. [40] (we considered κ g = κ q ).In this case, the tensor numerator of the resulting loops with rank n loop prop.+ 1 can have an arbitrary structure that CutTools cannot handle.The one-loop amplitudes for this class of processes were first computed in [44] and [45] for 0 and 1 additional gluon in the final states and are computed here for the first time for 2 and 3 additional gluons (see Appendix.D.2).The study of the phenomenology of QCD corrections within this effective theory featuring a spin-2 particle is in preparation [46].
This is a class of loop-induced processes for which event generation has recently been automated in MG5aMC [47].Loop-induced processes are processes without any contribution from tree-level Feynman diagrams, in which case the one-loop amplitude must be squared against itself.This implies that when using integrand reduction (and only then), loops must be reduced individually and independently for each helicity configuration.
Also, given the absence of any Born contribution, loop-induced processes are finite and build by themselves the complete Leading-Order (LO) prediction.For this reason, the speed of event generation and phase-space integration is entirely driven by the one of the one-loop matrix element, making optimizations especially relevant in this case.The gluon-fusion amplitude for gg → ZZ was first computed in [48] and results for the processes gg → Zγγ and gg → ZZZ were shown in [47,49], while the loop-induced processes with four and five final state Z-bosons have never been studied.
• E) uū → ZZZZZ, uū → e + ν e µ − νµ b b, uū → t tb bd d and uū → t tb bd dg This less uniform class of process with uū initial states serves different purposes.The first process is intended to be compared with its loop-induced counterpart.The second one includes both EW and QCD loop contributions, of all coupling orders, and it probes the behavior of the loop reduction algorithms in the presence of many scales and with complex masses in the loop propagators.The last two processes test the reduction for high multiplicity processes featuring loops with a large number of loop propagators (up to nine2 ) but low rank.These four high multiplicity processes have been selected for their specific characteristics from the standpoint of loop reduction and they have no direct phenomenological relevance except for the second one, so that their computation is not present in the literature.
The b-quarks are considered massive in all SM processes except for uū → e + ν e µ − νµ b b.

Timing profile
For all processes listed at the beginning of this section, we measure independently the time spent for the computation of the tensor components3 of the loop numerator (t num ) and the reduction of the loops (t red ), for one random kinematic and helicity configuration summed over color assignations.We stress that t red includes the time spent in evaluating the master integrals as well as in the computation of the coefficients of the Laurent expansion in the case of Ninja.
In MadLoop, there is no optimization in the computation of the loop integrand tensor numerator across helicity configurations, so that t num scales linearly with the number of nonvanishing helicity configurations.Conversely, t red remains independent of that number since the summation over helicity configurations can be performed before the loop reduction (except for loop-induced processes when using integrand reduction techniques).
We stress that the timing profile for a single helicity configuration is the relevant figure of merit for applications within MG5aMC which does not explicitly sum over helicity configurations for loop contributions, but instead adopts a Monte-Carlo procedure coupled with an adaptive importance sampling.
We summarize our findings in fig. 1 showing results obtained with MadLoop interfaced to either Ninja, Samurai or CutTools.
The x-axis registers the number of loop groups which combines all loops that can be reduced together.This corresponds to the set of loops sharing the same topology (i.e.ordered list of loop propagators), except for loop-induced processes where each loop must be reduced individually and therefore lies in a loop group of its own.Notice that since loops identical up to couplings (like fermion loops of different flavors) are combined already when generating the loop matrix element code, they only count as one.
The main feature of the upper panel of fig. 1 is that within each class of processes, the dependence of the reduction time w.r.t. the number of loop groups is linear, as already noticed in [16].The offset between each class of process is related to the difference in the rank of the constituting loops.The loop rank is typically larger in processes within models with higher dimensional operators (blue hexagons and green squares) as well as in loop-induced processes which involve fermionic loops only (purple triangles).Conversely, the rank becomes smaller as the number of external fermion lines increases and we indeed observe that the timings for the processes gg → t t + n • g and uū → t tb bd d(g) sit on a line underneath (black circles and red triangles).It is interesting to note that that the process uū → 5 • Z is almost two orders of magnitude faster than its loop-induced counterpart, even though they are both contributions to the same final state.
The second inset of fig. 1 shows the ratio of the time spent in the computation of the components of the tensor numerator with the loop reduction time with Ninja.This ratio rapidly increases with the multiplicity and number of loop groups, clearly establishing that within the MadLoop+Ninja implementation, the computation time is asymptotically dominated by the computation of the loop integrand numerator.When no loop grouping is possible, as it is the case for loop-induced processes, we observe the opposite asymptotic behavior hence showing the the loop grouping plays an essential role in this limit.We remind the reader that these conclusions apply to the computation of the loop matrix element for a single helicity configuration and only in the context of MadLoop's technique for the computation of the loop integrand, which is most flexible but less optimal than full-fledged recursion relations [50], especially for processes with large multiplicity.The bottom two insets of fig. 1 compare the performances of the three integrand reduction techniques interfaced to MadLoop and reveals that Ninja is about 3 to 5 times faster than CutTools and 5 to 10 times faster than Samurai.The reduction time relative to Ninja increases with the process multiplicity, hence assessing the impact of the advantages of the integrand reduction via the Laurent expansion method as the complexity of the considered processes increases.
We include in appendix B a table detailing the timing profile presented in fig. 1 as well as process generation times and results obtained with the tensor integral reduction tools IREGI, PJFry++ and Golem95.

Stability study
We now turn to the assessment of the numerical stability of Ninja for the benchmark processes A)-D) listed at the beginning of this section.We do so by applying the internal stability tests of MadLoop to a set of N P S random kinematic configurations and we report the resulting accuracy as a cumulative distribution for the fraction of points with a reduction relative accuracy larger than some target ∆ on the x-axis.
For the 2 → n processes, we chose N P S to be 100K for n=2, 10K for n=3 and 1K for n > 3.These N P S kinematic configurations are chosen randomly, with the constraint that all final states satisfy p t,i > 50 GeV with angular separation ∆R ij = ∆φ 2 ij + ∆η 2 ij > 0.5.The center of mass energy chosen is 1 TeV, except for the processes involving the spin-2 particle Y in which case the center-of-mass energy is set to 1.2 TeV.
MadLoop combines two stability tests to estimate the numerical accuracy of the result: • Loop direction test: The loop reduction is performed a second time with the order of all propagators reversed (corresponding to the loop momentum redefinition q → −q) and compared to the original evaluation.This changes the internal numerics of the reduction, hence assessing its stability.Given that the input kinematics remains unchanged, the tensor numerator components do not have to be recomputed.
• Lorentz test: The input kinematic is transformed by a Lorentz rotation for which the loop-matrix element is recomputed and compared to the original one.
Another commonly-used kind of stability test consists in rescaling all dimensionful quantities by a common factor.This is not used by MadLoop because it proves to be impractical in the general case where the dimension of each of the model input parameters is not necessarily available within the generated code.
The stability tests are performed on a computation of the loop matrix element summed over all helicity configurations, except for the loop-induced processes for which only the allminus helicity configuration is considered.
The vertical gray line at ∆ = 10 −3 (i.e. 3 stable digits) marks the typical threshold used for Monte-Carlo event generation during which MadLoop will attempt to rescue the phase-space points with a numerical stability estimate larger than this target by repeating the reduction (and possibly the computation of the tensor numerator) in quadruple precision.The crossing of the various curves with this gray line therefore gives the fraction of unstable kinematic configurations for which this rescuing procedure will be necessary.For all the processes of highest multiplicity in each class A)-D), this fraction is larger than 1% and almost 10% for gg → Y ggg, which shows that numerical stability becomes an important issue 4 when Fraction of points with accuracy less than target Comparison of Ninja reduction accuracy for various processes 2 TeV for the processes involving the spin-2 particle Y of mass 1 TeV), randomly chosen with the constraints that all final states have a p t,i > 50 GeV (except for the loop-induced processes) and an angular separation ∆R = ∆φ 2 + ∆η 2 > 0.5.The number of points considered is 100K, 10K, 1K and 1K for processes with 2, 3, 4 and 5 final states respectively.A vertical gray bar is shown at ∆ = 10 −3 which corresponds to the typical threshold applied during event generation.
attempting the integration of processes with loops of rank 6 and especially 7.
Fig. 3 compares the stability of all applicable reduction tools for the processes gg → t tgg, gg → Y gg and uū → e + ν e µ − νµ b b.We observe that Ninja is always the most stable reduction, comparable to that of Golem95 which is however considerably slower (see appendix B).
In fig.2, comparing the relative position of the curves for processes with equal number of external legs shows that the determining factor for stability is the tensor numerator rank.This is also manifest when observing that despite the large multiplicity of the process uū → e + ν e µ − νµ b b and the complexity of its contributing QCD and EW loops, the stability of its reduction is comparable to that of other processes with a maximum rank of 4.
In an actual Monte-Carlo integration, the phase-space points encountered are not uniformly distributed and as a result the stability profile can potentially be different in this context.For this reason, we also show the stability profiles obtained by considering the kinematics of unweighted events generated at LO accuracy for the LHC14 collider setup using the NNPDF 2.3 (NLO) PDF set [53].Except for IREGI, we find no qualitative difference and Figure 3: The setup is identical to the one described in the caption of fig. 2. The stability profiles obtained from a random distribution of kinematic configurations are compared to the ones obtained from unweighted events generated with LO accuracy at LHC14, using the NNPDF 2.3 (NLO) PDF set.
the Monte-Carlo distributions even tend to be slightly more stable for Ninja, showing that its reduction is mainly insensitive to a change of reference frame and c.o.m energy.
In appendix C, we show the stability profiles of processes A)-D) for all applicable reduction tools.These further establish the observations drawn in this section.

Conclusions
We presented an algorithm for the generation of the expansion terms needed by the oneloop integrand reduction via Laurent expansion implemented in the public library Ninja from the numerical components of a tensor numerator.We have shown how, within a numerical calculation, these expansion terms can be obtained by contracting the tensor numerator with appropriate cut-dependent tensors, which in turn can be efficiently built by means of simple recursive relations.
The algorithm has been implemented in the most recent version of the public library Ninja, which can thus be used by tensor-based one-loop generators by providing the numerical entries of the tensor numerator of an integral.We interfaced this library to the MadLoop generator, part of the MadGraph5 aMC@NLO framework (available from v2.3.4 onward).
This allowed us to extensively study the performance and the numerical stability of Ninja and compare it with several other available tools.In terms of reduction speed, we observe that Ninja outperforms all other reduction tools, and in particular Samurai and CutTools by a factor of about 6 and 3 respectively.Also, Ninja's improvement over other tools increases with the complexity of the process.In terms of reduction stability, Ninja improves on previous integrand reduction techniques by a considerable amount and in general stands on par with tensor integral reduction as implemented in Golem95, which is however more limited and significantly slower.Our results show that numerical instability with Ninja only becomes problematic, because of the slowdown induced by reprocessing unstable points using quadruple precision arithmetics, for loop numerators of rank 7 and above which are not of immediate concern for current phenomenology at particle colliders.
The algorithm and the results presented in this paper therefore enhance the capabilities of MadLoop and the applicability of Ninja, and will thus be valuable for future high-energy phenomenological studies, especially those involving amplitudes featuring loop diagrams characterized by loop numerators of high rank.

Appendix B Details of timing performances
This section presents MadLoop timing profile for the generation and the computation of the tensor numerator components for all the benchmark processes A)-D) introduced at the beginning of sect.7. We also show the time necessary for performing the loop reduction with each of the six reduction tools interfaced to MadLoop (when available): IREGI, PJFry++, Golem95, CutTools, Samurai and Ninja.The reader can easily reproduce analogous results for various compilers and machines by using the automated check timing command of the MG5aMC interface.all SM tree and loop contributions (i.e. of both QCD and EW origin, resonant as well as non-resonant ones and also including contributions of order O(α 0 s )).The timing of the loop matrix element 2 (A (loop) A (tree) † ) of the process uū → ZZZZZ (denoted uū → 5 • Z in the table) echoes the profiling presented in the lower table for the evaluation of the loop-induced matrix element |A (loop) | 2 of the gluon fusion contributions up to the same final states.All timings in this table refer to the the computation of the loop matrix element summed over colors but for a single helicity configuration.The test machine is using a single core (for process generation as well) of an Intel Core i7 CPU (2.7 GHz) and the executable is compiled with GNU gfortran -O2 (v4.8.2).The numbers shown in tables 2 and 3 refer to the computation of the loop matrix element averaged and summed over color assignments but for a single helicity and kinematic configuration.This is the relevant figure of merit in MG5aMC since it implements a Monte-Carlo over helicity (with importance sampling) when integrating loop contributions.It is worth noting however that for loop matrix elements with a Born contribution, only the numerator computation time (t num ) scales with the number of contributing (analytically non-zero) helicity configurations (n hel ) whereas the reduction time (t red ) remains constant as the integrand numerators can be summed over helicity configurations before being reduced.The total time for the computation of the loop matrix element summed over all helicity configurations can therefore simply be computed as since MadLoop does not implement optimizations across different helicity configurations.
For loop-induced processes however [47], this is not possible when using reduction at the integrand level, in which case both t num and t red scale with n hel .
The number of loop diagrams indicated does not count the multiple copies with different quark flavors in the loop.A loop group refers to a group of loops which can be reduced together.This number is equal to the number of loops for loop-induced processes since each loop must be reduced individually in this case; otherwise it regroups all loops sharing the same topology (ordered list of denominators identified by their mass m i and four-momentum flows p i ).
The synthetic fig. 1 of the main text illustrates best the results and we find that Ninja outperforms all reduction tools considered for all the benchmark processes.We note however that the advantageous apparent exponential growth of the integrand reduction time with the process multiplicity is mitigated by the factorial growth of the time spent in computing the numerator tensor components.This is intrinsic to MadLoop's approach based on Feynman diagrams which offers maximal flexibility at the expense of not taking advantage of the optimal scaling behavior of recursion relations [50].We stress that MadLoop implements a caching system for recycling part of trees and loops shared across different diagrams.This emulates what recursion relations achieve, but only to a lesser extent even though it already considerably improves the computation time of the tensor numerator.
Generation time is usually not considered as relevant given that it must be performed once per process only.In practice however, this can be of concern since it is typically not easily parallelizable and is also a general hinderance when it comes to testing, debugging or quickly exploring the impact of some modifications to a model.In MadLoop's approach, generation time is hardly an issue for current phenomenologically relevant processes, but its growth is such that we reached the limit of reasonable process generation for gg → t tggg which requires about 2 days of sequential runtime and 40 GB of RAM.
Figure 4: Same setup as in fig.2, showing the stability profile of all applicable reduction methods interfaced to MadLoop for the class of processes gg → t t + {0, 1, 2, 3} • g.Fig. 5 shows results obtained for the processes gg → H/Y + {1, 2, 3} • g involving effective interactions yielding loops of numerator ranks equal to one plus the number of loop denominators.Such loops are particularly challenging to reduce and we observe that the accuracy deteriorates significantly when assuming a completely general form of the higher rank tensor, as it is the case in gg → Y + n • g, compared to the simpler tensor numerators obtained in gg → H + n • g.Indeed, the residues and consequently the fitting procedure are considerably more involved for higher rank integrands.It is worth noticing that for the simpler tensor structure of gg → H + n • g however, the Laurent expansion method is significantly more accurate than other integrand reduction tools, since the numerator expansion methods will return zeroes for vanishing higher rank coefficients, thus avoiding the (inexact) numerical reconstruction of such zeroes from multiple evaluations of the integrand.In the Higgs case, the stability profile exhibits an unusual shape with a very steep dependence with ∆ in the region [10 −5 , 10 −3 ], indicating that any decrease of the default Monte-Carlo stability threshold of ∆ = 10 −3 would have a significant impact on runtime performances as the fraction of points that must be reprocessed using quadruple arithmetics increases.The comparison of the two profiles obtained using tensor integral reduction highlights the importance of the internal numerical stabilization mechanisms of Golem95 as the rank of the loop numerator increases.
Fig. 6 shows stability results for processes with high multiplicity but relatively low rank.We observe that even though the processes uū → t tb bd d(g) involve up to 8-(9-)loops, their numerical stability is only slightly worse than that of the lower multiplicity processes with equal maximum rank, such as gg → t tg(g).It is interesting to note that when the multiplicity is larger than the rank by several units, the numerical stability of CutTools and Ninja is almost identical, as expected from the fact that the integrand reduction via Laurent expansion method and the traditional OPP numerator fitting are very similar in this limit.Indeed, when the multiplicity n of the loop lines and the rank r of the numerator satisfy r ≤ n − 4, one can easily show that the result is only determined by the cut-constructible coefficients of the boxes.These in turn, given their simplicity, are the only coefficients that Ninja computes with the same algorithm as traditional integrand reduction.
The two processes of the bottom insets of fig.6 introduce new scales in the loop amplitudes; first with external massive lines for uū → 5 • Z for which CutTools is slightly more stable than Ninja (unlike for all other processes) and secondly with internal (complex) massive lines for the complete QCD+EW loop contributions to the process uū → e + ν e µ − νµ b b for which Ninja is more stable.From the numerical stability standpoint, the interesting characteristics of loop-induced processes such as gg → n•Z is that they only involve fermion loops of maximal numerator rank.Contrary to the stability studies performed on all other classes of processes, we investigated the loop-induced ones by considering only a single helicity configuration; either g (−) g (−) → n • Z (−) (all-minus) or g (−) g (−) → n • Z (0) (all longitudinal Z bosons).We find a similar stability behavior for both helicity configurations, except for the lowest multiplicity process.
The peculiarity of the gg → ZZ process is that its fermion box loop contribution becomes unstable when the transverse momentum p t of the final state Z-bosons tends to zero (i.e.all external momenta aligned on the beam axis).Given the constrained 2 → 2 kinematics, this configuration is often probed which is also why this process is typically integrated using a technical (very) small cut in the Z-boson p t .The upper left plot of fig.7 reveals that in this p t → 0 limit, the fermion box stability significantly depends on the helicity configuration considered 5 .For more than two Z-bosons in the final states, this dependence is much weaker, and mainly reflects the difference in the size of the relative contribution of the more stable Figure 7: Same setup as in fig. 2 except that no constraint on the final state transverse momenta was applied is applied in this case.The processes considered are loop-induced gluon fusion with up to five Z-bosons in the final states.Results are shown for the stability obtained when considering only a single helicity configuration where the two initial state gluons have negative helicity and the final state Z-bosons have either helicity 0 (solid line) or a negative one (dashed line).

Higgs channels.
Even though the highest multiplicity loop-induced process is of maximal rank 7, it is significantly more stable than gg → Y ggg which shares the same maximal rank but with lower multiplicity.On the other hand, its stability is on par with the other 2 → 5 process gg → t tggg which has many more diagrams but with a maximal rank of only 6.
In summary and in terms of numerical stability, Ninja performs better or at least as well as the best other public reduction tool available for all considered processes except two.First, contrary to Golem95, Ninja has around 1% of unstable kinematic configurations (featuring less than 3 digits) for gg → Hgg and secondly, it is slightly less stable than CutTools for uū > 5 • Z.
The observations drawn in this appendix suggest that the numerical stability of one-loop matrix elements can be classified according to mainly their maximal rank and multiplicity.However, a large difference in the number and complexity of the contributing loops (in terms of their tensor numerator structure and typical rank, as well as the number and spread of the different scales) can amount to variations as large as the gap between two consecutive such stability classes.

MG5 aMC> import <chosen model> MG5 aMC> generate <process definition> [virt=QCD] MG5 aMC> launch
The first step is optional when considering corrections for processes within the Standard Model.For mixed Electro-Weak (EW) and QCD correction, the syntax '[virt=QCD QED]' must be used instead.

D.1 Process gg → t tggg
We specify below the chosen values for the relevant Standard Model parameters.Notice that all dimensionful quantities in this Appendix D are indicated with GeV units, unless specified otherwise.We report below all stable digits (no rounding applied) obtained with double precision arithmetics for the coefficients a 0 , c −2 , c −1 and c 0 .

Parameter value
[GeV The details of the model that we considered including the spin-2 particle Y can be found in ref. [40].As already noted in [44], the operator renormalization constant for the energy momentum operator is identical to unity to all orders in perturbation theory.As a result, when the graviton minimally couples to the energy momentum tensor (i.e. with κ q = κ g ), there is no need for additional UV renormalisation counterterms.The parameters of this model that are relevant for the processes gg → Y + {2, 3} • g are chosen as follows:

Figure 2 :
Figure2: Comparison of the fraction of points with accuracy smaller than the target accuracy ∆ on the x-axis obtained with MadLoop+Ninja (using double precision arithmetics) for a variety of processes (see text for details).The cumulative distributions shown are obtained from kinematic configurations with √ s = 1 TeV (1.2 TeV for the processes involving the spin-2 particle Y of mass 1 TeV), randomly chosen with the constraints that all final states have a p t,i > 50 GeV (except for the loop-induced processes) and an angular separation ∆R = ∆φ 2 + ∆η 2 > 0.5.The number of points considered is 100K, 10K, 1K and 1K for processes with 2, 3, 4 and 5 final states respectively.A vertical gray bar is shown at ∆ = 10 −3 which corresponds to the typical threshold applied during event generation.

10 D. 2
Processes gg → Y + {2, 3} • g IREGI, PJFry++and Golem95 are instead tensor integral reduction tools.We stress that MadLoop can dynamically change at run time the active reduction tool Ninja unlimited n loop prop.+ 1 Samurai unlimited n loop prop.+ 1 Golem95 6 max(6, n loop prop.+ 1) Figure1: Overview of the timing performances of Ninja, Samurai and CutTools interfaced to MadLoop on a single core of MacBook (OS 10.8.5), 2.7 GHz Intel Core i7 and using the GNU gfortran -O2 (v4.8.2) compiler.The timings refer to the computation of the one-loop matrix element summed over color assignations but for a single helicity configuration.A loop group combines all loops which can be reduced together.Details on the processes considered are given at the beginning of sect.7.

Table 2 :
The upper table presents results for processes with high-multiplicity and low loop numerator ranks.Notice that the process uū → e + ν e µ − νµ b b (massless b-quarks) includes

Table 3 :
[40] setup as described in the caption of table 2. Profiling of the runtime of the processes gg → {X} + n • g with {X} = t t, H, Y and n = (0, )1, 2, 3.The symbol Y denotes a spin-2 particle with a mass of 1 TeV and interactions as described in sect.2.3 of ref.[40].