Machine learning Post-Minkowskian integrals

: We study a neural network framework for the numerical evaluation of Feynman loop integrals that are fundamental building blocks for perturbative computations of physical observables in gauge and gravity theories. We show that such a machine learning approach improves the convergence of the Monte Carlo algorithm for high-precision evaluation of multi-dimensional integrals compared to traditional algorithms. In particular, we use a neural network to improve the importance sampling. For a set of representative integrals appearing in the computation of the conservative dynamics for a compact binary system in General Relativity, we perform a quantitative comparison between the Monte Carlo integrators VEGAS and i-flow , an integrator based on neural network sampling.


Introduction
The success of gravitational-wave detections in the last decade [1][2][3][4] relies on our ability to construct high-precision waveform templates.The most common gravitational wave sources are the binary inspiralling systems of black holes or/and Neutron stars.Whereas we have seen exciting progress in numerical simulations, mostly for the merger phase, of binary systems [5][6][7] we discuss here a different type of numerical methods to compute certain constant ingredients for analytic approaches describing the binaries' movement.
Approaching the problem from a high-energy physicist's point of view lead to modern methods inspired by quantum field theory (QFT), reaching from worldline EFTs [37][38][39][40][41][42][43][44][45][46][47][48][49][50][51] to scattering-amplitude-based methods .All these methods have in common that they describe the binary problem in the scattering regime and the expansion parameter is the gravitational coupling strength, i.e.Newton's constant G.This resummation of all order velocity corrections at a given order in G is called a Post-Minkowskian (PM) expansion.The potential contributions to the scattering angle at the fourth PM (4PM) order [44,70] have been extended by conservative tail effects by the two different approaches [49,82].Very recently, the complete knowledge of the gravitational dynamics in the scattering of non-spinning bodies at 4PM order, incorporating conservative and dissipative effects [84], has been achieved by a combination of the worldline EFT approach and modern field theory techniques [85].The (analytically determined) integrals used in [49] were cross-checked by numerical methods discussed here.Results for the hyperbolic (scattering) version of the two-body problem can be analytically continued to the elliptic case via a so-called boundary-to-bound map [86,87].This map includes not only local conservative effects but also radiative corrections [88,89].This map has been successfully checked against state-of-the-art PN results for bound orbits in the overlapping expansion region.
Multi-loop integrals are at the core of QFT methodologies.Therefore, developing efficient techniques to evaluate these integrals is crucially important to advance the precision frontier for PM gravity.The goal of this work is to study a set of (cut) Feynman integrals appearing in such approaches.We will call them henceforth Post-Minkowskian integrals.In [38,44] the generic structure of integrals needed for the computation of the deflection angle at 3PM and 4PM orders was identified, which easily generalizes to any order.The main technique to compute, or rather bootstrap, such integrals used in these papers is the method of differential equations [90,91], which reduces the calculation to finding the solution of a coupled system of first-order differential equations in one variable.Whereas solving the differential equations is an art by itself (see e.g.[85] for integrals discussed here), in some cases the boundary conditions turn out to be surprisingly tricky as well.
One application of numerical integration methods is to cross-check analytic results.We develop here a machine-learning based framework for the numerical evaluation of multi-loop integrals, which is targeted to lay the groundwork for applications beyond simple crosschecking.One can imagine that analytical methods will eventually hit a wall.Numerical methods will provide a natural path forward for high-precision computations, for example via a hybrid analytical-numerical pipeline to efficiently produce waveform templates.Pushing in that direction, we apply this novel method to numerically evaluate boundary values of PM integrals, which are a part of the pipeline for results in gravitational wave physics.In the future, one could try to directly determine boundary conditions to the differential equation system with numerical methods, inputting them either as high precision constants to the final answer or as a way to conjecture its analytical form via integer relation algorithms like PSLQ [92,93].The latter strategy was for example successfully applied in a similar computation in [94].
Due to the use of dimensional regularization -meaning that we compute integrals in D = 4 − 2ϵ dimensions -such boundary integrals depend on ϵ.Since we are only interested in ϵ-divergent and -finite contributions to an observable it is sufficient to compute the boundary integrals up to a certain order as a power series in ϵ.Sector decomposition [95][96][97][98] is a method to perform such a power series expansion on an integrand level by breaking the integral into smaller pieces, so-called sectors.Many tools like (py)SecDec [99][100][101][102] or FIESTA [103][104][105][106][107] implement sector decomposition methods together with numerical integration algorithms.We used these programs to produce decomposed integrands, which we then integrated with machine learning techniques implemented in i-flow [108].The main idea of i-flow is to use a neural network (NN) to improve the Monte-Carlo integration and (importance) sampling, which improves the error estimates and leads to faster convergence of the numerical integration.i-flow uses the method of normalizing flows [109,110], which approximates the phase-space integrand via an (analytically) invertible neural network.
The main result of this paper consists of a quantitative analysis of the required number of integrand evaluations to reach a given accuracy goal, comparing traditional sampling methods such as VEGAS [111,112] to our neural-network-based framework.We analyze a representative set of Post-Minkowskian boundary integrals (in the so-called potential region) reaching from two (3PM) to four loops (5PM).Since the neural network needs to be trained for a constant initial time they perform worse for low relative precision (∼ 10 −3 ) but start to scale significantly better for higher precision (∼ 10 −4 and below).
We begin in Sec. 2 by introducing the loop families of interest and list analytical results for most boundary master integrals up to three loops, and a few representative four-loop integrals.The sector decomposition methods and our numerical setup, mostly focused on machine learning techniques, are introduced in Sec. 3.This section also contains the main results for our numerical integration framework.In Sec. 4 our findings are summarized and we conclude with a perspective into future applications of machine learning techniques to Feynman integration.

Post-Minkowskian integrals
This section introduces a set of Feynman integrals appearing in field theory based approaches to gravitational binary dynamics. 1We present a representative set of loop integrals and their analytic expressions.For one, two, and three loops those correspond to master integrals with respect to integration-by-parts relations.We will then apply machine learning techniques to numerically evaluate them in subsequent sections.We restrict ourselves to the first three orders in the ϵ series for numerical checks.

Prerequisites
At O(G L+1 ) order we define the set of Post-Minkowskian integrals by [85] where We adopt the mostly minus Minkowski metric, η µν = diag(1, −1, −1, −1), and work in dimensional regularisation in D = 4−2ϵ dimensions.We introduced a convenient normalization factor e ϵγ E per loop, where γ E is the Euler-Mascheroni constant.The inverse propagators P i (including irreducible scalar products for ν i < 0) can be expressed in terms of the external and loop momenta We use implicit '−i0' prescriptions for all propagators in the rest of the paper.The external kinematical variables satisfy A useful property is that there is a single dimensionful kinematical variable t = −q 2 = q 2 in the integrals.Thus, the dependence on t can be easily fixed by the mass dimension and is given by t L(D−1)/2−ν .As a result, the integrals in (2.1) are dimensionless functions of a single variable γ = u 1 • u 2 , where in the scattering region γ > 1.
An atypical feature of the integrals in (2.1) is that each loop integration is partially localized by a Dirac-delta constraint, whose argument is linear in the loop momentum and one of the initial velocities of the bodies δ(ℓ i •u a ).Similar loop integrals appear in PM methods relying on gravitational scattering amplitudes [54-56, 66, 67, 69, 70, 82, 113, 114].They are related to the PM integrals in (2.1) by so-called 'reverse unitarity' [115][116][117], in which a Diracdelta function is understood as a cut of a propagator.Thus, many techniques, including the novel numerical techniques developed in this work, are applicable for loop integrals in both worldline EFT and S-matrix-based formulations.
It was found that the method of differential equations [90,91] provides an efficient way to determine the γ-dependency of PM integrals [38,49,65,85].Using integration-by-parts (IBP) relations [118][119][120], one can derive a system of ordinary differential equations with respect to the kinematical variable γ for a set of basis (master) integrals.To be clear, let us take a look at the simplest example where the same velocity vector u a (a = 1 or a = 2) appears in all delta-function constraints in (2.1).In this case, any integral obeys the following simple differential equation: We can immediately write down its solution where These boundary integral are defined in Euclidean space of d = D − 1 dimensions where P i is the d-dimensional part of P i , i.e. the time component removed.On one hand, these integrals contribute to the test-particle limit (geodesic motion in a Schwarzschild background for the spin-less case).On the other hand, more interestingly, in the γ → 1 potential region [121][122][123][124] all integrals of the form (2.1) from other sectors can be reduced to (2.6) as well.To be precise, if we are working in the rest frame of the particle 2, upon resolving the delta-function constraints δ(ℓ i •u 1 )δ(ℓ j •u 2 ) one finds ℓ 0 i = βℓ z i and ℓ 0 j = 0. Therefore, using this frame and expanding the integrand around the small velocity limit β → 0 or γ → 1 leads to (2.9) We refer to the integrals defined in (2.6) as static integrals.They play a crucial role in evaluating PM integrals in the context of the differential equation method as they encode all boundary data in the potential region.They are the core objects of interest in this work.We list a representative set of static integrals and their analytic results in the following subsections.

2PM: One loop
At one-loop level, all static integrals can be immersed into the following form (2.10) These integrals are sufficient for the computation of the conservative dynamics of non-spinning [37] and spinning [40] binary systems at O(G 2 ).Any integral in (2.10) is independent of the sign in front of the linear propagator ±ℓ z − i0, where we have written out the otherwise implicit −i0.
Via IBP relations any integral of the form (2.10) can be expressed in terms of two master integrals {A 011 , A 111 }.Technically, it is not necessary to perform any IBP reduction since the analytical expression for generic {α, . (2.11) We have merely presented this result for completeness and we are not interested in their numerical evaluation.

3PM: Two loops
At two-loop order, all static integrals can be mapped into the following family [38,39] where we denote The five squared propagators in (2.12) graphically correspond to the Kite topology: . Solving IBP identities using FIRE6/LiteRed [126][127][128] or Kira2 [129], we find that 9 independent master integrals for all sign configurations of linear propagators in (2.12).As expected, each master integral has a either double-bubble or sunrise topology when considering only square-type propagators: . We list all their analytical results below: where the sign superscript is omitted in case a linear propagator is not present.These results were used in [38,39] and an analytical derivation is presented in [85].Most of them can be computed by using the one-loop formula (2. ) is not as trivial.Two independent derivations -one based on a symmetrization trick and one via direct integration of a Feynman parametrized form -are presented in App. A. The latter rather considers a generalized version of this integral with generic symbolic indices for some slots.The resulting expression needs some non-trivial transformation in order to lead to the simple form presented here.

4PM: Three loops
At three-loop level, all static integrals appearing in the computation of the next-to-next-tonext-to-leading order conservative dynamics of non-spinning binaries [44,49] can be reduced to the following three topologies (of squared propagators) [85]: (2.22) In the following we will denote integrals by their topology, a subscript counter, and the usual superscript of signs of linear propagators.Let us start with integrals without linear propagators: (2.23) (2.24) (2.25) In this case, it is clear that they can be evaluated to a product of Gamma functions using the one-loop bubble (2.11) iteratively.They explicitly evaluate to:  For the case of one linear propagator, we find the following master integrals: (2.30) (2.31) We have suppressed the sign superscript since these integrals are independent of the sign of the single linear propagator.A direct evaluation using the one-loop integrals in (2.11) results in ) Next, we find four static master integrals in the presence of two linear propagators [44,49]: (2.36) The following analytic results in terms of hypergeometric functions p F q have been computed for the results presented in [44,49].An extended analytic derivation is given in [85], which we have generalized to higher loops in App.A.2 with Γ a being a shorthand notation of Γ(a).Performing the Laurent expansions in ϵ for the first few orders is surprisingly tricky. 2 We numerically evaluated the expansion coefficients and conjecture the following analytic expressions using Mathematica's built-in implementation of the PSLQ algorithm FindIntegerNullVector: (2.41) More details about this reconstruction will be given in Sec.3.2.We have also checked that the above results satisfy the relation which follows from the fact that the combination of linear propagators with different signs forms a maximal cut of all linear propagators.Finally, let us consider static integrals with three linear propagators: (2.44) We find that they fulfil the following non-trivial relations: and They satisfy 47) for j = 5, 6 respectively.These relations are following from the fact that the combination of linear propagators with different signs yields a maximal cut of all linear propagators.This completes the set of all static master integrals for the conservative, non-spinning contributions at O(G 4 ).The results for B 1 , B 2 , D 1 , B 5 , and B 6 are to our knowledge presented for the first time here.

5PM: Four loops
We pick a representative set of integrals which are likely to appear as static master integrals in up-coming computations for the conservative dynamics at 5PM order.Here we choose to study the most typical one, the four-loop banana topology, as a representative to test our numerical methods: . Let us first consider the simplest case without any linear propagator: (2.48) Its analytic result, obtained once more via iterative application of the one-loop bubble formula (2.11), is given by We consider a generalization with a single linear propagator: (2.50) Similarly, its analytic form can be obtained using the one-loop bubble integral: Adding another linear propagator, we consider the following integrals (2.53) which have the analytic solution These results can be obtained in a similar way as the 3-loop integrals B 3 and B 4 in Eq. (2.35).Since the hypergeometric functions start contributing only at the third order in ϵ we can analytically perform the series expansion up to that order.We realized that by multiplying this series by (1 − 6ϵ) leads to a uniform transcendental result.This allowed us then to conjecture the coefficient at O(ϵ 0 ) via an integer relation algorithm (see Sec. 3.2 for more details): and We have further checked that the above results satisfy the relations with j = 2, 3. We will not consider any integral with three linear propagators.Finally, we consider an integral with four linear propagators (2.61) They fulfill the following relations: (2.62) and They furthermore satisfy the following non-trivial relation (2.66) The analytic results for the integrals M 2 , M 3 , and M 4 are to our knowledge for the first time presented here.A derivation based on a symmetrization trick can be found in App. A. All of these integrals are vital for the conservative contributions to the binary dynamics at O(G 5 ).

Numerical methods and results
In this section we present a framework to numerically evaluate dimensionally-regularized multi-loop integrals, with a special focus on the integrals introduced in Sec. 2. We start by discussing some background material, in which we explain the two main steps in our computation: sector decomposition and Monte Carlo integration.We then present a neural network method to sample the phase space.Next, we detail our explicit software setup and an analysis of the desired precision for numerical methods having in mind analytical reconstruction methods.We finish by presenting a comparison of NNs and VEGAS numerical integral solvers applied to PM boundary integrals.

Prerequisites
The numerical evaluation consists of two steps: First, the integral is decomposed into different sectors in order to write it as a Laurent series in the dimensional regularization parameter ϵ, where each series coefficient is expressed as a purely numerical integral.Methods that implement such an expansion are called sector decomposition [95][96][97][98].Second, we numerically evaluate these integrals using Monte Carlo-based methods.We start by presenting two different codes for sector decomposition, FIESTA [103][104][105][106][107] and pySecDec [99-102], followed by a review of two different Monte Carlo methods for the evaluation of the integrals: the widely-used VEGAS algorithm and a novel method based on neural networks.

Sector decomposition: FIESTA and pySecDec
Sector decomposition techniques date back to the proof of the BPHZ theorem [134] and have been used to isolate both infrared (IR) and ultraviolet (UV) singularities of Feynman integrals [96].Different strategies for sector decompositions [97,103,135] can lead to different number of sectors and distinct structure of the poles in the regulator ϵ.Also, the convergence properties of the algorithm can vary significantly among different strategies [135,136].The general idea of sector decomposition (see e.g.[98] for a review) is to split the integration region iteratively into smaller pieces, such that overlapping singularities (a denominator evaluates to zero for a set of integration variables x i → 0) are factorized.This is always possible, and proven to terminate for appropriate strategies due to homogeneousness properties of (Feynman) parametrized integrals.Having arrived at such a factorized form the extraction of poles is trivialized and a Laurent series in ϵ can be extracted to any desired order.Further singularities of the integrand at other points (surfaces) away from zero need to be handled by a contour deformation [137].It utilizes a complex deformation dictated by the i0 prescription of the (Feynman) propagators.
We have used two different programs to study and automatize the sector decomposition and contour deformation for the PM boundary integrals.FIESTA was first developed in [103] and improved in [104,105].Its core algorithms are implemented in C and a Mathematica interface is provided.FIESTA provides many different strategies for sector decomposition based on work in [103,105,136].
SecDec was developed in both C++ and Fortran [99][100][101] and has a Python interface (pySecDec) [102].It allows for three different decomposition strategies: iterative [97,98] and two geometric decomposition methods described in [101,136] that make use of the normaliz package [138].In our tests, we found that the geometric method described in [101] that makes use of the Cheng-Wu theorem [139] leads to fewer sectors.It produces for all integrals discussed here the most compact integrand, i.e. allowing for the fastest numerical evaluation at a random phase-space point.This observation agrees with the analysis made in [101].We use this method by default for the rest of this work.

Monte Carlo integrators: VEGAS family and Neural Networks
Monte Carlo algorithms estimate an integral I of a function f (x) over the domain Ω by sampling the integrand over N uniformly distributed points x i where V is the volume of Ω.The brackets represent the average taken with respect to a uniform sampling in the variable x.
Importance sampling means performing a variable change such that the regions in the phase space with large |f | gain more weight than other regions with small |f |.This decreases the variance σ, a measure that we use to estimate the accuracy of the result.The basic idea is to use a probability density function (PDF) that resembles f , g(x, θ) ∼ f (x)/I.It may depend on a nuisance parameter θ.Nuisance parameters are used in the statistics literature to enlarge the parameter space of a theory in order to take into account known unknowns [140].Letting G(x, θ) be the cumulative distribution of g we have Putting that in another way, g is the inverse Jacobian determinant J = |dx/dG|.The variance of the MC integral is estimated by which helps us to understand the effect of importance sampling: g reduces the overall MC variance as good as it resembles f , i.e. for an optimal choice of g(x) = f (x)/I, in which one already knows I, the variance vanishes.For non-optimal choices it is understood that the better the shape of g resembles f the more the peak regions get suppressed by the Jacobian J, reducing the variance of the integrand.The efficiency of importance sampling is attached to three distinct factors: g's shape should resemble f , be invertible, and fast to evaluate (comparable to the cost of evaluating f ).
In other words, G(x, θ) is a coordinate transformation.Sampling uniformly over Gcoordinates and mapping them to x-space (requiring the inverse Jacobian) allows one to reduce the variance.VEGAS and i-flow, introduced below, differ by how they construct an importance sampling function g.

VEGAS:
VEGAS [111] is an iterative Monte Carlo scheme that approximates the function f by a histogram function g on a grid.When computing d-dimensional integrals, approximating each dimension by N histogram steps leads to N d integrand bins.In order to avoid exponential scaling, VEGAS assumes integrand dimensions to be independent, i.e. assumes that g(x 1 , . . ., x n ) = g(x 1 ) . . .g(x n ) , leading to N d integrand bins and therefore a linear scaling.VEGAS is constructed to iteratively refine the binning used to generate the histogram.After each evaluation of the integrand this refinement is done through a weighting proportional to J 2 f 2 , where J is the Jacobian determinant of the coordinate transformation, evaluated at the previous iteration step.Hence, the bins get smaller around regions where |f | is larger.
Note that the effectiveness of VEGAS depends on the lack of correlation of the integrand among the integration variables, i.e. the assumption underlying Eq.(3.6).For integrands that cannot be factorized, VEGAS presents a poor sampling of points [141].Recent versions of VEGAS use adaptive stratified sampling (see [112]) to partially overcome this issue.Other algorithms, such as FOAM [142] have been proposed for cases in which the integrand is not independent in its components.FOAM uses an adaptive method to divide the overall phase space into hypercubes taking into account correlations.Though relatively efficient when dealing with low-dimensional integrals, FOAM becomes inefficient for higher dimensions [108].Moreover, histogram-based methods lack precision around the edges of the histograms, leading to the so-called edge effects.This effect is bypassed by neural networks, which approximate the phase space via splining, as we will discuss now.
i-flow: When the independence of components, Eq. (3.6), fails, VEGAS generically provides a poor sampling of the phase space and can be inefficient to probe non-diagonal contributions [108,141], i.e. correlations between different axes in the phase space.As previously mentioned, a central piece for importance sampling is to find a coordinate map G that satisfy three conditions: its Jacobian resembles the distribution of the integrand f , it is invertible and fast to calculate.Neural networks are able to learn an approximation of a given function f that, different from VEGAS, is independent of the axis alignment.Hence, it captures non-diagonal features.
The basic idea of neural networks is to model (approximate) a function by a concatenation of a number of layers.Each layer depends on the output of the previous layer and some internal parameters.The NN is then trained on a set of points by tuning these internal parameters.A trained NN can for example be used as a fast approximation to the original function, or it can provide a (fast) inversion of the original function.
Recently, NN architectures that are analytically (i.e.efficiently) invertible were proposed, built through the so-called normalizing flows technique [109] (see also [110]).Normalizing flows make use of coupled layers, each of which contains itself an efficiently invertible NN.The Jacobian matrix of the full transformation is designed to be in an upper-diagonal form whose determinant does not involve the inner neural network function m (i.e. the function that is x A serves as input for the second layer, now under a distinct permutation (masking) p. See also [110].
getting tuned), which only appears in the off-diagonal part.Each layer receives a data point ⃗ x as input from the previous layer.This point is split into two non-empty subsets ⃗ x A and ⃗ x B .Each layer then outputs a new data point given by ⃗ x , where ⃗ C is a coupling function.For an illustration, see Fig. 1.The coupling function needs to be easily invertible since it appears in the diagonal blocks of the Jacobian: In Appendix A of [108] some choices for this coupling function are discussed.
The integration algorithm then operates as follows on a batch-by-batch basis: 1. Sample uniformly in G-space and use the inverted NN to get a point sample in x-space; 2. Make a Monte Carlo estimation for the integral I using both f (x) and g(x); 3. Update the NN by minimizing a cost function L(I[g(x)], I[f (x)]) (that must be provided to i-flow); 4. Back to item 1 by sampling with the new NN (i.e.updated G and g).
NNs have been also used in other ways for the evaluation of (Feynman) integrals, for example to optimize the contour deformation [143].Machine-learning-based algorithms have also been shown to overtake VEGAS and FOAM for trivial non-factorizable integrands [141,144].For applications of NNs in Monte Carlo event generation see [145][146][147].

Setup
For our comparison of the traditional Monte Carlo approach of VEGAS and the novel NN implementation of i-flow we used pySecDec 1.5.2 to construct sector decomposed integrands.As discussed above we used the geometric decomposition method [101] which produces the most efficient integrands for our purpose.We optimized the Feynman parametrization by analytically continuing the external data in order to have a positive Symanzik polynomial F in cases it was possible.The analytic continuation that worked for many cases is the following.Let u = (0, 0, 1) such that the linear propagators can be written as ±u • ℓ i − i0.The overall power of the u dependence can be inferred by power counting.By computing the integral for u 2 = −1 instead of u 2 = 1 many parametrizations have positive Symanzik polynomials and a complex contour deformation is not necessary.Some technical details related to this are given in App.B. The presence of contour deformation typically leads to a poor convergence of the integral.Some integrals we considered (K (++) 11;00111 and B − 4 ) had a technical difficulty related to poles appearing on the boundary of the Symanzik polynomial F when one of the Feynman parameters x i → 1.These are not captured by the standard sector decomposition.Such poles lead to a poor convergence and in some cases even to erroneous results.This issue can be resolve by yet another split of the integral into more sectors.Details about this can be found in [107] where an option for the newest version of FIESTA was presented that takes care of this issue semi-automatically.The same paper also discusses the correct treatment for one of our two-loop integrals, K (+−) 11;00111 , in detail.In pySecDec the same can be achieve by performing the split manually.In the presence of three or more linear propagators this requires quite some manual work.We have not observed such issues for integrals where no contour deformation was needed.For the families B 5 /B 6 and M 4 we chose to only numerically integrate one integral per family which has a positive F Symanzik polynomial, i.e.B ++ 5 and M +++ 4 .All other integrals from these families can be obtained from the identities in Eqs.(2.46) and (2.63).
For the VEGAS integrator we used the default setup of the pySecDec C++ generator, that makes use of CUBA library [148].The setup of the i-flow pipeline is more involved.
i-flow makes use of the TensorFlow library [149].In order to expose pySecDec's integrand to the TensorFlow interface we have created a TensorFlow operator (op3 ) directly from the C++ integrand class.The i-flow code then takes care of the normalizing flows with the number of (piecewise rational quadratic) coupling layers scaling with the dimension of the integral.Each coupling layer has 4 hidden layers, each with 32 nodes and a rectified linear activation function (ReLU).We also used an Adam optimizer [150] and an exponential loss function.Each epoch of the NN includes 4096 points sampled.We noticed that i-flow results are slightly biased, but introducing a pre-training stage can attenuate this issue. 4In practice, this pre-training stage means that we run the NN algorithm until it reaches 50% of the required relative precision and then reset the samplings.We have checked that this amount of pre-training reduces the bias in the results to below 2σ, where sigma is the target precision.A more in-depth study of the source of this bias and a proper way to overcome it is required, though it does not change the overall scaling of the NNs and the conclusions of this work.Finally, i-flow stores all sampled points, which may incur into memory issues.For some of the integrals reported in Sec.3.3, memory limitation has been an obstacle to i-flow, and similar problems were already reported in [146]. 5recision In order to get a feeling for the desired precision, depending on the order in the ϵ power series expansion, we discuss an example of an analytic reconstruction approach based on high-precision results.For some of the integrals in the previous section we were not able to perform a series expansion to arbitrary order in ϵ even though we were able to derive the complete analytical result.The reason is the appearance of hypergeometric functions with arguments depending on ϵ, which are inherently difficult to power expand [151,152].Away from the leading order we relied on integer reconstruction algorithms to conjecture an analytic result.Consider the integral B − 4 where we presented the series expansion in Eq. (2.41).Assuming uniform transcendental weight we built an ansatz of the form where we included the set of transcendental numbers {π, log(2)} only.At transcendental weight 3 one also has to include ζ(3) (and possibly other constants).The unknown rational coefficients a 1 , b 1 , b 2 , c 1 , c 2 , and c 3 were then determined via the PSLQ algorithm.The leading coefficient, a 1 = −3/2, can in fact be analytically computed since it does not involve derivatives of hypergeometric functions, or can be guessed by eye from a numerical result.For the coefficients b 1 and b 2 at O ϵ −1 the output of FindIntegerNullVector stabilizes already at a precision of 3 digits.Finally, for the c i 11 digits were needed for a stable prediction.To give confidence in such a conjecture one would like to check it up to a much higher precision.With the full analytical results at hand, we have checked these results up to a precision of 150 digits.Looking at this from a different angle sometimes a good guess can work equally well since the correctness of the reconstruction can be justified a posteriori, e.g. in our case by comparing to Post-Newtonian results for physical observables that encapsulate all information about the small velocity limit in which we compute the boundary integrals [44,49].For all series expansions where the full analytic results contains hypergeometric functions we observed similar numbers of digits for a stabilization of the PSLQ algorithm.Dropping the uniform transcendental constraint and including a bigger set of transcendental numbers would accordingly require higher precision results for a stabilization.Hence, it can be beneficial to identify uniform transcendental integrals in order to use such a construction.
If such ideas should become useful for results away from the leading and maybe subleading term in ϵ, the precision of numerical integration results needs to exponentially increase.One improvement in that direction is presented in this paper.We decided to aim for a relative precision of σ = 10 −4 since it is already sufficient to conjecture analytic results for many of the subleading terms.In order to get an idea of the scaling behaviour we also give numbers for a relative precision of 10 −3 .

Numerical Results
We continue in this subsection by showing explicit results for the comparison of VEGAS and i-flow for the two-, three-, and four-loop integrals introduced in Sec. 2. For this comparison we present the number of integrand evaluations needed for i-flow 6 and VEGAS for each integrand at each order in epsilon and relative precision σ = 10 −3 and 10 −4 .We note that comparing the computational time is not a satisfactory metric.VEGAS has been substantially optimized and its performance is fully parallelized.Even though we have parallelized the i-flow sampling, there is still plenty of room to improve its performance on an implementation level.Moreover, the training stage of i-flow is not the limiting part of the algorithm and sampling is by far the most time consuming part.Therefore, the sampling number is a more coherent metric, akin as done in previous comparisons [141,146].
The results are summarized in Tables 1, 2, and 3 for the two-, three-, and four-loop integrals respectively.We note that we were not able to estimate some higher-order-in-ϵ terms, since i-flow computations lead to memory problems akin as reported by [146].For 10 −3 relative precision, VEGAS often required fewer evaluations compared to i-flow, especially for lower-dimensional integrals.When increasing the complexity of the integrand (either by increasing the loop order, the integral dimensionality, the ϵ-order or by requiring more precision) i-flow starts to pass VEGAS.This is consistent with the observations presented in [108].When requiring 10 −4 precision, i-flow has outperformed VEGAS for almost all cases and orders in ϵ.
In order to understand the scaling behaviour of the relative error, we display in Fig. 2 the evolution of the i-flow (solid lines) and VEGAS (dashed lines) error as a function of the number of evaluations N .We use the exemplary integrals K (±) 02;10110 and B 0 defined in Eqs.(2.21) and (2.23) respectively.The discontinuities for the i-flow graphs are due to the pre-training stage.We observe that for both integrals at 10 −3 , VEGAS indeed reaches the required precision faster than i-flow.i-flow underperforms here due to the early stage of learning the phase-space distribution that already requires a high number of evaluations.Also for 10 −4 precision i-flow still has a latent training stage, but once it is fully trained the error graph is significantly steeper as compared to VEGAS, especially for the harder We list the number of integrand evaluations needed to reach 10 −3 and 10 −4 precision for the two-loop integrals using VEGAS and i-flow.The first column shows the order in ϵ after sector decomposition (see Sec. 2) and the second column the integral dimensionality in parametrized form.Empty entries correspond to integrals we ignored since i-flow runs into memory problems, as reported in [146].1.  3: Results for the four-loop integrals.Same notation as in Table 1.
three-loop integral.The dotted lines represent the expected σ = 1/ √ N behaviour according to Eq. (3.5) in a late phase where only extra sampling is being performed.Hence, the asymptoptic behavior of VEGAS typically follows this 1/ √ N behaviour.Differently, NNs have an asymptotic behaviour better than 1/ √ N since they continue gathering information and learn about the system even in the late stage.The expectation that NNs work better for more complex integrals is confirmed by these plots in Fig. 2. Note that the maximal dimensionality of the example integrals in Fig. 2 is 3.When increasing the dimension of the integrals (typically when going to higher loops) the crossing-point in which i-flow outperforms VEGAS happens earlier (see Table 2).For instance, for B 2 , the required number of evaluations is five times smaller for i-flow when estimating it with 10 −4 precision.Therefore, when computing high-loop integrals NN technologies like i-flow are leading to significant improvements.
To get yet another impression on the scaling behaviour we show in Fig. 3 the total number of evaluations needed as a function of the relative precision required for two (simpler) integrals be the focus of a future study.On the other side, we do not see any potential improvements that could be done for the VEGAS setup that could substantially change its asymptotic scaling.

Discussion and Outlook
In this work we have initiated the application of modern machine learning techniques to the numerical evaluation of multi-loop Feynman integrals, with a special focus on loop integrals relevant to make precision predictions for gravitational-wave observations.Using pySecDec's C++ interface for the sector decomposition and contour deformation we have compared two different Monte-Carlo integrators: the traditional VEGAS method, based on partitioning the phase-space into non-uniform histograms and i-flow, a neural-network sampler that learns autonomously about the phase-space distribution of the integrand.We want to emphasize that numerical approximations can be useful not only to check analytical expressions but also open up the stage for the use of high-precision numerical results in direct numerical construction of gravitation waveform templates or integer relation conjectures for analytical reconstruction.
We have found that for simpler integrals, namely lower-dimensional, lower order in ϵ, and integrals containing fewer linear propagators, VEGAS performs better.This is partially due to a learning phase that is required for an unbiased neural network setup.However, increasing the complexity of the phase-space or aiming to surpass per mille precision makes integration with VEGAS significantly more time-consuming.i-flow starts in such cases to outperform traditional methods.Based on normalizing-flows, i-flow provides an efficient and systematic method to sample the phase-space.Our results are consistent with the previous observations of i-flow applied to other systems: its error scales slowly in the early stages due to an initial transient phase, but the normalizing flow keeps learning about the integrand topology.Due to its sampling strategy, i-flow's variance estimate then generically decreases faster than the naive 1 √ N for traditional Monte-Carlo sampling, where N is the number of integrand evaluations.
We would like to point out the current limitations for numerical integration via sector decomposition and Monte-Carlo methods: First, our sector decomposed integrands tend to run into divergences (undetected singularities) that need to be taken care of manually.Second, requiring more precision (σ < 10 −4 ) demands O(10 10 ) evaluations meaning that we hit a hardware wall in terms of memory requirement for i-flow.Improved sector decomposition algorithms have the potential to not only overcome the former, but can lead to better integrands when it comes to convergence speed, which in turn reduces the number of required integrand evaluations.The memory issues of i-flow can be fixed with an improved memory management, i.e.only storing results where required.
One idea of improvement of our current setup for PM integrals is to utilize integral identities like Eq. (2.42) to identify a set of independent integrals that have desirable properties for numerical algorithms.Of course, this could simply be done by trial-and-error, but it would also be interesting to have integrand-level criteria to determine whether a given integral is suited for numerical integration or not.One trivial criterion that we have identified is the positiveness of the Symanzik polynomials of the parametrized form of the integral.Positive polynomials render complex contour deformations unnecessary and can significantly decrease the integrand evaluation time and improve its convergence properties.We have analytically continued the external kinematics in order to achieve a positive F polynomial for many of our examples.
As a further improvement, we note that pySecDec has recently been extended by a quasi-Monte Carlo (QMC) [153] integrator [154].QMC uses quasi-random grids to generate sample points in the phase space, while traditional MC samples random numbers.This improves the theoretical scaling of the variance from 1 √ N for traditional Monte-Carlo to 1 N or even 1 N 2 .A challenge for QMC algorithms is the exponential scaling of the variance in the integral dimension d.For the foreseeable future we do not expect to find integrals with dimension significantly higher than 10, for which methods have been developed to overcome this scaling [154].We hence expect that a QMC integrator could be combined with improved NN phase-space sampling to reach an even better performance (see e.g.[155]).We leave that for future work.
It is clear that the framework developed in this paper is straightforwardly applicable to other multi-loop integrals, e.g. in the context of the effective field theory of large-scale structure [156][157][158][159][160][161].The success of similar methods for similar integration problems [141,144,162,163] and in other areas, such as precise measurements at high-energy colliders [145][146][147][164][165][166], strongly motivate us to apply normalizing flows to extending the multi-loop program in the context of gravitational waves.This work intends to be a beginning of an agenda in which numerical calculations and analytical results are complementary and together push forward the theory to exquisite precision.

A.2 Some deformation magic
The goal of this subsection is to find analytic expressions for B 3 , B 4 , M 2 , M 3 as well as K 11;00111 introduced in Section 2. After integrating out up to two trivial loop momenta using here: The integral above receives contributions mainly from ℓ z 1 ∼ ℓ z 2 ∼ 0 and behaves like (0 − i0)/(±0 − i0) in this region.This is in a clear contrast to the case in which we have a structure like (finite − i0)/(0 − i0) and the i0 in the numerator can be safely neglected.

The first integral I ± 1
We denote the Feynman parameters by x 1 , . . ., x 5 corresponding to the five propagators.The Feynman parametrization is then given by with Symanzik polynomials U = x 3 x 4 + x 4 x 5 + x 5 x 3 , (A.17) We do not yet specify the subset I ⊂ {1, 2, . . ., 5} since it can be chosen to be an arbitrary nonempty set according to the Cheng-Wu theorem [167].For illustrative purposes we split the integral into two contributions from the two regions x 3 > x 4 and x 3 < x 4 : In order to understand how the integrations in x 1 and x 2 behave, we identify the directions that diagonalize the matrix with the rotation matrix We further identify Note that λ 2 > λ 1 > 0 for x 3 > x 4 .Switching to x ′′ 1 and x ′′ 2 deforms the original integration region [0, ∞) × [0, ∞) for x 1 and x 2 in a nontrivial way, see Fig. 4. We call these regions R ± .Then Let us have a closer look at R ± .The rotation angle and the eigenvalues satisfy (bottom) before and after the coordinate transformations that are performed in the main text.For the former, the original integration region x 1 > 0 and x 2 > 0 expands after transforming to the x ′′ 1 and x ′′ 2 coordinates, while it shrinks for the latter.We call these deformed regions R ± . and The rotation angle θ is positive for I + 1 x 3 >x 4 , and negative for I − 1 x 3 >x 4 .As illustrated in Fig. 4, the original integration range x 1 > 0 and x 2 > 0 translates into the deformed regions R ± with angle α ± .Taking α − , the two angles are defined by tan α and finally, noting that 0 < α − < π/2 and α + + α − = π, α + = π − arctan (x 3 x 4 + x 4 x 5 + x 5 x 3 ) 1/2 x 5 , (A.27) α − = arctan (x 3 x 4 + x 4 x 5 + x 5 x 3 ) 1/2 x 5 . (A.28) The same set of angles also appear for I ± 1 x 3 <x 4 .Since x ′′ 1 and x ′′ 2 appear in the integrand only through the combination x ′′2 1 + x ′′2 2 , we may extend the integration region to the whole x ′′ 1 -x ′′ 2 plane, and compensate it by multiplying with α ± /2π The x ′′ 1 and x ′′ 2 integrations can be performed to give (assuming 1 / ∈ I and 2 / ∈ I) It is now straightforward to integrate out x 3 .Note that the parameter x 3 > 0 implies x 4 < x 5 x 2 6 from the RHS of (A.32).We arrive at The second integral I ± 2 The second integral I ± 2 has almost the same Feynman parametrization as the first one because of their identical topological structure when it comes to the squared propagators.To be explicit, performing a shift for ℓ 2 according to ℓ 2 → −(ℓ 12 − q), I (A.37) Whereas we were able to numerically confirm these identities, we leave an analytic proof for future research.

B Wick rotations
By default, pySecDec and FIESTA define loop integrals in Minkowski space.To compute a Euclidean loop integral with these programs, one has to transform it into its Minkowskian counterpart by a (reverse) Wick rotation.
To proceed, we define the scalar product of two vectors as in d-dimensional Euclidean or Minkowski space respectively.We relate them through the so-called Wick rotation Using this transformation, we can translate any integral from Euclidean space into Minkowski space, or vice versa.Let us consider the following 2-loop example , with q E •u E = 0.According to the Wick rotation defined in (B.2), its Minkowskian counterpart reads , with q M • u M = −q E • u E = 0. To show their equivalence explicitly, let us write down their parametric representations: and q 2 M = −q 2 E according to (B.2).As discussed previously, F − E can be positive if we take an unphysical value of u E such that u 2 E = −1.

Figure 1 :
Figure 1: Normalizing flow scheme.We have n coupling layers with coupling transforms C and neural-network functions m. x A and x B are partitions of x.The x A goes through a NN transformation m and serve as input together with x B for a coupling transform C. The output of C, together withx A serves as input for the second layer, now under a distinct permutation (masking) p. See also[110].

Figure 3 :
Figure 3: We plot the total number of evaluations N tot as a function of the relative precision σ: On the left for the leading ϵ term of the two-loop integral K 00;00111 ; On the right for the leading term of the three-loop integral D 0 .The black lines show expected theoretical behaviour for comparison, see main text.

Table 2 :
Results for the three-loop integrals.Same notation as in Table