Two-Loop Integrals for Planar Five-Point One-Mass Processes

We present the computation of a full set of planar five-point two-loop master integrals with one external mass. These integrals are an important ingredient for two-loop scattering amplitudes for two-jet-associated W-boson production at leading color in QCD. We provide a set of pure integrals together with differential equations in canonical form. We obtain analytic differential equations efficiently from numerical samples over finite fields, fitting an ansatz built from symbol letters. The symbol alphabet itself is constructed from cut differential equations and we find that it can be written in a remarkably compact form. We comment on the analytic properties of the integrals and confirm the extended Steinmann relations, which govern the double discontinuities of Feynman integrals, to all orders in $\epsilon$. We solve the differential equations in terms of generalized power series on single-parameter contours in the space of Mandelstam invariants. This form of the solution trivializes the analytic continuation and the integrals can be evaluated in all kinematic regions with arbitrary numerical precision.


Introduction
Since the early history of quantum field theory, perturbative scattering amplitudes have been a crucial tool in high-energy physics. As gauge-invariant consequences of the underlying field theory, their analytic structure unambiguously captures features of the theory which are not manifest in the action. It is then no surprise that they find many uses both in formal studies of field theory, as well as in more traditional applications such as in making predictions for collider physics. The computation of scattering amplitudes has been a topic of intense study in recent years. Nevertheless, and despite great recent advances at the five-point frontier [1][2][3][4][5][6][7][8][9][10][11][12][13], it still represents a formidable challenge at the two-loop level. Loop scattering amplitudes are multi-valued functions, whose branch-cut structure depends on the kinematics and loop order. This largely theory-independent analytic structure can be packaged and understood in various ways. One practical presentation is through a collection of so-called 'master integrals', in terms of which all scattering processes with the same kinematics and loop order can be linearly expanded. When these master integrals evaluate to polylogarithmic functions, it is also fruitful to understand the analytic structure in terms of the so-called 'symbol' [14][15][16]. For instance, in maximally supersymmetric Yang-Mills theory, this organization has led to the growth of a 'bootstrap' program for the amplitudes [17][18][19][20][21][22][23][24][25], most recently culminating in a computation at the seven-loop order [26].
In this work, we contribute to the understanding of the analytic structure of multi-leg two-loop scattering amplitudes, by computing the planar two-loop master integrals with one massive and four massless external legs. These integrals are highly relevant for QCD collider processes including four massless partons and a heavy particle such as a massive vector boson. While these amplitudes have been computed numerically [13], here we take the first steps towards their analytic calculation. In particular, these are strongly desirable for phenomenological studies at the Large Hadron Collider (LHC) [27]. The computation of multi-scale Feynman integrals relevant for massless QCD has received much attention in the literature. At two-loops all relevant four-point integrals are known [28,29], and recently the planar five-point two-loop integrals have been evaluated analytically both in terms of multiple polylogarithms [30,31] and a more tailored set of pentagon functions [32]. Important progress has also been made beyond the planar limit [8,33]. Much less is known about five-point two-loop integrals with an external mass, where only partial results are known [31,34]. The planar five-point one-mass integrals at two-loops are the main topic of the present paper.
In the past few decades a great deal of progress has been made in novel computational techniques for Feynman integrals. One of the most effective is the differential equations method [35][36][37][38][39]. The method is particularly useful whenever a basis of integrals is available such that the differential equation assumes a 'canonical' form [40], where the dependence factorizes and the matrix can be expressed in terms of so-called 'd log-forms'. A canonical differential equation also naturally encodes the analytic structure of the integrals, directly manifesting the 'symbol alphabet'. Simultaneously, it provides an important way of evaluating the master integrals, and so is the perfect workhorse for our investigations. Nevertheless, the construction of a canonical differential equation is challenging because it requires a basis of 'pure' master integrals [41,42]. Devising an effective D-dimensional algorithm to find such a basis is an active field of research (see e.g. [33,[42][43][44][45][46][47][48][49][50][51]). Here we solve this problem by constructing the basis with a heuristic approach, which we then validate by constructing the differential equation and observing the canonical form. Even when the pure basis is known, the construction of the analytic form of the differential equation is a technically challenging procedure. We employ the numerical sampling method of ref. [49] implemented over finite fields [52,53], which we show can also be applied in cases where square roots must be taken in intermediate stages. Integral reduction can then be performed numerically using, for example, standard public packages [54][55][56] or modern unitarity based methods [2,[57][58][59][60][61][62][63][64]. A further requirement for the application of the numerical sampling approach of ref. [49] is the symbol alphabet. With that in mind, here we show how the full symbol alphabet can be constructed from a technically simpler computation of cut differential equations. We organize the symbol alphabet and observe that, despite the complex five-point one-mass scattering kinematics, it can be written in a remarkably compact form. We then obtain the symbols of the integrals from their differential equation and show that the (extended) Steinmann relations [24,[65][66][67][68] follow from the structure of the differential equation.
The differential equation is also a useful tool to represent the master integrals in terms of known sets of functions, which can then be used for their efficient numerical evaluation. There also exist numerical approaches based on Monte-Carlo integration [69][70][71][72][73][74], some of which have been used to supply two-loop integrals in amplitude computations [75][76][77][78][79][80]. However, because the efficiency and precision of Monte-Carlo techniques are often limiting, analytic results are still desirable. Obtaining such results from the differential equation can be practically difficult in multi-scale applications such as five point integrals. Indeed, whilst the results would be naturally written in term of multiple polylogarithms, the analytic continuation required to be able to use them all over phase-space can be challenging. In this paper we apply the method of [81], where analytic solutions to the differential equation are constructed on a 1-dimensional path in the form of a collection of generalized power series. Such an approach trivializes the integration step and analytic continuation is easily implemented by appropriate choice of integration contours. As such, the integrals can easily be computed in not just the Euclidean but also the physical regions with high numerical precision. The approach of ref. [81] has already been applied in the computation of twoloop integrals for the QCD corrections to Higgs+jet production [81][82][83]. Here we apply it for the first time to five-point kinematics. We review the application of the method to a canonical differential equation and explain how to use it to compute boundary conditions. Furthermore, we demonstrate the readiness of the method for LHC physics in a number of ways, such as computing high-precision boundary conditions for the integrals at hand in both Euclidean and physical regions and showing the efficiency with various studies over physical phase space.
The main numerical and analytic results of the paper are provided in a set of ancillary files. The definition of the pure master integrals is given in anc/*/pureBasis-*.m. The alphabet is given in the files anc/alphabet.m. The differential equations for the three integral families are given in the files anc/*/diffEq-*.m. High precision reference values are given in anc/*/numIntegrals-*.m. For convenience, we provide an example of how to use these ancillary files in anc/usageExample.m where we also generate the symbols of the master integrals.
The paper is structured as follows. First, in section 2 we describe important features of the kinematics relevant for five-point one-mass scattering. In section 3 we describe the loop integrals which we compute. Next, in section 4, we discuss our numerical construction of the differential equations and thereby the basis of pure integrals. Further, in section 5 we discuss the symbol alphabet and the implications of the differential equations for the analytic structures of scattering amplitudes. In section 6 we discuss the application of the generalized series method to the solution of the differential equations. In section 7, we study numerical evaluation of the integrals in physical regions. Finally, we summarize the results of our work and discuss extensions in section 8.

Scattering kinematics
The main result of this paper is a calculation of a basis of two-loop integrals relevant for planar five-point scattering processes with a single massive external leg. However, before we delve into that problem, we first briefly discuss the kinematics of these processes and introduce some quantities that will be relevant in the following sections.
The momenta of the scattering particles are labelled p i , i = 1, . . . , 5, and fulfil momentum conservation, 5 i=1 p i = 0. Without loss of generality, we assume p 1 to be massive, i.e. p 2 1 = 0, and the remaining ones to be massless, p 2 i = 0 for i = 2, . . . , 5. Out of these momenta, we can form six independent Mandelstam variables of the form s ij = (p i + p j ) 2 , which we choose to be s = {p 2 1 , s 12 , s 23 , s 34 , s 45 , s 15 } . (2.1) For concreteness, in this paper we use the metric g = diag(+, −, −, −), which we extend with further minus signs when working in D dimensions. These variables are not sufficient to characterize the kinematics of the scattering process: there is an additional parity label which can be captured by the parity-odd Levi-Civita contraction Indeed, space-time parity inverts all spatial momentum components, and, while Mandelstam variables are invariant, tr 5 gains a sign under this transformation. It is also useful to introduce Gram determinants when discussing kinematics of scattering processes. They are given by the determinants of the Gram matrix G(q 1 , . . . , q n ), which we define as G(q 1 , . . . , q n ) = 2 V T (q 1 , . . . , q n ) g V (q 1 , . . . , q n ) = 2 {q i · q j } i,j∈{1,...,n} , (2.4) where the factor of two is conventional and V (q 1 , . . . , q n ) is a 4 × n matrix whose columns are the vectors q i . It can be shown from the definition of the Gram matrix that if the q i are linearly dependent then the Gram determinant vanishes, and also that this determinant is invariant under shifts of any of the q i by any of the other momenta. Returning to the discussion of five-point one-mass kinematics, we note that the parity-odd tr 5 is related to the parity-even five-point Gram determinant through In other words, tr 5 is a square root of a polynomial in the Mandelstam variables s. Two other square roots related to Gram determinants which are not perfect squares are relevant for the scattering kinematics we are considering. The associated Gram determinants can be written in terms of the Källén function λ(a, b, c): where the minus sign is conventional. Our notation is explained by the fact that ∆ 3 is naturally associated with a degeneration of the kinematics which preserves the cyclicity of the momenta, {p 1 , p 2 , p 3 , p 4 , p 5 } → {p 1 , p 2 + p 3 , p 4 + p 5 }, while ∆ nc 3 is associated with a degeneration that does not preserve it, {p 1 , p 2 , p 3 , p 4 , p 5 } → {p 1 , p 3 +p 4 , p 2 +p 5 }. Let us note that, by properties of the Gram determinant, other equivalent choices of the arguments of G are possible. For example, the choice in (2.5) of all momenta but p 5 is purely conventional.
We finish this section with a brief comment on the analytic structure of Feynman integrals, to which we will return later in the paper. They evaluate to functions of the Mandelstam variables s with a complicated branch cut structure. More precisely, the integrals we compute have branch cuts starting at p 2 1 = 0, s 12 = 0, s 23 = 0, s 34 = 0, s 45 = 0 and s 15 = 0. For each integral, we thus find it convenient to label different kinematic regions by the sign of the Mandelstam invariants. We highlight two types of regions that we will return to in following sections. First, the region where we are away from any branch cuts and where the integrals evaluate to real numbers. This region is called the Euclidean region, and in our case it corresponds to having all Mandelstam variables negative. For five-point one-mass kinematics, it is not a physical region (i.e., there is no physical configuration of momenta that corresponds to values of Mandelstam variables in the Euclidean region). Second, we consider regions associated with the production of a massive vector boson in association with two jets in QCD. This physical process is a natural application of the one-mass five-point two-loop integrals. We assign the massless momenta p i , i = 2, . . . , 5 to massless partons and the massive momentum p 1 to the vector boson, which we assume to decay, e.g. into a lepton pair. This implies that the momentum p 1 is timelike, i.e., p 2 1 > 0. Since any of the parton momenta may be in the initial state, we have six different channels, where i, j, k, l take distinct values in {2, 3, 4, 5} and to each channel corresponds a kinematic region. In table 1 we give the signs of the kinematic invariants for each region. We note that, since momenta corresponding to a physical scattering process must have real components and det(g) = −1, it follows from (2.4) that ∆ 5 < 0 .

Two-loop planar five-point one-mass integrals
Planar five-point scattering amplitudes with a single massive external leg can be written as a linear combination of Feynman integrals. These integrals form a linear space spanned by   a set of so-called 'master integrals', which can be generated by considering the four integral topologies depicted in fig. 1. 1 In this section we establish our notation and briefly describe these linear spaces. The topologies in fig. 1 can be categorized as either genuine two-loop, or 'one-loop squared'. The integrals of the topology I [1-loop 2 ] of fig. 1d, all factorize into a product of two one-loop integrals. Computing them is thus not a genuine two-loop problem, and indeed they lack many of the features associated with multi-loop integrals (for instance, there are no irreducible scalar products). Given that computing the master integrals in this topology is a one-loop problem, we will not discuss this topology further and in the remainder of this paper will choose instead to discuss corresponding one-loop integrals. In contrast, the integrals of the topologies in figs. 1a, 1b and 1c, which all have a 'penta-box' as top diagram, are genuine two-loop integrals and the main result of this paper. They differ by the position of the massive external leg, and we encode this in the label for each topology. Precisely, the notation characterizes the mass assignment for the three external legs attached to the pentagon subloop: they can all be 'zero mass' (zzz), the middle leg can be massive (zmz), or the first leg can be massive (mzz). All other assignments are related to these choices by relabelling of the kinematics, see footnote 1. This notation is also used in the ancillary files accompanying this paper.
For a given topology f , each set of powers ν defines an integral that is a member of a linear space Y [f ] . In this paper, we compute a set of integrals that form a basis of these spaces, that is the set of master integrals associated with each topology. Any integral in Y [f ] can be rewritten as a linear combination of the master integrals using integration-byparts (IBP) identities [84]. For each topology, the master integrals all have a subset of the propagators in the top topology. The dimensions of the vector spaces can be determined in several different ways, and we find Our choice of bases is given in the ancillary files anc/f/pureBasis-f.m, where f is to be replaced by the name of each topology (a pictorial representation of the basis can be found in the files anc/f/graphs-f.m which was generated using ref. [61]). We note that the same integrals can appear in different topologies, and there is a large overlap between these different spaces.
In writing the elements of these bases, we often make use of functions µ ij that are obtained by contracting the components of the loop momenta beyond four dimensions, which we denote (3.4) These functions can also be written as polynomials in the ρ i,f . The latter representation is more convenient if one wants to rewrite integrals defined with the help of these functions as members of the vector spaces Y [f ] . It is given in the ancillary file anc/determinants.m. While in this paper we compute for the first time the full set of master integrals required for two-loop five-point planar amplitudes, some of those master integrals also appear in other amplitudes. In particular, integrals associated with Feynman diagrams with four external legs or less appear in four-point processes with two external masses and have been previously computed [28,32]. We will thus pay particular attention to integrals corresponding to diagrams with five external legs. They are depicted in fig. 2, where we also give the number of master integrals supported on their respective propagator structures. Our choice of master integrals for these topologies, which we will discuss in the next section, is given in appendix B as well as in the ancillary files, as was already mentioned above. Finally, we also note that a full set of master integrals for topology I [mzz] [ ν] has already been computed previously [31].  4 Semi-numerical construction of differential equation and pure basis One of the most effective approaches for computing Feynman integrals is solving the differential equation they satisfy [35][36][37][38][39][40]. Nevertheless, for complicated enough integrals such as the ones we are considering in this paper, obtaining the differential equation can be challenging in itself. In this section we discuss how we constructed the differential equations required to compute the master integrals of topologies I [mzz] , I [zmz] and I [zzz] of fig. 1. Our approach is based on numerical evaluations and builds on the one presented in [49]. Before we discuss the details of our approach, we make some general comments on differential equations for Feynman integrals to set up our notation. Let I be a vector of master integrals associated with a given topology. It is clear that the derivatives of the master integrals are part of the same topology, and we can thus reduce them to the basis of integrals in I. We note that obtaining the IBP relations required for this step is often the bottleneck in constructing the differential equations. In full generality, the vector I fulfils a differential equation where the connection M is a matrix of differential forms depending on the dimensional regulator = (4 − D)/2. A refinement of the differential equation approach to the calculation of Feynman integrals was proposed in [40]: when Feynman integrals evaluate to multiple polylogarithms, the solution to the differential equation is greatly simplified if a basis of so-called 'pure' integrals is chosen. Indeed, in this basis the connection M in the differential equation (4.1) takes a particularly simple form. The regulator factorizes and the connection M is a total derivative of a (singular) potential that depends logarithmically on kinematic expressions. That is, where the entries of the matrices M α are rational numbers. The functions W α ∈ A are known as the 'letters' of the so-called '(symbol) alphabet' A associated to the integrals. We will return to these notions in section 5.
In [49], a numerical method of constructing the differential equation was introduced for the case of a pure basis of integrals with a known symbol alphabet. The approach requires only the solution of small linear systems, taking as input as many numerical evaluations of the differential equation as there are letters. Since only numerical evaluations are required, all IBP relations can be computed numerically, bypassing the often prohibitive complexity of intermediate analytic expressions. The non-trivial aspects of this method are the construction of the pure basis and of the symbol alphabet, which are both as yet unknown for five-point one-mass two-loop processes. In this section, we will address these points. After a brief description of how we numerically sample differential equations, we discuss our construction of the bases of master integrals in section 4.2 and then, in section 4.3, we construct the symbol alphabet from analytic cut differential equations. Crucially, using the data from the numerical evaluations allows to target the simplest cut differential equations that are required to obtain the full alphabet.

The random direction differential equation
Our approach to constructing differential equations for Feynman integrals is based on a numerical evaluation of the differential equation, where the Mandelstam variables s and the regulator take numerical values. To this end, we introduce a 'directional' partial derivative c · ∂ ∂ s I = C( , s ) I. The vector c specifies a direction in the kinematic space, and the operator c · ∂ ∂ s replaces the total derivative d. In contrast to the connection M, the matrix C( , s ) is an algebraic function of the kinematic data which can be easily evaluated numerically. Nevertheless, for appropriate choices of c, it is still sensitive to all features of the connection. For this to be the case, the vector c must not be chosen in any special direction. We thus fix it to a random direction by making a random numerical choice for its components.
Let us now discuss the case of a pure basis. Then C( , s ) takes a very specific form, The factorization of the regulator can be checked by evaluating C( , s ) at different values of . Furthermore, if the letters W α are known, we can fix all of the M α by evaluating C( , s ) for as many values of s as there are symbol letters. When numerically evaluating the matrix C( , s), we find it convenient to perform operations in a finite field of large cardinality. This approach has many advantages, such as removing all issues related to loss of precision in algebraic operations. However, one might worry that the presence of square-roots might render the numerical evaluation of the matrix C( , s ) in a finite field impossible. While this is true in general, it turns out not to be a problem in practical applications. Indeed, it is a fact of number theory that for a finite field of cardinality p, where p > 2, (p + 1)/2 of the elements of a finite field are perfect squares, or more precisely 'quadratic residues' [85], and there exist completely general algorithms which allow for the systematic computation of the square root in the finite field. 2 This fact can be easily understood, as one can enumerate the quadratic residues. Specifically, the set of distinct perfect squares is given by (4.5) These are quadratic residues by construction, so it remains to prove that they are distinct and complete. Now, consider two different elements of the finite field s and r which square to the same number. That is, It is clear that the equation is solved by r = s and the partner solution r = p − s. If we now consider eq. (4.5), we see that no entries are partners of one another, but this would not be true if we added any other residue. Hence the set is distinct, complete and manifestly contains (p + 1)/2 elements. This observation implies that in practice we can take the square root approximately 50% of the time. As such, to avoid any issues related to the fact that the square root of certain numbers cannot be represented in a given finite field, we simply veto the randomly chosen points in which the relevant square roots (see eqs. (2.2), (2.6) and (2.7)) are not perfect squares.

Constructing pure master integrals
Despite much progress in recent years [33,[42][43][44][45][46][47][48][49][50][51], which includes the development of automated approaches, the construction of a pure basis for multi-scale dimensionally regulated Feynman integrals is not yet a fully understood problem. Furthermore, for five-point integrals, four-dimensional analyses of the integrands are often not enough, see e.g. [10,33]. In this section we discuss how we constructed our bases of pure master integrals. Our approach is based on constructing educated guesses for pure integrals, and then checking numerically that factorizes in the matrix C( , s ). Strictly speaking, this does not imply that we have a pure basis, which would also require that only d log forms appear in the connection. We will see in the next section that this is the case for the bases we construct in this section.
For the integrals we are concerned with, pure bases are known for all integrals with four or fewer external legs [28,29]. 3 The five-point sectors for which we need to find pure integrals are depicted in fig. 2 and can be grouped into two sets: those where the number of master integrals is unchanged in the limit where the mass goes to zero, and those where it is not. We find that for those with the same master count-all penta-boxes, double-boxes, penta-bubbles and all but one triangle-box-pure master insertions can be constructed in the same way as in [8,49]. For each such topology, we can separate the pure masters into those that are even and those that are odd under the parity transformation of eq. (2.3). Educated guesses for pure even integrals can be motivated from a four-dimensional analysis of the integrand as we now illustrate in an example. Consider the integrand of the penta-box of fig. 1c for D = 4, where p ij = p i +p j , p ijk = p i +p j +p k and we have introduced a loop-momentum dependent factor N pb that we should fix such that the integrand integrates to a pure function. It has been conjectured that having this is equivalent to having an integrand that is a d log form with unit leading singularity (see e.g. [41]), and we will now see how this can be achieved for this example. In this case we can proceed with a loop-by-loop analysis.
We first recall that in strictly 4 dimensions, the integrands of one-loop box integrals can be written in terms of a d log form, where R box and the g i depend on the configuration of masses of the box integral. For our purposes, the form of the g i is irrelevant, but that of the R box is not. As the coefficients of d log forms, they are known as 'leading singularities'. They depend only on external kinematics and for one-loop box integrals are given by modified Cayley determinants associated with each integral (see e.g. [88]). Let us label the external legs of these box integrals cyclically by q i and take s = (q 1 + q 2 ) 2 and t = (q 2 + q 3 ) 2 . We will be particularly interested in the case of a box with a single massive external leg (say q 2 1 = 0), which we call b1m, and the case with three massive external legs (say q 2 4 = 0), which we call b3m. For those cases Note that the external legs q 1 and q 3 are diagonally opposed to each other. It is clear that if we normalize the boxes by R −1 box we obtain a d log integrand with unit leading singularity, and indeed these correspond to pure functions.
Let us now return to the integrand in eq. (4.7), and focus on the loop momentum 2 . Since it is nothing but the integrand of a one-loop box with three external massive legs, it follows from the one-loop examples we just discussed that it can be brought into the d log form where again the form of the ω i is immaterial for our discussion. Now, if we choose N pb to be proportional to the inverse of the leading singularity in eq. (4.10), then the integrand of eq. (4.7) factorizes into 2 independent propagators and a d log piece. Then, we can again notice that the 2 -independent propagators are those of a one-loop box with a single massive leg, which we know has a d log representation. Explicitly, (4.11) In summary, by choosing the integrand in eq. (4.7) can be written in a d log form with unit leading singularity (the factor of 4 is purely conventional). This four-dimensional argument is not sufficient to claim that the dimensionally regulated integral is pure, but we view this analysis as a way to construct an educated guess for a pure basis which we can later check.
Let us now discuss the remaining master integrals for topologies with the same master count as in the massless case, but which are odd under parity. These integrals involve numerator insertions that are written in terms of the µ ij defined in eq. (3.4) and their parity properties follow from the parity-odd factor tr 5 in the normalization. We find that the naïve generalization of the odd integrands from the massless to the massive case gives pure integrals: that is, we use the same integrands, but the expressions now implicitly  Figure 3: Topologies with more masters than in the limit p 2 1 → 0.
depend on p 2 1 . While these integrands vanish in strictly four dimensions and can thus not be captured by a four-dimensional analysis, 4 they are natural objects to consider. A detailed analysis of why this is the case is beyond the scope this paper, so we only suggest motivations. First, they can be used to shift the dimension of the integral, and purity of Feynman integrals depends on which dimensions they are computed in. Second, they are related to (generalized) Gram determinants, and thus vanish at special configurations of the loop-momenta. This naturally means that they remove maximal codimension residues of the integrand, helping to construct d log forms with unit leading singularity. A further benefit is that, as these integrands vanish in exactly four dimensions, they lead to integrals whose Laurent series around = 0 usually starts later than their even counterparts. As an illustration, for the penta-box of fig. 1c we can construct two odd pure integrals with the numerators and while the integral obtained from eq. (4.12) starts at order 0 , the ones obtained from eq. (4.13) start at order 5 . This has clear advantages when using these integrals for evaluating two-loop amplitudes. Let us now discuss the two topologies that cannot be understood by simple generalization of the pure basis of the massless five-point two-loop integrals, see fig. 3. The first is the penta-triangle topology of fig. 3a. Despite the fact that it does not appear in the massless case, this is in fact a simple case to solve. We require a single pure integral, and can use a logic similar to the one discussed above for odd integrals. We recall that, with an appropriate normalization, a triangle in 4 − 2 dimensions and a pentagon in 6 − 2 are pure. As we already hinted at above, the latter can be represented by a µ 2 insertion on the 4−2 pentagon, where µ denotes the (D −4)-dimensional component of the pentagon's loop momentum. A natural educated guess for a pure integrand is then to take as a numerator N pt = 4 tr 5 µ 11 . (4.14) The validity of this guess can be verified in the usual way. The final and most challenging case we need to address is the triangle-box with a massive leg on the triangle side of fig. 3b. The main difficulty lies in the fact that we must construct six pure integrals with this set of propagators. As for all other triangle-box topologies, two pure insertions can be obtained as simple generalizations of the massless case, Two further integrands can be constructed using the d log logic that was used to build eq. (4.12). Let us consider the triangle sub-loop in fig. 3b, and the one-loop IBP relation together with the same relation obtained by the exchange p 23 ↔ p 45 . It can be easily shown that a one-loop bubble in D = 4−2 dimensions multiplied by (D −3) is related by a simple numerical factor to a 2 − 2 bubble normalized by its scale. The latter is known to be pure, and thus to have an integrand which is a d log form with unit leading singularity. After replacing the box sub-loop by its d log form, equivalent to that in equation (4.11), we can then formally replace the triangle sub-loop in fig. 3b by the left-hand side of eq. (4.16) (or its equivalent under p 23 ↔ p 45 ) to obtain two candidate numerators that correspond to pure integrals: (4.17) There are two further pure integrals, for which we did not built educated guesses. Instead, we rely on the fact that we can easily test if factorizes in the differential equation by using simple numerical evaluations. Combined with the fact that the system of differential equations can be further simplified by imposing that certain propagators are set to zero (see e.g. [49,59,60,89]), we obtain a very efficient method of constructing the remaining two insertions by requiring that factorizes in the differential equation. With this approach we constructed the following other two numerators: As noted throughout this section, at this stage we cannot yet determine if the integral we have chosen for the five-point topologies of fig. 2 are pure. We can only check that factorizes in the matrix C( , s ) of eq. (4.3), which is a necessary but not sufficient condition for the master integrals to be pure. We have collected all the integrands for the diagrams in fig. 2 in appendix B.
The bases we have constructed through this procedure for the topologies in figs. 1a, 1b and 1c can be found in anc/f/pureBasis-f.m, with f=mzz, zmz, zzz. We also include the basis for the one-loop integrals in anc/1loop/pureBasis-1loop.m.

Analytic form of differential equations
We now discuss how we obtain the analytic form of the differential equations through an ansatz procedure. Specifically, we work with an ansatz consistent with the assumption that our bases of master integrals for the topologies in fig. 1 are pure. The structure of the ansatz is that the matrices C( , s ) take the form given in eq. (4.4). In our case, we take the ansatz as a working assumption. By verifying the ansatz with an overconstraining set of numerical data, we verify that our bases are indeed pure. Working with this assumed ansatz, in order to completely determine the differential equations we need to determine the letters W α and the associated matrices M α . At this stage, even the number of letters, i.e., the dimension of the symbol alphabet, is unknown.
Let us consider this dimensionality question for any differential equation (4.1) where the basis I is pure and of dimension n. Given the ability to numerically evaluate the directional derivative matrix C( , s ) of eq. (4.4), one can easily determine the dimension of the alphabet relevant for the basis. First note that, as the directional derivative matrix C( , s ) is considered in a random direction, its entries span a vector space which is equivalent to the one spanned by the alphabet. Therefore it is sufficient to count the number of entries of C( , s ) which are linearly independent. Numerically this can be easily achieved by sampling the directional derivative matrix. We begin by flattening the n × n matrix C( , s ) into a single vector of length n 2 . In a finite field of large cardinality, we now generate random phase space points s (k) , k = 1, . . . , m, and fix = 0 . Now, we evaluate our vector on these points to construct a new, finite-field valued, m × n 2 matrix C l ( 0 , s (k) ) whose rows are the flattened matrices C( 0 , s (k) ). The indices of this matrix are k and l. Given this construction, the rank of the matrix C l ( 0 , s (k) ) is bounded from above by both the number of rows m and the dimension of the alphabet itself, therefore This follows as linear relations between the columns of the evaluation matrix are inherited from linear relations between the entries of the directional derivative matrix. Equation (4.19) implies that if we sequentially raise m and find that the rank stops increasing, then we will have identified the dimension of the alphabet. To apply this approach to the integrals we are interested in, we note that we want to construct four different differential equations, one for each genuine two-loop topology of fig. 1 and one for the one-loop five-point one-mass topology. By direct application of the above steps we find that Another perhaps more interesting number is the dimension of the union of the alphabets, corresponding to all master integrals. We thus flatten the matrices C [mzz] ( 0 , s (k) ), ) into four different vectors, and then join them together to form one larger vector. We then determine the dimension of the full alphabet by computing the rank of the matrix constructed with these vectors evaluated at successive random values s (k) . We find that the dimension of the union of the four alphabets is 55.
We are now left with the task of determining the set of letters that we need to express the four differential equations, i.e. a basis of the 55 dimensional alphabet. Once again, we can use the numerical data we have collected. We consider the matrices C ) and row reduce them. For each nonzero row in the row-reduced echelon form, the index of the leading non-zero column labels an independent basis element. By appropriately ordering the columns, one can prioritize different elements in this basis search. We first choose to determine as many letters as possible from the one-loop differential equation, since these are trivial to obtain in analytic form. This leaves 25 letters to be determined. To determine those, we first note that each column of the matrices C [f ] corresponds to the coefficient of an integral i in the differential equation of another integral j. We choose to prioritize the columns for which (i, j) share as many propagators as possible, essentially organizing the matrix into increasingly 'off-shell' blocks. This organization leads us to an important observation: a basis of symbol letters can be found in the maximal and next-to-maximal cut differential equations for sectors with 6 or 7 propagators (this statement is true for the full 55 letters, not just for the 25 that are new at two loops). Whilst this observation is theoretically interesting, it is also of immediate practical consequence. Cut differential equations are technically much easier to construct analytically, especially when using IBP-reduction methods tailored for the presence of unitarity cuts [49,89,90]. Alternatively, public Laporta-based IBP programs such as KIRA [55] can be used to compute the relevant cut IBP relations. Guided by the numerical differential equations, we have thus reduced the problem of determining the symbol alphabet to the calculation of a few trivial one-loop or cut two-loop differential equations. We also note that by checking that these trivial differential equations are pure we prove that the bases we have chosen in the previous section is indeed pure.
Armed with the basis of the symbol alphabet extracted from the cut differential equations, the only missing ingredients to obtain the analytic form of the directional differential equation matrix C( , s ) are the matrices M α in eq. (4.4). These can be constructed by reusing the set of numerical evaluations of the directional differential equation matrix. First, we compute the dim(A) × dim(A) matrix of evaluations of the "random directional" d logs This matrix is invertible as the random directional d logs are independent by construction. This allows us to explicitly compute the coefficient matrices through As expected from previous experience [49], the rational numbers involved are easily reconstructed from their image in a single finite field of cardinality O(2 31 ).
The differential equations we have constructed in this way can be found in the ancillary files anc/f/diffEq-f.m, for f=mzz, zmz, zzz or 1loop. They are written in terms of the alphabet that can be found in anc/alphabet.m, whose construction will be described in the next section.
5 Analytic structure of planar five-point one-mass scattering at two loops The differential equations satisfied by the master integrals that we have constructed in the previous section contain a lot of information about the analytic structure of not just the integrals, but also the scattering amplitudes they appear in. In this section, we explore this structure with the help of the 'symbol' [14] which can be constructed with minimal effort from a canonical differential equation. Let us review some basic concepts to set up our notation. Consider the expansion of the master integrals. At each order in , the master integrals are computed by integrating the previous order with respect to a kernel that is fixed by the connection matrix M in eq. (4.2). In particular, the kernel is given by linear combinations of d log forms. More precisely, we have where we have used the fact that we normalize our master integrals to have no poles in . As I (0) lives in the kernel of the derivative, it has to be a constant vector. The vector I (n) is a function built from n iterated integral over a series of d log kernels. That is, The number of iterated integrations is called the weight of the function. To explicitly obtain the functions I (n) we must specify the integration contour. However, a great deal of analytic information can be understood from the integrand alone. To this end, it is common to introduce the notion of a symbol, which captures the integrand information. The symbol is simply a vector in the tensor product space of the letters where the length of the tensor equals the weight of the function. Note that the fact that the differential equation is in canonical form naturally ties the order in the Laurent expansion with the weight of the functions, see eq. (5.1). It is clear that the symbol is controlled by the differential equation. In particular, the tensors c are computed from the products of the matrices M α in eq. (4.2) and control which tensor products appear in the symbol of the integrals. In the case where I is a vector of Feynman integrals, there is a constraint on the symbol known as the first-entry condition [91]. In our case, it states that c α 1 ,...,αn = 0 if W α 1 / ∈ s, where we already use the fact that the Mandelstam variables s are part of our alphabet. As we will see in section 5.3, this proves to be a very strong constraint, which almost fully constrains the initial condition at weight 0.
In this section, we will first discuss how to construct a simple set of symbol letters in order to simplify the form of the differential equations (and thus of the symbol). We will then discuss some properties of the symbols of the master integrals.

Choosing letters
It is clear that the choice of symbol letters is not unique: their logarithms generate a vector space, and any basis of that space is equivalent. In section 4.3, we discussed how to extract a complete set of letters from cut differential equations. However, what we naïvely obtain from the differential equations might not be the most convenient choice of alphabet. We now discuss some steps we have taken to simplify the alphabet and attempt to choose letters that make manifest some analytic properties of the integrals.
Let us start from a pure differential equation whose connection takes the form of eq. (4.2). A first step is to take an independent set of irreducible factors of the d log forms in the differential equation as letters, but this can be practically difficult. The issue finds its origin in the square roots in the problem-in our case, the Gram determinants of eqs. (2.5), (2.6) and (2.7). As observed in the literature [92,93], expressions involving square roots cannot be factorized uniquely, meaning that elucidating multiplicative relations between candidate letters is analytically challenging. Furthermore, for letters involving these square roots, it is not a priori clear what the most compact and/or physically relevant basis is.
To combat these difficulties, we employ a numerical sampling approach, which uncovers multiplicative relations between letters even in the presence of square roots. Consider a set of functions Ω = {Ω i } i=1,...,N as new candidate letters. We want to know if they live in the alphabet, and if there are any multiplicative dependencies between them. To answer these questions, we construct the list That is, we take the list Ω, append the alphabet and take the logarithm of the absolute value of each element. All multiplicative relations between the elements of Ω and the alphabet now become linear relations between the elements of L( s). The absolute value plays the role of throwing away any sign information which is not relevant for symbol letters. Similar to the algorithmic construction of the alphabet, all linear relations between the elements of L( s) can be extracted by constructing the square matrix L i ( s (k) ) from n + N randomly chosen values of s (the indices i and k denote the entries of the matrix). As the matrix is not large, all practical questions of numerical stability are avoided using high precision floating point arithmetic. Having constructed L i ( s (k) ), we can now use similar techniques to section 4.3. Firstly, we can easily check if all elements of Ω indeed live in the alphabet as this implies that rank(L i ( s (k) )) = n. Secondly, by ordering the elements of L( s) to put preferred elements first, a new basis of the alphabet is algorithmically picked out by reading the linearly independent columns from the row reduced form of L i ( s (k) ). This approach allows us, with no explicit rationalization of the kinematics, to easily construct alternative bases of the alphabet, prioritizing the letters with the properties we find most important. With this technique in hand, we can easily find a set of symbol letters from analytically factorizing those found in the differential equation. We favour letters with lower mass dimension. It is nevertheless clear that the non-uniqueness of the factorization still remains a barrier to simplicity. To proceed, we rely on an observation made in reference [92], where it was pointed out that one can construct candidate symbol letters involving a single square root from knowledge of the polynomial part of the alphabet and the square root alone. Employing this method we find that it generates a large number of letters which live in the alphabet, but crucially many are new representations with lower mass dimension. Following these steps, we obtain a sufficient set of letters with one square root whose mass dimension is no greater than four.
The final step in our organization procedure is to choose the alphabet to have manifest behavior with respect to changing the signs of the square roots. The reason for this choice is that Feynman integrals are invariant under this change, but this invariance might be broken by the normalizations introduced when constructing a pure basis (see for instance the distinction between even and odd integrals in section 4.2). It is clear that the operations of flipping each sign compose to form a group, which is known in the mathematics literature as a 'Galois group'. 5 By choosing each letter to map to themselves, or their reciprocal, under each element of the group, we ensure that the d logs form an irreducible representation of the group, and that, consequently, so will the symbols.

The symbol alphabet
With the procedure described in the previous section we are able to construct an alphabet whose letters have low mass dimension, and with manifest properties under the Galois group associated to the square roots in the problem. As noted in section 3, the set of master integrals we compute is not sufficient for two-loop planar five-point one-mass amplitudes, as we also require the integrals obtained by the exchange (2 ↔ 5, 3 ↔ 4) of the external legs. To obtain the symbol relevant for the amplitude, we complete the letters by including their image under this transformation. This increases the size of the alphabet from 55 to 58. In this section we present the alphabet of the amplitude.
We split the letters into two main sets: those that do and those that do not appear in the master integral symbols up to weight four (after imposing the first-entry condition discussed at the start of this section), which is the weight of the contributions that are relevant for two-loop amplitudes. We will first list the 49 'relevant' letters, which we organize according to their simplicity and transformation properties under the Galois group. The remaining 9 letters are 'irrelevant', in that they do not turn up in the symbols of the integrals up to weight 4. Each set we present is closed under the (2 ↔ 5, 3 ↔ 4) exchange. In the following, we often choose representations of the letters which help to manifest the soft limits in which they vanish. In the ancillary files anc/alphabet.m we present the alphabet written explicitly in terms of independent Mandelstam variables. 5 Mathematically, the Galois group arises when considering field extensions [94]. Here we are implicitly working in the field of rational functions of Mandelstam variables extended by the addition of the square roots in eqs. (2.2), (2.6) and (2.7), denoted by Q( s, √ ∆3, ∆ nc 3 , √ ∆5). This field has a privileged set of field automorphisms-those that reduce to the identity on the underlying field Q( s). These automorphisms form a group under composition, the Galois group. Beyond square roots, these concepts generalize to more complicated radicals, such as those found in [95].
We start with letters that are invariant under the Galois group. The first set consists of the letters corresponding to the Mandelstam variables that are allowed in the first entry of the symbol, The next two sets are again invariant under the Galois group and of mass dimension two. They are either two-particle invariants or simple differences of Mandelstam variables Here we have introduce tr + (i 1 . . . i n ), which is defined as and, in the case n = 4, gives tr ± (i j k l) = 2 (p i · p j )(p k · p l ) − (p i · p k )(p j · p l ) + (p i · p l )(p j · p k ) ± iε µνρσ p µ i p ν j p ρ k p σ l . (5.10) This object is manifestly multilinear in the external momenta and manifestly vanishes in the limit where any of the involved momenta go to zero. We note that this object is chiral if the vectors p i , p j , p k and p l are linearly independent, as in this case tr ± (i j k l) depends on tr 5 . If this is not the case, as in eq. (5.7), then tr + (i j k l) = tr − (i j k l) is invariant under the Galois group action associated with the flip of the sign of tr 5 .
We next list some letters that are not invariant under the Galois group associated to the square roots in the problem. Two sets depend on the three-point Gram determinants ∆ 3 and ∆ nc 3 , and already arise in one loop integrals [96]. The first is associated to three-mass triangle integrals whilst the second is associated to the two-mass hard box,    Let us make a number of comments on the symbol alphabet. First, all 30 letters that appear in the one-loop alphabet are 'relevant' letters at two-loops. Specifically, the one-loop alphabet is comprised of  fig. 4 normalized with This integral is first non-zero at weight 4. We note that ∆ nc 3 is also a letter, but it does not appear in any of the master integrals at weight 4 and as such is part of the 'irrelevant' letters. Finally, we note that only a small number of symbol letters cannot be determined from maximally-cut differential equations. Specifically we find that, at amplitude level, the only letters that first appear at the next-to-maximal-cut level are W 54 , W 56 and W 58 . Remarkably, this implies that all 'relevant' letters can be determined from the maximal-cut differential equations.

Structure of symbols of master integrals
Having discussed the symbol alphabet, which describes the possible entries in the symbol, it remains to discuss the patterns of letters which turn up in practice in the master integrals.
We already discussed the first-entry condition at the beginning of this section. Here we will illustrate how strong this condition is by showing how it determines the weight 0 value of the integrals. It is clear from the definition of the symbol that, at weight one, we have where I (0) is a vector of rational numbers (of weight 0). The first entry condition states that S[I (1) ] should not contain [W α ] if α > 6. This means that the vector I (0) must be in the kernel of the matrices M α with α > 6, that is, it lives in the intersection of the nullspaces of theses matrices. Constructing such a vector is a simple linear algebra exercise. Remarkably, the intersection of the nullspaces has dimension 1, which means that I (0) is fully determined by this exercise, up to an overall normalization that any nullspace calculation is obviously blind to. This is consistent with the fact that the differential equation is homogeneous in I.
As an example of how to use the differential equations in our ancillary files, we implemented this calculation in a Mathematica function that can be found in anc/usageExample.m and allows to compute the symbols of all the master integrals.
Beyond the first entry, we also find that the letters that appear in the second entry of the symbols are highly constrained. Given the form of the differential equation, the weighttwo symbols fully determine the first two entries of any symbol tensor at any weight. We find that, at weight two, the symbols of all the master integrals required for planar two-loop five-point one-mass amplitudes correspond to the (weight two) symbols of one-loop boxes and triangles which preserve the cyclic ordering of the external legs. This fact was already observed in the massless case [32], and is well understood at one-loop [96,97]. We stress that this is a non-trivial constraint on the symbols: simply imposing that the symbol-tensors correspond to the symbol of a function (i.e., that it is 'integrable' [14] fig. 4 with the normalization in eq. (5.18) that we have already discussed.
It is also interesting to contrast other constraints on the symbol alphabet with the structure of the differential equations. For instance, the Steinmann relations [24,[65][66][67][68] state that there is no double discontinuity associated with overlapping channels. In our case, this means that there should be no double discontinuity associated with the s 12 and s 15 channels. 6 Consistent with this expectation, we observe that letters W 3 and W 4 never appear consecutively in any symbol tensor. Our results also confirm a stronger version of the constraint, known as the 'extended Steinmann relations', which states that the two letters cannot appear in the n-th and (n+1)-th letters in a symbol tensor for any n. Indeed, we find that the matrices M 3 and M 4 satisfy the relations which implies that the extended Steinmann relations will be satisfied at all weights. We thus see that the structure of the differential equations naturally encodes the extended Steinmann relations. In addition to (extended) Steinmann relations, we have empirically observed more 'forbidden pairs' of symbol letters which never appear consecutively in the symbols, by looking for pairs (i, j) such that We find many such pairs. We can however restrict them by demanding that the set of conditions be closed under the exchange (2 ↔ 5, 3 ↔ 4) so that they are conditions on the symbol of the planar amplitudes, and furthermore impose that i ≤ 6, that is W i is a letter that can appear in the first entry, and j appears in the second entry of at least one master integral. Under these conditions, we find four pairs of forbidden letters (besides the pair (3, 4) which we already discussed): We stress that, given that these pairs satisfy eq. (5.21), these letters cannot appear next to each other for any symbol tensor and at any weight. We leave it to future work to elucidate the nature of these extra Steinmann-like relations.

Series solution of the differential equations
In this section we discuss our approach to solving the differential equations constructed previously, which follows the strategy proposed in [81]. That is, we solve the differential equation along a path connecting a known boundary point and a target point, and the solution is written in terms of univariate generalized power series. After discussing how to construct the solution along a path, we discuss analytic continuation around the different branch-points, the determination of the boundary values, and the estimation of the numerical precision of our solutions.

Series solution along a path
Our approach to evaluate the master integrals is to solve their differential equations with generalized power series [81]. In this method, the system of partial differential equations (4.1) is integrated along a one-dimensional path connecting two fixed points in the space of the Mandelstam variables. For concreteness, we focus our discussion in the case where I is a vector of pure integrals, that is where the connection M takes the form of eq. (4.2). In the following, the univariate path will be parametrized by t and for convenience we will take it to be the straight line The initial point s b , where we assume I is known, provides the boundary condition required to solve the differential equation, and the final point s e denotes the point in the space of Mandelstam variables where we wish to evaluate the integrals. Along the path, the differential equation (4.1) degenerates onto a system of univariate ordinary differential equations depending on the parameter t, 7 d dt I(t, ) = A(t)I(t, ) , As discussed at the start of section 5, such a system admits an iterative solution in , where we assumed that the integrals are normalized such that their Laurent series in have no negative powers. The c (i) are integration constants of the differential equation, uniquely fixed by the boundary condition at t = 0, which in this section we assume to be known. The starting point of the iterative solution is the integration constant I (0) = c (0) . We recall that there is a concept of weight associated with solutions to differential equations of the type of eq. (6.2), which is aligned with the coefficient in the Laurent expansion in of the solution. We will sometimes refer to I (i) (t) as the contribution of weight i to I(t).
While in principle the integrals in eq. (6.3) are expected to be computable in terms of multiple polylogarithms, in practice the symbol alphabet can make this a daunting task. Firstly, to apply direct integration procedures one must simultaneously rationalize all square roots, which may not be possible (in appendix A we discuss some parametrizations that rationalize a subset of the square roots). Secondly, it can be complicated to accurately handle both spurious and physical branch points in any resulting expression. This problem can be exacerbated by introducing variables that rationalize the alphabet. Fortunately, these issues can either be sidestepped or clarified with locally-valid solutions written in terms of (generalized) power series. Such local solutions are only valid in a well-defined region. The first task in solving eq. (6.2) on the path of eq. (6.1) with this approach is then to split the path into segments, each with its own local solution. More explicitly we write the solution I (i) (t) as with, Here, t k is the expansion point of the local power series solution, r k is the radius of the region where the local solution is used, i.e., the radius of segment k, and N e is the number of segments. The path segment centered at t k with radius r k is denoted S k = [t k − r k , t k + r k ).
Our goal is to compute the value of I (i) (t) at t = 1, which is given by In the following, we will first discuss the construction of the local solutions I  k (t) is one that is valid in some region around the point t k . We can easily construct such a solution through series expansion of the integrand in eq. (6.3). This series has a finite radius of convergence and so the solutions will only be valid locally. The matrix A(t) determines the form of the series expansion of the integrand. Given the form of the alphabet discussed in the previous section, in our case it contains both simple poles and square-root branch cuts. The series expansion around the point t k then takes the form where the A i,k are constant matrices. Through iterated integration, the series solution then takes the form of a half integer power series with logarithmic terms where we have exchanged the order of integration and summation. Here, c  k (t) simplify to a Taylor series. The radius of convergence of this solution is the same as that of the expansion of the matrix A(t) in eq. (6.7). We note that the solution in eq. (6.8) introduces logarithmic and square-root branch points at t k that must be handled with care. This will be discussed in section 6.2.
Let us briefly discuss how the integration constants c (i,0,0) k associated with the local solution around t k are related to the boundary condition I (i) (0) of the full solution. First, the constant of integration c (i,0,0) 0 is obtained by requiring that the k = 0 local solution I (i) 0 (t) matches the known boundary value at t = 0, that is Note that this only requires I (i) 0 (t) to be valid at t = 0, rather than centered there. The remaining integration constants are then iteratively determined by exploiting the continuity of the full solution eq. (6.4) at the boundary of each segment. Explicitly, where the right-hand side should be understood as the limit of I (i) k−1 (t) as t → t k−1 + r k−1 , which exists by construction. In this way, the integration constant in each local solution can be determined from I (i) (0). We will discuss how to compute I (i) (0) in section 6.3.
To make the solution in eq. (6.8) practical, it will be necessary to work with truncated series expansions and control the numerical error associated with the truncation. As is well known, the convergence rate of the series decreases as one approaches the radius of convergence of the series. We must thus be careful with how we construct the segments S k , in particular in balancing the size and the number of segments used to cover the integration path: they should be small enough so that the truncated series solution converges fast enough on each segment, but there should not be too many segments as the complexity of the algorithm scales linearly with the number of segments. The remainder of this section is devoted to describing the segmentation of the path. We choose to work under the constraint that segments should never be larger than half the radius of convergence of the associated series solution to guarantee that convergence is fast enough on each segment. We note nevertheless that this constraint can be modified at the price of having more segments if we want to build local solutions that converge at a different rate. Finally, we note that the segmentation of the path is the same for all orders in the expansion, that is for all i in eq. (6.3).
A segmentation of the path is a collection of non-overlapping segments (or intervals) S k = [t k − r k , t k + r k ) such that the union of all of the segments covers the interval [0, 1], that is Each segment is specified by its center t k and radius r k . The choice of the pairs (t k , r k ) is primarily dictated by the singular points of the differential equation (6.2). These singular points may occur for both real and complex values of t. Let us denote the set of real singular points R = {σ k } k=1,...,Ns and the set of complex singular points C = {λ k } k=1,...,Nc . The complex-valued singular points will also affect the convergence properties of neighboring series solution. In order to avoid using complex arithmetic, we define the set of real regular points C r = {Re(λ k ) − Im(λ k ), Re(λ k ), Re(λ k ) + Im(λ k )} k=1,...,Nc . Considering these realvalued points effectively accounts for the effect of the complex valued singularities. It is clear that not all points in R ∪ C r affect the series solution in [0, 1], but it is also clear that it is not sufficient to consider the points that are in [0, 1]. Given our constraint of only using a series solution in half its radius of convergence, it is sufficient to consider the points t k ∈ R ∪ C r such that t k ∈ (−2, 3). 8 To each t k we associate a radius r k , chosen to be half the distance between t k and the closest point in R ∪ C r ∪ {−2, 3}. The above procedure may not cover the full interval [0, 1]. For these uncovered regions we turn to bisection, that is we add segments centered at regular points in the middle of the uncovered intervals of (−2, 3) that overlap with [0, 1]. The associated radii are chosen to be the minimum of the following two quantities, • half the distance between t k and the closest point in R ∪ C r ∪ {−2, 3}, • the distance between t k and the closest segment already determined.
We iterate the bisection until the [0, 1] interval is covered. We note that if R ∪ C r does not contain any point −2 < t k < 3, there is a single regular expansion point at t 0 = 1/2. Finally, we note that the segmentation procedure we described may have produced segments with no overlap with [0, 1] which we simply remove.

Analytic continuation
As was noted below eq. (6.8), a local solution I (i) k (t) of the differential equation will in general have a branch cut if the associated expansion point t k is either a singular point or a square-root branch point of A(t). At each such point t k , a subset of the letters in the symbol alphabet will either vanish or become infinity. In this section, we will classify the different types of branch-points we can encounter and then explain how we deal with the analytic continuation across different types of branch points.
Let us first introduce our naming for three different classes of branch points. The simplest to define are the 'square-root branch points', which arise from terms with non-integer exponents in eq. (6.8). The remaining two cases are logarithmic branch cuts. We distinguish those that are 'physical thresholds' from those that are 'non physical thresholds' as follows. It is well known that Feynman integrals with massless propagators have logarithmic branch cuts when either of the Mandelstam variables in s vanishes. These are the physical thresholds. From the alphabet we have determined in section 5.2, it is nevertheless clear that there are many other potential branch points. To contrast these against the physical thresholds we call them non physical thresholds.
Consider now a t k that is associated with a logarithmic branch-point in eq. (6.8). Given the distinction between physical and non physical thresholds, we would like to determine to which class t k belongs. To achieve this, it is natural to make a connection with the letters of the symbol alphabet, since we expect that some of them should either vanish or become infinity at t k . Naïvely, one might say that if t k corresponds to a physical threshold, it should be associated with one of the letters W 1 through W 6 , and if it is a non physical threshold it should be associated to any of the other letters. This is however not exactly the case, as we now show in an example. Consider a point t k where p 2 1 → 0. It is clear that at this point W 1 = 0. Nevertheless, this is not the only letter that vanishes. For instance, letter W 33 can be written as and will thus also vanish as p 2 1 → 0 if (s 45 − s 23 ) < 0. This observation might cast a doubt on whether t k should correspond to a physical threshold or not. It is nonetheless true that t k is a physical threshold, and the fact that W 33 vanishes is, geometrically, a consequence of the fact that, due to the square root, the zero set of an odd letter does not correspond to an irreducible algebraic variety (with our choice of alphabet, each even letter defines an irreducible variety). This situation should however be distinguished from the case where, at a given point t k , both p 2 1 → 0 and W 33 → 0. Then the point t k corresponds to an overlapping singularity, where two independent singular surfaces intersect. To make the different singular surfaces associated with each letter manifest, we can explicitly compute their d log using the variables s as coordinates. In the case of W 33 we would find 13) and identify the irreducible singular surfaces s 45 = 0, p 2 1 = 0 and ∆ 3 = 0. In summary, the classification of t k into physical or non physical thresholds should be done with care. The first step is to check if a given t k is associated with the vanishing of one of the entries of s. If it is, one should check the behavior of the other letters. If they vanish (or become infinity) only because of the same entry of s, then t k is a physical threshold. Otherwise, it is associated with an overlapping singularity. Finally, if t k is not associated with the vanishing of one of the entries of s then it is a non physical threshold. Non physical thresholds might also appear together in overlapping singularities, but this classification is immaterial for our purposes. Now that we have classified all types of branch points we can encounter, we discuss the analytic continuation across each one of them.

Physical thresholds: Analytic continuation is determined by Feynman's iε-prescription.
Assuming that the threshold is associated with variable s i , we take (6.14) This can then be implemented by performing a deformation of the t-contour in the segment centered at t k , As ε is taken infinitesimally small, this only has an effect in the logarithmic terms of (6.8) which are then defined as (6.16) Non physical thresholds: It is well known that logarithmic singularities associated with non physical thresholds are absent in Feynman integrals in the Euclidean region. Therefore, the associated logarithmic terms drop out of I (i) k in eq. (6.8). Indeed, we will return to this observation in section 6.3 and use it to determine the boundary condition. In the Euclidean region there is thus no analytic continuation to perform through these branch points. In the physical region, a path might cross a non physical threshold. In order to avoid having to continue through such a threshold, in practice we instead take another path with the same end point. Given that our paths are always straight lines, this means that we start from a different initial point s b to reach the desired point s e .
Square-root branch points: Square-root branch points are an artefact of our choice of basis of master integrals. Indeed, they are absent from genuine Feynman integrals (in the language of section 5, Feynman integrals are invariant under the action of the Galois groups associated with each of the square roots), and are introduced in the pure bases when Feynman integrals are normalized by a square root. We can thus freely chose the analytic continuation prescription of these branch points as the effect drops out when we relate the pure basis back to Feynman integrals (provided we are consistent with this prescription in the normalizations). We use the prescription for t < t k . (6.17) Overlapping singularities As discussed above, a given t k on a given path might correspond to an overlapping singularity. In practice we have never encountered such a situation. Nevertheless, we have implemented a check for this eventuality and, if such a situation were detected, we would simply veto that path and choose an alternative path to the end point.

Boundary conditions
Up to this point, we have assumed knowledge of the numerical value of the integrals I(t) at some point in the space of Mandelstam invariants and elaborated on how to use generalized series expansions and differential equations to transport this to another point in Mandelstam space. More precisely, we assumed that I(0) is known, and discussed how to obtain I(1).
In this section we discuss how to determine I(0). Our approach will be based on arguments analogous to those that were used in section 5.3, where the symbol of the integrals was determined by imposing the 'first-entry condition'. This condition is a consequence of the fact that Feynman integrals have no branch-cuts in the bulk of the Euclidean region. We now show how, by imposing this behavior, we can determine I(0) up to an overall normalization.
Our approach to the determination of the boundary condition is most conveniently formulated order-by-order in the expansion. Throughout this discussion, we will thus assume that we have fully determined the function I (i−1) (t), and use it to compute the boundary value I (i) (0). This will be achieved by enforcing that I (i) (t) does not introduce spurious logarithms at order i + 1. To build such a constraint, we consider a choice of path for which a spurious logarithmic singularity occurs at t = t k . Here, by spurious we mean a point t k where A(t) has a pole and none of the first entry Mandelstam invariants are zero, that is and all entries of s(t k ) are different from zero. In the language of local solutions, a spurious logarithmic singularity manifests itself as an explicit logarithm in the generalized series solution associated to the point t k . If we consider the computation of the weight (i + 1) solution through eq. (6.8), it is clear that such a logarithm arises if the contribution of the pole term from the right-hand side of eq. (6.18) is non-zero. We therefore see that requiring the absence of this spurious logarithm is equivalent to the condition Given our assumption that I (i−1) (t) has been fully determined, it must also satisfy eq. (6.19), and the primitive in eq. (6.8) is thus regular at t = t k . Using the continuity conditions of eq. (6.10), we can explicitly relate I where v (i) k is fully known. For instance, a useful implementation strategy is to note that it can be computed as a difference of local solutions, v (i) which is independent of the boundary condition I (i) (0). Imposing eq. (6.19) then becomes an explicit constraint on I (i) (0): We note that the constraint (6.22) is particularly simple for i = 0 as the integrals are constants. This implies that v (0) k = 0, and we reproduce the conditions determined in section 5.3 to constrain the symbol of the integrals.
It is clear that the above discussion can be repeated for a series of spurious singularities to build more and more constraints on the value of I (i) (0). Searching for these singularities can be implemented in many ways. In our case, we considered a piecewise straight-line path in the Euclidean region. More concretely, we consider the vertices and the path s eu-1 → s eu-2 → s eu-3 → s eu-4 → s eu-5 . For each spurious singularity we encounter, we use (6.22) to build a further set of of linear constraints on I (i) ( s eu-1 ). In practice, we find that if we combine all the constraints determined along this path we are able to fix the value of I (i) (0) up to a single degree of freedom. Indeed, if we consider this analysis for i = 0 this is no surprise. As noted below eq. (6.22), the weight 0 solution is kinematically independent and so simultaneously lives in the kernel of all A −2,k . Therefore, the general solution I (i) (0) to eq. (6.22) can be written in terms of any particular solution I (i) p (0) to eq. (6.22) and the weight zero solution, i.e., where f (i) is a constant we are yet to determine. We note that this is the order by order in incarnation of the fact that the differential equation (6.2) is invariant under overall rescaling of I(t) by a kinematically independent but -dependent function. The value of f (i) can then be determined by computing a simple master integral, such as a factorized bubble-type integral, with the normalization chosen in eq. (3.1). We end by emphasizing that, for a given topology, the boundary condition only has to be computed once with this approach. It can then be transported to other regions of phase-space with the procedure described in section 6.1, and the result obtained in this way used as a boundary condition for subsequent evaluations in each region.

Numerical precision of integrals
In section 6.1 we already noted that our approach to solving the differential equation (6.2) relies on truncated series expansions. Here, we describe how to fix the truncation order in order to reach a given precision in the evaluation of I(t), which we define as the number p of correct digits after the decimal point. Throughout this discussion, we will refer to p as the precision of the integrals. The precision will be affected by two distinct factors. One is the precision associated with the boundary condition we compute with the procedure described in section 6.3, and the other is the precision of the numerical transportation of the solution along the integration path. We will first discuss the transportation precision, and then comment on the precision of the boundary condition.
The coefficients of the generalized series expansions in eq. (6.8) are represented by finite-precision numbers. In practice, we take these coefficients to be much more precise than p digits so that there is no error associated with them. The precision of a numerical evaluation of an integral is then controlled by two factors: the precision of a local solution on the boundary of a segment and the accumulation of these errors along a path. That is, when computing I(t) along a given path, one needs to concatenate multiple segments, and the truncation error of the integrals at the end point of the path is obtained by combining the error on each segment. We determine the required truncation order on a given segment by using the connection matrix A(t), which is known exactly. We introduce the expression A [k] (t) for the truncated expansion over the k-th segment, where, unlike in eq. (6.7), the expansion is truncated at order n k . For each segment, n k is determined by requiring that each element of A [k] (t) approximates the matrix A(t) within a certain tolerance. That is, we fix n k by requiring where we introduced another (positive) parameter δ, which must be determined so that the precision of the integrals at the end of the integration procedure is indeed larger than p.
In order to understand how to determine δ, we must first understand how the truncation error of the local solutions accumulate to an error at the end point of the path. We start by estimating the truncation error on each segment. While we can in principle perform a detailed analysis of the error propagation, we find it more practical to obtain an estimation of the error from Cauchy's convergence criterion. Specifically, for a given local solution we consider the last m terms of the series expansion and estimate the truncation error at the end point of the associated segment as, where m is a small integer compared to n k (we use m = n k 50 ) and the index a labels the different integrals in I(t). That is, we conservatively assign the worst estimate across all weights i and all master integrals a to all integrals and weights. In practice we observe that, for a given choice of p and δ, the error of the local solution is of order 10 −(p+δ) for each segment, i.e., ∆ k ∼ 10 −(p+δ) . (6.28) Next, we consider how the errors associated with the segments accumulate when matching multiple local solutions along the path. Conservatively, we estimate that the error increases along the path by the sum of the errors ∆ k , that is where ∆( s e ) is the error on the value at the end of the path and ∆( s b ) is the error on the value at the beginning of the path, i.e., on the boundary condition. Note that, according to eqs. (6.27) and (6.29), the error associated to I(t) is the same for all the master integrals in I(t) and for all the weights i. We now have all the tools required to determine δ. We distinguish two cases for which our evaluation strategy is slightly different: the evaluation of I(t) at a single phase-space point s e using a known boundary value at s b , and the evaluation of I(t) at multiple phasespace points. Let us first discuss the case of a single evaluation. We assume that the boundary value at s b was computed as described in 6.3 to a precision much higher than p, that is ∆( s b ) ∼ 0. In this case, the error is fully determined from the accumulation of the truncation errors along the path, and from eqs. (6.28) and (6.29) it is of order n s 10 −(p+δ) . Choosing δ ≥ log 10 (n s ) then ensures that we obtain I(1) with precision p. Let us now discuss the case of multiple evaluations. When evaluating many times in a give region of phase space, we may take previous evaluations as boundary points. Then, the error on the boundary value ∆( s b ) is no longer negligible, but is given by a previously calculated ∆( s e ) which can therefore be calculated by iteration of eq. (6.29). To guarantee that all evaluations have a precision of at least p digits, we take δ ≥ log 10 (n s n ps ) wheren s is the average number of segments required for each evaluation and n ps is the number of phasespace points under consideration. As we will show in section 7.3, the average number of segments per path is of order two. Therefore, setting δ ≥ log 10 (2n ps ) ensures that all evaluations of I(1) have a precision of at least p. Finally, we note that, to err on the side of caution, in practice we always take δ to be an integer greater than the estimates we have discussed in this paragraph.
We finish this section by discussing the precision of the boundary conditions determined with the procedure described in section 6.3. Aside from the truncation error associated with the different segments required to reach each vertex of the pentagon in eq. (6.23), which can be estimated with the same analysis as above, there is a new source of error associated with the numerical solution of the conditions of eq. (6.19). This error is harder to determine, but can be estimated a posteriori as follows. We start by noting that in the Euclidean region the integrals are either purely real or purely imaginary. This condition is however broken by the fact that eq. (6.19) only holds up to a certain numerical accuracy. This leads to a residual imaginary part in real integrals, and real part in imaginary integrals. The magnitude of these residual contributions are then a measure of the precision of the zero in eq. (6.19), and can thus be used to estimate the precision of the determination of the boundary condition. This estimation is applicable only when a Euclidean region exists, as it does in our case. Nonetheless, we expect that similar methods can be applied to estimate the error for boundary values computed at non-Euclidean points. We leave this analysis for future work.

Numerical evaluation of master integrals
In this section we illustrate the power of our approach to obtain numerical values for Feynman integrals. We demonstrate this by computing high-precision benchmark values, and by evaluating and plotting the integrals over a sub-region of physical phase space. Finally, we discuss the validation of our numerical results.

Integral evaluation over physical phase space
Having in mind future phenomenological applications, it is not sufficient to have precise evaluations at a single phase-space point. One also needs to have efficient and stable evaluations across phase-space that can be used for Monte Carlo phase-space integration. In the following, we describe how this can be achieved with our approach.
Let us begin by elaborating on the strategy we follow when evaluating master integrals over a large set of phase-space points. In this context it is possible to improve the average evaluation time of the integrals by exploiting previous evaluations. In our approach, when a vector of master integrals has already been computed over a set of phase-space points, any of these previous evaluations can used as a boundary point for the next integral evaluation. As such, it is fruitful to pick the element in the set of available points that minimizes the evaluation time which, considering the analysis of the previous sections, depends linearly on the number of segments of the contour that connects the boundary point to the target point. It is therefore wise to search for a boundary point which decreases the number of segments. In order to find the optimal boundary point, in principle it would be necessary to consider the full set of available points. However, in the context of a phase-space Monte Carlo integration this set may be prohibitively large and this analysis would then undermine the aim of decreasing the average evaluation time. It is however natural to expect that the optimal boundary point will be close to the target point in the space of Mandelstam variables. We thus consider only the k nearest points and choose the one that minimizes the number of segments. In practice we find that k = 10 gives an average speed-up of 40% in comparison to the k = 1 case. This can be justified by noting that the optimal boundary point is not in general the nearest one, as the number of segments also depends on the configuration of the singular points. An important feature of this approach is that, with each new evaluation, the pool of available points increases and so the average number of segments, and therefore the evaluation time, required for each new evaluation decreases.  In order to demonstrate these features, we generated 20k sample phase-space points corresponding to vector-boson production at the LHC with phase-space cuts of ref. [98]. (We used the Sherpa Monte-Carlo program [99] to generate the phase-space points.) The particles with momenta p 2 and p 3 are taken in the initial state. As a seed evaluation in this physical region, we took the high-precision value discussed in section 7.1. We evaluated the complete set of master integrals of the three two-loop topologies and of the one-loop topology over the 20k phase-space points and, for each evaluation, recorded both the number of segments per contour and the evaluation time. Figure 9 shows the average evaluation time per master integral as a function of the number of points evaluated. The figure corresponds to evaluations with 32-digit precision on a single CPU thread. As expected, the evaluation time decreases as the number of points increases, and we observe that the evaluation time stabilizes after about 10k points. The asymptotic timings, along with other evaluation parameters for 32-and 16-digit evaluations, are presented in table 3. The number of segments per contour stabilizes around two for all the families. In the language of section 6.4, we performed the evaluations with an offset of δ = 6. We thus obtained a numerical precision of at least 36 for the 32-digit run, and 20 for the 16-digit run.
Finally, we compare our timings with a fully analytic solution of the integrals. For this, we focus on the mzz topology and compare our evaluation timings with those of the analytic solution of ref. [31]. We point out that it is hard to make this comparison meaningful since the two strategies are very different. In the spirit of this section, we thus choose to compare our asymptotic timing with the timing of a single evaluation of the expressions of ref. [31], since these are the relevant numbers when using the two implementations for e.g. Monte-Carlo phase-space integration. We find that in our approach the timings are stable in each of the physical regions of Tab. 1. That is, in each region, to evaluate all master integrals of the mzz topology at a phase-space point takes ∼ 75s with 16 digits, ∼ 150s with 32 digits and ∼ 1150s with 128 digits. We then evaluated the polylogarithms in the expressions of ref. [31] at the six phase-space points of eq. (7.1) using the GiNaC implementation of [100]  on a single CPU core. We observed a very large fluctuation of the evaluation times across the six phase-space points, which ranges from 42s to 2800s for 16 digits, from ∼ 50s to ∼ 4800 for 32 digits, and from ∼ 120s to ∼ 22200s for 128 digits (we note that obtaining integrals with 16, 32 or 128 digit precision requires running GiNaC with a slightly higher number of digits). Whilst we stress that given the differences in the approaches the timing comparisons are not straightforward, we conclude that our approach is competitive with a fully analytic solution of the results, with a more stable behaviour across phase space.

Plots over physical phase space
A further way to demonstrate the efficiency and the numerical stability of our approach is to produce plots of the integrals over a sub-region of physical phase space. Specifically, we present plots of the highest non-vanishing integrals for each family (i.e., the same integrals for which we gave high-precision values in table 2), over a two-dimensional sub-region of the physical region relevant for W + 2-jet production at the LHC, where the particles of momenta p 2 and p 3 are in the initial state. The sign of the independent Mandelstam variables in this phase-space region are given in  The remaining Mandelstam variables are s 23 > 0 and s 34 < 0, but they do not take arbitrary values if they are to correspond to a physical phase-space point in the region under consideration. We shall now characterize this region explicitly. We base our analysis on the observation that, in this region, the Gram matrix G(p 1 , p 2 , p 3 , p 4 ) has three negative eigenvalues [101]. We note that this is a stronger condition than simply requiring the determinant to be negative. In fig. 10 we depict the two disconnected regions in the s 23 > 0 and s 34 < 0 quadrant where det G(p 1 , p 2 , p 3 , p 4 ) < 0. To determine the boundary of these regions, we first solve det G(p 1 , p 2 , p 3 , p 4 ) = 0 with respect to s 34 , finding These s 23 is a constant determined by the values given in eq. (7.2).
We expect the master integrals to have interesting behaviors near the branch points that we can approach in this region. These are s 34 → 0 and s 23 → ∞. Given the constraint of remaining in the region R these correspond not to dimension 1 surfaces but to the points In practical applications, however, we are not interested in approaching the point s 23 → ∞. Instead, we introduce a cut-off at s 23 = (13 TeV/80 GeV) 2 , i.e., at the LHC center-of-mass energy divided by a scale similar to the W -boson mass, which we use as the regularization scale.
In order to plot the functions, it is convenient to map R to a finite region. We thus map R to a unit square with the following change of variables where we highlight that s (±) 34 are functions of x, following from their dependence on s 23 . Under this change of variables, the points in eq. (7.9) are mapped to P 1 = {x = 0.376542 . . . , y = 1} and P 2 = {x = 1, 0 ≤ y ≤ 1} . (7.11) The cut-off at s 23 = (13 TeV/80 GeV) 2 translates to a cut-off at x = 0.998926 . . .. We expect the plots to have interesting features around P 1 and P 2 of eq. (7.11). Despite the cut-off not allowing us to reach x = 1, the cut-off is sufficiently close to 1 that we expect to see the effect of the threshold. In summary, the phase-space region shown in figs. 11 and 13 isR = {(x, y) | 0 < x < 0.998926 . . . , 0 < y < 1} . (7.12) In the remainder of this section we discuss the plots of master integrals in this region shown in figs. 11 through 14. They were generated by computing selected integrals over a set of 200k points in the interval 0 < y < 0.9, and over a set of 200k points in the interval 0.9 ≤ y < 1, where the integrals exhibit fine structures and large variance. In each interval, the points are evenly distributed over 200 equally-spaced parallel lines in the direction of the y axis. This gives us enough granularity to observe the smoothness of the functions in the bulk of the phase-space region we are exploring, as well as the behavior around the singular points P 1 and P 2 .
The plots of the highest non-vanishing two-loop integrals at weight four overR are presented in fig. 11. For each topology f=mzz, zmz and zzz, we show the real and imaginary parts of the penta-box integrals with the insertions N (the other two insertions for each penta-box are only non-zero starting at weight 5). As expected, the integrals have interesting features near the singular points, some of which are not always apparent in the plots due to the perspective. The plots in figs. 11a and 11b have clear logarithmic divergences at each threshold point, where the integral tends to +∞. The plots in figs. 11c and 11d are consistent with a divergence to −∞ at P 2 . Regarding the behavior at P 1 , the start of a logarithmic dip towards −∞ can be seen in the imaginary part (see fig. 11d), but the behavior of the real part in fig. 11c is more intricate. We thus take a closer look at this behavior in the region around P 1 in fig. 12a, to show that the real part also has a logarithmic divergence towards −∞ at P 1 . Similar conclusions hold for the integral of the zzz topology: the real part diverges to +∞ at P 2 (see fig. 11e) and −∞ at P 1 (see fig. 12b). The imaginary part diverges to −∞ at both P 1 and P 2 , see fig. 11f. The same analysis was performed for the one-loop pentagon integral at weight four. In fig. 13, we plot the real and imaginary part of the pure integral, which we recall is normalized by a factor of tr 5 (see the pure basis in anc/1loop/pureBasis-1loop.m). This implies that the function is odd under tr 5 → −tr 5 , see e.g. the discussion in section 4.2. It must thus vanish at the edge of the region R, where tr 5 = 0. Given the cut-off at x = 0.998926 . . ., the integral must vanish on all but this edge ofR. This is indeed what we observe (the vanishing is not apparent in the y = 1 edge of fig. 13b, but this is because of the perspective we chose). Figure 13b has a peak around the singular point P 1 , consistent with the start of a logarithmic divergence. In fig. 14 we close in on that region and see that the condition that the function should vanish at y = 1 eventually wins and, as expected, pushes the integral back to zero on the edge.

Validation
We have performed several checks on the results obtained with our approach for the numerical evaluation of the integrals. Aside from verifying that we obtain the correct values for integrals that are trivial to evaluate at one and two loops, we have validated our program with the following checks: • Two independent implementations of the approach were made to check for internal consistency.
• We compared the high-precision evaluations to values obtained from the pySecDec program [70]. All one-loop integrals were validated up to weight four on the physical and Euclidean phase-space points. All two-loop integrals were checked to match the pySecDec results within error estimates on the Euclidean point.
• The integrals of the mzz topology were validated against the results of ref. [31]. We tested at least one integral per sector on all 6 physical phase-space points. We found agreement, including for a high-precision comparison with 128 digits.
(e) Re(I 3 ) of zzz topology. (f) Im(I 3 ) of zzz topology. Figure 11: Integrals plotted overR, a two-dimensional sub-region of the physical region defined in eq. (7.12). The integrals are singular at the point P 1 of eq. (7.11) on the top edge (y = 1) of the unit square, and on the right edge (x → 1) of the unit square, corresponding to the threshold at the point P 2 of eq. (7.11).

Conclusions
In this paper we described the computation of the full set of planar two-loop master integrals with one massive and four massless legs. These integrals are the complete set required to compute the amplitudes necessary for NNLO predictions of W -boson production in association with two jets in the leading-color approximation at the LHC. Furthermore, they are also a crucial ingredient for these amplitudes beyond leading color, and for Z-or Higgs-boson production in association with two jets at the LHC.
We computed the master integrals by obtaining canonical differential equations which we solved using generalized series-expansion techniques. In order to construct the differential equations we found a basis of pure master integrals, all with surprisingly compact integrand representations. It would be interesting to further explore the mathematical properties which make them pure. The analytic differential equations were then constructed using numerical techniques, following the approach introduced in [49]. Importantly, we showed how finite fields, despite not being algebraically closed, can be used throughout the calculation even when intermediate stages require taking square roots, as 50% of the elements of the field are perfect squares. Beyond a pure basis, this method of constructing the differential equation requires a priori knowledge of the symbol alphabet. Remarkably, we find that the alphabet can be constructed by considering differential equations for only maximal and next-to-maximal cut integrals, which can easily be constructed analytically.
The alphabet displays a number of notable features. Firstly, despite the complex nature of the five-point one-mass kinematics, the full set of letters can be written in a remarkably compact form. We expect the alphabet itself to be of great use in the future. Indeed, it forms the minimal necessary information required for the construction of pentagon functions, extending the construction of ref. [32] to five-point one-mass kinematics. These are a valuable tool for compactly presenting scattering amplitudes, which has been shown to be of great use in the reconstruction of analytic results from numerical data [4,5,11,12].
We also considered the analytic structure of the master integrals at symbol level and made a number of interesting observations. First, we find that certain letters which arise in the master integrals at all orders in are in fact not present in the symbol at weight four-that is, they decouple from the four dimensional physics. Second, we confirm that the master integrals satisfy the extended Steinmann relations to all orders in , and also observe that there are other as-yet unexplained similar relations.
In order to solve the differential equations, we employed the generalized series-expansion method of ref. [81] both to compute the integrals in all kinematic regions relevant for vectorboson production in association with two jets, and to obtain Euclidean boundary conditions from consistency conditions of the differential equation. We demonstrated the viability of the method for applications to LHC physics through a number of numerical studies, both at high precision for individual phase-space points and more generically over physical regions.
A natural next step is to consider the non-planar extension of the integrals considered here, especially given their relevance for precise predictions for the production of a Higgs boson in association with two jets at hadron colliders. In the case of massless scattering, it was observed that the non-planar symbol alphabet could be obtained through permutations of the planar alphabet, and it would be interesting to see if this also holds here. As the generalized-series approach is powerful and applicable to any first order linear differential equation, it would also be interesting to develop an automated public implementation for general Feynman integrals.
Another useful parametrization is that which rationalizes both tr 5 and √ ∆ 3 simultaneously. 10 The corresponding change of variables is given via