Evolution and interpolation of double parton distributions using Chebyshev grids

Double parton distributions are the nonperturbative ingredients needed for computing double parton scattering processes in hadron–hadron collisions. They describe a variety of correlations between two partons in a hadron and depend on a large number of variables, including two independent renormalization scales. This makes it challenging to compute their scale evolution with satisfactory numerical accuracy while keeping computational costs at a manageable level. We show that this problem can be solved using interpolation on Chebyshev grids, extending the methods we previously developed for ordinary single-parton distributions. Using an implementation of these methods in the C++ library ChiliPDF, we study for the first time the evolution of double parton distributions beyond leading order in perturbation theory.


Introduction
Double parton scattering (DPS) is a mechanism in hadron-hadron collisions in which two partons from each hadron initiate two separate hard-scattering processes.The Tevatron and the LHC have provided evidence for this mechanism for a wide range of final states with heavy flavors, jets, photons, or electroweak gauge bosons [1][2][3][4][5][6].While often suppressed compared with single-parton scattering, DPS can be significant and even dominate in specific final states or kinematic regions.A prominent example is like-sign W production [5,[7][8][9][10][11][12][13], which (via its leptonic decay) also provides a background to searches for physics beyond the Standard Model [14,15].Significant progress in the theory description of DPS has been made in the last decade [16][17][18][19][20][21][22][23][24], and there is a sustained interest in phenomenological aspects (see for instance the recent study in [25] and the references therein).A detailed account of theoretical and experimental aspects of DPS is given in ref. [26].
Reflecting the multitude of information they contain, DPDs depend on a large number of variables, namely the momentum fractions x 1 and x 2 of the two partons, the transverse distance y between them, and the renormalization scales µ 1 and µ 2 associated with each parton (corresponding to the factorization scales in the two hard-scattering processes that the partons initiate).Their scale dependence is well understood (see eq. ( 1) below), but the numerical delivery of evolved DPDs is a computational challenge.We believe that this represents one of the bottlenecks for using realistic forms of DPDs in predictions, either at the analytical level or in the form of Monte Carlo event generators [41][42][43][44][45].In the present work, we present a method that addresses this challenge, allowing for high numerical accuracy at a moderate computational cost.
We recall that the evolution equations for PDFs can be solved numerically either by discretization in the momentum fraction x or in Mellin space, and that public codes are available for both options, see for instance refs.[46][47][48][49][50] and [51][52][53].For DPDs, even the simplest realistic initial conditions for evolution have a correlated dependence on x 1 and x 2 (see eqs. (7) and (8) below), which precludes the analytic computation of complex Mellin moments in these two variables.As far as we can see, this makes a discretization in x 1 and x 2 inevitable.
The main reason why the handling of evolved distributions is much more demanding for DPDs than for PDFs is the sheer amount of computer memory needed to store the former.As an example, consider interpolation grids for x 1 and x 2 with 64 points (for PDFs, this is at the lower end of typical grid sizes in the LHAPDF interface [54]), and grids with 48 points for y, µ 1 , and µ 2 .This corresponds to 64 2 × 48 3 ≈ 4.5 × 10 8 real numbers and thus (with 8 bytes for a double precision floating-point number) to 3.4 GiB per parton flavor combination.A full unpolarized DPD with 5 active quark flavors has 11 2 = 121 parton flavor combinations.Getting evolved DPDs from pre-computed grids would thus require an enormous amount of memory (with an added penalty of accessing this memory for interpolation in the five variables).The path we have chosen instead is to store a DPD only for a pair of initial scales, and to evolve "on the fly" to the desired values of µ 1 and µ 2 .In the above example, this reduces the memory imprint to 64 2 × 48 points and thus to 1.5 MiB per flavor combination.
We will see in subsection 3.4 that the computational effort for evolving a DPD with p x grid points in x 1 and x 2 scales like p 3 x , whilst the scaling with the number p y of points in y is at most linear.It is therefore of great importance to limit the size of interpolation grids for the momentum fractions, without sacrificing numerical accuracy.In our previous work [55] we found that a highly accurate interpolation and evolution of PDFs in x space is possible using Chebyshev grids.In the present paper, we show how to adapt this approach to the case of DPDs.We find that with p x ∼ 54 and p y ∼ 40 to 56 points, one can achieve high accuracy for the interpolation and integration of DPDs with x 1 , x 2 ≥ 10 −5 and y ≥ (7 TeV) −1 .
We have implemented our approach in the C++ library ChiliPDF, 1 which is under development and which will be made public in the future.Our implementation demonstrates that the methods we will describe do work in practice.It also allows us to explore a number of physics aspects.So far, DPD evolution has been studied only with DGLAP kernels at leading order (LO).We extend this to next-to-leading and next-to-next-to-leading order (NLO and NNLO).Moreover, our implementation includes the change in the number of active quark flavors at specified matching scales, and access to distributions with different scales (µ 1 and µ 2 ) and flavor numbers (n 1 and n 2 ) for the two partons.Last but not least, we can evolve DPDs for polarized partons, which may for instance have measurable impact on observables in like-sign W production according to the studies in refs.[12,13].
This paper is organized as follows.In section 2.1 we recall some basics about DPDs and set up the corresponding notation.Using our implementation for a physics study, we compare in section 2.2 the evolution of DPDs at LO, NLO, and NNLO.In section 3, we summarize the method we developed in ref. [55] for interpolating and evolving PDFs and describe its extension to DPDs.We study the joint interpolation in the two momentum fractions x 1 and x 2 in section 4, and the evolution in the two scales µ 1 and µ 2 in section 5. Section 6 is devoted to the interpolation in the distance y between the partons, from some minimum value to infinity.In section 7 we cross check our implementation against the DPD evolution code DOVE, which was originally introduced in ref. [28] and further developed in subsequent work.We conclude in section 8.In appendix A we prove an important statement about the independence of DPD evolution and flavor matching on the path taken in the (µ 1 , µ 2 ) plane.

Theory framework
To begin with, let us recall some properties of DPDs that are relevant to this work.A DPD F n 1 ,n 2 a 1 a 2 (x 1 , x 2 , y; µ 1 , µ 2 ) depends on the longitudinal momentum fractions x 1 and x 2 of the two partons, on their distance y in the transverse plane, and on the factorization scales µ 1 and µ 2 associated with each parton.The labels a 1 and a 2 specify the flavor and polarization of each parton.DPDs involving transverse quark or linear gluon polarization have open Lorenz indices and depend on the difference y of the transverse parton positions rather than on the length y of this vector.As specified in section 2 of ref. [56], they can be decomposed into scalar distributions that depend on y.The evolution and matching equations discussed below hold separately for these scalar distributions.Throughout this work, we consider only DPDs in the color-singlet channel, i.e.where the color is summed over separately for each of the two extracted partons.
Notice that we allow different numbers n 1 and n 2 of active quark flavors for the two partons, which is useful when the hard-scattering processes initiated by a 1 and a 2 take place at very different scales.Technically, n 1 and n 2 are specified by the renormalisation prescription for the twist-two operator pertaining to parton a 1 and a 2 , respectively.
The scale dependence of a DPD is given by the evolution equations with separate Mellin convolutions for the two momentum fractions.Although we define each convolution integral in eq. ( 2) with the upper integration limit equal to 1, the effective integration range is reduced by the support property of DPDs, which is conserved by the evolution equations (1).The evolution kernels P n ab (z; µ) in eq. ( 1) are identical to the familiar DGLAP kernels for the evolution of PDFs with n active quark flavors.Note also that the distance y plays no active role in the scale evolution, so that each "slice" of a DPD at fixed y evolves by itself.
We remark in passing that the homogeneous evolution equations (1) hold for DPDs depending on the interparton distance y.An inhomogeneous term appears if one integrates the distributions over y or performs a Fourier transform from y to its conjugate transverse momentum [19,24].This term is closely related with the splitting contribution (7) discussed below and has been extensively studied in the earlier literature [28,[57][58][59][60]. Throughout this work, we will exclusively work with DPDs that depend on y and evolve as given in eq. ( 1).
The transition from n − 1 to n active flavors in a DPD is described by matching equations where m n is the mass of the quark with flavor number n and the matching kernels A n ab (z, m n ; µ) are the same as the ones for PDFs [61]: Eq. ( 1) is a coupled set of differential equations in µ 1 and µ 2 .In appendix A we show that with an initial condition at some point (µ 01 , µ 02 ) one obtains a unique DPD regardless of the evolution path taken in the (µ 1 , µ 2 ) plane.We will also show that the result of successive flavor matching does not depend on the order in which the flavor thresholds for the two partons are crossed.Note that the path independence of evolution and flavor matching in the (µ 1 , µ 2 ) plane is exact even if the perturbative expansions of the DGLAP and matching kernels are truncated.This is in contrast to the independence on the scale at which the flavor matching is performed, which only holds to the perturbative order of the truncation.
The following evolution and matching kernels are currently implemented in ChiliPDF and used in the present work: • DGLAP evolution for unpolarized and longitudinally polarized partons up to NNLO (order α 3 s ).For the NNLO kernels we use the approximate parameterized forms given in refs.[62,63] and refs.[64,65].The kernels in these references were confirmed by the calculations in refs.[66,67].
• DGLAP evolution up to NLO for transversely polarized quarks and linearly polarized gluons, with the kernels given in refs.[68,69].
• Flavor matching for polarized partons up to NLO.At order α s , the matching kernels are directly proportional to the DGLAP kernels for the same polarization.
We now take a closer look at the dependence of DPDs on the kinematic variables x 1 , x 2 , and y.The y dependence is not directly observable, because DPDs enter cross sections (differential or integrated) via so-called double parton luminosities where y is integrated over with a lower cutoff y min of order 1/ min(µ 1 , µ 2 ).The origin of this cutoff is explained in ref. [24], where it is also shown how the dependence on this cutoff is removed when the cross sections for single and double parton scattering are combined.The form in eq. ( 6) holds for unpolarized and longitudinally polarized partons; for transverse quark or linear gluon distributions additional tensors depending on y appear under the integral.
At sufficiently small distances y, the dominant contribution to a DPD arises from the perturbative splitting of a single parton into the two observed partons a 1 and a 2 .At leading order in the coupling, this mechanism gives [19,24] F spl a 1 a 2 (x 1 , x 2 , y; µ y , µ y ) = where the scale µ y should be taken of order 1/y so as to avoid large logarithms ln(yµ) in higher-order corrections.In addition, there is an "intrinsic" two-parton contribution, which lacks the 1/y2 enhancement of the splitting part at small y but grows more strongly at small x 1 and x 2 for gluons and sea quarks.At distances y of nonperturbative size, our knowledge of DPDs is rather limited, and in practice one needs to make a model ansatz.Where needed, we will take recourse to the model used in refs.[24,30], where a DPD is written as a sum F int + F spl of intrinsic and splitting parts at all y.At a starting scale µ 0 , the intrinsic part of the DPD is assumed to be proportional to the product of two PDFs: with a phase space factor to ensure that the DPD smoothly goes to zero at the kinematic boundary Other models in the literature take only a power (1 − x 1 − x 2 ) r ; for the motivation of the form (9) we point to section 3.2 in ref. [28].
The splitting term in the model of refs.[24,30] is given by the perturbative splitting expression (7) multiplied with the same exponential factor exp −y 2 /(4h a 1 a 2 ) as in the intrinsic part.This retains the correct small-y limit whilst providing a more realistic decrease at large y.

Quantitative impact of higher orders in DPD evolution
With the methods presented in this work, we are able to evolve DPDs at different perturbative orders, for all relevant polarization combinations and with different scales and active flavor numbers of the two partons.As a demonstration, we now compare DPDs evolved at LO, NLO, and NNLO from a common starting condition.
As starting condition we take the perturbative splitting form (7) with n 1 = n 2 = 3 active flavors.The splitting formula is evaluated at µ y = 2 GeV and y ≈ 0.561 GeV −1 , which corresponds to the scale choice µ y = b 0 /y with b 0 = 2e −γ E ≈ 1.12, where γ E is the Euler-Mascheroni constant.This choice is suggested by the form of the NLO corrections to the splitting formula (7) and was found to keep these corrections at a moderate size [73,74].For the PDFs in the DPD splitting formula, we take the default NNLO set of the MSHT20 parameterization [75]. 2 The parameterization assumes quark masses m c = 1.4 GeV, m b = 4.75 GeV and a five-flavor coupling α s (m Z ) = 0.118.
We evolve these initial conditions to µ 1 = 9 GeV with n 1 = 3 for the first parton and to µ 2 = m W with n 2 = 5 for the second one.This is a setting relevant for the production of a J/Ψ and a W , a channel for which double parton scattering has in fact been observed experimentally [4,76].Flavor matching for the second parton is performed at the charm and bottom quark masses, with kernels of the same order (LO, NLO, or NNLO) as the evolution kernels.An exception is longitudinally polarized evolution at NNLO, where we use NLO flavor matching.
We show the effect of the higher-order DGLAP evolution for a selection of unpolarized DPDs in figure 1.The distributions are given in the top panel of each plot, and their ratios with respect to the leading-order distribution in the lower panel.At small momentum fractions, higher-order evolution has an effect of order 10% on F gg , whereas for the other flavor combinations shown in the figure, the effect is of order 1.In all cases, we find that the change from LO to NLO is much larger than the rather moderate change from NLO to NNLO.
A selection of polarized DPDs is shown in figure 2. For linear gluon and transverse quark polarization, we have only implemented evolution at LO and NLO.We find that the NLO corrections to evolution introduce effects of order 10% for ∆u∆ū and δuδ ū in different regions of x 1 , where as for ∆g∆g they reach 70% at small x 1 .For δgδg, one obtains a huge relative difference between NLO and LO evolution in the limit x 1 x 2 .This is because for z → 0 the splitting function P δgδg (z) vanishes like z at LO but grows like 1/z at NLO [69].At both orders, the evolved distribution at x 1 x 2 remains however small compared with its peak around As is the case for PDFs, DPDs are not observables, and distributions evolved at different orders are combined with parton-level cross sections computed at different orders.Nevertheless, the strong evolution effects we observed in several channels and kinematic regions suggest that NLO corrections in double parton scattering can be of substantial size.
In figure 3, we illustrate the scale dependence of DPDs, using evolution at NNLO and flavor matching at the highest available order (NNLO for unpolarized partons and NLO for polarized ones).We show the distributions at the starting point (µ 1 , µ 2 ) = (2 GeV, 2 GeV), and after evolution to different combinations of the scales µ i = 9 GeV or µ i = m W for the two partons, with associated flavor numbers n i = 3 in the first and n i = 5 in the second case.For unpolarized partons, we find that the effect of evolution from µ 1 = µ 2 = 2 GeV to 9 GeV is huge at small x, turning a shape that is flat or increasing with x at the low scale into a shape decreasing with x at the higher scale.Subsequent evolution to m W for one or both partons has a significant, although less dramatic, effect.For polarized partons, evolution is generally weaker, and we find DPDs increasing with x at small x even when both partons are taken at the high scale m W . tion of the DGLAP evolution equations.A detailed account of Chebyshev interpolation and its applications can be found in ref. [77]. 3

Chebyshev interpolation
We start by reviewing Chebyshev interpolation of a function The function is discretized on a grid consisting of the so-called Chebyshev points, which for given N read These are the points where the N th Chebyshev polynomial of the first kind, T N (t), assumes its maxima +1 and minima −1.They form a descending series from t 0 = 1 to t N = −1 and satisfy the symmetry property t N −i = −t i .We call the set of Chebyshev points a Chebyshev grid.
A sufficiently smooth function f (t) on t ∈ [−1, 1] can be approximated by the Chebyshev interpolant p N (t).This is a series of Chebyshev polynomials T j (t) with j = 0, . . ., N whose expansion coefficients are such that one has p N (t i ) = f (t i ) at the Chebyshev points.For details see equations (2.6) to (2.8) in ref. [55].
Barycentric formula.A simple and efficient way to compute the Chebyshev interpolant without having to evaluate Chebyshev polynomials is given by the barycentric formula with the barycentric basis functions where β 0 = β N = 1/2 and β i = 1 otherwise.Even though it is not directly evident from eq. ( 12), the b i (t) are polynomials of order N .The number of operations for evaluating the barycentric formula scales linearly with N .The barycentric formula is found to be numerically stable in the interpolation interval (however, this is not the case for extrapolating the function f (t) outside this interval, see chapter 5 of ref. [77]).
Accuracy estimate.A computationally inexpensive estimate for the accuracy of Chebyshev interpolation is obtained by interpolating f (t) on the Chebyshev grid without the end points t 0 and t N .The resulting interpolant q N −2 (t) is a polynomial of order N − 2 and can be computed with an adapted form of the barycentric formula, see equations (2.23) to (2.27) in ref. [55].Comparing p N (t) with q N −2 (t) yields an estimate for the interpolation accuracy, i.e. for the difference between p N (t) and f (t).This estimate was found to work well for PDF interpolation [55], and we will investigate in later chapters how reliable it is in the case of DPDs.
Integration.Using the expansion of f in terms of Chebyshev polynomials T i (t) and the analytic form for integrals of these polynomials, one readily obtains the integration rule with weights This integration rule is known as Clenshaw-Curtis quadrature.For a detailed discussion of its accuracy (and comparison with Gauss quadrature), we refer to chapter 19 of ref. [77] and to section 2 of ref. [55].
An estimate for the integration accuracy can be obtained by using a quadrature rule on the Chebyshev grid without its end points, in full analogy to the procedure described above for interpolation.The result is known as Fejér's second rule.Details are given in equations (2.31) to (2.33) of ref. [55].

Interpolation strategy
In this subsection, we consider the scales µ 1 and µ 2 to be fixed and omit them as arguments for brevity.We discuss the discretization of DPDs in the three variables x 1 , x 2 , and y, which we generically denote by z.Depending on the typical dependence of a DPD on z, different variable transformations z → u are used to map an interval [z min , z max ] onto a finite interval [u min , u max ].For the momentum fractions x 1 and x 2 we always use the transformation which corresponds to the choice for interpolating PDFs in ref. [55].For both x 1 and x 2 , the interpolation interval has a fixed upper limit x max = 1, whereas the lower limits may be equal or different (but must be nonzero).For the y dependence (where y max may be infinite or finite) we use a number of different transformations, which are presented in subsection 6.1.
For reasons given below, the z intervals are usually split into a few subintervals.On each subinterval, we perform a linear transformation from u to a variable t ∈ [−1, 1], which is used for Chebyshev interpolation as described in subsection 3.1.
Consider a particular subgrid with N + 1 points u 0 , . . ., u N , which is mapped by a linear transform onto the Chebyshev grid t 0 , . . ., t N given in eq. ( 10).The corresponding grid points in the physical variable are then given by the inverse variable transformation z i = z(u i ).Similarly to the PDF case, it is advantageous to interpolate DPDs scaled by their momentum fractions, because this leads to less steep functions in x 1 and x 2 .The interpolation in one variable is given by the barycentric formula where the ellipsis denotes the other variables and The interpolation of the full DPD is obtained by discretization in both momentum fractions and the interparton distance and reads where N x,1 , N x,2 , and N y specify the polynomial order of interpolation in each variable, and denotes the values of F on the interpolation grid.Eq. ( 18) reflects that the form of the barycentric basis functions (12) remains unchanged under a linear transform of the interpolation variable.Analogous formulae can be used to interpolate functions derived from DPDs, such as the Mellin convolutions of DPDs with an integral kernel (see subsection 3.3).Note that we interpolate F (x 1 , x 2 , y) on a full rectangle in the (x 1 , x 2 ) plane, given by x 1 ∈ [x 1,min , 1] and x 2 ∈ [x 2,min , 1].This includes the region x 1 + x 2 > 1 where the DPD is zero.In this way, we can use the same interpolation grid in x 1 for all values of x 2 and vice versa, which in turn permits a simple implementation of DGLAP evolution for the two partons.The implication of this choice on the interpolation accuracy is discussed in section 4.
Chebyshev interpolation is "global" in the sense that the interpolant at a given point z depends on the function values at all points z i in the interpolation interval.For grids with many points, this can lead to an accumulation of rounding errors, especially in regions of z where the function is much smaller than elsewhere in the interval.To limit this effect, it is advantageous to split the full z domain into several subintervals and to use the barycentric formula (19) separately on each subinterval.In ref. [55] this was found to substantially improve the interpolation accuracy for PDFs, provided that the number of points in each subinterval does not become too small.We find that the same is true for the interpolation of DPDs in x 1 and x 2 , as well as in y.
To specify such a composite grid with k individual subgrids, we use the notation where z i are the subinterval boundaries and p i = N i + 1 is the number of Chebyshev points in subgrid number i.We refer to this as a (p 1 , p 2 , . . ., p k )-point grid.Note that neighboring subgrids share their end points, such that the total number of grid points is p total = k i=1 p i − (k − 1).Splitting the interpolation interval in y has the additional benefit that one can use different variable transformations u(y) for regions with different characteristic y dependence, such as the power law behavior from perturbative splitting and an exponential decrease at large distances.Details will be discussed in subsection 6.1.

Mellin convolution
Consider now the convolution of a DPD with an integral kernel, which may be a DGLAP evolution kernel, a flavor matching kernel, or a hard-scattering coefficient.The general expression to be computed reads where x stands for x 1 or x 2 and the dependence of the rescaled DPD on all other variables is omitted for now.The kernel K(z) is appropriately rescaled, given that To discretize eq. ( 22), we consider for simplicity a single Chebyshev grid with p x points for x ∈ [x min , 1].The result of the convolution (K ⊗ F )(x) is a function defined in the same x interval as F (x) and can hence be interpolated on the same grid.It is thus sufficient to evaluate eq. ( 22) at the grid points x i , where in the second step we have approximated F (x i /z) by its interpolated form.Here and in the following, we use the number of grid points p x rather than the polynomial order N x = p x − 1 to specify summation ranges.At the grid points, the convolution is thus obtained by a matrix multiplication where the kernel matrix can be computed using standard numerical integration techniques (we use an adaptive Gauss-Kronrod routine).If K(z) contains plus-or δ distributions, this requires rewriting eq. ( 25) as specified in equations (4.5) to (4.10) of ref. [55].
It is straightforward to generalize the preceding discussion to the case where the interval [x min , 1] is split into several subintervals with corresponding subgrids.The relation (24) remains valid, with an appropriately generalized definition of the kernel matrix and with p x being the total number of grid points in x.
The kernel matrix method can be readily used for flavor matching of DPDs, discretizing the Mellin convolutions with the matching kernels in eq. ( 4).It is also a crucial part in DPD evolution, which is presented next.

DGLAP evolution
We now discuss the evolution of DPDs, starting with evolution in µ 1 at a fixed value of µ 2 .The evolution equation for the rescaled DPD reads where P (z; µ 1 ) = z P (z; µ 1 ) is the rescaled splitting function and we omit parton labels and the associated sums for the time being.After discretization on one or several subgrids with a total number of p x,1 , p x,2 , p y points in x 1 , x 2 , y, the integro-differential equation (26) turns into a linear system of ordinary differential equations in µ 1 , where F ijk (µ 1 , µ 2 ) is given in eq. ( 20) and P ii is the matrix for the kernel P (z) as defined in eq. ( 25) or its generalization to several subgrids.At NNLO accuracy, we have and after a change of variables from the scales µ i to the evolution times we obtain with a new kernel matrix given by ii + e −t 4π P ii + e −2t (4π where β k are the expansion coefficients of the QCD β function (see equation (5.4) in [55] for our normalization conventions).
To solve eq. ( 30) we use a Runge-Kutta algorithm with a fixed maximal step size in t 1 .Notice that the t 1 dependence of K ii (t 1 ) starts only at NLO, which leads to a more uniform accuracy of the algorithm compared with the direct solution of eq. ( 27), where P ii (µ 1 ) changes with ln µ 1 already at LO.As documented in appendix A of ref. [55], we find that the use of a higher-order Runge-Kutta algorithm -in particular the 8 th order algorithm of Dormand and Prince [78] -greatly improves the accuracy of evolution for a given number N RK of intermediate time values at which the r.h.s. of the differential equation is evaluated for evolution from one value of t 1 to another.
The Runge-Kutta algorithm is also used to compute the transformation (29), solving the differential equation for the running of α −1 s as a function of ln µ.Flavor labels and sums have to be added in eq. ( 30) as in the original version (1).The mixing between parton flavors under evolution is the same as for PDFs, and correspondingly we solve the evolution equations in the same basis that is used for PDFs in ref. [55].For each of the two partons, we thus form the linear combinations where q ± = q ± q.The combination Σ − and all flavor differences then evolve by themselves, and mixing is reduced to the combination Σ + and the gluon.
The system (30) needs to be solved separately for each combination of the indices j, k that correspond to the "inactive" variables x 2 and y during evolution in t 1 , and for each of the 2n 2 +1 active flavors of the second parton.This gives a large number of p x,2 p y (2n 2 +1) combinations -typically several thousand.It is much more efficient to apply the Runge-Kutta algorithm not to each of these combinations, but to a set of p x,1 basis vectors e j i that generates all possible initial conditions on the grid for x 1 .With the choice of basis e j i = δ ij , evolution from t 01 to t 1 requires solving the system with the boundary condition U ij (t 01 , t 01 ) = δ ij .The solution of eq. ( 30) is then obtained by a matrix multiplication where the evolution matrix U ii (t 1 , t 01 ) acts as the discretized version of the Green function for the original evolution equation (26).With flavor labels reinstated, eq. ( 33) is solved separately for the evolution kernels of the different channels (i.e. for q + i − q + j , q − i − q − j , Σ − , and the coupled system of Σ + and the gluon).
This approach is also attractive in the PDF case when a large set of different PDFs is evolved simultaneously, and it is for instance used in the PDF evolution codes APFEL [49] and EKO [53], with U ii (t 1 , t 01 ) being called an evolution operator and an evolution kernel operator, respectively.In ChiliPDF, the evolution of a DPD in one scale is implemented in the same way as the evolution of a set of PDFs.
The number of operations for evaluating eq. ( 34) for all indices i, j, k and flavors for the second parton scales like (p x,1 ) 2 p x,2 p y (2n 2 + 1), and the number of operations for solving the evolution equation ( 33) scales like (p x,1 ) 3 N RK with N RK introduced below eq.( 31).The number of grid points for the interpolation in the momentum fractions is therefore a major factor determining the computational speed of DPD evolution.
Let us briefly sketch how the initial conditions for DPD evolution are set up in ChiliPDF.In a first step, the values of F a 1 a 2 x 1 , x 2 , y; µ(y), µ(y) for a selected number n 0 of active flavors are computed on the grid points in x 1 , x 2 and y, either from pre-defined functions for the splitting form (7), or from functions implementing a product ansatz like the one in (8), or from functions entirely provided by the user.As is adequate for the physics, the input scale µ(y) may depend on y as specified by the user.For each grid point y k , the discretized DPD is then evolved to a pair of scales (µ 0 , µ 0 ) that is independent of y and stored as an initial condition.After this, all evolution calls can use the same set of evolution matrices for all grid points y k , which significantly reduces computing time.One may also add two DPDs (after evolution to a common scale pair), which is for instance needed for calculating F int + F spl .In the procedure just described, the flavor number n 0 is equal for both partons.DPDs with larger values of n 1 or n 2 are then obtained by flavor matching, with initial conditions at the pair of scales (µ 1 , µ 2 ) where the matching is carried out.More detail is given in subsection 5.1.

Interpolation of DPDs in both momentum fractions
As shown in the previous section, our approach to interpolating DPDs in x 1 and x 2 is a straightforward two-dimensional generalization of the method presented in ref. [55] for the interpolation of PDFs in x.In the present section, we study the performance of this approach.Let us point out some particularities of the DPD case: 1.To limit computation time, interpolation grids should not have too many points, while still giving numerically reliable results.For PDFs one can typically afford denser grids, and the demands on interpolation accuracy are often higher than for DPDs.
2. The DPD is a non-analytic function of x 1 at the point x 1 = 1 − x 2 because it vanishes identically above that value.Since we interpolate on the same grid in x 1 for all values of x 2 , this singular point is generally in the middle of an interpolation interval.Polynomial interpolation is not very accurate in the vicinity of such a point, but we will see that this effect is in general relatively mild.
3. The perturbative splitting mechanism gives a dependence on x 1 that is qualitatively different from the x dependence of a PDF.In some channels, the splitting contribution (7) goes to zero for x 1 x 2 and thus results in a strong rise of the DPD as a function of x 1 .
Of course, the last two points also apply to interpolation in x 2 at a given x 1 .
In the following, we will study the interpolation accuracy for DPDs that have a known analytic expression.To this end, we simplify the forms ( 8) and (7) to and respectively.For the PDFs, we take the simple parameterizations of the Les Houches benchmark PDFs given in section 1.32 of ref. [79] and section 4.4 of ref. [80].
Corresponding to the typical use case, we take the same interpolation grids for x 1 and x 2 (a situation in which different grids are useful is encountered in section 7).We consider three composite grids of the form where the shorthand notation from eq. ( 21) has been used.We find that for the given x range, three subgrids with the above interval limits generally give the best accuracy for a given total number of points.As a measure for the quality of interpolation, we use the relative accuracy defined as relative accuracy = interpolated result/exact result − 1 .
In the following, this interpolation accuracy will be shown as a function of x 1 for fixed values of x 2 .The latter are not grid points, so that interpolation is performed for both momentum fractions.

Interpolation of the product form
We start with the product form (35) for the three values r = 0, 1, 2 of the power in the phase space factor.The case r = 0 corresponds to a "naive" product form with a discontinuity at x 1 + x 2 = 1 that is not very plausible on physics grounds.In fact, evolution to higher scales quickly changes such a discontinuity to a steep but continuous decrease.We nevertheless include the case r = 0 here, in order to see how our method performs in this limiting case.
The following plots show the uū distribution; we verified that for other parton combinations the interpolation accuracy is not significantly better or worse.
In the left panels of figure 4 we see that all three grids yield an excellent interpolation accuracy at small momentum fractions.Not surprisingly, there is little dependence on the power r in this region, given that the phase space factor is very close to 1 for x 1 , x 2 1.In the right panels of the figure, we see that a degradation of the relative accuracy sets in as x 1 approaches its kinematic limit 1 − x 2 = 0.7, and that the degradation is stronger for smaller r.Nevertheless, the accuracy is better than 1% as long as the distance 1 − (x 1 + x 2 ) from the kinematic threshold is above 0.15 for r = 1, 2 and above 0.2 for r = 0.
The plots in figure 5 show the x 1 region around the point where the DPD has a nonanalytic behavior, which is at x 1 = 0.7 in this example.In these plots, we include the region x 1 > 1 − x 2 , where the interpolated DPD oscillates around its true value zero.Note that this region is included when one computes a Mellin convolution for the DPD with the kernel matrix method described in subsection 3.3.However, the effect of oscillations around the true value will partially cancel out in convolution integrals.
We see that for r = 0, one obtains a rather poor approximation of the DPD in the range x 1 ∈ [0.6, 0.8] with the coarse or medium grid.In situations where this region is deemed important, one should take a dense grid and check the reliability of the results by comparing with an even denser grid.Of course, this is only needed in the relevant subinterval (i.e. for x 1 ∈ [0.5, 1] in our example) and hence entails only a moderate increase in the total number of grid points.For r = 1, where the DPD is continuous but has a discontinuous first derivative at x 1 = 0.7, we see that the fine grid performs very well, and for r = 2 already the medium grid yields a good approximation of the DPD.

Interpolation of the splitting form
We now turn to the interpolation of the splitting form.We see in eq. ( 36) that it decreases almost as fast as a PDF for x 1 + x 2 → 1, so that according to the results of the previous subsection one does not expect particular problems with interpolation in that region.This is confirmed by our numerical studies.figure 4, which can be understood because the splitting form has more structure in that region.This is also seen for the interpolation of F spl δgδg at small momentum fractions in panel (c), where the accuracy of the medium grid is at the permille level.In panel (d) we see that the linear decrease of the distribution for x 1 x 2 requires the fine grid for a satisfactory interpolation at the smallest x 1 , where the DPD is many orders of magnitude smaller than its maximum value.On the other hand, the medium grid still performs well for x 1 > 10 −4 .We note that the coarse grid does not give a good accuracy in any of the cases considered here, and it is entirely unreliable in the case of panel (d).We conclude that at at least 16 points per subgrid should be taken, unless one considers only unpolarized DPDs and has only very moderate accuracy requirements.

Estimating the interpolating accuracy
Let us finally study the reliability of the accuracy estimate for interpolation described in subsection 3.1, given by relative accuracy estimate = value interpolated without end points value interpolated with end points − 1 .
In figure 7, we compare this estimate with the true relative accuracy (38) for the medium x grid in eq. ( 37).We see that the estimate is typically not more than two orders of magnitude away from the true accuracy (except around points where the error estimate or the true error has a zero crossing).Somewhat surprisingly, the estimate almost coincides with the true accuracy in some cases, as seen in figure 7(a).Closer inspection reveals that the estimate tends to be above the true accuracy close to subinterval end points; this is expected because in that region interpolation without the end points is particularly bad by construction.We find a qualitatively similar picture for the accuracy estimate in other kinematic settings, and for the other two grids in eq.(37).Although not perfectly accurate in all regions, the method based on eq. ( 40) provides a good and easy-to-compute indicator for the quality of interpolation.

Evolution and flavor matching of DPDs
In this section, we demonstrate the accuracy of DGLAP evolution and flavor matching for DPDs with the methods presented in subsections 3.3 and 3.4.As a default setting for DPD evolution, we take a maximal step size of h = 0.22 for the evolution time (29) in the Runge-Kutta method.We use the 8 th order algorithm of Dormand and Prince [78], where each Runge-Kutta step is divided into 13 substeps for which the r.h.s. of the evolution equation is evaluated.
As initial conditions we take the splitting form (7) evaluated with the Les Houches benchmark PDFs of refs.[79,80].Following the settings in these references, we take as initial scale with n = 3 active flavors for both partons.The evolution accuracy is independent of y, and we fix this variable to the value y = b 0 /µ 0 ≈ 0.794 GeV −1 .Unless  True interpolation accuracy in x 1 and x 2 compared with its estimate (40).All values refer to the medium x grid in eq.(37).The DPDs used are specified in eqs.(35) and (36).
specified otherwise, flavor matching is carried out at the canonical scales µ q = m q , with the quark masses m c = √ 2 GeV, m b = 4.5 GeV, and m t = 175 GeV.The strong coupling at the initial scale is taken to be α s (µ 0 ) = 0.35.Throughout this section, we consider unpolarized partons and use evolution and flavor matching at NNLO.We specify the scales and the number of active flavors for the two partons in the form (µ 1 , µ 2 ) and (n 1 , n 2 ).
Guided by the studies in section 4, we interpolate both x 1 and x 2 on the grid This corresponds to the medium grid in eq. ( 37), except for the third subinterval, where we afford a finer grid for improved interpolation accuracy at large momentum fractions.

Path independence of evolution and flavor matching
For a given heavy quark flavor q, we apply the matching equations (4) at a fixed scale µ q , so that in the (µ 1 , µ 2 ) plane we match at the line (µ q , µ 2 ) for the first parton and at the line (µ 1 , µ q ) for the second one.We show in appendix A that, under these conditions, the result of evolution and flavor matching between two points in the (µ 1 , µ 2 ) plane is independent of the evolution path, including the order in which the different flavor matching steps are carried out.We verify this numerically by comparing the results of evolving and matching from (1 GeV, 1 GeV) with (3,3) flavors to (500 GeV, 1000 GeV) with (6,6) flavors along the two paths shown in figure 8.The path in black is the default taken in ChiliPDF when matching from 3 to 6 flavors for both partons and then evolving to the final pair of scales.The staircase-like path in red can be selected by subsequent evolution calls for each line segment.Each point marked by a dot in the figure corresponds to the initial condition of the DPD with given flavor numbers (n 1 , n 2 ) in ChiliPDF.For any scale pair (µ 1 , µ 2 ), the DPD with these flavor numbers is computed by evolution from this initial condition.
The results of evolving and matching along the two specified paths are in excellent agreement, as illustrated in figure 9 for the t t distribution, for which the differences are largest among all flavor combinations.The agreement deteriorates close to the kinematic limit x 1 = 1 − x 2 but remains small compared with the interpolation accuracy in that region.The choice for a particular evolution path in our implementation is therefore of no consequence for the accuracy.

Backward evolution accuracy
A stringent way to test the accuracy of an evolution algorithm is backward evolution from large to small scales.Whilst small inaccuracies of distributions at a given scale are diminished when evolving to higher scales, they are amplified by backward evolution.In the following, we adapt the backward evolution test performed for PDFs in ref. [55] to the case of DPDs.
We start with the three-flavor input DPD specified at the beginning of this section, match to 4 flavors at the input scales (µ 0 , µ 0 ), and then evolve to (µ low , µ low ) with µ low = m b /2 = 2.25 GeV.At this point, we match from 4 to 5 flavors for both partons.Since the matching scale differs from m b , all DPDs are nonzero at that point, including the ones with bottom quarks or antiquarks.We take these distributions as starting conditions for evolution with 5 flavors, evolving

Characteristic y dependence and associated variable transformations
It is natural to distinguish the region of small y, where perturbative mechanisms are at work from the domain of large y.
As mentioned earlier and discussed at length in ref. [24], DPDs at small y can be represented as the superposition of a part due to perturbative splitting, with a power behavior like y −2 , and an "intrinsic" part that behaves like y 0 and is related to twist-four distributions.Both power laws are modified by logarithms from loop corrections.The relative weight of the two parts strongly depends on the parton combination and on the momentum fractions x 1 and x 2 in the DPD.As already mentioned, the y −2 behavior can be substantially flattened by evolution to high scales.In the small-y region, the transformation u(y) should therefore be suited to handle the interpolation of a power behavior y −q with 0 ≤ q ≤ 2.
A complication arises from the fact that for certain parton combinations, such as uu or u d, the leading-order splitting contribution ( 7) is zero.Quark-gluon mixing results in nonzero values for these combinations when evolving from the scale µ y ∼ 1/y at which the splitting formula is evaluated to the final scale µ.Hence, the splitting part of the DPD at scale µ has a zero around y ∼ 1/µ.The position of this zero is shifted when the intrinsic part is added.
At large distances, one expects a gradual decrease of DPDs governed by some characteristic nonperturbative mass scale.This is often modeled by a Gaussian, F ∼ exp(−M 2 y 2 ), but one may also argue in favor of a Yukawa-like decrease, F ∼ exp(−M y).The characteristic mass scale M may depend on the parton combination and on x 1 and x 2 [41,81], and it can slowly change under evolution.
We now discuss variable transformations that we found to work well for the types of y dependence just discussed.An overview of the transformations and their properties is given in tables 1 and 2. By convention, all transformations are such that u(y) has a positive derivative in the domain where it is used.In the case of interpolation, the function f (y) in table 2 represents a DPD at fixed momentum fractions or scales.For sum rule integrals f (y) is 2πy times a DPD (see eq. ( 55)), and for double parton luminosities f (y) is 2πy times the product of two DPDs (see eq. ( 52)).The overall normalization of f (y) is irrelevant for the interpolation or integration accuracy and therefore set to an arbitrary value here.
The last three transformations in the table are designed for functions with a decrease like exp(−M 2 y 2 ) or exp(−M y) at large y, possibly modified by a power law in y.They map the physical interval y ∈ [0, ∞] onto u ∈ [−1, 0] and depend on a mass parameter m, which is normalized such that dy/du = 4/m at y = 0.The integral is computed with the Clenshaw-Curtis quadrature rule, which includes the interval boundary u = 0 where the Jacobian dy/du is infinite.The upper limit on m specified in the last column of table 2 ensures that the product dy/du f y(u) is zero at the point u = 0, so that this point does not contribute to the Clenshaw-Curtis rule.We note that the value no.
− u(y) y(u) y(−1) dy/du Table 1.Variable transformations suitable for interpolation in y.The parameters α and m must be positive.All transformation satisfy u ≤ 0, y(0) = ∞, and dy/du Behavior of the variable transformations in table 1 for selected functions f (y).In the third column we used exp −cL(u) = |u| c .The condition in the last column ensures that dy/du f y(u) → 0 for u → 0. It is understood that the parameters M and m are always positive.
of the mass scale M in the integrand depends on whether one integrates a product of two DPDs or a single DPD.
Inverse power law: Transformation no. 1 in the table is u(y) = −y −α .We will show that with α ∼ 0.2 this is well suited for the range of power laws y −β expected for DPDs at small distances.The more general form u(y) = −(y + a) −α can be used down to y = 0 if a > 0, but we will not need this here.
Gaussian: Transformation no. 2 in the table has a Gaussian falloff at large y and is designed for functions with a Gaussian behavior, f (y) ∼ exp(−M 2 y 2 ).The transformed function f y(u) approximately behaves like a power of |u| for u → 0, and we find that good interpolation and integration accuracy is obtained for m ∼ M .
Exponential: Transformation no. 3 in the table is an exponential in y.It is suitable for both f (y) ∼ exp(−M y) and f (y) ∼ exp(−M 2 y 2 ).In the first case, the transformed function f y(u) behaves like a power of |u| for u → 0, whereas in the second case it falls off faster than any power of |u|.In both cases, values m ∼ M yield good interpolation and integration accuracy.
Exponential with square root: Transformation no. 4 in the table is designed for functions f (y) ∼ exp(−M y) with a Yukawa-type decrease.The transformed function f y(u) decreases faster than any power of |u| for u → 0, and good accuracy is obtained with m ∼ M .
When handling DPDs with different flavor combinations and in different kinematic regions, the mass scale M of f (y) in the preceding discussion does not have a unique value.M should instead be understood as an average, or in the case of the bounds in the last column of table 2 as the lower limit of the relevant values.

Composite grids
Given the different behavior of DPDs at small and large y, it is natural to use interpolation grids with different transformations u(y).At small y, there is an additional physics reason for using several subgrids.Not only the scale µ y ∼ 1/y but also the number n of active flavors for which the splitting DPD is naturally initialized depends on y.Since the value of a DPD in general changes with n, one should use a unique value of n in a subgrid, so as to avoid interpolating discontinuous functions.As discussed at length in ref. [82], it is appropriate to evaluate the perturbative splitting formula (7) with n active flavors for γ m n < µ y < γ m n+1 with γ ∼ 1.We therefore place subgrid boundaries in y at values around the inverse heavy quark masses.
In the following studies, we will assume a y dependence for a single DPD, with κ = 0 for the intrinsic and κ = 1 for the splitting part.We take M = 0.21 GeV, which roughly corresponds to the Gaussian parameter in quark-gluon DPDs of the model used in refs.[24,30].As for the function f (y) in table 2, the overall normalization of F (y; κ) is irrelevant for the interpolation accuracy.
The smallest value of y needed in a given physics setting corresponds to the inverse of the largest hard scale that will be considered.In the following, we will take y min = 1/(7 TeV), which covers the production of very heavy states at the LHC.For the interpolation of eq. ( 44) and for the computation of the associated double parton luminosities, we use the following combination of grids and transformations: 1/(7 TeV), 1/m b , 1/m c p 1 , p 2 inverse power law transformation with α = 0.2, (45) where the "inverse power law" and the "Gaussian" transformation are respectively given in entries no. 1 and 2 of table 1.We find that taking m = 2M results in a somewhat better overall accuracy than the rule-of-thumb value m ∼ M given in the previous subsection.We will consider three different settings for the number of points in each subgrid: 12,6,12) (coarse y grid) (16,8,16) (medium y grid) (24,12,24) (fine y grid) Relative interpolation accuracy at large y for the grid in eq. ( 46) with p 3 = 24 points (fine grid) and for the composite grids specified by eqs.( 49), (50), and (51) (finer and finest grid).The interpolated functions are the same as in figures 13(a) and 13(b), respectively.
As an example for this case, we take the simple form with different values of y 0 and the same mass parameter M = 0.21 GeV as before.The resulting accuracy is shown in the bottom panels of figure 13.We find that the coarse grid performs poorly, whereas with the medium grid the relative accuracy stays below 1% except in the immediate vicinity of the zero crossing.Very high accuracy is obtained with the fine grid.We note that for the intermediate y grid from 1/m b to 1/m c , the setting with p 2 = 8 grid points is sufficient to obtain a relative interpolation accuracy of 10 −6 or better in all cases considered here.In each of the grids for smaller and at larger y, one may take either 16 or 24 points, depending on how stringent the accuracy requirements are.This gives 40, 48, or 56 for the total number of points. 4ith the grids considered so far, the interpolation accuracy at very large y becomes poor mainly because there are too few grid points in that region.For many purposes, the behavior of the distributions in the extreme tail region is not of particular interest, and it contributes only little to integrals over y.Nevertheless, we wish to show that with an appropriate choice of subgrids and grid transformations, one can also accurately describe this region.To this end, we replace the large-y grid in eq. ( 46) with The integral is then computed using Clenshaw-Curtis quadrature on the reduced subgrid and on the subgrids for y ≥ y b .Let us investigate the accuracy of this procedure for the double parton luminosities that correspond to the simplified shapes of DPDs in eq.(44).We consider the cases κ 1 + κ 2 = 0, 1, 2, which respectively correspond to taking the intrinsic parts of both DPDs, the intrinsic part of one and the splitting part of the other DPD, or the splitting parts of both DPDs.Although only the sum of intrinsic and splitting parts enters in physical cross sections, it is useful to investigate these different combinations separately, as was for instance done in the studies in refs.[24,82].
The integration accuracy can be computed using the exact expression of ( 52) in terms of an incomplete gamma function.The result is shown in the left panels of figure 16 for the interpolation grids introduced in subsection 6.2.We see that for y min below 1 GeV −1 , good accuracy is achieved even with the coarse grid.For cutoff values above 1 GeV −1 , the accuracy quickly becomes unsatisfactory unless the fine grid is used.This reflects the degrading interpolation accuracy in that region, which was shown in the upper panels of figure 13.In physics applications, y min is of the order of an inverse hard scale, so that large values of the cutoff will only be needed in special circumstances, for instance if one is interested in the partial contribution of large y to an integral.We see that the fine grid setting is useful in such a case.
In the right panels of figure 16, we compare the true integration accuracy with its estimate obtained from comparing the values obtained with Clenshaw-Curtis quadrature and with Fejér's second rule (see subsection 3.1).We see that for y min below 1 GeV −1 , one obtains an overestimate of the true error by one or two orders of magnitude.This is qualitatively similar to what we found for integrals of PDFs over x in section 3.4 of ref. [55].

Cross check: comparison with DOVE
As a cross check for our implementation of DPD evolution in ChiliPDF, we performed a comparison with DOVE, a code that was introduced in ref. [28] and further developed or adapted in subsequent work in refs.[24,30,44,45,81]. 5The comparison is done with the version that was used in ref. [30] to compute the Gaunt-Stirling sum rules for DPDs.In the following, we show the momentum sum rule for a u quark and the number sum rule number sum rules are evaluated with ν = 144.6GeV.These µ and ν values correspond to grid points in our setup of DOVE and thus avoid additional interpolation.The detailed grid settings and integration procedure in DOVE are as specified in section 4 of ref. [30].
In figure 17, we show the relative deviation between the left-hand sides of the sum rules ( 53) and ( 54) evaluated in ChiliPDF and in DOVE.The numerical differences increase with x 1 and reach several percent at x 1 = 0.9.For sum rules with other parton combinations, we find similar or smaller discrepancies.We also compared the sum rules evaluated at smaller values of µ 1 = µ 2 and ν and found somewhat smaller discrepancies than at high scales.
An error estimate for the accuracy of evolution with DOVE in its original version is given in figure 9 of [28].Given that this estimate was carried out with finer grids than the ones used here, and given that relatively simple integration rules are used in the evaluation of the sum rules with DOVE, we find the level of agreement between the two codes satisfactory for the purpose of cross-validation.

Conclusions
We have shown that discretization on Chebyshev grids allows for a precise and economical representation of DPDs as functions of the momentum fractions x 1 , x 2 and the distance y between the partons.This generalizes the methods we developed for PDFs in ref. [55].We use grids for x 1 that are independent of x 2 and vice versa.This gives the highest degree of simplicity and decoupling between operations that involve only one momentum fraction, which is the case for the Mellin convolutions that appear in DGLAP evolution and in flavor matching.This method incurs interpolation errors close to the kinematic boundary x 1 + x 2 = 1 of the DPDs, which we find manageable.We investigated the interpolation accuracy for typical shapes of DPDs, both for the input distributions and after evolution.We find that grids with p x = 54 points for x ≥ 10 −5 yield a very accurate representation of DPDs, with relative errors that are much smaller than 10 −3 in large parts of the phase space and remain below 1% for x 1 + x 2 < 0.8 (see figures 10 and 11).
For the dependence on y, Chebyshev interpolation is used after variable transformations that are adapted to the functional form and the typical mass scale for the decrease of the DPDs at large distance.We find that with p y ∼ 48 grid points, very good accuracy can be achieved for interpolation and integration.
To solve the DGLAP equations for DPDs, we use a high-order Runge-Kutta algorithm for computing the evolution matrices in eqs.( 33) and (34), which are the discretized version of the Green functions for the evolution equations.This provides very high numerical accuracy for both forward and backward evolution, with errors well below the ones due to discretization in x 1 and x 2 .With our algorithm, it is natural to evolve DPDs separately in the scales µ 1 and µ 2 associated with the two partons.We note that the computational effort scales linearly with p y but with the third power of p x .
In our setup, a DPD with given flavor numbers (n 1 , n 2 ) is specified numerically by its values at all grid points in (x 1 , x 2 , y) and at one pair (µ 1 , µ 2 ) of scales.Its values at other scales are computed "on the fly" by solving the evolution equations.DPDs at other values of x 1 , x 2 , and y are computed using the barycentric interpolation formula, which is fast and numerically stable.As discussed in the introduction, we avoid grids in the renormalization scales, which would require a huge amount of computer memory.This is a notable difference between our approach and the widely used method to compute PDFs from grids in x and µ, for instance via the LHAPDF interface [54], and it reflects the notable difference in complexity between the distribution functions for one or for two partons inside a proton.
With our present implementation of the above methods in ChiliPDF, the evolution of a DPD with n = 5 active flavors (and thus 121 parton flavor combinations) from µ 1 = µ 2 = 15 GeV to 150 GeV takes 1 to 2 seconds6 for p x = 54 grid points in x 1 and x 2 and p y = 48 grid points in y.With grids of this size, the largest part of computing time is spent on multiplying the evolution matrices with the initial conditions, so that there is only a weak dependence of the timing on the perturbative order of evolution and on the initial and final scales.Since the current code is based on a simple implementation of linear algebra and matrix multiplication, we expect that significant gains can still be made with a dedicated performance tuning.
We used our code to investigate the impact of higher orders in the DGLAP kernels on the evolution of DPDs.For some parton combinations, the change from LO to NLO evolution produces O(1) effects at small momentum fractions, whereas we find only moderate differences when changing from NLO to NNLO kernels.
The numerical delivery of realistic DPDs has been a bottleneck in practice so far.The presented methods and their implementation provide a practical tool for computing evolved DPDs from user-given starting conditions, on a par with what has long been a standard for PDFs.This increases our ability to use the predictive power of perturbation theory and thus presents an important ingredient to making theory predictions for double parton scattering more realistic.
The evolution equations considered in this work apply to DPDs that are summed separately over the color of each parton.DPDs with color correlations between the two partons follow a different evolution pattern, with kernels that have recently been computed at NLO [84].The methods described in the present paper can be adapted to this case, as will be reported in a future publication.

A Path independence of evolution and flavor matching
In this appendix we show that the set (1) of DGLAP equations has a unique solution for a given initial condition at a point in the (µ 1 , µ 2 ) plane.In other words, the solution is independent of the path in the (µ 1 , µ 2 ) plane that is taken to evolve from the initial condition to the final scales.To establish this, it is enough to show that the second derivatives in the renormalisation scales commute, i.e. that is equal to Here and in the following, we change evolution variables from µ i to and drop the arguments x 1 , x 2 , and y of the DPDs for brevity.For evolution at fixed (n 1 , n 2 ), i.e. without flavor matching, the desired equality follows from the fact that the convolution integrals w.r.t. the first and second momentum fraction of the DPD commute with each other.This can readily be shown from their definition (2).Note that it is essential for the argument that the evolution kernel for one parton does not depend on the scale of the other parton.We now extend this argument, showing that one obtains the same result when flavor matching along different paths in the (n 1 , n 2 ) plane.Let us see how eqs.(59) and (60) have to be modified to include the effects from flavor matching.We first consider the case of PDFs, where the notation is less involved.For a set of given matching scales μn for the transition from n − 1 to n active flavors, we define the "canonical" number of flavors at given L = 2 ln µ as 4 for L 4 < L < L 5 5 for L 5 < L < L 6 6 for L > L 6 (62) where L n = 2 ln μn .We further define PDFs with the canonical number of flavors at a given renormalisation scale.From these distributions, one can obtain f n a (L) for flavor numbers different from the canonical values by evolution. 7t the matching scales, the distributions in eq. ( 63) can have discontinuities, which complicates their description via differential equations.We therefore introduce a smoothened version of f a (L).Taking a single matching step from n − 1 to n flavors at L = 2 ln μ for simplicity, one can define a smooth approximation θ (L) of the step function that takes the values θ (t) = 0 for t < − and is continuous and monotonic in the intermediate region − < t < .Its derivative δ (t) = dθ (t)/dt is then a smooth approximation of the Dirac distribution.Define now a PDF f a (L) that evolves according to where the O( ) contribution comes from the integral over the term with P n (L) in eq. ( 65).
In the limit → 0, this tends to the flavor matching prescription given in eq. ( 5).
It is straightforward to extend this construction from one to several matching steps and from PDFs to DPDs.In analogy to eq. ( 63), DPDs with canonical flavor numbers are defined as and their smoothened version F a 1 a 2 (L 1 , L 2 ) evolves according to where we abbreviated As at the beginning of this section, one can then show that so that with an initial condition given in the (L 1 , L 2 ) plane, eq. ( 69) has a unique solution independent of the evolution path.Taking the limit → 0, the desired result is obtained.

Figure 1 .Figure 2 .
Figure1.Unpolarized DPDs at x 2 = 5 × 10 −3 , evolved from (µ 1 , µ 2 ) = (2 GeV, 2 GeV) to (9 GeV, m W ) at different perturbative orders.The starting conditions and the procedure of flavor matching are specified in the text.The ratio shown in the small panels is taken with respect to the LO result.

Figure 3 .
Figure 3.Comparison of DPDs at different scales.The starting conditions at (µ 1 , µ 2 ) = (2 GeV, 2 GeV) and (n 1 , n 2 ) = (3, 3) are the same as those used for the previous two figures.At higher scales, the number of active flavors is n i = 3 for µ i = 9 GeV and n i = 5 for µ i = m W . Evolution and flavor matching are carried out at the highest available order.

3 Figure 5 .
Figure 5. Interpolation of DPDs at large momentum fractions.The DPDs considered are the same as in the right panels of figure 4. The plots on the left compare the exact and interpolated forms of the DPDs.The plots on the right show the difference between the interpolated and the exact values, along with the exact form.Notice that the plots in the top row have a different x 1 range than the remaining ones.

Figure 7 .
Figure 7. True interpolation accuracy in x 1 and x 2 compared with its estimate(40).All values refer to the medium x grid in eq.(37).The DPDs used are specified in eqs.(35) and(36).

ln µ 1 ln µ 2 (Figure 8 .
Figure 8. Two paths for evolution and flavor matching in the (µ 1 , µ 2 ) plane.Next to each line segment, we specify the numbers (n 1 , n 2 ) of active flavors for which the DPD is evolved.Flavor matching is performed at the points marked by dots.The path in black is the ChiliPDF default for matching from(3,3) to(6,6) and for subsequent evolution up to the final scale pair.

Figure 11 .
Figure 11.As figure 10, but for different flavor combinations evolved and matched to scales (1.01µ c , 1.01µ b ) with (4, 5) flavors.A dashed black line indicates that the interpolated function is negative.

Figure 17 .
Figure 17.Relative differences between the left-hand sides of the DPD sum rules (53) and (54), evaluated using DOVE and ChiliPDF.Details are given in the text.