On systematic effects in the numerical solutions of the JIMWLK equation

In the high energy limit of hadron collisions, the evolution of the gluon density in the longitudinal momentum fraction can be deduced from the Balitsky hierarchy of equations or, equivalently, from the nonlinear Jalilian-Marian-Iancu-McLerran-Weigert-Leonidov-Kovner (JIMWLK) equation. The solutions of the latter can be studied numerically by using its reformulation in terms of a Langevin equation. In this paper, we present a comprehensive study of systematic effects associated with the numerical framework, in particular the ones related to the inclusion of the running coupling. We consider three proposed ways in which the running of the coupling constant can be included:"square root"and"noise"prescriptions and the recent proposal by Hatta and Iancu. We implement them both in position and momentum spaces and we investigate and quantify the differences in the resulting evolved gluon distributions. We find that the systematic differences associated with the implementation technicalities can be of a similar magnitude as differences in running coupling prescriptions in some cases, or much smaller in other cases.


Introduction
One of the well known predictions of perturbative Quantum Chromodynamics (QCD) is a rapid growth of gluon distribution functions at a fixed resolution scale µ with a e-mail: piotr.korcyl@uj.edu.pl decreasing values of hadron longitudinal momentum fraction x. For sufficiently small x, the growth is expected to be tamed, because otherwise the unitarity bound would be violated. This phenomenon -the so-called gluon saturation -is one of the most interesting and still not fully resolved aspects of QCD at high energies. For a comprehensive and pedagogical review of these subjects, see, for instance, Ref. [1].
Pictorially, in the saturation domain, the occupation number of gluons inside a hadron is so large that they start to overlap. Consequently, a new perturbative mechanism is possible, the gluon recombination, which competes with the gluon splitting giving the aforementioned growth of the distribution. At a fixed resolution scale µ, the growth due to the gluon splitting with decreasing x is described by the linear Balitsky-Fadin-Kuraev-Lipatov (BFKL) evolution equation [2,3,4], while the inclusion of the recombination effects leads to nonlinear equations, the simplest example being the Balitsky-Kovchegov (BK) equation [5,6].
The BK equation is the mean field approximation to an infinite tower of entangled equations, describing the evolution in x of correlators of Wilson line operators U (x T ), stretching on the light-cone from minus to plus infinity, but positioned at a fixed transverse point x T (with respect to the light-cone directions). Such Wilson line operators appear naturally at high energies in the so-called eikonal approximation, as phases picked up by energetic colored parton fields traveling through the color field of the target hadron. The simplest two-point correlator of Wilson lines, the dipole, i.e. U (x T ) U † (y T ) , is directly related to the gluon distribution function unintegrated over the transverse momentum, often called unintegrated dipole gluon density or transverse momentum dependent (TMD) gluon density 1 probed in inclusive processes such as deep inelastic scattering.
In general, TMD gluon distributions turn out to be non-universal; various processes enforce a different gauge link structure required to maintain the gauge invariance of the operator defining the TMD distribution [9]. However, a TMD distribution for an arbitrary multiparticle process can be constructed from a restricted set of basis operators [10]. Correlators of four and more Wilson lines can be related to these non-universal TMD gluon distributions -the relation has been first shown at leading power [11,12,13] and recently beyond leading twist [14,15,16].
A systematic approach to the evolution in energy (or x) of the correlators of Wilson lines is provided by the Color Glass Condensate (CGC) theory (see e.g. [17]). In the CGC, the Wilson lines are treated as functionals of the target (typically nucleus) random gauge field. The averaging of the Wilson line operators is defined by means of the functional weight, which is typically taken to be the Gaussian functional at some initial scale x 0 , according to the McLerran-Venugopalan (MV) model [18,19]. The decrease of the scale modifies the weight functional according to the Balitsky-Jalilian-Marian-Iancu-McLerran-Weigert-Leonidov-Kovner (B-JIMWLK) equations [5,20,21,22,23,24,25,26,27], available now also at NLL accuracy [28,29,30]. Since the evolution Hamiltonian involves the Wilson lines themselves, the set of equations for correlators is not closed, i.e. each step in the evolution will generate more and more complicated correlators. Due to this last fact, the aforementioned BK equation is phenomenologically very important, because in the mean field approximation it allows to approximate any correlator by dipoles. However, the precise description of less inclusive processes, where the non-universality of TMD gluon distributions is strong, requires full solutions of the B-JIMWLK equations.
In particular, recent phenomenology studies of forward jet production processes [31,32,33,34,35] within a framework that directly uses various TMD gluon distributions in the small-x limit -the so-called small-x improved TMD factorization (ITMD) [36,14] -utilizes the Gaussian approximation and large-N c approximation to calculate the TMD distributions. As shown in [13,37], they can also be calculated from the full B-JIMWLK equation, without approximations, although at a fixed coupling constant.
It is possible to study the solutions of the JIMWLK equation numerically by rewriting it in terms of a more practical Langevin equation [38]. The first application of such numerical solutions to phenomenology was described in [39] and used the so-called "square root" prescription for the inclusion of effects of the running of the strong coupling constant. We devote this work to investigate the impact of the various systematic effects inherent to the numerical framework, such as the differences which arise from performing particular steps of the calculation in position or momentum space. The Langevin equation itself is defined only in position space. However, intermediate or additional steps, such as the computation of correlation function of Wilson lines, can be implemented both in position and momentum spaces. There are two reasons why the Fourier transform introduces unwanted systematic effects. On the one hand, the Fourier transforms mix short-distance discretization effects with large-distance finite volume effects. It is difficult to disentangle them and one needs to perform both the continuum and the infinite volume extrapolations to remove them. On the other hand, the Fourier transforms are in many cases performed in the color group instead of in the corresponding algebra and therefore many of the theorems known in Fourier analysis do not hold. We show numerically that under some circumstances the two ways of estimation, through position or momentum space, are not equivalent and we quantify these effects. In particular, we pay special attention to the various prescriptions for the running coupling constant effects and we try to quantify the associated differences.
Our results can be used as guidelines to estimate systematic uncertainties on the phenomenological parameters of the initial condition in the MV model. The recent phenomenological fit to the F 2 HERA data provided first estimates of these parameters. They are, however, dependent on different implementation choices of the numerical framework used to solve the JIMWLK evolution equation. We argue that some care needs to be taken when these parameters are compared with experimental determinations.
The paper is composed as follows. We start in Section 2 with a brief summary of the main features of the numerical framework. In Section 3, we describe in more detail existing implementations of the running coupling in the JIMWLK equation. In subsequent sections, we discuss the systematics associated to different stages of the computation. Some subtleties even at the stage of computing the correlation functions are mentioned in Section 2.2. In Section 5.1, we summarize the dependence of the initial gluon distribution on the algorithmic parameters, rederiving some of the results of [38] 3 and [13] for completeness. Further, in Section 5.2, we describe the systematics involved in the implementation of the evolution in rapidity. Finally, in Section 5.3, we discuss the differences induced by different implementations of the running coupling in the JIMWLK equation. We present our conclusions in Section 6.
Throughout this paper, we try to keep an explicit notation of all expressions. We find that in the literature many details of the implementation are implicitly assumed, but never precisely stated. Although some of our expressions seem to be trivial, their aim is to provide the Reader with expressions that correspond exactly to what was implemented in the computer code used to perform the computations. The open-source code is available as a git repository under https://bitbucket.org/piotrekkorcyl/jimwlk together with the documentation. The code is fully parallelized and multi-threaded and can be run on multicore CPU-based clusters. General information about the code can be found in Ref. [40].

Description of the numerical framework
In this section, we describe in detail the basic steps of the calculation. We describe the implementation of the MV model for the initial condition and present the computation of the dipole correlation function. Then, we proceed with the description of the evolution equation, providing details of the implementation entirely in position space or using the JIMWLK kernel in momentum space. The inclusion of the running coupling effects is described in the next section.

Initial condition and construction of Wilson lines
The basic object in our calculation is the straight infinite Wilson line along the light cone with fixed transverse position x. In the discretized setting, it is constructed as the product of N y elementary Wilson links, The infinitesimal Wilson link variables can be obtained from the gauge potentials A ab which are expressed in terms of color source fields ρ ab k and imposed to be the solutions of appropriate classical Yang-Mills equations, in accordance with the MV model. In the approximation of stationary, classical fields, these equations reduce to a single Poisson equation, where the artificial mass m regulates a possible zero eigenvalue of the Laplace operator. We solve Eq. (3) on the square lattice with periodic boundary conditions. The Laplace operator is diagonal in momentum space, so the simplest way of obtaining the solution is to perform the Fourier transform element-wise on the ρ matrix, solve the corresponding algebraic equation in momentum space and return to position space, where the two sums correspond to the two Fourier transforms and Λ (Λ) denote the sets of all discrete lattice momenta (positions), see Appendix A for details. The lattice has size L x × L y and we take L x = L y ≡ L. a is the dimensionful lattice spacing. The dimensionless lattice momentum and its squarê arise in Eq. (3) as a direct consequence of using a symmetrized finite difference instead of the continuum derivative.
For the remaining problem of generating initial color sources ρ, we follow the MV model with a discretized transverse x−y plane. For a given x = (x, y) on the transverse plane, N y color matrices in the rapidity direction represent successive, random gluon radiation. Such implementation follows the procedure described in Ref. [41]. Alternative implementations were discussed in Ref. [38].
Technically, on each site x of each plane k, k = 1, . . . , N y , we construct a random matrix from the SU(3) algebra by where the color sources ρ(x) a k are normally distributed and generated by the Box-Muller method. For their standard deviation, we have Above, g, N y and µ are input parameters of the MV model. As can be seen from the above expression, the relevant combination of MV parameters is g 2 µ. It is therefore sufficient to set g = 1 and vary µ. Summarizing, the Wilson line is given by On the lattice, the dimensionless (input) scale parameter is aµ and hence, a fixed physical setting is obtained when multiplying it with the lattice size L/a. Thus, in our numerical study, we take a constant value of g 2 µL = 30.72, as used in Refs. [42,13].

Two-point correlation function
Once we have constructed all Wilson lines, we can evaluate their correlation functions. At first sight, this seems to be a trivial task. However, depending on how the estimation of the correlation function is implemented, i.e. whether one stays entirely in position space or one uses Fourier acceleration and performs parts of the calculation in momentum space, the results differ significantly on a single realization of the initial condition. This is due to the ambiguity of the Fourier transform on the group manifold. The discrepancy disappears when a sufficiently large statistical sample is used for the evaluation. Below, we summarize both approaches and the numerical results are shown in Fig. 1. We stay in the fundamental representation and define C(x) following Ref. [41], where the average · is taken over different statistical realizations of Wilson lines. We enforce that C(δ) is an even function of δ = x − y, We average over symmetric distances and C(δ x , δ y ) = 1 4 C(δ x , δ y ) + C(L − δ x , δ y )+  where δ x ≥ 0 and δ y ≥ 0. Finally, we use the Fourier transform of complex-valued function C with periodic boundary conditions to obtaiñ which, due to the applied symmetrization, yields a strictly real functionC(k). A much faster implementation uses Wilson lines in momentum space. The starting point is the Fourier transform of Wilson lines, This Fourier transform is understood in terms of matrix elements of the matrices U , i.e. the above equation 5 involves 9 independent complex number Fourier transforms. Obviously, in general,Ũ (k) does not belong to the SU(3) group anymore, The advantage is that usingŨ (k) we can recover the correlation function of Eq. (14) by only local multiplications, where we have used the fact that tr U † (x)U (y) depends only on the distance x−y = δ. Numerical results for both implementations are shown in Fig. 1. In the top panel, we present the results for both approaches for a single configuration of Wilson lines. Clearly, the results differ significantly, especially for small values of the transverse momentum vector. On the contrary, after averaging over 100 realizations of the color charges, both methods give compatible results (bottom panel), the momentum space approach being statistically more precise. In the following, we use the momentum space approach to evaluate all correlation functions. Below, we present results for the rescaled dimensionless gluon distribution, (L/a) 2k2C (Lk T ), where k T is the norm of the two-dimensional vector k and where we use the lattice momentum, Eq. (6), fork 2 , as previously suggested in the literature. For clarity of our message, we do not consider in this work other correlation functions, in particular those containing derivatives of Wilson lines, such as the ones presented in Ref. [13]. They share all the systematic uncertainties with the basic gluon distribution analyzed below, their additional systematic uncertainties being solely related to the discretization effects of the derivative.

Evolution in rapidity
In this section, we describe the details of the implementation of the JIMWLK equation reformulated as a Langevin equation [38]. We use a numerically more economical formulation, where the Wilson line is multiplied by evolution kernels from the left and right side [42]. The evolution equation in rapidity s with a step of size δs reads where K(x) is a kernel function and ξ(x) are random vectors, both to be discussed below. To simplify the notation, we denote the arguments of the right and left exponentials by A and B, and so that the Langevin equation becomes Again, the construction of A and B can be performed in position or momentum spaces. Below, we describe both approaches and the corresponding numerical results are discussed in Section 5.2.

Construction of A and B
In order to use Fourier acceleration, we need several quantities in momentum space. We definẽ Then, In order to repeat a similar construction for the B matrix, we start by defining a matrix U which we subsequently transform to momentum spacẽ 6 Then, we have All Fourier transforms above are understood element-wise. Because of that and because of the different discretization errors in the JIMWLK kernel, both approaches, in position and in momentum space, are not equivalent. The evolution equation in momentum space reads Further numerical results are discussed in Section 5.

Random vectors
In both approaches, we start by generating random vectors ξ valued in the Lie algebra (with generators λ a ) on each site of the lattice, from a normal distribution with unit width. The vectors For the momentum space formulation, we also need the Fourier-transformed random vectors ξ(p), which satisfy the analogous constraint which states that the noise vectors are also uncorrelated in momentum space.

Kernel function
We now turn to the kernel function that enters the rapidity evolution equation and discuss the possible discretizations thereof in position space and in momentum space. The original JIMWLK kernel is defined in continuum position space as When implemented on a lattice, several symmetries of this kernel are broken and hence, various discretizations are possible and lead to different systematic effects. In particular, due to the slow, power-law decay of the kernel at large distances, finite volume effects must be carefully studied. We start our discussion with the position space kernel and we show how to construct the A and B matrices. Then, we present an analogous presentation for the momentum space kernel.

Position space kernel
On a discrete lattice, the kernel in Eq. (32) assumes the following form, where n = (n x , n y ) is a vector of integers, i.e. n i ∈ (−L/a, L/a), and the numbersn i (n 2 ) are in the chosen discretizations: naive with a discontinuity: regularized with the sine function: The numerator/denominator of the discrete kernel are plotted in the top/bottom panel of Fig. 2 for both discretizations. By construction, they agree for small separations n, whereas the largest discrepancy appears for separations close to the half of the lattice extent, which for the demonstrated situation happens at |x/a| = 16.

Momentum space kernel
In order to be able to compare position and momentum space results, we need to make sure that the relation between the JIMWLK kernel in position and momentum space is well established. As both sides of the equation have the same dimensions, we can write it in terms of dimensionless, still continuous, variables, The lattice size is L/a = 32.
We have discussed possible discretizations of the integrand for the position space kernel. Mimicking the continuum relation, discrete Fourier transform should produce the equivalent JIMWLK kernel in momentum space. Anticipating the result, we can discretize the right hand side of the continuum relation, Eq. (38), following one of the two ways, keeping the naive lattice momenta, or, alternatively, as is commonly done using the lattice momentak andk, The comparison of both definitions in momentum space is shown again in Fig. 2. Obviously, both discretizations lead to the same behavior as in position space, up to a rescaling of the vertical axis.
We can now test the discretized versions of Eq. (38). Note that we should not a priori expect that this equation holds in the discrete version. This is due to the fact that the Fourier transform propagates discretization effects and finite volume effects of the position space kernel to all values of the momentum k and vice versa. To simplify the picture, we look at a single component of the kernel, i.e. the x-component, where D pos and D mom denote one of the possible discretizations of the quantity in brackets, i.e. either the position or the momentum vector. In principle, there may not be a single region where Eq. (42) holds with a reasonable degree of accuracy. As a test (shown in the Fourier transform of the position space kernel. This is shown by the blue and black data points; the blue correspond to the naive discretization of the position space kernel, whereas the black to the sin-regularized kernel. Surprisingly, both data sets agree quite well for nearly all values of x and also y. The red data points show the momentum kernel with the usualk/k 2 definition, whereas the green data points are calculated using the simple lattice momenta. If Eq. (42) held, the blue and black data points should lie on top of the green and red. The discrepancy signals that the linear momentum discretization fails to reproduce the discretized JIMWLK kernel in the bulk of the lattice: green data is significantly above all others. On the contrary, notice that the Fourier transforms of both position space kernels (blue and black data) agree reasonably well with the momentum kernel defined using trigonometric regularization. The momentum kernel defined through lattice momenta provides good agreement only for small momenta. This is reasonable, as the numerator of the kernel is linear in the lattice momentum, hence as seen in Fig. 2, the naive and trigonometric momenta agree up to ≈ 6, then the boundary effects introduce large deviations.
We conclude that both forms of the position space JIMWLK kernel and the trigonometric momentum kernel provide a consistent picture. This conclusion will be further enhanced when we discuss the gluon distribution after evolution in rapidity: the momentum linear kernel introduces large finite size effects and distorts the shape of the final distribution. This prescription was proposed in Refs. [43,44] and discussed by Rummukainen and Weigert in Ref. [38], where they introduced the running coupling in the Langevin formulation of the JIMWLK equation following [45] and [46]. The running coupling effects are accounted for with the coupling at the scale given by the size of the parent dipole (x − y) 2 , i.e. we introduce a one-loop running for N c colors and N f flavors and Λ QCD being the QCD scale parameter. The running coupling in the Langevin equation (28) is hidden in the rapidity factor s = α s π 2 y, y = ln with x 0 (x 2 ) the initial (final) Bjorken-x of the evolution. In the "square root" prescription, Eq. (28) becomes where we have separated Depending on the way in which we have implemented the JIMWLK kernel, either in position or in momentum space, the running coupling has to be included accordingly. This question was addressed in Ref. [42], where the momentum prescription was provided. We therefore make use of the following definitions, where the parameter c is used to freeze the running for small momenta and large distances. As in Ref. [42] we take the numerical values of Lµ 0 = 15 and LΛ QCD = 6. These parameters are part of the model and so their precise values should be fixed by comparison with experimental data, e.g. through a fit to the DIS data.

"Noise" prescription
An alternative definition of the running coupling was proposed by Lappi and Mäntysaari [42,47]. The running coupling can be implemented as a modification of the properties of the noise vectors in the Langevin equation. This has a different physical motivation, as the scale of the running coupling is in this case provided by the momentum of the emitted gluon. This scale is then argued in Refs. [42,47] to correspond to the smallest of the three relevant dipole sizes (the "parent" and two "daughter" dipoles) [42]. Hence, instead of Eq. (28) we use where now instead of the uncorrelated, Gaussian random variables in position space ξ a,i (x). The noise η(x) becomes correlated in both momentum and position spaces.

Momentum space
The implementation in momentum space is straightforward. The correlation is diagonal and thus, for each p one generates uncorrelated Gaussian variable with variance σ = α s (p),

Position space
In position space, we need to construct a nontrivial correlation between any two lattice sites, As noted in Ref. [42], the coupling constantα x−y cannot be interpreted as being evaluated at the scale x−y, but rather it is the Fourier transform of the coupling constant in momentum space, as in Eq. (50). Hence, we construct the desired correlation matrix in position space, Σ is by construction symmetric and positive-definite. Thus, in the next step, we find the Cholesky decomposition of Σ, i.e. a matrix L such that Eventually, we generate a Gaussian vector of uncorrelated random variables, ξ, which we afterwards transform according to η(n) has the desired correlation, The correlation matrix Σ is color and spin blind, so that we can use it for all color and spin components as postulated by Lappi and Mäntysaari. A robust comparison of different implementations is shown in Fig. 4. We note that for both running coupling prescriptions the position space and the momentum space evolutions lead to consistent results within our statistical uncertainties. Although, for this small rapidity no significant deviation between the "square root" and "noise" prescription can be seen, for large rapidities these two prescriptions give significantly different saturation scales, as we will show below.

Hatta-Iancu prescription
In Ref. [48], the Authors provide a formulation of the JIMWLK equation with collinear resummation, which accounts for DGLAP kind of logarithms suppressing anti-collinear pole of the kernel. In particular, they reinvestigate the relation of momentum space expression for the running of α s to small dipole prescription discussed in [42,47]. According to Hatta and Iancu, the smallest dipole prescription corresponds to the dependence of α s on virtuality and not the transverse momentum. We proceed with implementation directly in coordinate space, where r is the size of the projectile. We implement this prescription with the position space evolution only, by  fixing the distance r at which the Wilson line correlator is evaluated and performing the entire evolution for that given distance. This adds an additional L factor to the scaling of computations, but in this way we avoid using Fourier transforms, as it is not clear how the Fourier transform would mix the contributions from distances different than r where the coupling constant would be evaluated at a wrong scale. On a purely numerical level, we note that this prescription coincides with the "square root" prescription for large distances r. In the latter situation, most of the contributing distances |x − y| are smaller than r, hence reproducing the "square root" definition. On the contrary, for small separations r in the correlation function, the scale of α s will be set by that distance, thus modifying the final correlation function with respect to the "square root" prescription. Numerical confirmation of this expectation will be discussed in Section 6.

Statistical analysis
The final results, i.e. the two-point correlation functions of two Wilson lines, are evaluated using 100 random realizations of the color charge distributions. Standard statistical analysis provides standard deviations for the resulting data points.
For the definition of the saturation scale Q s , we follow Ref. [38] and define it as the transverse momentum for which the rescaled gluon distribution reaches a maximum. The results are quoted in terms of the dimensionless quantity LQ s . When plotted on a logarithmic scale, the gluon distribution near the maximum resembles a quadratic or a Gaussian function. Therefore, we use two fitting ansatzes in order to estimate the maximum: with a, b, c and d being fit parameters. We associate the saturation scale with the parameter d. An example fit is shown in the uppermost panel of Fig. 5. The fitting range was chosen as Lk T ∈ [33,245] and is shown in the figure as they gray band. In order to disentangle the discretization effects from the continuum dependence on log(Lk T ), we increase the statistical uncertainties by a factor 3 in such a way that the data points form a continuous curve within their new uncertainties. In the example shown, both fitting functions describe the data well in the full range. The quality of the fits can be estimated by the value of χ 2 /dof. In order to estimate the systematics of the fitting procedure, we vary the fitting range by shifting the left and right borders of the fitting interval and keeping the interval symmetric around the approximate maximum at Lk max T ≈ 90. The minimal range is [55, 148] and the maximal is [12,665] and they contain respectively 50 and 250 data points. The middle panel of Fig. 5 shows the dependence of the χ 2 /dof on the fitting range for both fitting ansatzes. Starting at Lk initial T ≈ 20, the χ 2 /dof's of both kinds of fits are below 1 and approximately equal. The statistical uncertainty of the fitting parameters estimated using jackknife resampling from 100 fits is at a level of 1 ‰ and completely negligible compared to the systematic effects associated with the choice of the fitting range.
The final values of the saturation scale are shown in the bottom panel of Fig. 5. We exclude results for Lk initial T < 20 as the fitting ansatzes do not correctly describe the tails of the correlator. We also exclude results for Lk initial T > 50, as the fitting range contains too few data points to correctly resolve the maximum. For the remaining range of fitting intervals, we estimate the mean saturation scale obtained using the Gaussian ansatz as the final result, since the Gaussian curve describes the data for a wider range of Lk T and we associate with it a systematic uncertainty equal to the spread of the remaining fit results coming from both ansatze. Hence, in the example shown, the final saturation scale reads LQ s = 88.5 ± 2.0. This corresponds to the green band in the top panel of Fig. 5.

Systematics
This is the main section of our work, where we quantify and discuss the different systematic effects induced by   the implementations outlined above. We start in Section 5.1 with the discussion of the gluon distributions resulting from the initial condition (i.e. at zero rapidity) and their dependence on parameters such as volume, N y or the mass regulator in the Poisson equation. Next, in section 5.2, we focus on the different ways the evolution in rapidity can be implemented. Eventually, in Section 5.3, we discuss the systematic effects involved in the implementation of the running coupling.

Initial condition
In this section, we discuss the dependence of the twopoint correlation function on the parameters of the nu- merical setup. The parameters of the MV model have to be scaled appropriately when the lattice size is varied. We keep the dimensionless combination g 2 µL constant as we scan the different lattice extents from L/a = 12 up to L/a = 128. We use the value g 2 µL = 30.72 from Refs. [42,13]. As far as the lattice volume is concerned, we do not see any statistically significant deviations down to lattice sizes of L/a = 12, see Fig. 6. The dependence on the mass regulator in the Poisson equation, Eq. (3), is shown in Fig. 7 together with the dependence of the resulting saturation scale. We notice that within the systematic and statistical errors, all setups provide consistent saturation scales, except the single case of the largest mass regulator, am = 0.1. Finally, the dependence on N y is shown in Fig. 8. We check that when a statistical ensemble of 256 realisations is used, all values of N y saturate the product in Eq. (1). Since the generation of the initial distribution is a negligible cost of the computation compared to the evolution in rapidity, we conservatively use the value of N y = 50 for all subsequent calculations.

Evolution at fixed coupling
In this section, we study systematic effects which arise from the different implementations of the Langevin equation. As we will see, the situation at fixed coupling constant is not satisfactory, as the continuum limit is not well-defined. Therefore, we do not present here all the possible implementation combinations and only concentrate on some representative features of the fixedcoupling setup. The phenomenologically relevant situation with the running coupling constant is discussed in the next section.
We study the dependence of the gluon distributions on various parameters, in particular the size of the time step and the lattice extent and the different discretizations of the JIMWLK kernel. In the top panel of Fig. 9, we show the dependence of the distributions on the size δs of the Langevin step in the evolution equation. We show data for rapidity s = 0.04 together with the initial condition to set the scale. In the fixed coupling scenario, we remind that one can relate s to the rapidity y by [13] Therefore, taking e.g. α s = 0.16, s = 0.04 corresponds to y ≈ 2.5. For each value of the time step, we estimate the saturation scale and its systematic and statistical error and plot it as a function of the time step in the bottom panel of Fig. 9. Two extrapolations to the vanishing value of the Langevin time step are shown in the figure: i) a constant one, fitted to the last three data points, which are compatible within their systematic uncertainties, ii) a linear one in the time step, fitted up to δs = 0.0008. Both extrapolations yield val-ues which are compatible within their uncertainties. Hence, we notice convergence. For time steps smaller than δs = 0.0001, there is no systematic difference in the saturation scale and the results at non-zero step size are compatible within uncertainties with the extrapolated results. Another important parameter is the lattice extent L/a. Its impact on the evolution is shown in Fig. 10. We present data for different volumes for the evolution with time step of δs = 0.0001 and for final rapidity of s = 0.04. We see that the results significantly depend on the lattice volume for volumes in the range from L/a = 128 up to L/a = 1024. When analyzing the resulting saturation scales, a systematic trend is clearly visible with the saturation scale diverging as the infinite volume limit is taken. We find the same behaviour in both position and momentum space formulations, where, for the latter, we were able to reach volumes as large as L/a = 16384. This observation is also irrespective of the fact whether the 'linear' or 'sin' kernel is used. This result is in agreement with the previous findings discussed in Ref. [38]. The difficulty in taking the continuum limit makes the evolution at fixed coupling not a viable numerical approach, hence we do not discuss it further here and we turn our attention to the case of evolution with running coupling.
Before doing so, we provide a consistency check of the implementations of the coupling, shown in Fig. 11. Namely, we compare evolved gluon distributions with a fixed coupling and with a running coupling, having switched off the running. Appropriate rescaling of noise vectors guarantees that the evolved distributions agree.

Evolution at running coupling
Now, we turn on the effects arising from the running of the coupling. As discussed in Section 3, there are several possible ways to implement the latter and here, we compare the systematics arising in the "square root" and "noise" prescriptions.
Let us start by noticing that in the setup where the running coupling constant is included, the numerical evolution proceeds much slower. This results from the fact that now, the Langevin step contains a scaledependent α s factor. For example, the value of s = 0.04 that we used to illustrate the systematics of the fixedcoupling setting, corresponds in the case of a running coupling to y ≈ 0.4 (taking α s evaluated at the saturation scale) instead of y ≈ 2.5. Thus, in order to reach the same physical rapidity, the simulations with a running coupling must be considerably longer. We will comment more on the speed of the evolution with different prescriptions for α s in the next section.  We start with the discussion of the dependence on the size of the Langevin step in the numerical integration of the Langevin equation. We note that there is a significant difference between the two running coupling prescriptions, see Fig. 12. The Langevin step dependence of the "square root" prescription is similar to the one in the fixed coupling setting (cf. Fig. 9), with a similar slope, e.g. the result at δs = 0.0004 is around 10-15% above the linearly extrapolated one and one needs to go to δs ≤ 0.0001 to ensure agreement with the latter. In turn, the "noise" prescription allows for a step size at least one order of magnitude larger than the "square root" prescription and even the result at our largest employed step size is consistent with the extrapolated value. The origin of this effect was not identified, as both implementations use the same Wilson line update algorithm, and will be investigated in the future. However, the practical conclusion is that much larger rapidities can be attained with the same computational cost with the "noise" prescription.
As a second systematic effect, we consider the volume dependence of the gluon densities obtained with different α s prescriptions at a fixed rapidity of s = 0.16. The results are shown in Fig. 13. The upper two panels show the comparison of distribution shapes for both the "square root" and "noise" prescriptions. We note that, contrary to the fixed coupling setting (see Fig. 10), the peak of the distributions, i.e. the saturation scale, does not depend on the volume, provided the latter is large enough. Differences between the shown volumes are vis-  One sees a dramatic difference in the convergence rate: the "noise" prescription is roughly insensitive to the step size in the investigated range, whereas the "square root" prescription exhibits a linear slope similar to the one found in the fixed coupling setting.
A constant fit to the two finest step sizes (δs ≤ 0.0001) for the "square root" data was also performed, showing that the continuum limit has been attained. The data set corresponding to the finest step size, δs = 0.00005 is not shown on the middle panel, as it overlaps almost ideally with δs = 0.0001. ible only in the tails of the distributions and can be attributed to discretization effects -in a fixed physical volume, smaller lattice volumes correspond to larger lattice spacings. Even though saturation scales implied for the chosen parameter values are different for the two prescriptions, the systematics of the volume is the same, i.e. L/a ≥ 256 ensures that the saturation scale becomes independent of L/a within uncertainties. This is illustrated in the bottom panel of Fig. 13, where we show the volume dependence of the saturation scale Q s determined from the maxima of the distributions. In order to contain both sets of data on the same extrapolation plot, the "square root" data used for the extrapolation are taken at a twice smaller rapidity, s = 0.08. The data can be extrapolated to the infinite volume limit using a constant fitting ansatz when the smallest two volumes used, namely L/a = 64 and L/a = 128 are removed from the fit. At this value of rapidity, they are affected by finite volume effects and hence are not reliable.
As a next step, Fig. 14 shows the comparison of gluon distributions obtained using different JIMWLK kernel discretizations. As discussed above, we proposed to discretize the kernel in two ways, both in position space and in momentum space. The figure only shows results obtained in momentum space. The outcomes in position space follow a similar pattern. Comparing the gluon distributions obtained with the kernel using the sine function ("sin kernel") and the naive discretization ("linear kernel"), we see that the latter induces large distortions for momenta Lk T near the maximal value. In the case of the "noise" prescription, these distortions are limited to large Lk T values, as opposed to the "square root" prescription, where the modifications also affect the position of the maximum of the gluon distribution and hence the saturation scale itself. As the results obtained with the linear kernel exhibit an unphysical growth of the gluon distribution for the maximal momenta, we believe this implementation choice is inferior to the discretization involving the sine function.
Finally, we can compare the results obtained under evolution implemented in position and momentum spaces. Note, as already signaled in the introduction, that the Langevin equation is written only in position space. The Fourier acceleration used to perform some of the computations in momentum space may introduce uncontrolled discretization/finite volume effects. In Fig. 15, we show the comparison of evolved gluon distributions performed in both spaces at different values of the rapidity s. Although, as suspected, some differences can be seen in the shape of the gluon distributions, their size is negligible on the scale of the effect of the evolution itself. The discrepancies appear for the   the boundary as the distribution rises for large k T . In the case of the "square root" prescription, the distortions of the linear kernel also propagate to small k T . We conclude that the linear kernel should not be used in subsequent calculations.
maximal transverse momenta Lk T where the boundary conditions and finite volume effects are maximal. The above observation remains true for both running coupling prescriptions. Since the evolution in position space is computationally much more demanding, we can therefore conclude that from a practical perspective the solutions obtained in momentum space are reliable and can be used in subsequent phenomenological studies. However, for observables sensitive to the large k T tail, some care must be taken.

Discussion
Finally, we are in position to discuss physical implications of the different implementation choices. As our first observable, we study the evolution speed, i.e. the rapidity dependence of the derivative of the saturation scale. In Fig. 16, we plot the quantity where where we employed the definition Eq. (47) of α s (k). In the previous section, we have demonstrated that our results are robust against systematic uncertainties, i.e. increasing the volume or changing the evolution between the momentum space and position space implementations provides results compatible within their uncertainties. We demonstrate this once again for the observable λ in the top panel of Fig. 16 where the dependence on the lattice extent is plotted. Clear agreement of all data sets corroborates again the control of systematic effects. Conversely, the bottom panel demonstrates that changing the running coupling prescription has a genuine physical effect, as the rate of change of the saturation scale is different between the "square root" and "noise" prescriptions for small saturation scales, up to around 10Λ QCD . This is even true with our rather con-servative estimates of the systematic uncertainty in the extraction of Q s . Fig. 16 also contains another set of data points. As a consistency check, we have determined the evolution of the saturation scale with rapidity using an alternative definition of the saturation scale. We follow Ref. [42] where Q s was fixed from the gluon correlation function in position space, Although this is only a change in the definition, it has a significant effect on the evolution speed λ, as can be inferred from Fig. 16. At small saturation scales, i.e. Q s smaller than 10 − 15Λ QCD , for both running coupling prescriptions the value of λ defined from the definition Eq. (64) is larger than Q s obtained from the maximum of the correlation function in momentum space. At large saturation scales, all prescriptions and definitions seem to converge to a single value of around λ ∼ 0.2. At the technical level, the definition (64) suffers from large discretization effects for large saturation scales, which is visible as a scatter of data. This is because the exponential decay of the correlation function is loo large to precisely estimate the intercept with the e − 1 2 line. This effect should decrease when the lattice size is increased. Conversely, the definition in momentum space described in Section 5 is more reliable in this region, giving smaller statistical and systematic uncertainties. Again, both definitions and implementations yield quantitatively similar values for the evolution speed at large saturation scales.
Secondly, we present for the first time the results obtained using the Hatta-Iancu running coupling prescription, see Fig. 17. Since the definition of the latter is provided in position space as a function of the separation in the final correlation function, we keep all the calculations in position space, avoiding Fourier transforms. In order to show the physical effect of the Hatta-Iancu prescription, we contrast the position-space correlation function obtained using this prescription with the one from the "square root" coupling. We observe statistically significant differences at small separations. We plot results for two volumes, L/a = 48 and L/a = 64, which enable us to draw the conclusion that the differences cannot be attributed to finite volume effects, but to the intrinsically different physical effect of the running coupling constant. We note that the slower decay of the correlator in the Hatta-Iancu case implies, according to Eq. (64), smaller values of the saturation scale.
Finally, we discuss the large-k T asymptotics, for which analytic arguments exist in the large rapidity limit. The correlator at large rapidity is a difficult observable, as it  requires large lattice extents and one has to make sure that the appropriate scaling regime is reached. It is expected that the 1/k 2 T behaviour [1], present in the distribution corresponding to the initial condition, will be modified by the evolution to a less steep dependence. In Fig. 18, we demonstrate the situation when the effects of the running coupling constant are included for two large lattices. Two regimes can be identified: the first one for moderate values of k T L, where both volumes give comparable results, and the second for large k T L, where finite volume effects distort the tail of the distributions. Although the volume independence of the data in the first region may indicate that the scaling regime is reached, the resulting power law dependence strongly depends on the details of the implementation and of the running coupling constant prescription. The separation between the two regions is marked roughly by the black vertical line in the figure. Therefore, this suggests that the proposed prescriptions, not only yield values for the saturation scale which differ by a rough factor of 2, but they also differ significantly in the regime of large transverse momenta. For the data shown, the fitted power law (L/a) 2 k 2 TC (Lk T ) = ax b yields b ≈ −0.5 for the "noise" prescription and b ≈ −1 for the "square root" prescription. It is a question for further studies how the exponent in the power law depends on rapidity. As shown in Fig.17, the Hatta-Iancu prescription is equivalent to the "square root" definition at large distances, hence the differences we observe at small distances should translate to differences in the scaling for large k T . The Fourier transform, needed to go from position to momentum space, requires the knowledge of the correlation function at all positions on the lattice, which is prohibitively expensive for the current numerical setup. Therefore, we do not have reliable data from large enough lattice sizes to include the large k T asymptotics with the Hatta-Iancu prescription.
To summarize, we have presented a systematic comparison of solutions of the JIMWLK equation using in the case of the "noise" ("square root") prescription. In order to increase the figure readability, the "square root" data were divided by a factor 100.
numerical, lattice-related techniques. On the technical side, among the main systematic effects that we investigated are momentum/position space implementation of the evolution, dependence on the volume, Langevin step size and kernel discretizations. On the phenomenological side, we discussed the inclusion of the effects of the running of the strong coupling constant following three prescriptions proposed in the literature. All these results are fundamental for a reliable program of relating the numerical solution of the JIMWLK equation to experimental data (such as the F 2 structure function), since they give constraints on the parameters of the numerical setup such that the different systematic uncertainties are under control. Thus, they allow one to optimize the numerical setup and ensure efficient and robust fitting with quantified errors. correlated in momentum space. We adopt the following conventions, (with V = (L/a) 2 ) This hints to the following equivalence: ξ p generated in momentum space: i.e. at each lattice site we generate uncorrelated random numbers with a unit standard deviation ξ x generated in position space: i.e. at each lattice site we generate uncorrelated random numbers with a unit standard deviation by taking the forward Fourier transform to obtain ξ p , it will be indistinguishable from the ξ p .
We checked numerically that the above is indeed true.
The practical choice to make is either to: generate the random vectors in one space and Fouriertransform them to the other one, or generate separate random vectors in both spaces.
Obviously, both choices are equivalent in the limit of an infinite number of stochastic realizations.