Parton Distribution Functions from Ioffe time pseudo-distributions

In this paper, we present a detailed study of the unpolarized nucleon parton distribution function (PDF) employing the approach of parton pseudo-distribution functions. We perform a systematic analysis using three lattice ensembles at two volumes, with lattice spacings $a=$ 0.127 fm and $a=$ 0.094 fm, for a pion mass of roughly 400 MeV. With two lattice spacings and two volumes, both continuum limit and infinite volume extrapolation systematic errors of the PDF are estimated. In addition to the $x$ dependence of the PDF, we compute their first two moments and compare them with the pertinent phenomenological determinations.


Introduction
Parton distribution functions (PDF) [1] describe the structure of hadrons in terms of the momentum and spins of the quarks and gluons. Deep inelastic scattering (DIS) experiments have allowed for a phenomenological determination of the PDFs, but a direct calculation using Quantum Chromodynamics (QCD) remains out of reach. The theoretical definition of PDFs requires calculation of hadronic matrix elements with separations on the light cone. A calculation on a Euclidean lattice is therefore not possible. Previously, the Mellin moments of PDFs and Distribution Amplitudes (DA) of baryons and mesons have been calculated with Lattice QCD [2][3][4][5][6][7], but the reduced rotational symmetry of the lattice only allowed access to the lowest few moments. Unfortunately more moments than are available Unlike the quasi-PDF, the pseudo-PDF has canonical support in −1 ≤ x ≤ 1 for all values of z 2 even when the PDF limit has not yet been reached. In a super renormalizable theory, the pseudo-PDF will approach the PDF in the z 2 → 0 limit. In renormalizable theories, the pseudo-PDF will have a logarithmic divergence at small z 2 which corresponds to the DGLAP evolution of the PDF. The pseudo-PDF and the pseudo-ITD can be factorized into the PDF and perturbatively calculable kernels, similar to experimental cross sections. This fact means that the pseudo-PDF and pseudo-ITD fall into the category of "Good Lattice Cross Sections" as described in [33]. The first lattice implementation of this technique was performed in [34,35] to compute the iso-vector quark pseudo-PDF in the quenched approximation. Other Good Lattice Cross Sections have been calculated to extract the pion DA [36,37] and the pion PDF [38]. We refer the reader to [39][40][41][42] for detailed reviews of these topics.
Possible difficulties with these approaches were raised in [43,44]. In [43], the authors observed that the power divergent mixing of moments of the PDF calculated in Euclidean space would cause a divergence of the moments of the quasi-PDF. Due to this issue, they argued the PDF could not be extracted from the quasi-PDF. This claim was refuted in [45], where the authors showed that the non-local operator can be matched to the PDF without the presence of power divergent mixings. In [43,44], the authors noted that the Fourier transformation of the logarithmic z 2 dependence, generated by the DGLAP evolution of the PDF, will create contributions to the qPDF in the region of |y| > 1 which do not vanish in the limit p 3 → ∞. This effect is unavoidable in the quasi-PDF formalism since the Fourier transform must be performed before matching to the MS PDF. It is this contribution which generates the divergent moments of the quasi-PDF. In [46], the origin of this contribution was described in terms of the "primordial transverse momentum distribution". It was argued that the non-perturbative part of the |y| > 1 contributions vanishes in the p 3 → ∞ limit, while the non-vanishing perturbatively calculable contributions are canceled after implementation of the matching procedure. As a result, the moments of extracted PDFs are finite.
It should be noted that in the pseudo-PDF formalism the z 2 -dependence of M(ν, z 2 ) is not subject to a Fourier transform, and the issue is completely avoided. As was shown in [32], pseudo-PDFs P(x, z 2 ) have the canonical support [−1, 1] for the momentum fraction x. The unphysical region of |x| > 1 is avoided and the moments of the pseudo-PDF are finite. Finally in [47], it was demonstrated using lattice data that the finite moments of the PDF can be extracted from the non-local matrix element for the reduced pseudo-ITD, refuting the claim in [44] that the pseudo-PDF moments would be power divergent. In the OPE, the power divergent mixings of the moments are explicitly canceled by the corresponding Wilson coefficients. This feature of the OPE has been known for some time [48], and this method of extracting moments from non-local operators is referred to as "OPE without OPE" [49]. This cancellation of divergences is unsurprising. The reduced pseudo-ITD is by design a renormalization group invariant quantity. There can be no difference between this object calculated with lattice regularization or dimensional regularization in the continuum. Since all of the moments are finite, a matching relationship between the pseudo-ITD and the MS ITD can be derived from these Wilson coefficients.
In this paper, we show the first calculation of the Ioffe time pseudo-distribution function with dynamical fermion ensembles. The other aspect new to pseudo-ITD analysis is that we have applied the method of momentum smearing [50] to the pertinent matrix element which substantially improves our results at high momenta when compared to Gaussian smearing. The remainder of the paper is as follows. In Section 2, the Ioffe time distribution is outlined, and in Section 3 the details of its lattice implementation are described. In Section 4, the results of the calculation are presented, and finally in Section 5, we summarize our findings and propose future research directions.

Ioffe time pseudo distributions
The unpolarized ITD is described in terms of a special case of the helicity-averaged, forward, non-local matrix element, M α (p, z) = p|ψ(z)γ α U (z; 0)ψ(0)|p , (2.1) for p = (p + , m 2 2p + , 0 T ), z = (0, z − , 0 T ), and α = +, where we use light-cone coordinates, i.e. a µ = (a + , a − , a T ) with a ± = (a t ± a z )/ √ 2 and a T = (a x , a y ). Given arbitrary choices of p, z, and α, the Lorentz decomposition of the matrix element in Eq. (2.1) is where the Lorentz invariant ν = p · z is called the Ioffe time. For the choice of parameters which corresponds to the ITD, only M contributes to the matrix element. For arbitrary z 2 , the M function, called the Ioffe time pseudo-distribution function, can be thought as a generalization of the ITD to separations other than light-like. The pseudo-ITD contains the leading twist contributions, but also contains higher twist contributions at O(z 2 Λ 2 QCD ). The removal of these contributions, through cancellation or small z 2 , is necessary for accurately determining the ITD from the pseudo-ITD.
From the relevant handbag diagrams, it has been shown [32] that the Fourier conjugate of ν, denoted by x, has a restricted range of −1 ≤ x ≤ 1. The Fourier transform of the pseudo-ITD, called the pseudo-PDF P(x, z 2 ) is given by This definition of x is completely Lorentz covariant and there is no need to restrict displacements onto the light cone or to require infinite momenta. This feature is promising for a lattice calculation where only space-like displacements are possible and large momenta are plagued by both large statistical and systematic errors. An extended discussion of this approach is provided in [32,34,35,47,[51][52][53].

PDFs, TMDs, and pseudo-PDFs
Specific choices of z, p, and α can connect the Lorentz-invariant pseudo-PDF to the standard light cone PDF. Using light-cone coordinates, and assuming a light-like displacement z µ = (0, z − , 0 T ), longitudinal hadron momenta p µ = (p + , p − , 0 T ), and the Dirac matrix for α = +, the standard light-cone PDF can be determined as the Fourier transform of the ITD One can now introduce transverse degrees of freedom through the displacement z = (0, z − , z T ) in order to define a TMD, F(x, k 2 T ), as This definition of the TMD uses a straight link operator for U (z, 0) of Eq. 2.1 and describes the undisturbed or "primordial" distribution of the nucleon. Standard definitions of the TMD rely on staple links. These staple links account for interactions with either the initial or final state colored particles which exist in scattering processes. The primordial TMD may not be capable of being determined in a scattering experiment, but it can still be treated properly within the realm of Quantum Field Theory. Unfortunately, light cone coordinates are not suitable for lattice field theory applications. Instead by choosing the displacement and momenta along a particular lattice axis, z = (0, 0, z 3 , 0) and p = (0, 0, p 3 , E) and the Dirac matrix for α = 4, the pseudo-PDF can be determined by where ν = p · z. Ignoring the logarithmic divergences in a renormalizable theory, the pseudo-PDF will converge to the PDF in the limit of z 2 3 → 0. Due to Lorentz invariance, the pseudo-PDF calculated in either light cone or Cartesian coordinates will produce the same function. The difference between the pseudo-PDF and the PDF can then be described by using the k T dependence of the primordial TMD, F(x, k 2 T ). This limit can be shown to be the same convergence limit of the quasi-PDF after one recognizes z 2 3 = ν 2 p 2

3
. The complicated evolution of the quasi-PDF can be explained by the fact that, in the space of the Lorentz invariants ν and z 2 , it is an integral with respect to ν of M(ν, ν 2 /p 2 ) along the curve z 2 = ν 2 p 2 . This feature makes the quasi-PDF at a given value of the momentum p a mixture of the Ioffe time distribution at different scales some of which may not be in the perturbative regime. In order to ensure the applicability of perturbative matching formulae, the momenta used in the quasi-PDF determination need to be very large in order to neglect the ν dependence in the second argument. In contrast, the pseudo-PDF is the integral along the line z 2 = const. The single scale makes the validity of perturbation theory, or lack there of, more transparent. A verification of the validity of the perturbative formula will be necessary for any lattice-calculated PDF to be believable.

Reduced distribution
In order to improve the calculation of the ITD from the pseudo-ITD, it has been suggested [54] to remove the O(z 2 Λ 2 QCD ) contributions by considering the reduced pseudo-ITD Ideally, D(z 2 ) will contain all the non-trivial z 2 dependence from higher twist effects. The chief requirement is that in the z 2 → 0 limit, D(z 2 ) approaches a non-zero finite constant. This feature ensures that the OPE for this reduced pseudo-ITD will be the same as the original pseudo-ITD, and factorization will lead to the same PDF moments. For future purposes in a lattice calculation, the choice of D(z 2 ) = M(0, z 2 ) is particularly useful and in the following this choice will be assumed. An alternative choice has been proposed in [55]. The authors claim that if one used a vacuum matrix element of the same operator, instead of the rest frame hadron matrix element, then the pseudo-PDF would have smaller higher twist corrections in the limit of x → 1.
In order to take a continuum limit, the operator O WL =ψ(z)γ α U (z; 0)ψ(0) must first be renormalized. In QCD, the renormalization constant of a Wilson line is proportional to e −gm z a +c(z) which when expanded in g gives a power divergence in perturbation theory and c(z) is a term which depends on the number of cusps in the Wilson line and the specific angles formed at the cusps [56,57]. The full Wilson line operator O WL is multiplicatively renormalizable [58]. This convenient feature allows the ratio in Eq. (2.7), with the choice D(z 2 ) = M(0, z 2 ), to have a finite continuum limit. The renormalization constants in the numerator and denominator, which only depend on the length and shape of the Wilson line, not the external momentum, cancel exactly. Even more importantly, this ratio is Renormalization Group Invariant (RGI). Therefore, it can be factorized using the MS scheme into the PDF and an appropriate Wilson coefficient contrary to what was claimed by [43,44].

z 2 evolution and MS matching
All phenomenological PDF fits are performed in the MS scheme either with NLO or NNLO truncation in the perturbative series of the Wilson coefficients. Since the reduced pseudo-ITD is RGI, its z 2 dependence is independent of any particular scheme, but its dependence on this scale must match the µ dependence of the MS ITD. The O(α s ) perturbative z 2 evolution equation is given by [32] is the Altarelli-Parisi kernel. This equation is the pseudo-ITD's analog of the ITD's DGLAP evolution equation. Solving Eq. (2.8) in the leading log approximation, as in [34], the reduced pseudo-ITD at various scales z 2 can be evolved to the common scale z 2 0 by applying As was done in [34], one can estimate the effects of evolution and matching by performing the convolution on a model reduced pseudo-ITD. Consider the pseudo-ITD for the model y) is the Beta function, for a = 0.5 and b = 3. The convolutions for the DGLAP kernel are shown in Fig. 1. Due to this logarithmic divergence in z 2 , the determination of the PDF from the pseudo-ITD is not as straightforward as a simple limit of z 2 → 0. At the leading-twist level, there exists a factorization relationship between the reduced pseudo-ITD M(ν, z 2 ) and the MS ITD Q(ν, µ 2 ) defined as The kernel K for matching the pseudo-ITD to the MS ITD has been calculated at NLO [59][60][61] is a scale independent piece related to the specific choice of the MS scheme. This gives the full NLO matching relationship The convolution of L(u) for the pseudo-PDF model is shown in Fig. 1. The scale dependent part of the kernel can be identified as the previously mentioned evolution process to a scale The reduced pseudo-ITD at many scales could be directly matched to the MS ITD in a single step. The separation of the evolution and matching procedure can allow for the two steps to take into account the higher twist contamination in different ways.

Moments of pseudo-PDF
Taylor expanding the right-hand-side of Eq. (2.3) about ν = 0 provides access to the moments of the pseudo-PDF These moments can be related to the Mellin moments of the PDF through the OPE of the reduced pseudo-ITD in the limit z 2 Λ 2 QCD 1 [47]. The leading contribution in the OPE of the reduced pseudo-ITD is given in terms of Mellin moments of the PDF, a n (µ 2 ), where O(z 2 ) schematically represents terms at sub-leading power in the twist expansion and K n (µ 2 z 2 ) are the Wilson coefficients of the moments. These Wilson coefficients are the Mellin moments of the matching kernel in Eq. (2.13). By comparing the ν dependencies of Eq. (2.16) and Eq. (2.18), one can find the matching relationship between the pseudo-PDF's moments and the PDF's moments, b n (z 2 ) = K n (µ 2 z 2 )a n (µ 2 ) . This matching relationship is multiplicative unlike the convolution for matching the pseudo-ITD to the ITD. These advantages of the representation in Mellin space have been exploited to calculate high-order evolution of the PDF [62,63].
As given in [47], the NLO Wilson Coefficients in Eq. (2.18) are given by which are the well known moments of the Altarelli-Parisi kernel, and With the Wilson coefficients computed, we can now obtain the MS moments up to O(α 2 s , z 2 ) directly from the reduced function M(ν, z 2 ) as a n (µ 2 ) = (−i) n 1 K n (z 2 µ 2 , α s ) There are other analogous proposals for the calculation of moments of parton distributions from Lattice QCD, which do not use local twist-2 matrix elements. Using two spatially separated current operators, [25,27], one can extract the moments of the DA or of the PDF. These ideas, as well as the above technique for determining moments, go under the name of "OPE without OPE", where one calculates a non-local operator in order to estimate the moments defined by local operators [49]. This procedure is particularly useful when the local operators are subject to some systematic difficulty such as the power divergent mixing of the twist-2 operators for PDF moments. In principle, one could calculate all PDF moments from the pseudo-ITD, though this fact is not necessarily true in practice due to the discretized values of z and p which are available to lattice calculations.

Details of the Lattice Calculation
The numerical calculation is performed on three different ensembles of lattice QCD configurations. The ensembles that are used in this article were generated by the JLab/W&M collaboration [64] employing 2+1 flavors of stout-smeared clover Wilson fermions and a tree-level tadpole-improved Symanzik gauge action.
In the fermionic action, one iteration of stout smearing was used with the weight ρ = 0.125 for the staples. A direct consequence of this smearing is that the tadpole corrected tree-level clover coefficient c SW used is very close to the non-perturbative value determined, a posteriori, employing the Schrödinger functional method [64]. The parameters used for the three ensembles are listed in Tab. 3. The lattice spacings, a, are estimated using the Wilson flow scale w 0 described in Ref [65].  Table 1. Parameters for the lattices generated by the JLab/W&M collaboration [64] using 2+1 flavors of stout-smeared clover Wilson fermions and a tree-level tadpole-improved Symanzik gauge action. The lattice spacings, a, are estimated using the Wilson flow scale w0. Stout smearing implemented in the fermion action makes the tadpole corrected tree-level clover coefficient cSW used to be very close to the value determined non-pertubatively with the Schrödinger functional method. The a127m415L contains 10 independent streams of 256 configurations each. The other ensembles only contain a single stream.

Momentum smearing
In order to improve the overlap of the interpolating fields on the nucleon ground state, smearing procedures are performed on the quark fields. Standard Gaussian smearing, which was used in the previous study of pseudo-ITDs, can help improve the overlap with the state at rest, but deteriorates the overlap with states of higher momenta. The momentum smearing procedure [50] changes the smearing operation to improve the overlap of the interpolating field to nucleon ground states with arbitrary momenta. The quark fields are transformed bỹ where ρ and ζ are tunable parameters. This transformation has the same form as used in standard Gaussian smearing with an extra phase, e iζ·k multiplying the gauge links. Here, ρ plays the same role as in Gaussian smearing and ζ determines which momentum states the quark field will predominantly overlap with. As suggested in previous works, this procedure was implemented using the existing iterative Gaussian smearing routines, but using a set of rotated gauge linksŨ k (x) = e iζ·k U k (x) in order to account for the phase. Unlike previous works with the momentum smearing technique, the momentum smearing parameter, ζ, was not chosen to be dependent on the nucleon momentum. Also, unlike the sequential source technique, the matrix element extraction based upon the Feynman-Hellmann theorem allows for calculating several nucleon momentum states without additional propagator inversions. Fixing the smearing parameter as a fraction of the momentum would greatly increase the cost of this calculation by requiring different propagator calculations for each momentum used. The parameters were chosen to overlap with the higher momenta states which had a poor signal using standard Gaussian smearing. The choice of ζ was made by maximizing the signal-to-noise ratio of the 2-point correlation function for the desired range of momenta.
In order to demonstrate the efficacy of the momentum smearing procedure, the effective masses of the pion and nucleon were calculated on the a127m415L ensemble, plotted in Fig. 2. Each of these momentum smearings was set to improve the signal-to-noise ratio for a different range of nucleon momentum states. As can be seen the smearing without the momentum phase performs poorly for the highest momentum states, the intermediate ζ improves the signal for the middle range of momenta, and the largest ζ performs significantly better at the highest momentum states than the other two. It should be noted that momentum smearing only alleviates one of the sources of decaying signal-to-noise ratio. Momentum smearing improves the overlap of the ground state with the operator at nonzero momentum which allows for more time sources to be available before machine precision issues arise. The other source is the variance inherent in the correlation functions, which decays exponentially in T with a rate determined by the hadron's energy where n q is the number of quarks and anti-quarks in the interpolating operator. This variance will not be affected by the momentum smearing procedures, because it is an inherent property of the theory.  The signal-to-noise ratio for the effective mass, particularly for the pion, decays without momentum smearing. Even without tuning ζ for each momentum, the momentum smearing procedure significantly improves the signal and allows for precision determination of high momentum effective masses which were not attainable with Gaussian smearing alone.

Feynman-Hellmann Matrix element extraction
The extraction of the matrix element is performed with a method based on the Feynman-Hellmann theorem. Hadron matrix elements can be calculated by studying the time dependence of the ratio of 3-point and 2-point correlation functions. We refer the reader to [34,66] for more details on the implementation of this method. For completeness, the key points are highlighted below.
The Feynman-Hellmann theorem relates matrix elements of energy eigenstates to derivatives of the corresponding energy eigenvalue with respect to a parameter of the theory, In order to calculate matrix elements of arbitrary operators, following the procedure described in [66], we consider a change of the action by adding the following term, where j(x) is some operator of interest. The vacuum state is labeled by |λ . The 2-point correlation functions are defined as where Φ represents collectively the gauge and fermion fields of the theory, O(t) is an interpolating field for a desired hadron, and Z λ is the partition function defined by These three objects, |λ , C λ , and Z λ , become the true QCD vacuum, correlation function, and partition function respectively in the λ → 0 limit. In this formulation, the hadron matrix element of the operator j(x) can be computed through the Feynman-Hellmann theorem as where J(t) = d 3 xj( x, t) and |h n is the n th hadron state with the quantum numbers of O(t) and mass m hn . In the large Euclidean-time limit, the ground state mass can be approximated by the effective mass The derivative of the effective mass can be shown to be where The contributions to the time ordered matrix element in the numerator of R(t) from the regions where the current is not inserted between the hadron states were shown in [66] to be suppressed for lattices with large time extent N t in the difference of the two terms which appear in the RHS of Eq. (3.9). At large Euclidean time t, Eq. (3.9) isolates the matrix element of interest up to exponentially small corrections, where A and B are fit parameters related to the matrix elements of the lowest excited state with mass gap ∆. The last term, O(e −∆ T ) represents the neglected higher state corrections. It should be noted that the higher state effects of this method are significantly smaller than the typical ratio method where excited state effects are of order O(e −∆T /2 ). This method of matrix element extraction has a number of advantages over the more common techniques. First, the summation over the operator insertion time eliminates one of the independent variables of the correlation function. This reduction allows for a clear identification of excited state contamination.
Without the summation over operator insertion time, points with several insertion times are required to visually identify a plateau. Furthermore, several source/sink separations are needed to unambiguously determine the ground-state matrix element. A second advantage, demonstrated in [67], is that the matrix element extraction can begin at much earlier times in contrast to what is possible from other calculations, whose excited state effects are much larger. This advantage is particularly important for physical mass calculations where the excited state effects require long time extents in order to be controlled.

Desired lattice correlation functions
All correlation functions were calculated with randomly determined source points. The correlation functions with smeared source operators and both point and smeared sink operators are constructed and each of these smearings will be performed for 3 different values of the momentum smearing parameter ζ. All of these correlation functions will be simultaneously analyzed to extract the matrix element. The nucleon states were boosted up to a maximum momenta of p max = π/2a, except for the ensemble a127m415L in which the maximum momentum was p max = 3π/8a. Within these ranges of momenta, the continuum energy dispersion relation is still reasonably satisfied by the lattice calculation within errors as can be seen in Fig 2. The momentum smearing technique is used to allow for more precise access to the high momenta correlation functions.
To calculate the matrix element for the nucleon Ioffe time distribution, the relevant 2-point correlation function is defined by 12) where N p is a helicity averaged interpolating field of a nucleon with momentum p and T is the Euclidean time separation between the interpolating operators for the nucleon creation and annihilation operators. The quark fields in N p have all been smeared using Gaussian momentum smearing in order to improve the overlap with the boosted ground state. The relevant 3-point correlation function is defined by where O Γ (z) =ψ(0)ΓW (0; z)τ 3 ψ(z). In this extraction method, the time of this operator insertion is summed over. The flavor isospin Pauli matrix τ 3 is used in order to create the iso-vector quark combination, u − d. This step is performed to avoid the potentially costly additional calculation of disconnected diagrams. The effective bare matrix element, M (z · p, z 2 , T ), is defined by where E N is the nucleon energy and The factor of 2E N in the definition of the bare matrix element will be used to cancel the factor of 2p 4 in the pseudo-ITD definition from Eq. (2. 2), such that the difference of ratios will be an effective bare pseudo-ITD The separation of the quark fields and the nucleon momentum are both chosen along thê z axis, z = z 3 and p = p 3 . The bare matrix element is determined through the large time asymptotics of the effective matrix element, As found in [66], the bare matrix element extracted with this method has different excited state effects than in a typical calculation of a three point correlation function. The bare matrix element will have contamination from higher state effects proportional to e −∆T and T e −∆T where ∆ is the energy gap between nucleon's the ground state and the first excited state. These terms will be included in the matrix element fit to control these effects in the low T region.

Fits and Excited States Effects
The simplest method to extract the reduced pseudo-ITD from the effective matrix element is to analyze the large time limit with a mean value in a region where the effective reduced matrix element has a plateau. These data have plateau regions, especially at low momenta, where this procedure could be performed. A more sophisticated method can be used to take into account the effects from the lowest excited states which would provide a more justified account of the systematic errors. The effective bare matrix element has excitedstate contamination of the form where A p (z 2 ) and B p (z 2 ) are the contributions from the matrix elements containing the first excited state and ∆ p is the effective energy gap between the ground state and the first excited state. The coefficients of the excited state term are, in general, correlation function dependent, i.e. different for each set of smearing parameters. Further simplifications of this functional form are possible to reduce the number of fit parameters. One could adopt the lattice energy dispersion relation to fix ∆ p to ∆ 0 removing another parameter. The rest frame energy could be extracted from two point functions prior to the study of the matrix element or left as a fit parameter common to all matrix elements. The latter option, requiring a simultaneous fit of many matrix elements for the sake of fixing a single parameter, may not be too practical. Another choice is to perform the spectroscopy fits on the 2-point functions to find the energy gaps and hold those values fixed in a fit for the coefficients of the exponentials.
In this work, a fit of the data to Eq. (3.18) is used simultaneously for each correlation function, holding the ground state matrix element and effective energy gap fixed. Specifically, a fit is performed on N different effective bare matrix elements with different smearing setups, M eff j , to the form

Cancellation of Renormalization Constants
With a lattice regulator (unlike in dimensional regularization), the operator O WL (z) has a power divergence in z a . The handling of this power divergence in lattice QCD renormalization schemes, such as the popular RI-MOM scheme, and the associated matching  relationships have generated a large amount of discussion, see [23] for a comparison of methods. In a lattice QCD calculation, renormalization constants require a separate calculation. Alternatively, when possible one can form ratios of matrix elements where the UV divergences cancel. In this spirit, a ratio, which has the same leading order in the OPE as the pseudo-ITD, will be constructed where all renormalization constants cancel.
The local vector current, M 0 (z · p, z 2 )| z=0 , where M 0 is the bare matrix element can be used to ensure quark number conservation. In the continuum limit, where the vector current is conserved, this matrix element should be equal to 1. Due to lattice artifacts, the local vector current is not conserved and it possesses an ap dependence, but has a finite renormalization constant. This renormalization constant does have the property Z V → 1 in the continuum limit a → 0. Again the leading order behavior of the OPE for the ratio will match the original pseudo-ITD. This ratio still contains the logarithmic and power divergences associated with the Wilson line operator. The UV divergences of the Wilson line operator will be canceled by forming ratios which have in the denominator the rest frame matrix element M 0 V (z · p, z 2 )| p=0 . The imaginary component is consistent with zero for all z. The real component results, which are plotted in Fig. 6, have the exponential behavior expected from the non-perturbative effects generated by the Wilson line operator. The low z/a region exhibits a cusp as z → 0, which is a signal for the power divergence. For the pseudo-ITD, the matching kernel is unity at ν = 0, meaning the rest frame matrix element will be the integral of the PDF, which for the iso-vector flavor quark combination is 1, up to potential higher twist and discretization errors. The leading order behavior of the OPE for this ratio will be the same as for the pseudo-ITD, so that this matrix element satisfies the properties required for the reduced pseudo-ITD.  Figure 6. The rest frame distribution M 0 V (z · p, z 2 )| p=0 . The left plot is from the ensemble a127m415, the middle plot is from the ensemble a094m390, and the right plot is from the ensemble a127m415L. The cusp as z → 0 is the signal for the power divergences which occur in perturbation theory. The large z limit reveals the exponential nature of the Wilson line renormalization constant.
Finally the reduced matrix element is defined by the double ratio This double ratio not only takes care of cancelling the multiplicative renormalization constants. It also has the desired goal of cancelling some O(z 2 ) higher twist contaminations. This feature was demonstrated in the quenched approximation [54], where at fixed Ioffe time and large z 2 , the reduced pseudo-ITD was independent of z 2 instead of showing a polynomial behavior. The reduced pseudo-ITD is a renormalization scheme independent quantity which can be matched directly to the MS light cone PDF through the OPE-based Eq. (2.15).
The form of this double ratio has an additional advantage of being exactly equal to unity at ν = 0 with no possible higher twist effects and lattice spacing errors. This feature explicitly sets the iso-vector quark PDF sum rule, 1 0 dx (u(x) − d(x)) = 1. Any error in the sum rule for a PDF determined from this reduced pseudo-ITD must be an artifact of the procedure for calculating the PDF.

Reduced pseudo-ITD results
In this section we discuss the results that we have obtained for the reduced Ioffe time pseudo-distribution. The real and imaginary components of the ITD, and therefore the continuum reduced pseudo-ITD, can be analyzed separately yielding additional insight to the structure of the hadron. The real (CP even) component describes the valence quark distribution The imaginary (CP odd) component describes the sum of the quark and anti-quark distributions A combined analysis of these two components will allow for isolating the valence quark and sea/anti-quark contributions to the reduced pseudo-ITD. In Figs. 7 and 8 we show the reduced pseudo-ITD as a function of p 3 and z 3 respectively. The real-component curves all have a Gaussian shape which suggests that the renormalization of the Wilson line was indeed canceled. The curves look similar, but their width decreases with increasing momentum. If, instead, the real-component Ioffe time pseudodistribution is plotted as a function of the Ioffe time (see Figs. 9 -11), then the data appear on a more universal curve which is nearly independent of z 2 3 . In the absence of higher twist effects, this feature was to be expected since the perturbative z 2 dependence of M(ν, z 2 ) only begins at O(α s ).

Perturbative evolution and matching to MS
Even though the data follow a nearly z 3 independent curve, the DGLAP evolution of the PDF dictates a perturbatively calculable dependence on the scale z 2 . Understanding this z 2 dependence is particularly necessary for comparing the data to phenomenological fits which are renormalized and given at a single scale. The MS matching procedures could be performed in a single step by applying the kernel in Eq. (2.13) to each set of data with different z 3 independently. It is also possible to separate the z 2 evolution from the MS matching steps. As long as the steps are of the same order in α s , the one step and two  Above a certain length scale, the perturbative evolution of the data ceases and a separation of the ν and z 2 variables appears, up to the neglected higher twist effects which are partly canceled by the ratio. This separation of variables results in a z 2 independence for the reduced pseudo-ITD for large z 2 . For the evolution of the data points in this regime, the initial scale could be treated as the scale when evolution first appeared to stop. In the quenched approximation, this scale was found to be z −1 400 MeV [54]. This scale is par- ticularly low for using a perturbative evolution, and a non-perturbative evolution method would be preferable. Failing to account for this cessation of perturbative evolution would cause the longest distance points to be evolved away from the universal curve. In order to perform the convolutions in Eq. (2.13), the reduced pseudo-ITD for constant z 2 are fit to a sixth degree polynomial and subsequently integrated over. The real and imaginary components are fit to the even and odd powers in the polynomial respectively, with three free parameters each Re M Figure 11. The real and imaginary components of the reduced pseudo-ITD on the ensemble a094m390 as a function of ν.
in Fig. 1. Fig. 15 shows the reduced pseudo-ITD evolved to z −2 = 4e 2γ E +1 GeV 2 . This particular scale was chosen, so that the reduced pseudo-ITD can be easily matched to the MS ITD at µ = 2 GeV. The value of α s (2 GeV) = 0.303 was taken from the evolution used by LHAPDF [68] for the dataset cj15nlo from the CTEQ-Jefferson Lab collaboration [69]. There are two ways in which these convolutions can be used to evolve and match the reduced pseudo-ITD to the MS ITD. The most straightforward is a direct inversion of Eq. (2.15) for data with different z 2 independently. An alternative, but equivalent, approach is to perform the z 2 evolution of the reduced pseudo-ITD, for each z 2 independently, using Eq.(2.9) to the scale z 2 0 = e −2γ E −1 µ −2 . With all the data evolved to this common scale, the inverse of Eq. (2.15) can be applied to match the pseudo-ITD to the MS ITD for the data originating with all z 2 simultaneously. The convolutions with L are performed by fitting the evolved reduced pseudo-ITD to the same polynomials as before in Eq. (3.24). The common scale was chosen such that the scale dependent logarithm in Eq. (2.15) vanishes when matching to the MS ITD for a particular µ.
In this work, the scale µ = 2 GeV was chosen. The evolved reduced pseudo-ITD and the matched MS ITD are shown in Fig 15. It has been tested that the evolution and matching procedure performed in a single step or being performed in two steps result in a consistent MS ITD. For the remainder of this work, only the one step matching results will be used.

PDF extraction
Due to the restrictions in allowed quark-field separations and momentum states on the lattice, the data lay discretized on an interval of ν different than the full Brillouin zone. These issues make the extraction of the PDF from Eqs. (3.22) and (3.23) given lattice data an ill-posed inverse problem. In order to reliably extract a PDF from the lattice data, one will have to provide additional information. What information and how it is applied constitute different solutions to the inverse problem, a few of which were studied for use in PDF calculations in [28,53,70].

Moments of PDF and pseudo-PDF
Information about the PDF can still be determined from the reduced pseudo-ITD without directly performing the Fourier transform. The moments of the pseudo-PDF can be used to calculate PDF moments while avoiding entirely the inverse problem [47]. A discretized version of the relationship in Eq. (2.16) between the moments of the pseudo-PDF and the reduced pseudo-ITD data can be written in the matrix form where M is a vector of N data points, C is known as the Vandermonde matrix in ν, and b is a vector of M moments weighted by the factor of i n /n! mentioned in Eq. (2.16). The Vandermonde matrix is an N × M matrix of the form C in = ν n i where ν i is the Ioffe time for the i-th data point in M. This relationship can also be split into real and imaginary components which only contain even and odd powers of ν and result in even and odd moments of the pseudo-PDF respectively. This equation is inverted for points with a fixed z 2 . The results of the first and second moments of the pseudo-PDF as well as the At small separations, the moments of the pseudo-PDF have small dependence on z 2 . After the application of the DGLAP evolution and matching relationships from Eq. (2.20), any residual z 2 dependence of the moments of the PDF, which would be caused by higher twist contaminations, appears negligible. Fig 16 has a comparison of this calculation of the pseudo-PDF and MS PDF moments and those calculated from various global fits. As is the case in the direct calculation of the local matrix element [71], the PDF moments at heavy pion mass are systematically higher than the phenomenologically determined result.
In principle, one can also use this technique to extract the higher moments. This procedure has been tested on the next two higher moments. Only the results with the largest few z 2 had statistical errors comparable to the lower moments. The range of Ioffe time for those large z data points had been sufficiently large to determine the second variable in the Taylor expansion in Eq. (3.24) with reasonable statistical precision. Since these data potentially have significant higher twist corrections, these results should be considered questionable. The small z data points appear almost entirely described by including only the terms proportional to c 1 and c 2 . Further studies on finer lattice spacings will be required to extend the range of Ioffe time for low z data in order to constrain the higher moments and confirm the lack of higher twist contamination which was observed in the lower moments.
Reconstructing the PDF from its moments in itself is an inverse problem. Instead of inverting a Fourier transform, like most of the procedures discussed in this chapter, this problem is the inversion of a Mellin transform. The PDF can be parameterized by some function such as the PDF Ansatz in Eq. (4.2) or the moments can be parameterized by some function with a known inverse Mellin transform as was suggested in [27]. In this work, with at best two moments being constrained, there is not much hope for actually performing a reliable fit to any of these functional forms.
The determination of moments of the pseudo-PDF is a useful calculation while studying these Ioffe time distributions. The inversion of the Fourier transform is an ill-defined problem which comes along with many complications and the moments of the pseudo-PDF allow for a quick sanity check that the data contain reasonable information. The residual z 2 dependence of the MS moments, derived from moments of the pseudo-PDF, is a nontrivial check of the size of higher twist effects. The lack of any statistically meaningful z 2 dependence in the MS moments calculated on these ensembles can be used to justify the validity of using the data with all z 2 in the following PDF extractions.

PDF fits
All solutions to the ill posed inverse problem require adding additional information to constrain a unique solution for the unknown function. In the global PDF fitting community, Figure 16.
The moments of the pseudo-PDF compared to phenomenologically determined PDF moments from the NLO global fit CJ15nlo [69], and the NNLO global fits MSTW2008nnlo68cl_nf4 [72] and NNPDF31_nnlo_pch_as_0118_mc_164 [73] all evolved to 2 GeV. The top, middle, and bottom plots are from the ensembles a127m415, a127m415L and a094m390 respectively. The left and right columns show the first and second moments respectively. Only the lowest two moments have signal for most z. The higher moments only have signal for the largest z where the maximum Ioffe time used allows for identifying more than the leading behavior in the Taylor expansion.
the most common choice for solving the inverse problem is to choose a physically motivated functional form for the PDF. By choosing a PDF parameterization with fewer parameters than existing data points, the inverse problem is regulated. This model Ansatz can be designed to explicitly show some limiting behaviors, physically motivated features, and satisfy some possible constraints. The better motivated the information used to create the Ansatz the more successful this technique will be, but any particular choice will introduce a model-dependent bias into the final result. The ill-posed inverse problem does not have a unique solution, and ultimately some bias must be introduced into the PDF determination. Ideally, several different parameterizations would be checked and compared, and in effect this cross checking has occurred amongst the several phenomenological PDF fits employed, each with different choices of models.
It is known that the PDF can be reasonably well described simply by the following expression that parameterizes its limiting behaviors, .
To add more generality, the phenomenological PDFs are fit to a more flexible functional form, where P (x) is a yet to be specified interpolating function with more model parameters.
There exist well known features of PDFs, such as vanishing as x → 1, diverging as x → 0, and the constraints of the PDF sum rules. The limiting behaviors can be seen through the signs of the model parameters a and b in Eq. (4.3) and the normalization can be fixed to satisfy the sum rules. By separating these features, P (x) is allowed to be a smoother and slower varying function which is easier to determine. One choice of P (x) for the valence quark PDF employed by the CJ [69] and the MSTW collaborations [72] is given by This functional form explicitly sets the PDF's sum rule and allows all moments to be directly calculated as a ratio of sums of Beta functions.
The statistical errors will be obtained with the jackknife resampling technique. In the dynamical quark ITDs, there does not appear to be a strong dependence on the initial separation z with which the data point had been calculated. This z 2 independence indicates that large polynomial z 2 corrections do not appear to exist. Therefore we can fit all z 2 separations simultaneously. Fig. 17-19 shows the results of fitting the ITD to the functional formed used by the CJ and MSTW collaborations. If the variance of the fit result at large values of the Ioffe time, where there is lack of data, the more the PDF result tends to have large oscillatory errors. It will be seen in other functional forms when the variance grows significantly at large Ioffe times, the PDF will contain oscillatory solutions which generate large errors. In these functional forms, the large Ioffe time behavior of the PDF is largely governed by the low x parameter a. The large Ioffe time behavior of the ITD from the model in Eq. (4.2) is given by The PDF must have a finite integral for the sum rules to be enforced. This feature restricts the power of the x → 0 divergence to a > −1 and this ITD must vanish in the limit ν → ∞. All of the polynomials tried above, will only add terms which force the ITD to converge to 0 more rapidly. Any of these fit solutions which does not eventually converge to 0 should be rejected, because it cannot have a finite integral and violates the sum rule.  Figure 17.
The nucleon valence distribution obtained from the ensemble a127m415 fit to the form used by the JAM collaboration in Eq. (4.4). The χ 2 /d.o.f. for the fit with all the data is 2.5(1.5). The uncertainty band is obtained from the fits to the jackknife samples of the data. The resulting fits are compared to phenomenologically determined PDF moments from the NLO global fit CJ15nlo [69], and the NNLO global fits MSTW2008nnlo68cl_nf4 [72] and NNPDF31_nnlo_pch_as_0118_mc_164 [73] all evolved to 2 GeV.

Continuum extrapolation
Previous publications of quasi-PDFs and pseudo-PDFs from lattice QCD have only been performed with one lattice spacing at a time. With results from these two lattice spacings, we can get an estimate of the continuum extrapolation systematic. Though the action is O(a) improved, the quark bilinear operatorψ(0)γ α W (0; z)ψ(z) is not; therefore, O(a) effects are still possible. With two lattice spacings, it is not actually possible to extrapolate a quadratic form in a. If one is cavalier, it could be supposed that O(a) effects may have been significantly reduced or even canceled in the ratios for the reduced matrix elements. This feature is almost certainly true for the low ν region where the normalization explicitly fixes the value to 1. This hope if is further supported by the fact that the results from both lattice spacings are statistically consistent with each other in this region. With the two available lattice spacings an extrapolation may not be reliable, however, the primary goal of this exercise is not the extrapolation in itself but to understand the regions of p and z which show signs of larger discretization errors. As will be argued, the large momentum The nucleon valence distribution obtained from the ensemble a127m415L fit to the form used by the JAM collaboration in Eq. (4.4). The χ 2 /d.o.f. for the fit with all the data is 2.1 (6). The uncertainty band is obtained from the fits to the jackknife samples of the data. The resulting fits are compared to phenomenologically determined PDF moments from the NLO global fit CJ15nlo [69], and the NNLO global fits MSTW2008nnlo68cl_nf4 [72] and NNPDF31_nnlo_pch_as_0118_mc_164 [73] all evolved to 2 GeV. The nucleon valence distribution obtained from the ensemble a094m390 fit to the form used by the JAM collaboration in Eq. (4.4). The χ 2 /d.o.f. for the fit with all the data is 2.0(5). The uncertainty band is obtained from the fits to the jackknife samples of the data. The resulting fits are compared to phenomenologically determined PDF moments from the NLO global fit CJ15nlo [69], and the NNLO global fits MSTW2008nnlo68cl_nf4 [72] and NNPDF31_nnlo_pch_as_0118_mc_164 [73] all evolved to 2 GeV. discretization errors, proportional to O(ap), are more significant than the low separation discretization errors, proportional to O(a/z).
In order to study the discretization effects, the real component of the reduced pseudo-ITD calculated on ensembles a094m400 and a127m440, which are of approximately the same spatial extent, are fit to a polynomial expansion M(ν) = 1 + aν 2 + bν 4 + cν 6 + dν 8 . (4.6) For this fit, data with the same Ioffe time are averaged and the z 2 dependence is neglected. Due to the discretization of the allowed nucleon momentum p, the evolved reduced pseudo-ITD M(ν) is calculated for a different set of ν on configurations with different lattice lengths L, and therefore some ν are in common between both ensembles but far from all of them.
The results of these fits as well as the extrapolation to the continuum limit are shown in Fig. 20. The discretization effects are assumed to have the form where n = 1, 2. Without more lattice spacings, particularly finer lattice spacings, these extrapolations should be considered with reservations. They are more of an attempt to quantify how significant the discretization effects can be especially for large values of the Ioffe time ν (originating from the region of large momenta). Consequently, no attempt to determine a continuum limit extrapolated PDF from these data will be made. The discrepancy between the two lattice spacings is small at low ν, but becomes significant at large values of ν. The low ν source of discretization errors would come from effects proportional to powers of a/z, but their size are restricted by the normalization of the reduced pseudo-ITD. On the other hand, large ν discretization errors are proportional to powers of ap which are not constrained in any way. The size of the coefficient c a (ν) is shown in Fig. 20 for errors proportional to O(a) and O(a 2 ). The discretization errors appear to raise the lattice results at these fairly coarse lattice spacing. For obtaining results in the critical large Ioffe time region, finer lattice spacings are required, in order to both reduce discretization effects and also to produce more data in this region due to finer resolution in momentum. It should be noted that lattice spacing errors at large Ioffe time would be dominated by O(ap) effects rather than short distance effects that scale as O( a z ). Therefore, careful study of the continuum extrapolation of this large Ioffe time region should be performed.

Finite Volume effects
Another potential pitfall in the study of PDFs in numerical Lattice QCD arises from the non local operators used in these studies. Numerical Lattice QCD requires a finite volume to be used, whose effect on local matrix elements typically is exponentially suppressed as where L is the length of the lattice and m is the mass of the lightest particle of the theory with the appropriate quantum numbers. This effect can be thought of as coming from a particle traveling from the operator across the boundary and back to the operator. Generally, the lightest such particle, in QCD simulations with dynamical quarks, is a pion and lattices are designed to make m π L to be sufficiently large so that these effects are small. This picture for local matrix elements is modified by the finite size of the operator. In the case of a non local operator of size z, the lightest particle does not have to travel the full distance L to return to the operator, but instead it must travel a distance L − z, The case of a two current operator has been studied for a model of scalar "pions" and "nucleons" in [74]. This operator, O(z) = J(z)J(0) has a periodic behavior under shifts of the lattice size This periodicity drives the significant finite volume effects observed in [74], particularly for distances such as z L/2. It is also possible that C L should be augmented by powers of (L − z) as was found in [74]. For simplicity, in the following, these unknown powers will be neglected. In a proper study of the finite volume effects, the functional form would need to be determined for the Wilson line matrix element.
The Wilson line operator defining the pseudo-ITD does not have this same periodic feature, but finite volume effects can still be significant. The two ensembles with lattice spacing a = 0.127 fm, a127m415 and a127m415L have volumes of approximately 3 fm and 4.5 fm respectively. Just as was done when studying lattice spacing effects, the reduced pseudo-ITD is fit to the form in Eq. (4.6). In Fig. 21, the reduced pseudo-ITDs calculated on both ensembles are compared. There appears only a slight sign of deviation for the results on these two volumes from the data, but the fits to the polynomial expression show clear deviations. The finite volume effects do appear to be small particularly at small Ioffe time and the difference between the fit results are shown in Fig 21. These differences are largest in the large Ioffe time region, ν 5, where the data originated from large z and the matrix elements are the least precise. The largest z used on either of these lattices was L/4. With these heavy pion masses, the product m π L is fairly large for a lattice calculation. One needs to keep in mind when performing a lighter pion mass calculation that the significance of finite volume effects should be checked. Figure 21. The real components of the reduced pseudo-ITD from the ensembles a127m415 and a127m415L with volumes of approximately 3 fm and 4.5 fm respectively. There appear to be slight finite volume effects whose difference is plotted on the right.

Conclusions
In this work we presented a detailed and systematic analysis of the extraction of nucleon PDFs based on the formalism of pseudo-PDFs. We have employed lattice ensembles with N f = 2 + 1 Wilson-clover fermions with stout smearing and tree-level tadpole improved Symanzik gauge action with two values of the lattice spacing namely a=0.127 fm and a=0.094fm. While two values of the lattice spacing are not sufficient for a stringent control of the continuum extrapolation, they do provide us with a lot of information regarding the size and source of discretization errors for our formalism. In future studies, more lattice spacings should be used for a more robust extrapolation to the continuum limit as well as studying the functional forms used to interpolate the ν dependence. Moreover, since toy model calculations of PDFs employing formalisms that are based on spatially nonlocal operators could potentially suffer from enhanced finite volume effects [74] we have also addressed two different physical volumes for the case of the coarser lattice spacings. Our studies did reveal the presence of slight finite volume effects for the values of the parameters that we investigated. The volume dependence does appear to be larger for matrix elements that come from large separations as expected from the toy model calculation. For a proper study of the finite volume effects, the functional form for the Wilson line matrix element will need to be determined. However, we believe that the functional form used in our studies is well motivated if one assumes that finite volume effects arise from pion exchanges that wrap around the periodic volume. Also just as with the lattice spacing analysis, the dependence of this discrepancy on the functional form used in the interpolation of ν should be studied in future work as well as including more volumes. sequential source technique. As was shown in [67], the matrix element extraction based on the Feynman-Hellman theorem can begin at much earlier times and this is very advantageous for simulations with physical pion mass. Additionally, we have also employed the method of momentum smearing which has proven to substantially improve the overlap of the interpolating fields with the boosted hadron ground state.
Beyond the extraction of the x-dependence of the PDF we also perform the extraction of the lowest two moments of the PDF and we compare to the pertinent phenomenological determinations from CJ 15, NNPDF and MSTW collaborations. Our results, lie above the results of CJ 15 and MSTW, as expected, due to the relatively heavy masses of our pions but agree within errors with NNPDF due to the larger error bars of the latter. As was analyzed in detail in [39] the complementary synergies between the communities of global fits and Lattice QCD would be very fruitful in the forthcoming years. In the latter article different scenarios of lattice data included in the global fits analyses were presented. In all cases, the conclusion was that a close collaboration of the two communities is necessary in order to achieve the best possible PDF extraction.
In this work despite the relatively heavy pions we have addressed many of the pertinent systematics of the extraction of light cone PDFs with the method of pseudo-PDFs. Our studies have shown that the lattice community as the time goes by can have under better control all systematics of these calculations and steady progress is being made. In our forthcoming studies we plan to employ lattice ensembles which have pion masses at the physical pion mass and consider also the pion PDF besides that of the nucleon.    (5) (17)