Extraction of unpolarized transverse momentum distributions from fit of Drell-Yan data at N$^4$LL

We present the extraction of unpolarized transverse momentum dependent parton distributions functions (TMDPDFs) and Collins-Soper kernel from the fit of Drell-Yan and weak-vector boson production data. The TMDPDF are parameterized, as commonly done, using their (large transverse momentum) asymptotic matching to PDF. The analysis is done at the next-to-next-to-next-to-next-to leading logarithmic accuracy (N$^4$LL) (performed only approximately because PDF evolution is known so far at next-to-next-to leading order (NNLO)). The non-perturbative model used for TMDPDF is flavor dependent to reduce the colllinear PDF bias. The estimation of uncertainties is done with the replica method and, for the first time, it includes the propagation of uncertainties due to the collinear distributions.


Introduction
The formulation of the transverse-momentum dependent (TMD) factorization theorems for Drell-Yan (DY) and semi-inclusive deep inelastic scattering (SIDIS) about a decade ago [1][2][3][4] has driven an increasing effort towards the understanding the TMD distribution of partons within the nucleon.The cornerstone of this program is made up by the unpolarized transverse momentum-dependent parton distribution functions (TMDPDFs) because they have an impact on the determination of all the other TMD distributions.For this reason, the precise knowledge of unpolarized TMDPDFs is of utmost importance.In this work, we perform a global analysis of the vector boson production data and determine the unpolarized TMDPDFs.We include an extended dataset and we incorporate consistently the perturbative QCD information from the highest available orders.
The previous unpolarized TMDPDFs extractions [5][6][7][8][9] were based on the next-to-next-toleading logarithm (N 2 LL) resummation, and next-to-next-to-next-to-leading order (N 3 LO) of perturbative accuracy.The perturbative ingredients for this order were computed in refs.[10][11][12][13][14][15].High perturbative orders are needed to meet the precision of modern experiments, mainly ATLAS and CMS, at the LHC.The latest measurements of differential Z-boson-production cross-section at ATLAS [16] reach an extraordinary precision of ∼ 0.1%.The progress on the experimental side has stimulated the calculation of even higher perturbative orders in Quantum Chromodynamics (QCD), reaching very recently the three-loop accuracy for all terms of the factorized cross-section, and even higher for the anomalous dimensions [17][18][19][20][21][22][23][24][25].All this has prompted the explicit calculation of cross-sections, including next-to-leading logarithmic effects up to power four1 (N 4 LL).
The perturbative calculations, however, are just the tip of the iceberg.The TMD distributions, the heart of the TMD factorization theorem, are essentially non-perturbative functions.As such, they cannot be computed in perturbative QCD, but must be determined from the data.For this task we take advantage of a well-known relation between TMDPDFs and PDFs when the TMDPDFs are written in the transverse momentum conjugate variable b.Then, going to the asymptotic small-b limit, one has where x is Bjorken collinear momentum fraction, q j (x, µ) and f i (x, b; µ, ζ) are the collinear PDFs and TMDPDFs respectively.These distributions depend on the factorization and rapidity scales, µ and ζ.In the eq.(1.1), C i←j 's are the Wilson coefficient functions, the subscripts i and j label the flavors of partons and we have omitted power-suppressed corrections for simplicity.Implementing eq.(1.1) in a fit ensures a correct behavior of TMDPDFs in the collinear limit.Simultaneously, its usage propagates all the biases of collinear PDFs into the TMDPDFs, and extractions done using different collinear PDFs (keeping fixed all the rest) might show significant deviations from one another, up to the point of not overlapping the uncertainty bands.This bias is called PDFbias [26], and it represents one of the largest problems of modern TMD phenomenology.A way to mitigate the PDF-bias consists in taking into account the theoretical uncertainty of the collinear PDFs along with a flexible non-perturbative ansatz in the TMDPDF determination.The effects of PDF selection were discussed in ref. [8] and explicitly shown in ref. [26] using four sets of collinear proton PDFs.
The main goal of this work is to update the unpolarized TMDPDFs determined within the artemide-framework [5,7,8,27].For this reason, we name this extraction "ART23".Compared to the previous extraction [8] (SV19), ART23 is based on a larger dataset, which is greater by almost 50%.It includes the recently released data [16,[28][29][30][31][32] on the Z-boson production and, for the first time in the TMD phenomenology, the data on W-boson production [33,34].We increase the perturbative accuracy to N 4 LL and include the flavor dependence into the non-perturbative ansatz.The baseline collinear distribution that we use is the MSHT20 PDF set [35].We provide a critical and comprehensive treatment of uncertainties.For the first time, we consistently include the PDF uncertainty in the analysis along with the data uncertainty obtaining a more trustful result.
This article is organised as follows.We devote section 2 to the theoretical framework used for the computations.It includes the formulas for cross-sections, a brief description of the ζ-prescription for the TMD evolution employed by artemide [36], and the ansätze for the non-perturbative parts of the TMDPDFs and Collins-Soper (CS) kernel.In section 3 we discuss the experimental data and the selection criteria employed, while section 4 contains the details pertaining to the fitting procedure.The outcome of the fit can be found in section 5. Finally, we present our conclusions in section 6.The plots comparing the experimental data and theoretical predictions are presented in Appendix B. In Appendix A, we present the outcome of the same analysis using the NNPDF3.1 collinear PDFs [37] as baseline.

Theory overview
There are several implementations of the TMD factorization framework.All of them are based on the same evolution equations, but differ in the realization of the solution, which is not unique.Conceptually, all realisations produce the same final result [38,39].Practically, there are differences due to the truncation of the perturbative series.Also, the correlations between non-perturbative (NP) elements are different, which could affect the results of the extractions.In our fit we use the ζ-prescription [5,39], which eliminates (theoretically) the correlation between the CS kernel and the TMD distributions.

Cross-section in the TMD factorization
The DY lepton pair production is defined by the process with h 1 , h 2 the colliding hadrons, l, l ′ the final state leptons, and the symbols in parentheses denoting the momentum of each particle.In the following, we neglect both hadron and lepton masses, i.e., P 2 1 ≃ P 2 2 ≃ l 2 ≃ l ′2 ≃ 0, since the target-mass corrections are negligible at the typical energies of DY data.
The relevant kinematic variables in DY read where q + and q − are the components of q µ along P µ 1 and P µ 2 , correspondingly.The transverse components of vectors are projected by a tensor g µν T , orthogonal to P µ 1 and P µ 2 , 3) The transverse momentum of the exchanged boson is q 2 T = −q 2 T = g µν T q µ q ν .In the center-of-mass frame, the components of momenta are P + 1 = P − 2 = s/2, and the variables x 1,2 are respectively At leading power in the TMD factorization, the cross section of the DY process mediated by a neutral boson reads where M Z and Γ Z are the mass and decay width of the Z-boson, respectively.The summation index f runs over all active quarks and anti-quarks.The function W f f f1f1 describes the hadronic part of the process and is defined below in eq.(2.11).The function P 1 accounts for the modifications of the lepton phase space (fiducial cuts) and it is given in eq.(2.7).The factors z jk i are the combinations of Z and γ couplings to quarks and leptons : with s W and c W the sine and cosine of the Weinberg angle, respectively.Here z γZ f corresponds to the interference term in the product of amplitudes, and z γγ f and z ZZ f corresponds to the diagonal terms.
Some experiments do not correct the data for the detector acceptance of leptonic phase space.In these cases, the collaborations provide the description of the fiducial regions to be accounted for in the theoretical predictions.On the theory side, the fiducial cuts are part of the leptonic interactions described by the leptonic tensor, and do not interfere with the hadronic tensor.Due to this, the corrections for fiducial cuts can be incorporated exactly, as part of the integration over the phase-space volume of the lepton pair.In the fits that we present here, these corrections are collected in a factor P 1 defined as where E and E ′ are the energy components of the leptonic momenta l and l ′ , and θ(cuts) is the Heaviside step function defining the volume of integration.Typically, the cuts on the lepton pair are reported as where η and η ′ are the pseudo-rapidity of the leptons.The derivation of eq.(2.7) can be found in refs.[5,8].The factor P 1 is normalized such that it is equal to unity in the absence of cuts.The integral on the right hand side of eq.(2.7) is computed numerically.
In the case of W-boson production the formula eq.(2.5) changes to where M W and Γ W are the mass and the decay width of the W-boson, and with V f f ′ either an element of the Cabibbo-Kobayashi-Maskawa (CKM) matrix (for quarks) or unity (for leptons).Notice that for the data considered in this work it is not necessary to pass from the variable Q 2 to the square of the transverse mass m 2 T ; it is sufficient to know that Q 2 min > m 2 T and one can integrate on Q 2 > Q 2 min (see also ref. [43]).The hadronic function W f f ′ f1f1 is given by where f 1 is the unpolarized TMD distribution, C V is the hard coefficient function (that coincides with the vector form factor of the quark), and J 0 is the Bessel function of the first kind.The variable b is the Fourier conjugate of the transverse momentum q T , and µ H is the hard factorization scale.Finally, the argument ζ is the rapidity evolution scale, which is typical in TMD factorization [1][2][3][4][38][39][40][41][42] that prescribes Eq. (2.5) and eq.(2.9) are only the leading power terms of the TMD factorization theorem.The power corrections scale as q 2 T /Q 2 and Λ 2 QCD /Q 2 .Currently the theory of these corrections is   [44].The quark masses are taken as in the MSHT20 collinear PDF extraction [35].
in a developing stage, see e.g.refs.[42,45].Thus, in what follows, we consider only the data for which the power corrections are (presumably) negligible.
The values of the electroweak parameters and heavy-quark masses used in this work are reported in table 1.For the electroweak parameters we have taken the central values published in the Particle Data Group [44].We do not include their uncertainties, since they are smaller than other uncertainties involved.The strong coupling constant and the quark masses are taken from the PDF set that we use, that is MSHT20 [35] for the main fit.

TMD evolution and optimal TMD distributions
The scale evolution of TMDPDFs is essential to include high and low energy data in an unique theoretical frame.The TMD evolution equations are ) Note that these equations do not depend on the quark's flavor.This system of equations consists of a standard renormalization group equation, eq.(2.12), coming from the renormalization of ultraviolet (UV) divergences, and a rapidity evolution equation, eq.(2.13), specific of the TMD factorization that comes from the factorization of rapidity divergences.The function D(µ, b) is called the CS kernel and it is a non-perturbative (NP) function.The integrability condition (also known as Collins-Soper equation [46]) holds, where Γ cusp (µ) is the cusp anomalous dimension.Eq. (2.14) expresses the formal pathindependent evolution in the (µ, ζ)-plane.The TMD anomalous dimension is The perturbative expansion for γ F is (2.16) In the present work we use the five-loop Γ cusp [47] and four-loop γ V [48].
The selection of the initial evolution scale (i.e. the scale where the NP functions are extracted) is a key point.In our work, we use the initial scale associated with the ζ-prescription [8,39] so that the boundary conditions for the system in eq.(2.12, 2.13) are given by the optimal TMD distribution [39].This optimal TMDPDF is defined at the scale (µ, ζ µ (b)), which belongs to the special equi-potential (or null-evolution) line (of the evolution potential introduced in ref. [39]) defined by the equation (2.17) with boundary conditions For that reason, the optimal TMDPDF is exactly scale-independent (for any µ and b) and it is denoted without scales, Eq. (2.18) defines the (unique) saddle point (µ 0 , ζ 0 ) of the evolution potential.Due to it, the value of ζ µ is finite for any µ (bigger than Λ QCD ) and b.A TMDPDF at any other scale can be obtained evolving the optimal TMDPDF along the path µ = const., (2.20) The hard factorization scale can be arbitrary since the dependence on it cancels between factors in the factorized expression of eq.(2.11).Practically, we set it to in order to minimize logarithms in the hard coefficient function.Substituting eq.(2.20, 2.21) into the definition of the structure functions W f f1f1 we obtain, These are the final expressions used to extract the NP functions.
The most important feature of the ζ-prescription is that it exactly removes the correlation between the CS kernel and TMDPDF due to renormalization scales.The optimal TMDPDF does not depend on the CS kernel because it is determined exactly at the saddle point D = 0. Therefore, the CS kernel is treated as an independent NP function.Thus, the solution of eq.(2.17) with boundary conditions eq. (2.18) must be found for a generic D since it will change during the fitting procedure.This problem is solved in ref. [27].The corresponding solution for ζ µ (b) as a functional of D is denoted as The expression is rather lengthy, and can be found in ref. [8] at N 3 LO, and directly in the code of artemide [36] at N 4 LO.In contrast to SV19, in this work we use ζ exact µ without any modification at small values of b.

TMD distributions at small-b
As mentioned in sec. 1, in the regime of small-b the TMDPDF can be expressed via the collinear PDFs with the help of the operator product expansion (OPE).The relation between TMDPDF and PDFs reads where q(x, µ) is the unpolarized PDF, the label f ′ runs over all active quarks, anti-quarks and gluon, and with γ E being the Euler constant.At LO the coefficient function reads C [0] f ←f ′ = δ(1 − x)δ f f ′ and the higher order terms are known up to N 3 LO [11,13,21,49].In the ζ-prescription, the expressions of the coefficient functions are different from those presented in refs.[11,13,21,49], e.g.all double-logarithm contributions disappear.Up to N 3 LO the expressions are f ←k ⊗ P (1) f ←k ⊗ P (1) f ←k ⊗ P (1) f ←k ⊗ P (2) f ←k ⊗ P (1) (1,0) k←f ′ + C (2,0) where the symbol ⊗ denotes the Mellin convolution, a summation over the intermediate flavour index k is to be understood, and we have omitted the argument x of the C f ←f ′ on the r.h.s. for brevity.The functions P (n) (x) are the coefficients of the DGLAP kernel, P (x) = n a n s P (n) (x) and, up to three-loops, they can be found in ref. [50].The functions C (n,0) f ←f ′ (x) are the finite parts of the coefficient functions given in refs.[11,13,21,49].In particular, the NLO terms are with x = 1 − x.
The OPE has an internal renormalization scale, µ OPE , which is not connected to the scales of the TMD evolution, as it happens f.i. in the case of the b * -prescription [2,9].Therefore, the expansion in eq.(2.23) is independent of µ OPE , and its value can be conveniently chosen such that it minimizes the logarithmic contributions at b → 0 and, at the same time, it avoids the Landau pole at large-b.We have decided to use the same value for µ OPE as in the SV19 extraction, i.e, The choice of the large-b offset of µ OPE as 2 GeV is motivated by a typical reference scale for PDFs (and lattice calculations).We remark that the factorization of the cross-section with TMD distributions is superior to a particular realization of the TMD distributions in terms of PDFs.Therefore, the actual choice of µ OPE is a part of a TMDPDF modeling which (in the present case) includes the asymptotic collinear limit.Any modifications of µ OPE would be absorbed by the NP parameters.
The CS kernel is an independent NP function, defined by the vacuum matrix element of a certain Wilson loop [51].Analogously to TMDPDFs, the CS kernel can be computed at small values of b using the OPE.The leading power expression has the form where the explicit expressions for d (n,k) are given in [14,15,41] at N 3 LO, and in [24,25] at N 4 LO.The power corrections to eq. (2.30) have been computed in ref. [51] and they are proportional to the gluon condensate.The coefficients of OPE and the values of the anomalous dimensions depend on the number of active quark flavors N f .To treat this number we use the (naive) variable flavor number scheme, which sets N f = 3 for µ < m c , N f = 4 for m c < µ < m b , and N f = 5 for µ > m b .In this scheme, the evolution integrals are smooth, whereas the coefficient functions have discontinuities (steps) at some b values, those corresponding to the thresholds of µ OPE (b).These discontinuities produce tiny oscillations in W f1f1 after the Fourier transform.The input PDF distribution that we choose is the MSHT20 extraction ref. [35], which uses a similar scheme.We noticed that it is important to set our threshold parameters identical to those used in MSHT20, so that the oscillations are reduced to a negligible ∼ 0.01%.The values of the threshold masses are reported in tab. 1.

Models for TMD distributions and CS kernel
We use the following phenomenological ansatz for our optimal unpolarized TMDPDFs: where the functions f f NP accumulate the effect of power corrections to the small-b matching.To satisfy the general structure of OPE [52], f f NP must be a function of b 2 and behave as NP must decay at large b to ensure the convergence of the Hankel transformation.Note that, in the ansatz of eq.(2.31), the logarithm of b in the coefficient function grows unrestricted at large-b (the so-called "local ln(b)"-setup).
There is a large freedom in the definition of the functions f f NP .The main criterion for their construction is to have the maximum flexibility with the smallest number of free parameters.From our experience in previous extractions we deduce that the optimal b-profile is the one with an exponential decay at b → ∞ and Gaussian behavior at intermediate b [5,7,8].The x-profile should distinguish large and small-x contributions [6][7][8][9].After several attempts we opted for the following functional form where λ f 1,2 are free parameters.In the present fit, we distinguish {u, d, ū, d, sea} flavors, where sea stands for {s, s, c, c, b, b}-quarks.This decomposition is suggested by the data as they do not allow for the flavor separation of sea quarks yet.In total we have 10 free parameters, The novel feature of the present model is the flavor dependence.In previous determinations of unpolarized TMD distributions f NP was chosen to be flavor-independent, which led to a number of undesirable effects, see ref. [26] for a discussion.First of all, the extraction of the TMD distribution appeared to be strongly dependent on the choice of the collinear PDF, and often a choice of f NP valid for one PDF set was not successful for another (we call this effect "PDF bias").Secondly, the uncertainties of f NP were essentially underestimated.The inclusion of flavor dependence significantly reduces these problems.Additionally, the functional form used for each flavor f in eq.(2.32) is much simpler in comparison to SV19 [8] or MAP22 [9].Note that the TMD flavor dependence were also studied in ref. [53,54].
The CS kernel reads where D small-b is given in eq.(2.30), and D NP provides the rest of the NP terms.The term with the integral in eq.(2.33) performs the evolution of the CS kernel2 from the scale µ * to the scale µ.Therefore, generally, eq.(2.33) does not depend on µ * , apart from the truncation of the perturbative series.The functions b * and µ * are with a free parameter B NP .This definition implies that L µ * (b * ) = 0. Analogously to f NP , the NP part of the CS kernel must be a function of b 2 to support the structure of the OPE.At large b the CS kernel must be positive (to guarantee the convergence of the Hankel transform in eq.(2.22)), and not grow faster than (b 2 ) 1/2−δ with δ ⩾ 0 [51].The expression for D NP generalizes the one used in SV19 including logarithmic corrections, where c 0,1 > 0. One can easily identify three free parameters in our ansatz for the CS kernel, namely, {B NP , c 0 , c 1 }.At large-b, the logarithmic term vanishes and the expression for the CS kernel becomes linear in b: The term proportional to c 1 simulates the logarithmic dependence of the power corrections, and gives an extra flexibility to the ansatz at b ∼ B NP .In preliminary studies, we have found that such a correction provides a better agreement with data in comparison to other models.This fact conveys the important message that both theory and experiment have achieved a degree of precision at which these effects become measurable.

Definition of perturbative order and scale variation uncertainties
In the factorized cross section defined above, we encounter three perturbative inputs and associated scales: • The perturbative hard coefficient function C V , and associated hard factorization scale µ, that separates C V and the TMD distributions in eq.(2.22).
• The coefficient function of the small-b operator product expansion for TMDPDF C [n] f ←f ′ and the associated scale µ OPE in eq.(2.23).
• The small-b expansion for the CS kernel D small-b and the associated scale µ * in eq.(2.33).
Thanks to the ζ-prescription, each perturbative series can be truncated irrespectively of the perturbative orders included in the others.
In this work, we use the highest known orders for all perturbative ingredients: the N 4 LO (fourloop, ∼ a 4 s ) hard coefficient function C V [58], the N 3 LO (four-loop, ∼ a 4 s ) light-like-quark anomalous dimension γ V [48], the N 3 LO (four-loop, ∼ a 4 s ) expression for the CS kernel D small-b [24,25], and the N 3 LO (three-loop, ∼ a 3 s ) expression for the matching coefficient functions C f →f ′ [21,49].The cusp anomalous dimension is taken at order (∼ a 5 s ) [47] (the expression in ref. [47] is approximate, we consider the central value).
The input collinear PDFs, MSHT20 [35] (and NNPDF3.1 [37] discussed in appendix A), were obtained at NNLO, which implies the usage of the NNLO evolution kernel P (3) .As a result, the  logarithms included in the N 3 LO small-b coefficient functions are entirely compensated by the PDF evolution.The value and evolution of a s is provided together with the collinear PDF.The orders of the anomalous dimensions and coefficients functions are adjusted to each other, such that the scale-dependence is canceled at a given perturbative order.In the resummation nomenclature this combination of orders is referred to as N 4 LL [6,56] (or N 4 LL − in [9], or, here, approximate N 4 LL).The summary of the perturbative orders is also given in tab. 2.
To define the scale-variation band we multiply each scale with an independent factor Summary of the perturbative orders used for each part of the factorized cross section.The evolution of αs is provided by the LHAPDF library and comes together with the PDF set (uniformly NNLO).
-10 -2, 3, 4), defined by the rule The labels of parameters s i follow the enumeration used in ref. [59].The rule for µ OPE is designed such that the variation of scale does not impact the NP large-b part of the TMD distribution.As customary, each s i is allowed to vary by a factor 2, i.e. they can take the values {0.5, 1, 2}.In total, there are 27 combinations of {s 2 , s 3 , s 4 }.For each one of them we compute the cross-section dσ i .The size of the variation band is defined as the maximum (symmetric) deviation from the central value, i.e. ∆dσ = max (|dσ i − dσ| i=1...27 ) . (2.37) In fig. 1 we compare the sizes of variation bands for a representative sample of high and low energy experiments for four consecutive orders, while using the same values of the NP parameter and the same PDFs.In fig. 2 the same comparison is done for the cross-sections.As expected, we observe that the size of the bands reduces with the increasing of the perturbative order.Each next-order curve is inside the variation band of the previous one.We also notice that for q T > 5 GeV the curves are close to each other.It indicates that convergence of the perturbative series is better than what one could estimate from the variation bands and that the rule in eq.(2.37) is too conservative.
For small values of transverse momentum (q T < 5 GeV), the dominant contribution to the variation band arises from the factor s 4 .However, in this regime the scale variation band is still not significant, as the effects of non-perturbative parameters override those of the perturbation theory.For q T > 5 GeV, the variation band is largely determined by the variation of s 3 , and in this range, it remains nearly constant.For the ATLAS experiment at √ s = 13 TeV, the mean value of the variation band for 5 GeV < q T < 25 GeV is 1.3%.At low energies, the variation band remains large (around 10%) even at the N 4 LL level, but this is not a problem because in this range the theory prediction is largely non-perturbative.
We also note that the oscillations happen in the variation band at q T > 10 GeV.Studying this effect in detail, we have concluded that it is connected to our implementation of the flavor variable number mass scheme in a complex way.Basically, the discontinuities in the shapes of the distributions (due to the quark mass thresholds) change positions with the variation of the parameters s i .These discontinuities generate tiny oscillations in the cross-section.For a natural selection of scales (that minimizes logarithms), the discontinuities (and hence the oscillations) are negligible, but some combinations of s i 's are especially oscillatory since they generate many discontinuities.Therefore, the final shapes in fig. 1 are generated by the overlapping of the 27 curves with small oscillations in each one.We will address this problem in future works.

Summary of inputs from theory
The cross-section of the vector boson production is computed with eq.(2.5) and eq.(2.9), for neutral and charged EW boson, respectively.The structure function W f1f1 is evaluated in the TMD factorization framework and given in eq.(2.22).The evolution of TMDPDFs is computed using the ζ-prescription, and the phenomenological ansätze for the optimal unpolarized TMDPDFs and CS kernel are defined in eq.(2.31) and eq.(2.35).At small-b the TMDPDFs are matched to the unpolarized PDFs, for which we use the MSHT20 extraction at NNLO [35].In tab. 2 we list the perturbative orders used in each factor of the approximated N 4 LL cross section.In total, there are 13 free parameters to be determined by the fitting procedure: 3 of these describe the CS kernel, while the remaining 10 are for the unpolarized TMDPDFs.

Overview of data
The factorization formulas, eq.(2.5) and eq.(2.9), are valid at small values of q 2 T /Q 2 .This restriction has been studied from the phenomenological point of view in refs.[5,6,8].The common conclusion is that, for q T /Q < 0.25, the power corrections remain at the level of 1%, and therefore the data can be safely included in phenomenological extractions.Above this threshold, the deviation between the theory and the measurements grows.However, such a rule does not work for data with precision of the order of (or better than) 1%.In this case, the power corrections significantly affect the quality of the description, despite being numerically small.
In this work, we use the same general strategy for selecting the data as in refs.[7,8].Namely, we only include in our fit a data point if it fulfils the conditions ⟨q T ⟩ ⟨Q⟩ ≡ δ < 0.25 and where ⟨q T ⟩ and ⟨Q⟩ are the average values of q T and Q for the bin, and σ is the relative uncorrelated uncertainty.The second condition is actually needed only for the high-energy data, as it is satisfied by all the data from the lower energy experiments with Q < 40 GeV.The selection rules of eq.(3.1) allow us to keep control of the predictive power of the theory, and still incorporate a large amount of data into the fit procedure.They are slightly softer than the rules used in refs.[7,8], because we plainly include all data with ⟨q T ⟩ < 10 GeV.The bulk of the data considered here has already been used in previous extractions, such asin refs.[5][6][7][8][9].This includes the fixed-target E288, E605, E772 experiments from FermiLab (263 data points) [60][61][62], the Z-boson production data from the CDF and D0 experiments at Tevatron (107 data points) [63-67], and the LHC run-1 and run-2 measurements of Z-boson production by the ATLAS, CMS, and LHCb collaborations (75 data points) [68][69][70][71][72]. Since these datasets are wellknown and have been well-studied in the past, we refer the reader to refs.[5,6,8,9] for a detailed discussion on their properties.In addition to these, we have included the latest measurements done at RHIC [28,73] and the LHC [16,[29][30][31][32], and the W-boson production data from Tevatron [33,34].As we consider these data in the framework of TMD factorization for the first time, we find it worthwhile to highlight the particularities of each set in the following lines.
The PHENIX data [73] were taken at √ s = 200 GeV, which restricts the Q range (⟨Q⟩ = 7 GeV).It is the only modern DY measurement at low energy presently available, and has already been studied within TMD factorization in refs.[6,8,9].The Z/γ-boson production measurement at STAR [28] was made at moderately high energy ( √ s = 510 GeV) during the 2018-2020 runs and the final results are currently in preparation for publication.Here we used the preliminary data.
In the present fit we include the recent y-differential measurements of Z-boson production at CMS [29] and LHCb [32], at √ s = 13 TeV.These replace the corresponding integrated measurements used in [8].We also include the more precise (∼ 0.1% uncertainty) measurement of the Z-boson differential cross-section by ATLAS [16].Finally, we include the high-Q neutral-boson production measurements by the CMS collaboration [31].This dataset is unique, since it spans up to Q = 1 TeV (in several bins).For the interpretation of these data one should take into account that the bin-migration effects due to final state radiation are not incorporated in the published tables 3 .Accounting for these effects is critical to confirm the agreement between theory and measurement.Nevertheless, we are not able to describe the lowest lying bin 50 GeV< Q < 76 GeV, for which we encountered a large difference in the normalization, and therefore, we exclude this bin.We have also excluded the 76 GeV< Q < 106 GeV bin, using instead the y−differential measurement from the same run [29].For the first time in TMD phenomenology, we include W-boson production data [33,34].Generally, the description of this observable is problematic within the TMD factorization framework because, usually, the data are integrated over a wide kinematic range, including regions where the TMD factorization conditions are not fulfilled.For a detailed discussion of this issue, see [43].While the measurements [33,34] are fully integrated in Q, an explicit restriction on the transverse energy of the electron and neutrino (the missed transverse energy) was also imposed.This permits to find the lowest limit for Q.We have restricted the upper limit of integration to 300 GeV, since the contribution of higher Q provides a negligible correction.To estimate the cut rules of eq.(3.1) for these data we used ⟨Q⟩ = M W .
In total, the present analysis includes 627 data points, summarized in tab. 3. The kinematical coverage of the datasets in the (x, Q)-plane is shown in fig. 3.All the new data (w.r.t.[6][7][8]) are at high energy.In fact, the new dataset totally supersedes the previous ones in both number of points (e.g. the present fit includes 227 points from LHC, vs. 80 in [8]) and precision.Therefore, the present selection allows for a more precise determination of the CS kernel (due to increased span in Q) and provides a finer flavor separation due to the W-boson measurements.

Fit procedure
Comparing the theoretical predictions with the data, we are able to restrict the free parameters of our ansatz for the TMD distributions, and in this way determine its NP part.This procedure is standard, and the present implementation is generally the same as the ones used in refs.[5][6][7][8][9].However, in the present fit we treat the uncertainties more accurately and the PDF uncertainties are taken into account.The details of the procedure are reviewed in this section.*Bins with 9 ≲ Q ≲ 11 are omitted due to the Υ resonance.
Table 3. Summary table of the data included in the fit.For each dataset we report: the reference, the centre-of-mass energy, the coverage in Q and y/xF , the cuts on the fiducial region (if any), and the number of data points that survive after the cut of eq.(3.1).

Treatment of the experimental data
In the measurement of the cross-sections in an experiment there are several features that should be treated accurately in order to achieve a better consistency.We point them out one-by-one below.Bin integration.The expression for the cross-section in eq.(2.9) is given for a single point in the (Q 2 , y, q T )-space.On the experimental side, the measurement is provided for a volume-element of phase space, which can be obtained by averaging the theoretical expression . Here, the Q max/min , y max/min , and q T max/min are the boundaries of the phase-space volume (bin).These integrations could not be simplified analytically, since the expression for the cross-section is too involved (especially in the presence of fiducial cuts).Accounting for the effect of finite bin size is of crucial importance, since for most of the experiments dσ| th.changes significantly and nonlinearly within the bin (an explicit study of this can be found in ref. [8]).It should also be taken into account that some experiments provide integrated (rather than averaged) data.For example, the LHCb measurements in refs.[71,72] are given for ∆σ, that is, the bin integrated cross-section without any weighting factors.
Normalization.In a few cases the measurement is normalized to the total cross-section, i.e.
This practice helps to reduce the normalization uncertainty of the measurement.Our formalism does not allow the computation of the weighting factor, since it includes values of q T beyond the factorization range.In these cases we adopt the following procedure [5,7,8].We compute i.e. we normalize to the area of included data points.Note, that the normalization is done after the bin-integration.The numerator of N tells which part of the cross-section is included into the fit, while the denominator gives the theoretical estimate of normalization.If the shape of data is perfectly described by the theory, the factor N is independent on the number of included bins.Due to it, we expect that the deviation of the normalization factor computed with the full data set (in a framework that describes also large-q T data) from eq. (4.3) is minimal and it cannot significantly impact the results of the fit.The normalization procedure reduces the amount of information which can be gained from the data and it also potentially introduces an unknown uncertainty due to eq. (4.3); therefore, we use unnormalized data whenever available.Only the following datasets require a normalization factor: D0 (run2) [66, 67], ATLAS (13 TeV) [16], and W-boson measurements [33,34].The normalization factor is computed independently for each new set of values of the NP parameters (including the PDF replicas), and for each replica of the data, which provides the proper propagation of the fitting and data uncertainties.
Nuclear effects.The fixed target experiments are done on nuclear targets (Cu for E288 and E605 [60,61], and 2 H for E772 [62]).To simulate the nuclear environment we perform the iso-spin rotation of the TMD distribution.Namely, we set ) where A is the atomic number and Z is the charge of a nuclear target.The effects of heavier mass in the TMD distribution can be ignored since they are compensated by the scaling of the momentum fraction, as it has been shown in ref. [52].We do not include any finer modifications, since the fixed target data are not precise enough to distinguish them.
Artemide.The computation of the theory prediction, as well as all integrals and factors required for comparison with the data, are done in artemide, a multi-purpose code for the phenomenology of TMD factorization.It is based on the ζ-prescription, which allows for many simplifications of the code and improves the speed of the computation.In particular, the computation of the theory prediction for the full dataset (627 points) with all integrations required for comparison with experiment consumes 20-30 seconds on the average desktop (12 cores processor) depending on the NP input.The code of artemide is written in FORTRAN95.It is open-source and available at [36].The values of α s and the collinear PDFs are obtained from the LHAPDF interface [74].
The legacy of artemide is established in many previous global fits, including fits of various unpolarized [5,7,8,26,27,43,75], and polarized [76][77][78] observables.For the present work we have updated it with the expressions for N 3 LO coefficient functions and N 3 LO anomalous dimensions, without further modifications of the internal structure of the code.

Definition of the χ 2 -test function and related quantities
The agreement of the theory prediction and the data is quantified by the χ 2 -test function.We use the standard definition adopted from the fits of collinear PDFs in refs.[79,80], to which we refer the reader for a detailed discussion.The χ 2 -test function is defined as where i and j run over all data points included into the fit, m i and t i are the experimental value and theoretical prediction for point i, respectively, and V −1 ij is the inverse of the covariance matrix.The covariance matrix is defined as where ∆ i,uncorr.is the uncorrelated uncertainty of the measurement (if there are more than one, they are summed in squares), and ∆ (l) i,corr.is the l-th correlated uncertainty.The normalization uncertainty (due to the luminosity) is included in the χ 2 as one of the correlated uncertainties.This definition of the covariance matrix and the χ 2 -test function takes into account the nature of the experimental uncertainties and also provides a faithful estimate of the agreement between data and theoretical predictions.
Additionally, this definition of χ 2 allows for an one-to-one separation of correlated and uncorrelated parts [80] for each correlated dataset.Therefore, where χ 2 D incorporates the contribution due to the uncorrelated uncertainties of the measurement, while χ 2 λ is the rest.The definitions of χ 2 D and χ 2 λ are with λ and t defined below.Loosely speaking, χ 2 D (χ 2 λ ) shows the agreement in the shape (normalization) between the theory prediction and the experimental measurement.
To perform this decomposition, one should compute the "nuisance parameters" λ l (l enumerates the number of correlated uncertainties for the given dataset) [7,80] for a given theory prediction.Then the theory prediction is decomposed as The terms d i are interpreted as correlated shifts in the predictions which generate the χ 2 λ contribution.The terms ti represent the part of the theory curve that contributes solely to χ 2 D , and thus has a "perfect" normalization.
This decomposition is often useful for analysis and visualization because the correlated (normalization) uncertainty is generally much larger than the uncorrelated one.Hence, in plots comparing with data, we show the ti part of the prediction.This allows us to visually confirm the "good" values of χ 2 .The average value of correlated shifts relative to the cross-section ⟨d i /σ⟩ is presented in tab. 4. It exhibits the general disagreement in the normalization between the predictions of TMD factorization and the measurements.
The computation of the χ 2 value and further manipulations are performed with the DataProcessor library, which is written in PYTHON and interfaced to artemide via the standard f2py library.The code of DataProcessor, together with the collection of the experimental data points, and all programs used for the present work, can be found in [81].

Minimisation procedure and uncertainty estimation
The ansatz for the TMD distributions contains in total 13 parameters, which we denote as To find the optimal values of − → λ we minimize the χ 2 using the library iMinuit [82].The resulting value − → λ center is called central value fit.The central value fit is used only as an initial assumption for all further minimization procedures described below.
The propagation of initial uncertainties to extracted values of the TMD distributions is both the central component and the most time-consuming step in the computation.We employ the resampling method to perform this task, generating samples of setups distributed according to the initial uncertainties.We distinguish two sources of uncertainties: • Experimental uncertainties.These are uncertainties due to experimental measurements (statistical, systematic, etc.).To propagate these uncertainties we generate replicas of pseudodata.A replica consisting of pseudo-data is obtained by adding Gaussian noise to the values of the data points.The parameters of the noise are dictated by the correlated and uncorrelated experimental uncertainties.Here, one should account for the nature of the correlated uncertainty in order to avoid the so-called D'Agostini bias [83].The detailed algorithm for the generation of pseudo-data is given in ref. [79].
• Uncertainty in collinear PDFs.To propagate these uncertainties we use the Monte-Carlo sampling of the PDF distributions.The MSHT PDF set is given with (asymmetric) Hessian uncertainties, and the Monte-Carlo samples are generated according to the prescription given in ref. [84].
Other sources of uncertainties (such as uncertainties in M Z or α s , uncertainties due to missed higher perturbative orders, etc) are considered negligible.
The initial uncertainties included in the analysis originate from different sources, and it is not always clear how they should be combined.This is because the propagation mechanism for each uncertainty differs.The experimental uncertainties modify the expression of χ 2 (by changing V and m), while the PDF uncertainty changes the theory expression (by changing the boundary value of the TMD distribution).In this work, we consider them on the same foot and generate samples by varying data and PDF simultaneously.Here, the PDF replica is randomly selected from the pre-generated sample 4 , and thus could be present in the final ensemble several times.This method of uncertainty propagation is more accurate than any used before.For example, in the method of ref. [26] both types of uncertainties were sampled independently and then combined into a single uncertainty band; in refs.[6,9] the PDF uncertainty was introduced into the definition of χ 2 .
For each setup sample, we minimize the χ 2 -function and find a set of parameters − → λ , which defines the minimum.This procedure is repeated 1000 times, giving us an ensemble {Λ i = ( − → λ i , n i )}, where i = 1, ..., 1000 and n i is the serial number of the PDF replica used in the sample setup.The PDF replica's serial numbers must be preserved since the values of − → λ are essentially correlated with it.The ensemble of Λ's entirely describes the TMD distributions and the CS kernel.This ensemble is used in all further manipulations.The list of values of Λ i can be found in the artemide repository [36], in the (human-readable) format suitable for automatic processing by the DataProcessor.
Using the ensemble Λ, we can find the mean values of the parameters ⟩ associated with the central PDF replica (since the averaging of PDF replicas produces the central one by definition).The distribution is not entirely Gaussian, and this uncertainty band is associated with the 68%-confidence interval (68%CI).The boundaries of the 68%CI are computed using the resampling method by computing the 16% and 84% quantiles.
The central values and uncertainty bands for secondary values, such as TMD distributions, cross-sections, χ 2 , etc., are computed starting from the ensemble Λ.For example, to obtain the TMD distribution we compute the ensemble The central value is then the mean ⟨F i ⟩ and the uncertainty band is 68%CI of due to the correlations in-between members of Λ i .The procedure described above allows to propagate all correlations correctly.
This work presents a comprehensive analysis of error propagation in TMD phenomenology, which is the first of its kind.The proposed procedure is expected to reduce the dependence on PDFs as input parameters.However, the approach comes at the cost of increased computational complexity.In the present work, we use the MSHT20 PDF set [35], which we present as the main result.To cross check we also made an independent run (with 300 replicas) with the NNPDF3.1 PDF set [37].The results of this run are given in appendix A.

Results
In this section we present the results of the fitting procedure, starting with the quality of the data description, and finishing with the presentation of the extracted TMD distributions and CS kernel.

Quality of data description
We find that the current setup perfectly describes the data.The central value fit results in χ 2 /N pt = 0.93.For the mean prediction (i.e.⟨dσ[Λ i ]⟩), χ 2 /N pt = 0.957, with the 68%CI (0.950, 1.048).The histogram of χ 2 /N pt is given in fig. 4. The complete list of the χ 2 -values for all datasets is presented in tab. 4. In comparison to the SV19 fit [8] we observe an overall improvement in the χ 2 , which is especially significant for the description of the LHC data (χ 2 LHC /N pt = 1.26 +0.76 −0.15 with N pt = 230), and the low-energy DY data (χ 2 low /N pt = 0.50 +0.09 −0.03 with N pt = 266).Similarly to SV19, we observe that the low-energy DY data suffer of deficits in the normalization.This is a known feature of TMD factorization (see e.g. the extended discussions in refs.[8,27]).Given that the data have very large normalization uncertainties, these deficits do not significantly impact the value of χ 2 ; therefore it is not clear at the moment whether the problem arises from a shortcoming of the theory or of the measurements.Let us also mention that the PHENIX measurement (⟨Q⟩ = 7 GeV) does not show any problem with the normalization.
In fig. 5 we present the comparison of theory vs. ATLAS 13 TeV measurement, which is the most precise measurement at our disposal (with uncorrelated uncertainties < 0.5%).In this plot one can see that TMD factorization works up to q T ≃ 0.2 Q (even if in this particular case only data up to q T = 10 GeV are included into the fit).At larger q T , the theory prediction is systematically lower than the measurement: this is a signal of the necessity for power corrections.The full collection of data plots is given in appendix B.
We emphasize that the uncertainty band obtained in this fit is larger than in previous analyses with artemide [7,8] because of the PDF uncertainty evaluated in the present work.Also, since we cannot control the PDF uncertainty, the band is often larger than the uncertainty of the measurement.It indicates that a simultaneous extraction of PDF and TMD distributions will reduce the uncertainties of both.We also have observed that most part of our uncertainty band is correlated.To determine the size of the correlation we computed the covariance matrix for the full set of replicas, cov(dσ(Λ i ), dσ(Λ j )) and fitted it with the form of eq.(4.6), and determined ∆ th.uncorr and ∆ th.corr for the theory prediction.We found that the portions of correlated and uncorrelated parts depend on x.Generally, the higher the value of x, the larger the uncorrelated part of the band.For example, for the Z-boson production at √ s = 13 TeV ⟨∆ th.uncorr ⟩ ≃ 8.5% at |y| < 0.4, ⟨∆ th.uncorr ⟩ ≃ 15% at 2.0 < y < 2.4, and ⟨∆ th.uncorr ⟩ ≃ 48% at 4 < |y| < 4.5.
The values of the cross-sections predicted by the TMD factorization are somewhat lower than the experimental ones, see the last column in table 4.This is a known feature and has been observed in many analyses (see e.g.refs.[6][7][8][9]27]).The discrepancy is of the order of a few percent for LHC energies (increasing for larger rapidity bins), and of the order of ∼ 40% for fixed-target experiments.These discrepancies are within the published correlated uncertainties, and thus result in a minor increase of the χ 2 .The corresponding contribution is provided in table 4 (column labeled χ 2 λ /N pt ).Theoretically, there could be several sources of disagreement in normalization, f.i.power corrections, see for instance ref. [85].
Few experiments provide the normalized cross-section, which we match using the procedure dataset  We present these estimations in table 5, together with the experimental measurement (if available in the publication), and the theoretical prediction by DYNNLO [86,87] (done with MSHT20 collinear PDFs).Clearly, the results of DYNNLO agree within errors to the ones determined in our simplified procedure, demonstrating the consistency of the approach.

Collins-Soper kernel
The plot of the CS kernel is presented in fig.6.The values of parameters that we obtain are B NP = 1.56The parameters B NP and c 0 are compatible with the ones extracted in SV19.In particular, for the MMHT14 PDF (predecessor of MSHT20) SV19 found B NP = 1.55±0.29 and c 0 = (4.7±1.47)•10−2 .Nonetheless, the shape of the distributions changes significantly due to the new logarithmic term ∼ c 1 in the ansatz of eq.(2.35).This term modifies the shape of the distribution at b ∼ 1 − 3 GeV, leaving the large-b asymptotic behaviour untouched.Removing this parameter, the fit shows worse values of χ 2 , χ 2 /N pt = 1.51 (for central replicas of PDF and the data).The general shape and value of the CS kernel are in good agreement with the MAP22 determination, as can be seen in fig.6, despite the differences in the two codes.
A significative feature of the current fit is that the size of the uncertainty band for the CS kernel is reduced, in contrast to that of the TMD distribution itself.This is correct, since in the ζ-prescription the CS kernel is exactly decorrelated from the TMDPDF (on the theory side), and the quality of the data has increased.We also notice that the parameter c 1 is clearly non-zero, which indicates the presence of a non-negligible logarithmic behaviour in the next-to-leading power term of the small-b expansion of the CS kernel.GeV.Different lines correspond to the independent extractions CASCADE [88], SV19 [8], MAP22 [9], and ART23 (this work).

Unpolarized TMD distribution
The values of the TMDPDF parameters extracted in the fit are Most of these parameters have reasonable sizes, and they agree (within uncertainty) with similar ones found in ref. [26].However, the parameters λ d 1 and λ sea 2 show some problematic behaviour.Namely, they almost vanish at their lower boundary.For negligible values of λ's the b−profile of the corresponding TMDPDF flattens.This is a clearly non-physical behavior, which results in disturbed shapes of the uncertainty bands for d and sea flavors at large-b.Simultaneously, it does not produce any problem in the prediction for the cross-section, since the TMDPDFs contributes in products with the evolution factors.It merely indicates that the present observables/data are not restrictive enough for these flavor combinations.
The shapes of the TMDPDFs are shown in fig.7 for u and d quarks (other flavors show similar behaviour).The sizes of the uncertainty bands are shown in fig.8. Generally, the uncertainty bands are increased by an order of magnitude in comparison to the SV19 fit, and grow faster with the increase of b.This is the result of the PDF uncertainties, which to account for the PDF-bias and allows for a more realistic uncertainty estimation.The x-shape of the uncertainties has become more involved.Their minimum is at x ∼ 10 −2 , where more precise data are located.The sizes of quark-and anti-quark uncertainties are compatible, because most part of the data depends on the product f 1q f 1q that does not distinguish between quarks and anti-quarks.

Impact of inclusion the PDF uncertainty
The inclusion of PDF uncertainties in an analysis of TMDPDFs inflates the uncertainty on TMD-PDFs themselves by almost an order of magnitude.This feature has been already analysed in ref. [26] using the SV19 setup as baseline.In this subsection we present the impact of PDF uncertainty to our analysis.To determine this impact we have performed an independent fit of the data including into the fit procedure only experimental uncertainties (see sec.4.3).We have observed that the central values of parameters are largely unaffected, while the uncertainty band is essentially reduced.The quality of the data description remains high as χ 2 /N pt = 0.95 +0.008 −0.006 .Note that the uncertainty band here is reduced by an order of magnitude, which already tells that the largest part of the replica distribution is due to the PDF uncertainty rather than due to the data.
The values of the NP parameters obtained in this fit are No PDF uncertainty B NP = 1.51 Comparing these parameters to eq. (5.2, 5.3) we see that they stay within the uncertainty band of the full analysis, and in most cases have practically the same central value.The uncertainty band is reduced by a factor 2-10.There are two parameters (λ u 2 and λ ū 1 ) whose central value shifted more in comparison to the complete fit.It possibly indicates that these parts of PDFs have some tension with TMD data.The comparison of shapes is presented in fig.8.
The theoretical uncertainty in the cross-section also shrinks by an order of magnitude.The central value almost coincides with the central line for the full fit, although in some cases there are deviations in the low-q T bins.An example is shown in fig. 5.

Conclusions
The present extraction of unpolarized TMDPDF from the global fit of Drell-Yan data (refereed as ART23) represents a significant step forward in comparison to previous analyses.The main improvements are a higher order perturbative input (which reaches N 4 LL with NNLO evolution for the collinear PDFs), a consistent treatment of PDF uncertainties in the error analysis, and the inclusion of additional data.We also use the flavor dependent form of the fitting ansatz for TMDPDF.The introduction of a flavor dependence reduces the sensitivity to the choice of PDF sets, as observed in [26].Furthermore, the newly incorporated non-perturbative logarithmic dependence of the CS evolution kernel, eq.(2.35), plays a crucial role in achieving a successful fit.
We also find it particularly interesting that several groups find a reasonable agreement on the CS kernel (see fig. 6) despite somewhat different functional forms and uncertainty-estimation procedure.The agreement with the new Z-boson data at LHC and W-boson mediated data at Tevatron is particularly encouraging.
It is important to stress that the PDF uncertainties dominate the TMDPDF extraction, see fig. 8.For that reason, the uncertainties on TMDPDFs that we find are larger by almost an order of magnitude in comparison to other global extractions [6][7][8][9], even though the present dataset is about 30-40% larger and more precise than those considered earlier.We argue that this increased uncertainty is more realistic, and previous studies are biased in several aspects.The χ 2 value that we obtain is very stable, see fig. 4, and it shows a very good agreement between the theory and the data.This observation is noteworthy and suggests that future improvements to the TMDPDFs determination could be achieved by a joint fit of PDFs and TMDPDFs.These possibilities will be explored in due time along with the inclusion of new data.
The values of ART23 unpolarized TMDPDFs, as well as, the artemide code release used for their extraction is published in the artemide-repository [36].The accompanying code (for minimisation, generation of plots, etc) is presented in ref. [81].We also release ART23 TMDPFs in the unified format of TMDlib2 [89].
Our current understanding of the TMDPDFs is summarized in fig.7, which illustrates the shape and the uncertainty of the unpolarized TMD distributions in the (x, b) plane for up and down quarks.An important point for future consideration is the impact of power corrections, for which preliminary theoretical results have been obtained but are not yet directly applicable to our present analysis [52,90,91].The present extraction opens the path to the N 4 LL analysis of SIDIS data, where, in addition to the unpolarized TMDPDFs and CS kernel, one can also determine TMD fragmentation functions.Generally, we observe that the NNPDF3.1-fityields a different spread in the parameters with a larger uncertainty.Most of the parameters are in relative agreement; however, the parameters λ u 2 , λ ū 2 , λ d 2 and λ sea 2 disagree, and do not overlap within the uncertainty bands.All these parameters are responsible for the behaviour of the TMDPDF at large-x, i.e. exactly in the region where collinear distributions differ.However, for the middle-x range (x ∼ 10 −2 − 10 −3 ) the TMDPDFs are in general agreement, except for the sea-quark (see fig. 10).
The resulting theory predictions are given in fig.11, for several bins of ATLAS at √ s = 8 TeV, and LHCb at √ s = 13 TeV.Other experiments demonstrate a similar picture.Comparing these two experiments one can see that both PDFs produce very similar results for ATLAS (which has x ∼ 10 −2 − 10 −3 ), while for LHCb (which has x ∼ 10 −1 ) the curves are very different.It confirms our previous conclusion that the extractions of TMDPDFs are very sensitive to the large-x range.
Let us stress that without the inclusion of the flavor-dependence the extractions with MSHT20 and NNPDF3.1 diagree drastically [26].The inclusion of the flavor-dependence reduces the problem of the PDF-bias, but does not resolve it completely.q T [GeV] q T [GeV] 3.0 < |y| < 3.5

B Comparison with data
In this appendix, we present all the data used for the fit, along with the resulting theory prediction of our main fit (with MSHT20 PDF input).The depicted theory prediction is the distributions (of replicas) average.The 68%CI of the theory prediction (see the discussion in sec.4.3) is shown as a blue band.For a better visual comparison of data and theory predictions, the theory curves are shifted by a factor d (see eq. q T [GeV] q T [GeV]  q T [GeV] q T [GeV] q T [GeV] q T [GeV] q T [GeV] q T [GeV]  q T [GeV] q T [GeV] q T [GeV] q T [GeV] q T [GeV] q T [GeV]  q T [GeV] q T [GeV]  q T [GeV] q T [GeV] q T [GeV] q T [GeV] q T [GeV]

Figure 1 .
Figure 1.Ratio of scale variation band over theoretical cross section at different perturbative orders for Z/γ-boson production at ATLAS at √ s = 13 TeV (left), and for the DY process at PHENIX (right).The NP parameters and the PDF set are kept fixed.The definition of the variation band is given in eq.(2.37).

Figure 2 .
Figure 2. Ratio of cross sections at different orders over the one at N 4 LL with the corresponding scalevariation band for the kinematics of Z/γ-boson production at ATLAS at √ s = 13 TeV, and for DY process at PHENIX.

Figure 3 .
Figure 3. Distribution of data in the (x, Q) plane.Each data point can span large regions.The color gradient darkens with an increasing number of data points contributing to a particular (x, Q) point.

Figure 4 .
Figure 4.The histogram of the χ 2 values for the full dataset.The black line marks the position of the mean prediction, and the blue band shows the 68%CI of the χ 2 distribution.

described in sec. 4 . 1 .
The values of N provide us estimates for the integrated cross-sections for the experiments with normalized data σ th.= N −1 .(5.1)

Figure 7 .
Figure 7. Shape of TMDs in the (x,b)-space.The color indicates the uncertainty.

Figure 8 .
Figure 8. Sizes of the full uncertainty bands in ART23 (green) in comparison to the extraction at the central PDF replica (grey).The upper panel shows the b-dependence at x = 0.01.The lower panel shows the x−dependence at b = 0.5 GeV −1 .

Figure 9 .
Figure 9.Comparison of the CS kernels extracted with the MSHT20 (solid grey curve) and NNPDF3.1 (dashed orange curve) collinear PDFs.

Figure 10 .
Figure 10.Comparison of uncertainty bands for unpolarized TMDPDFs extracted with MSHT20 (green) and NNPDF3.1 (gray) collinear distributions.The plots are normalized to the MSHT20 case.

Figure 11 .
Figure 11.Comparison of theory predictions based on MHST20 (blue) or NNPDF3.1 (orange) collinear PDFs.The rapidity range is indicated above the plot.All data and uncertainties are normalized to the blue line (MSHT20 result).
(4.9)) computed for the central line of the prediction.In all plots, we demonstrate more data than those described by the TMD factorization theorem.The data points used in the fit are shown by filled points, while the rest are shown by empty points.

Figure 12 .
Figure 12.Differential cross-section for the Z/γ * boson production measured by CMS, at different values of √ s and Q, according to the legends in the plots.

Figure 13 .
Figure 13.Differential cross-section for the Z/γ * boson production measured by ATLAS, at different values of √ s, y, and Q.All details pertaining the values of the kinematic variables and their cuts can be seen in the plots.The plot for comparison with ATLAS at √ s = 13 TeV can be found in fig. 5 in a larger scale.

Figure 14 .
Figure 14.Differential cross-section for the Z/γ * boson production measured by LHCb, at different values of √ s and y, according to the legends in the plots.

Figure 15 .
Figure 15.Upper and central rows: differential cross-section of the Z/γ * boson production measured by CMS at different values of y (see the text on the plots for details).Lower row: differential cross-section for the DY process measured at PHENIX (left) and STAR (right).

Figure 16 .
Figure 16.Differential cross-section for the Z/γ * (upper and center rows) and W (lower row) boson production in the DY-process measured by CDF and D0, at different values of √ s.

Table 1 .
The masses and electroweak parameters used in the present work.The CKM matrix elements are taken from the Particle Data Group (ed.2022)

Table 4 .
The values of χ 2 for the individual datasets.The last column shows the average relative shifts computed by eq.(4.9), which indicate the size of normalization discrepancy.

Table 5 .
[16]arison of data (here, the Z-Boson production by ATLAS √ s = 13 TeV measurement in ref.[16]) with the theoretical prediction (blue).The blue band is the 68%CI.The filled points are included into the fit and the displayed χ 2 corresponds to those only.The purple line (band) shows the prediction (uncertainty) obtained without taking into account the PDF uncertainties.Comparison of the total cross-section values for the normalized datasets.The column σexp.shows the measured value (if available in the original publications) with uncertainties from all sources.The column σDYNNLO shows the prediction by DYNNLO, with PDF uncertainty.The column σ th.shows the value computed by eq.(5.1), with uncertainties from TMD extraction only.