Analysis of vector boson production within TMD factorization

We present a comprehensive analysis and extraction of the unpolarized transverse momentum dependent (TMD) parton distribution functions, which are fundamental constituents of the TMD factorization theorem. We provide a general review of the theory of TMD distributions, and present a new scheme of scale fixation. This scheme, called the $\zeta$-prescription, allows to minimize the impact of perturbative logarithms in the large range of scales and does not generate undesired power corrections. Within $\zeta$-prescription we consistently include the perturbatively calculable parts up to next-to-next-to-leading order (NNLO), and perform the global fit of the Drell-Yan and Z-boson production, which include the data of E288, Tevatron and LHC experiments. The nonperturbative part of the TMDs are explored checking a large variety of models. We support the obtained results by a study of theoretical uncertainties, perturbative convergence, and dedicated study of the range of applicability of the TMD factorization theorem. The considered nonperturbative models present significant differences in the fitting behavior, which allow us to clearly disfavor most of them. The numerical evaluations are provided by the"arTeMiDe"code, which is introduced in this work and that can be used for current/future TMD phenomenology.


Introduction
The transverse momentum dependent (TMD) distributions are universal functions that describe the interactions of partons in a hadron. The TMD distributions naturally appear within the TMD factorization theorem for the differential cross section of double-inclusive hard processes. A lot of efforts have been made to arrive to a comprehensive picture of TMD factorization (for the latest works see [1][2][3][4][5][6][7][8]). In this work we perform a detailed comparison of the experimental measurements with the theory expectations based on our studies of higher-order perturbative expansions and power corrections for unpolarized TMDPDFs made in refs. [9][10][11][12]. Among many different spin (in)dependent TMD distributions, the unpolarized TMD parton distribution functions (TMDPDFs) play a central role. From the practical point of view, their precise knowledge is required to extract further TMD distributions and perform other precision measurements. The ideal process to study the unpolarized TMDPDFs is the unpolarized vector boson production. The data on the q T -dependent cross-section for the Drell-Yan process are collected by many experiments, including the precise measurements done by Tevatron and LHC. The -1 -150, 8 · 10 3 ) GeV (ATLAS collaboration [30]). Typically, the low-and high-energy data are considered separately. The main reason for a separate scan is the assumed physical picture of strong interactions, which describes different energies. The description of the high-energy data requires a precise perturbative input and it is expected to be less sensitive to the fine non-perturbative dynamics. The situation is the opposite for the low-energy measurements. Our experience shows that the inclusion of data of different energies is not only possible within TMD formalism, but it is also desired because it cuts away inappropriate models very sharply. We find also that the precision achieved by LHC is already sensitive enough to the non-perturbative structure of TMDs. We show that low and high energy data are sensitive to different regions of b-space, and consequently to different non-perturbative regimes of the TMDs: high energy data are better described by a Gaussian non-perturbative correction, while low energy data prefer an exponential type of non-perturbative models. The code (arTemiDe) that we have prepared allows to test all these hypotheses, and can be adaptded also to test different non-perturbative inputs for TMDs.
In order to extract the non-perturbative core of the TMDs, in the present study we choose a neutral tactic. We have scanned many possibilities such as a Gaussian/exponential behavior, with/without inclusion of power corrections, and so on. We have also studied the non-perturbative correction to the evolution kernel. During the examination of models we have prioritize the following criteria: (i) Stability. The TMD factorization is valid at small-q T (the dilepton transverse momentum) up to a certain limit. Therefore, an acceptable model should produce a stable and good description within the allowed q T -range. In other words, the value of χ 2 should be valuable and values of parameters are stable independently on the number of included data points, as far as the points belong to the allowed range.
(ii) Convergence. The agreement with data should improve with the increase of the perturbative order. Given the current state of the art of the theory, we can define four successive perturbative orders, which is enough to test the perturbative convergence. Also, the value of the phenomenological non-perturbative constants that one extracts should converge to some central value.
(iii) χ 2 minimization. Naturally, among the models with similar behavior we select the model with the minimal χ 2 . We have found that it is difficult to find a model (with one or two parameters), which fulfills the demands (i) and (ii), and that at the same time provides a good χ 2 value on the whole set of data points (although it is relatively easy to achieve this selecting a particular experiment). The models that we test consider a kind of minimal set of parameters which can be enlarged in future studies, refining the fitting hypotheses.
In the present fit, we have included the measurements of E288 at low-energies, Z-boson production at CDF, D0, ATLAS, CMS and LHCb, and Drell-Yan measurements from ATLAS. To our knowledge, this is the largest set of Drell-Yan data points ever simultaneously considered in a fit within the TMD formalism. We find also that the LHC data below the Z-boson peak and at small q T are very important for current/future TMD studies. In the article we present the most successful models that we have found, and discuss some popular models. In order to numerically evaluate the theoretical expressions, we have produced the package arTeMiDe. arTeMiDe has a flexible module structure and can be used at any level of TMD theory description, from the evaluation of a single TMDPDF or evolution factor to an evaluation of differential cross-section. The arTeMiDe code is available at [31] and can be used to check our statements or test a possible future/alternative ansatz (for instance [14,32]). In arTeMiDe we have collected all recent achievements of TMD theory, including NNLO matching coefficient function, and -3 -N 3 LO TMD anomalous dimensions. In the current version, arTeMiDe evaluates only unpolarized TMDPDFs and related cross-sections, however, we plan to extend it further.
The body of the article is divided as in the following. In sec. 2 we review the theoretical construction of the Drell-Yan cross section and summarize the theoretical knowledge on unpolarized TMDPDFs. In this section, we also describe all the theoretical improvements which are original for this work. The main original point, namely ζ-prescription is presented in sec. 2.4 and appendix B. The phenomenological studies are presented in sec. 3. This section includes also a dedicated discussion of the shape of the non-perturbative part of the TMD. The allowed range of validity of the TMD factorization is explored in sec. 3.4, the presentation of theoretical uncertainties is given in sec. 3.5. The results of the final fit are presented in sec. 3.7. A final discussion and conclusions can be found in sec. 4.

Theoretical framework
We consider the Drell-Yan reaction h 1 + h 2 → G(→ ll ) + X, where G is the electroweak neutral gauge boson, γ * or Z. The incoming hadrons have momenta p 1 and p 2 with (p 1 + p 2 ) 2 = s. The gauge boson decays to the lepton pair with momenta k 1 and k 2 . The momentum of the gauge boson or equivalently the invariant mass of lepton pair is Q 2 = q 2 = (k 1 + k 2 ) 2 . The differential cross-section for the Drell-Yan process can be written in the form [33,34] where 1/2s is the flux factor, ∆ G is the (Feynman) propagator for the gauge boson G. The hadron and lepton tensors are respectively (2π) 4 δ 4 (k 1 + k 2 − q) l 1 (k 1 )l 2 (k 2 )|J G ν (0)|0 0|J G µ (0)|l 1 (k 1 )l 2 (k 2 ) , where J G µ is the electroweak current. The point of our interest is the q T dependence of the cross-section, where q T is the transverse component of the produced gauge boson in the center-of-mass frame. More precisely, we are interested in the regime q T Q, where the TMD factorization formalism can be applied. Within the TMD factorization, one obtains the following expression for the unpolarized hadron tensor (see e.g. [35]) where g T is the transverse part of the metric tensor and the summation runs over the active quark flavors. The variable µ is the hard factorization scale. The variables ζ 1,2 are the scale of soft-gluons factorization, and the subject of relation ζ 1 ζ 2 Q 4 . In the following, we consider the symmetric point ζ 1 = ζ 2 = ζ = Q 2 . The variables x 1,2 are the longitudinal parts of parton momenta (2.5) The factors z GG f f are the electro-weak charges. The explicit form of factors z is given in sec. 2.1. The factor C V is the matching coefficient of the QCD neutral current to the same current expressed in terms of collinear quark fields. The explicit expressions for C V can be found in [36][37][38], and are also given in appendix A. The functions F f ←h are the unpolarized TMDPDFs for quark f in the hadron h. They are universal non-perturbative functions and the main objects of our study. The details of their definition and their parametrization are given in sec. 2.3. Finally, the term Y denotes the power corrections to the TMD factorization theorem (to be distinguished from the power corrections to the TMD operator product expansion). The Y -term is of the order q T /Q and composed of TMD distributions of the higher dynamical twist. In our study, we restrict ourself to the limit of low q T such that the Y -term can be dropped.
Evaluating the lepton tensor, and combining together all factors one obtains the cross-section for the unpolarized Drell-Yan process at the leading order of TMD factorization in the form [1,2,6,[39][40][41] dσ where y is the rapidity of the produced gauge boson. The factor P is a part of the lepton tensor and contains information on the fiducial cuts. It is discussed in details in sec. 2.6. In the rest of this section a more detailed description of the particular components is presented.

Expressions for cross-section for different produced bosons
In the case of neutral vector bosons production, the sum over G and G in eq. (2.6) has three terms which correspond to γ * -production, Z-production and interference of γ * -Z production amplitudes. These three terms of the cross-sections differ from each other only due to the factors z GG f f in eq. (2.6), which are where M Z and Γ Z are the mass and the width of the Z-boson, s W and c W are sine and cosine of the Weinberg angle. We use the following of values [42] In many studies (see e.g. [15,19,20,22,43]) the contribution of γ * to the cross-section is neglected in the vicinity of the Z-peak, i.e. the zero-width approximation is used. Here, instead, we include the γ * and interference terms in the evaluation of the the cross-section. The inclusion of these terms is important for LHC (in particular ATLAS experiment), where the measurement precision often exceeds the theory precision.

TMD parton distributions: evolution
The quark unpolarized TMDPDFs are given by the matrix element [1,2,11] where n is the light-cone vector along the large component of the hadron momentum, ξ = {0 + , ξ − , b}, Z and R are the ultraviolet and rapidity divergence renormalization factors. The Wilson lines W n pointing along the direction n to the infinity. For the detailed definition of all constituents in this expression we refer to [11]. The peculiar feature of the TMD operator is the presence of two types of divergences and, as a consequence, two renormalization factors Z and R. Firstly, we have ultraviolet divergences, which have their collinear counterpart in the coefficient function C V . These divergences are the result of collinear factorization and give rise to the logarithms of the factorization scale µ. Secondly, we have rapidity divergences, which arise in the factorization of the soft-gluon exchanges between partons. The singular soft-gluons exchanges can be collected into the soft factor, which in turn, can be written as a product of rapidity renormalization factors R, see e.g. [10,11,44]. This procedure introduces the rapidity factorization scale ζ.
The dependence of TMDPDF on the factorization scales µ and ζ is given by the pair of evolution equations The TMD anomalous dimensions γ and D are known up to order a 3 s (see [45] for γ V , and [44,46,47] for D). They satisfy the consistency condition (Cauchy-Riemann condition), which guaranties the existence of the common solution for equations (2.11-2.12), where Γ f is the cusp anomalous dimension. This equation fixes the logarithmic part of the anomalous dimensions. So, the anomalous dimension γ is linear in logarithm at all orders, while the rapidity anomalous dimension D has all powers of logarithms, (2.14) Here and in the following, we use the following notation for logarithms The explicit expressions for the anomalous dimensions up to third-loop order can be found e.g. in the appendix of [11,44].
The initial values of the factorization scales are dictated by the kinematics of the considered process. In particular, the scales ζ 1,2 are related to the momentum of hard partons as In the following, we use the symmetric normalization point, ζ 1 = ζ 2 = ζ = Q 2 . The µ-dependence cancels between the parts of factorization formula, namely between hard coefficient function |C V | 2 and the TMDPDFs. The natural choice of µ is such that logarithms appearing in |C V | 2 are minimized, i.e. µ = Q. Therefore, TMDPDFs enter in the cross-section in eq. (2.6) at the hard point A typical construction of a model for a TMD distribution requires its evolution to a different set of scales. The evolution from (µ f , ζ f ) to (µ i , ζ i ) takes the form Here, the P denotes the integration along the path P in the (µ, ζ)-plane from the point (µ f , ζ f ) to the point (µ i , ζ i ). The integration can be done on an arbitrary path P , and the solution is independent on it, thanks to the Cauchy-Riemann condition eq. (2.13). At a finite perturbative order, the condition eq. (2.13) is violated by the next perturbative order. As a consequence the expression for the evolution factor R is dependent on the path of integration. The two simplest choices of integration paths are the combinations of straight segments as These paths are depicted in fig. 1. The factor R evaluated along these paths reads The numerical difference between these two expressions represents the value of the uncertainty at a given perturbative order.  The expressions for the evolution factor R given in eqs. (2.19-2.19), contain the rapidity anomalous dimension D(µ, b). The latter contains potentially large values of L µ , which should be resummed with the help of eq. (2.13). Additionally, the rapidity anomalous dimension can acquire power corrections from the higher orders in the power expansion of the factorization theorem [48]. These power corrections can be also observed in the renormalon structure described in [12]. The non-perturbative correction takes the form of a series of even powers of the transverse distance. Therefore, the practical expression for the rapidity anomalous D is where g K is an unknown parameter. Here, D f pert is the perturbative expression for D. Correspondingly, the value µ 0 should be chosen such that L µ0 is minimal in the perturbative region. Substituting this expression to the evolution factor, we obtain . (2.21) In this form, the evolution factor R is independent on the path of evolution, as can be checked explicitly. The perturbative uncertainty which previously has been given by the variation of evolution path, now is represented by the dependence on the parameter µ 0 . Thus, using eq. (2.21) the uncertainties of the perturbative calculation can be measured by varying the scale µ 0 . In the following, we use the evolution factor as in eq. (2.21).

TMD parton distributions: b-space behavior
The TMDPDF is a genuine non-perturbative function, which is to be fitted by a certain ansatz, which covers the whole domain in b-space. Different intervals of b-space describe different regimes of strong interactions. The schematic picture of b-regions is shown in fig. 2. In order to construct an optimal and physically meaningful fitting ansatz, the behavior in every part of the b-space should be reproduced. In this section, we collect the main information on the b-dependence of TMDPDFs, as it is understood in the modern theoretical picture. The starting point of our description of a TMD distribution is the small-b operator product expansion (OPE), which results in the series -8 -where f (n) are PDFs of a 2(n + 1)-twist, C (n) are coefficient functions of OPE and the symbol ⊗ represents the convolution in momentum fractions of partons. The parameter B is an unknown non-perturbative parameter which represents an intrinsic hadron scale. Region 1: In the range b B the TMDPDF is dominated by the n = 0 term of OPE eq. (2.22). The leading term is represented by the usual matching onto twist-2 PDFs and reads where C is known up to two-loop order [11,49]. There is a subregion b 1/Q, which should be considered specially. While the TMD distribution is completely perturbative within this region, the contributions of this region to the crosssection strongly overlaps with the Y -term, eq. (2.4), which is formally O(1/(bQ)). The behavior of TMD distributions within this tiny range together with the Y-term dictates the asymptotics of the cross-section at large q T . As a consequence, it has a significant influence on the value of the total cross-section. In our current evaluation we restrict ourself to the range of small-q T (for a dedicated study of the applicability of this approximation in practice, see sec. 3.4). Therefore, we drop the Y -term and do not need any special treatment of b Q −1 region. Region 2: In the range b B the OPE is still valid. However, one has to include the higher order terms in addition to the leading one. Very little is known about power suppressed terms of the small-b OPE. Our recent study of the renormalon singularities [12] suggests several hints that can be used to model this region: (i) The OPE contains only even powers of b. Moreover, the coefficient function of n'th order has a prefactor x n . In other word, the natural scale of OPE is xb 2 /B 2 rather then just b 2 /B 2 .
(ii) The higher order OPE contributions induced by renormalons, can be summed together to some effective non-perturbative function under the convolution integral.
Therefore, in this region the TMDPDF can be approximated by the form where the leading term of the power series in b/B of G is given by C. As the power n grows, the sub-leading terms of OPE switch on, which is schematically presented by gray lines in fig. 2. The particular contributions at higher n are not so important in the continuous TMD picture. However, (iii) The n = 1 contribution to OPE can be estimated by the leading renormalon contribution of order ∼ xb 2 [12]. It has the form where Λ = Λ QCD is the position of the Landau pole.
Region 3: At b B the small-b OPE cannot be considered as a source of information, and the TMD is completely non-perturbative. Luckily, this region is suppressed by the evolution factor. As a consequence, the cross-section is not very sensitive to the fine structure of TMD distribution in this region, but the general behavior is important. We have tested several asymptotical forms of the TMDPDF, including Gaussian, exponential and power-like and found that the best agreement with the experimental data is achieved with exponential behavior. This observation is in agreement with the general physical intuition, that at high distances the strong forces are dominated by meson exchange, while the Gaussian and power-like asymptotics can not be produced in any simple way.
We should mention that the size of the parameter B, as well as, the order of convergence of the small-b OPE, which influences the size of the intermediate region 2, are not known. Our estimations of these characteristic sizes are presented in sec. 4.

Definition of scaling parameters
The small-b matching is the starting point for the construction of the majority of phenomenological ansatzes for TMD distributions. It can be considered as an additional collinear factorization, which is performed at some convenient set of scales (µ i , ζ i ). The difference of (µ i , ζ i ) from the initial (defined by process kinematic) scales of TMD distribution is compensated by the evolution factor in eq. (2.17). As usual, the all-order expression is independent on (µ i , ζ i ), but in practice, these scales are to be chosen such that the coefficient function C f ←f has good perturbative convergence. This procedure is alike the choice of hard-factorization scale, with one essential difference: the parameter b, which accompanies µ i and ζ i in the logarithms, has no fixed value. It varies from zero to infinity within the Fourier integral.
The choice of scales (µ i , ζ i ) is one of the central point of the TMD phenomenology. To make the discussion clearer, let us recall the expression for the coefficient function at NLO. It reads where the notation for the logarithms is defined in eq. (2.15). Ideally, the scales µ and ζ should be chosen such that no large perturbative contributions appear in the coefficient function. Clearly, it cannot be done at arbitrary b due to the presence of µ in the coupling constant and in L µ . However, such a strict choice is not required. The only requirement for scales is to keep the perturbative ansatz stable, i.e. to prevent its blowing up. There are several solutions of this problem. The most famous is the b * -prescription [27]. Within the b * -prescription the logarithms L µ are absent, an this fact allows the control of the perturbative series in the full region of b. However, the b * -prescription introduces artificial power corrections to the small-b OPE, which washes out any theoretical intuition. Another popular scheme [50,51] is based on the re-expression of Hankelintegral as an integral in the complex b-plane. In this way, the logarithms L µ can be minimized by µ ∼ b −1 and the Landau pole at large-b is by-passed in the complex plane. The drawback of this scheme is the necessity of the analytical continuation into the complex b-plane, and the restriction to NNLO (since the analytical solution for running coupling at N 3 LO is unknown). In this work we use another scheme which we call ζ-prescription. It is a novel one (to our best knowledge), and it is described in the following.
The ζ-prescription uses the fact that the TMD operator and hence its small-b OPE depends on two scales µ and ζ, which are entirely independent. This simple fact has been overlooked so far. Indeed, the first typical step is to fix ζ = C 2 0 /b 2 , or ζ = µ 2 [1,12,52]. It reduces the problem to a single variable problem, which looks simpler, but finally, it does not provide a simple solution for the appearance of large logarithms in the OPE.
The initial point of the ζ-prescription is the observation that not all logarithms in the coefficient function are dangerous. So, the terms L 2 µ and L µ l ζ in eq. (2.26) are problematic, while the logarithm in first term is not. There are several reasons for it. First, the double logarithm contributions violate the normal perturbative counting and at large-b grows faster than the single logarithms. Second, the first term of eq. (2.26) comes together with the DGLAP kernel, and thus, it preserves the area (say, the integral over x) of the TMDPDF, due to the conservation of the electromagnetic -10 -charge. We remind that logarithms accompanying the DGLAP kernel are related to PDF evolution, while the rest of logarithms are related to the TMD evolution. For this reason, the main problem of convergence is represented by the logarithms that are related to the TMD evolution. The logarithms related to the PDF evolution come with a particular x-dependent function. The probabilistic interpretation of PDF ensures their minimal contribution in the very large domain of b. Practically, this fact has been already demonstrated although not entirely realized in the fit [20]. In the realization of ref. [20], the DGLAP logarithms were left unregulated and they do not influence the convergence of the fit.
The logarithms related to the TMD evolution can be eliminated completely by a particular choice of ζ = ζ µ . Along the curve ζ µ , the TMD distributions are independent on µ. In other words, the curve ζ µ is an equi-evolution curve in the plane (µ, ζ). Such a curve satisfies the equation Using the definition of anomalous dimensions in eq. (2.11) we rewrite this equation as where f (L µ ) = l ζµ . The perturbative solution is discussed and presented in the appendix B.1. The curve ζ µ is different for quark and for gluon TMDs, and it is expressed in terms of the TMD anomalous dimensions eq. (B.7). In our analysis, we need only the quark case. Up to NNLO it reads Note, that in eq. (2.29) we have set the boundary condition such that no terms singular at L µ → 0 appear in l ζ (see appendix B.1, for details). Also, in the current work we drop the power contributions to the rapidity anomalous dimension D. The influence of these decisions should be investigated later. One can check that the leading term of ζ µ (i.e. l ζ = L µ /2) cancels leading powers of logarithms at all orders in perturbation theory (i.e. all terms a n s L 2n µ ). Then, including the next correction (a s β 0 L 2 µ /12) cancels subleading powers of logarithms at all orders of the perturbation theory (i.e. all terms a n s L 2n−1 µ ) , and so on.
Substituting the leading term of the solution in eq. (2.29) to the quark small-b coefficient function, we obtain This coefficient function is stable for any value of L µ , which can be seen by considering its integral which is independent on L µ .  Table 1. The perturbative orders studied in the fit. For each order we indicate the power of as of each piece that enters in the TMDs. Note, that the order of as and PDF set are related, since the values of as are taken from the PDF set.
The general expression for the coefficient of arbitrary flavour at NNLO has the form where C (n,0) is the finite part of the coefficient function at n'th perturbative order, and P (x) = a n s P (n) is the DGLAP kernel. The detailed derivation of eq. (2.32) is presented in the appendix (B.2). Eq. (2.32) has the form of the usual coefficient function for an object without external evolution (e.g. coefficient function for DIS). In other words, it is straightforward to check that by direct differentiation of eq. (2.32). The integral of this function over x is independent on L µ due to the charge conservation, and thus at least the area of TMDPDF is stable at large b.
A further positive point of the ζ-prescription is that the scale µ remains unconstrained. Often, the scale µ is selected such that it behaves as ∼ 1/b at b → 0. Such a choice minimizes the small-b logarithms in small-b OPE and in the evolution exponent. At large-b the scale µ should be frozen to some fixed value (of the order of a few GeV's), in order to avoid the Landau pole. We use the simplest function which satisfies these criteria where the low-energy fixed scale 2 GeV is chosen due to the fact that it is the standard scale of PDF extractions. Finally, we should also select the value for the parameter µ 0 that enters expression for the evolution factor eq. (2.21). To keep our discussion simple, we set µ 0 = µ b .

Theoretical uncertainties and perturbative ordering
In the construction of the cross section, one finds several sources of perturbative uncertainties. The size of these uncertanties can be estimated by the variation of associated scales. We list here the ones that we have considered in the present work.
• Uncertainty associated with the perturbative matching of rapidity anomalous dimension : This uncertainty arises from the dependence (at the fixed perturbative order) on µ 0 , which should be compensated between the Sudakov factor and the boundary term D(µ 0 ) in the TMD evolution factor eq. (2.21). This uncertainty can be tested by changing µ 0 → c 1 µ 0 and varying c 1 ∈ [0.5, 2].
• Uncertainty associated with the hard factorization scale: This uncertainty arises from the dependence (at the fixed perturbative order) on the scale µ f (∼ Q) which is to be compensated between the hard coefficient function |C V | 2 and the TMD evolution factor. This uncertainty can be tested by changing µ f → c 2 µ f and varying c 2 ∈ [0.5, 2].
• Uncertainty associated with the TMD evolution factor: This uncertainty arises from the dependence (at the fixed perturbative order) on initial scale of TMD evolution µ i , which is to be compensated between the evolution integral and the µ-dependence of ζ i in eq. (2.21). This uncertainty can be tested by changing µ i → c 3 µ i and varying c 3 ∈ [0.5, 2].
• Uncertainty associated with the small-b matching: This uncertainty arises from the dependence (at the fixed perturbative order) on the scale of the small-b matching µ OPE which is to be compensated between the small-b coefficient function C f ←f and evolution of PDF. This uncertainty can be tested by changing µ OPE → c 4 µ OPE and varying c 4 ∈ [0.5, 2].
We remark that our definition of perturbative uncertainties c 1,2 is commonly used in the literature (as far as it can be compared among different schemes of calculation), see e.g. [21,43]. The uncertainties c 3,4 are usually non distinguished and they are commonly varied simultaneously i.e. in the literature one finds discussions of errors for the case c 4 = c 3 . To our best knowledge, the distinction of the matching and evolution uncertainties is made here for the first time.
In this way, the general expression for the cross-section in eq. (2.6) with our choice of scales reads where the evolution factor R is given in eq. (2.21) and the explicit expression for the ζ µ is given in eq. (2.29). The low-normalization point µ i and the scale of small-b operator product expansion µ OPE are fixed at the same point (2.34) The central value of the constants c 1,2,3,4 is 1 and they are varied in order to estimate the theoretical uncertainties in the usual range (0.5, 2). The perturbative orders of each constituent are to be combined consistently. Having at our disposal the NNLO expressions for coefficient function and N 3 LO expressions for anomalous dimensions, we can define four successive self-contained sets of ordering. This is reported in table 1. In our definition of orders we use the following logic: (i) The order of the a s -running should be the same as the order of PDF set, since their extraction are correlated. (ii) The order of D should be the same as the order of Γ since they enter the evolution kernel R with the same counting of logarithms (i.e. a n s ln n+1 µ), and one-order higher then the order of γ V , since it has counting a n s ln n µ. (iii) The order of small-b matching coefficient should be the same as the order of evolution of a PDF, because large logarithms of b are to be compensated by the PDF evolution. (iv) The order of ζ µ should be such that no logarithms appear in the coefficient function, and the general logarithm counting coincides with the counting of the evolution factor. In table 1 the order of the ζ µ is defined as following: NLL is l ζ = L µ /2, NLO has in addition finite part at order a 0 s (i.e. two first terms of eq. (2.29)), NNLL has in addition logarithmic part at order a 1 s (i.e. the first line of eq. (2.29)), and NNLO is given by whole expression eq. (2.29).
-13 -To label the orders we use the nomenclature where the part with 'LO suffix designates the order of coefficient functions, and the part with 'LL suffix designates the order of the evolution factor in the a s ln µ ∼ 1 scheme. So, our highest order is NNLL/NNLO, which at the moment the highest available combination of the perturbative series. The order NLL/LO appears to be barely inconsistent, because it requires the LO PDF evolution to match the trivial coefficient function. Therefore, we exclude the NLL/LO from our phenomenological studies.

Implementation of lepton cuts
In modern experiments, the cross-section is often evaluated with the fiducial cuts on the dilepton momenta. That is, the lepton tensor in eq. (2.3) should be evaluated taking into account the experimental cut phase-space. At the leading order the lepton tensor takes the form where θ-function restricts the lepton momenta to the allowed region.
In the limit Q → ∞ and no restriction on the lepton pair phase space we obtain Substituting this expression to the cross-section we obtain the standard formula to the Drell-Yan cross-section within TMD factorization [1,2,6,[39][40][41]. In order to include the corrections due to a finite Q and experimental cuts let us introduce a factor P, i.e.
which is consistent with the cross section expression presented in eq. (2.6). The function P in the absence of cuts reads In the presence of cuts the expression for P is involved. For example, at q T = 0 and y = 0 it reads (2.41) Generally, P cannot be evaluated analytically, but it is rather easy to evaluate numerically.

Review of experimental data
In this section we present the experimental data that have been included in our fit. We have splitted the data into two large data sets with respect to a general energy scaling. They include the measurements from the following experiments: • Low-energy data set -E288: Drell-Yan process, at 4 < Q < 14 GeV.
• High energy data set: Ref.
In the present study, we have not included the data of other experiments, such as E605, or R209.
The main reason is that these data require some special discussion, due to possible internal inconsistencies (see e.g. [20]), or due to low precision. We expect that the results of the present work are not sensibly affected by this omission.
In the following, we present each included measurement in more detail.
E288. The E288 experiment [29] presents a large number of low energy points. So, the total number of low-energy point is nearly equal the total number of points of high energy experiments. For convenience we have splitted this data set into three subsets with different center of mass energy s. The characteristics of the measurements are shown in table 2. Concerning these data we can comment the following: • We exclude the data points in the range 9 < Q < 11 GeV, because these data are dominated by the Υ-resonance (M Υ 9.5 GeV). The description of Υ-resonance production is beyond the scope of current TMD factorization approach.
• The E288 experiment is made on a copper target. To simulate the effects of copper nuclei we replace the proton PDFs by the following combinations where Z = 29, A = 63 and N = A − Z = 34, are charge, atomic number and the number of neutrons in copper correspondingly.
• The absolute normalization of the E288 p T -cross-section is unknown. Typically, one includes an additional normalization factor N E288 , as a parameter of the fit, see e.g. [13,15,19,20]. There is no agreement on this factor values, it varies from ∼ 0.8 [13,19,20] to ∼ 1.2 [15]. In our analysis we fix N E288 = 0.8.
The theoretical uncertainties for low energy experiments are large, of the order ±10% at the best (see sec. 3.5). As a consequence, the value of the cross-section is very sensitive to the choice of the PDF set and the overall normalization factor. For example, we have checked that the E288 data can be fitted also with N E288 = 0.9 with the same (or better) value of χ 2 by an additional variation of µ b . However, we consider this as a bad practice and restrict ourself to N E288 = 0.8, as the most conventional solution.   • The data are splitted into different bins with different dilepton invariant mass. For each bin we evaluate the cross-section eq. (2.6) as where Q max,min are the boundary of the Q-bin.
CDF and D0. The data on the Z-boson production measured by CDF and D0 collaborations at Tevatron Run 1 and Run 2 [54][55][56][57][58] have been used nearly in every fit of unpolarized TMDPDFs. They are summarized in tables 3-4. Concerning these data we can comment the following: • There is a known tension between the values of total cross-section at run I of CDF and D0.
Here we restrict ourself to the fit of the shape of the cross-section and normalize the theoretical points on the bin-by-bin integrals in the allowed range of q T . I.e. we multiply the theoretical cross-section by the factor where ∆q T is the size of q T -bins. As we show in sec. 3.6 the obtained normalization factors are very close to one (at NNLO), and the values of partial cross-sections are in agreement Theor. [30] [30] Table 6. The characteristics of the data for the vector boson production off the Z-peak measured by ATLAS collaborations.
with the experimental ones within error-bars. In the tables 3-4, we also present the values of the total cross-sections evaluated by DYNNLO code [17,53]. In this calculation of the total-cross-section, we have used the same inputs as in the TMD fits, i.e. the PDF are taken from MMHT2014 set [59].
• The experimental values for cross-section points are obtained by integrating over all values of y, integrating over measure range of Q and averaging in q T . Consequently, we have used the following expression for the cross-section where y 0 = 1 2 ln(s/Q 2 ), q T,low and q T,high are boundaries of q T -bin, and ∆q T is the size of the q T -bin.
ATLAS. The data by ATLAS collaboration in [30,61] cover a broad range of dilepton invariant masses for the Drell-Yan process with small experimental error-bands. So, this set provide the  biggest constraints on TMD extraction coming from high energy data points. The characteristics of the measurements are resumed in tables 5-6 and here we comment the following: • The data from the ATLAS detector at 8 TeV run are presented in several sets [30], which corresponds to different treatment of final-state photon radiation. We have considered the "dressed" set of the data.
• The values of cross-section have been calculated by the expression in eq. (3.4), where y 0 = 2.4, as it is presented in the tables 5-6.
• There is a known tension between the theoretical calculation of the integrated cross-section and the measured one, see e.g. [30,60]. Moreover the theoretical cross-section for vector boson production is not known (at least, the DYNNLO package [17,53], which we have used for evaluation of total cross-sections, does not guarantee the accurate calculation outside the Z-peak). Therefore, we normalize the calculated cross-sections by a factor, as explained in more detail in the text around eq. (3.9). In sec. 3.6, we compare the obtained values of normalization to the total cross-section. We have found that the values of obtained normalization are practically independent on the non-perturbative input of the TMD model, and at NNLL/NNLO correctly reproduce (within the error-bars) the measured total cross-sectio.
• All data sets from LHC are presented within fiducial cross-sections. Therefore, we have implemented the cut leptonic tensor as it is discussed in sec. 2.6.
CMS and LHCb. The CMS and LHCb collaborations provide data around the Z-boson peak in [62][63][64][65][66], see tables 7,8. The treatment of these data is similar to the case of ATLAS data: • The values of cross-section have been calculated by the expression in eq. (3.4), where the limits for y-integration y 0 are taken in accordance to the tables 7-8.
• Just as in the case of ATLAS data we have normalized the calculated cross-sections by the factor provided in eq. (3.3) discussed in sec.3.6. We have found a good agreement between the theoretical and experimental values for total cross-section for LHCb data.
• All data sets from LHC are fiducial cross-sections. Therefore, we have implemented the cut leptonic tensor as it is discussed in sec. 2.6.
-  Finally, we have considered only points which allow a consistent TMD treatment. I.e. the points with the value of q T < δ T Q, where δ T is sufficiently small. In the literature we have not found any special study on this limiting ratio. So, we present our study in sec. 3.4, and conclude that TMD factorization range is q T /Q < 0.2.

arTeMiDe
In order to evaluate the cross-sections we have prepared the program package arTeMiDe. The arTeMiDe package is a collection of FORTRAN modules that evaluates individual terms of the TMD factorization formalism, such as TMD evolution factors, TMDPDFs, and combines them into the differential cross-sections. arTeMiDe forms a flexible package for TMDPDF phenomenology based on the ζ-prescription, as described in this article. It is publicly available at the web-page [31].
arTeMiDe version 1.1 evaluates the quark and gluon unpolarized TMDPDFs (although in the discussed fit the gluon TMDPDFs are not necessary) for any given function f N P , at any composition of perturbative orders from LO to NNLO, with or without renormalon-induced power corrections. For the current study, the input PDFs are taken from the MMHT2014 PDF set [59].
The most time-consuming part of the numerical evaluation of the TMDPDFs, is the convolution integral in eq. (3.6), which is especially expensive at NNLL/NNLO. Within the arTeMiDe package the convolution integral is optimized using an approximate expression for NNLO coefficient functions. The approximate form of the NNLO coefficient function is (note, that NLO and renormalon coefficient functions can be presented in this form without approximation) where coefficients A, B and c are functions of L µ . Such an approximate form is widely used in NNLO+ phenomenology of PDFs, see e.g. [67]. Here, the coefficients A and B represent the singular at x → 1 and x → 0 terms, and are evaluated exactly. The coefficients c represent the smooth part of the coefficient function, which is reconstructed by appropriate values of c i with better then 1% accuracy. The values of constants A, B and c are presented in the appendix B.3. In the convolution -19 -integral the main numerical contribution comes from the singular terms proportional to A and B, which are exact. The relative difference between the convolution with exact coefficient function and approximate expression in eq. (3.5) is of order 10 −6 . This numerical error is compatible with the numerical error of integration procedure and far inside the theoretical error-bands. The evaluation of the Hankel-type integral over b is one of the main source of numerical errors. Typically, in order to obtain sufficient precision one should include a large number of points into the integral, which is very costly especially at NNLL/NNLO. arTeMiDe evaluates this integral with the Ogata quadratures [68]. The Ogata quadrature is a double exponential quadrature, whose nodes are the zeros of the Bessel function. It provides a fast and precise evaluation of Hankel-type integrals with the minimal number of integrand calls.
The fitting procedure has been performed by minimizing the χ 2 -function. The minimization of the χ 2 distribution have been done using the MINUIT package from the CERN library [69,70]. The estimation of the statistical uncertainties for non-perturbative parameters is made with the MINOS procedure, performing the variation of parameters in the range χ 2 ±∆χ 2 , with ∆χ 2 corresponding to the 68% confidence level (i.e. ∆χ 2 {1.03, 2.32, 3.55} for 1,2,3 fitting parameters, correspondingly.) The sources of theoretical uncertainties have been pointed in sec 2.5, and parameterized by the constants c 1,2,3 . The variation of these constants in the region (0.5, 2) produces the error-bands. The discussion on the individual contributions of theoretical uncertainties associated with different scales is given in sec. 3.5.

Models for non-perturbative part of TMDPDFs
The non-perturbative part of the TMDPDF in general needs some ansatz, the parameters of which are to be extracted from data. In our study we have tested different ansatzes of the following general form where f f ←h is PDF of the parton with the flavour f . The non-perturbative information of the TMDPDF, which is unreachable from the PDFs, is contained in f N P . In order to match the perturbative regime, the function f N P should approach 1 for b → 0. Instead, the behavior of f N P for b → ∞ is not so well established, which requests a test of different models. In the current study, we restrict ourself to flavor independent f N P , i.e. f N P is common for TMDPDFs of different flavours. The flavour-dependence of TMDPDFs enters only through PDFs and coefficient functions, i.e. it is completely determined. The large-b behavior of TMD distributions is the key point of TMD parametrization and extraction. There is no common agreement on this behavior. Clearly, such an agreement cannot be achieved in general, since the b-shape of a TMD distribution is strongly dependent on the large-b prescription. For example, the Gaussian behavior is typically observed in the models based on b * -prescription. Moreover, the classical fits by ResBos package [15] disfavor other non-perturbative behaviors, such an exponential one (for more recent discussion, see [24]). Also the Gaussian shape is used in DYRes code [21] (together with b * -prescription) and in DYqT code [18] (together with the minimal prescription). Contrary, the fit made in ref. [20], which does not employ the b * -prescription, uses an exponential shape of f N P and also obtains an agreement with data. We point out that the use of LHC data for TMD extraction is made here for the first time (to our knowledge). Given the precision of LHC data, the consistency and/or goodness of all previous hypotheses have to be rediscussed.
In order to decide the best shape of f N P within ζ-prescription, we have considered several subsets of the data. It appears important to include simultaneously both high-energy and lowenergy data because they are sensitive to different parts of the b-space spectrum. We have found  that the most optimal data subset is given by the E288 data and the ATLAS Z-boson production data, see tables 2 and 5. In this subset, the very small error-bands of ATLAS data are compensated by a large number of points in E288 data, and as a result, we have a certain equilibrium between low and high-energy inputs.
Using the E288/ATLAS subset we have performed multiple fits using several different functional forms of f N P . Probably, the most informative preliminary test is the comparison of the pure Gaussian and exponential behavior for separate/joint low and high energy data points. In table 9 we demonstrate results of fit with some simple single-parameter models. According to this table, although the quality of the fit is still not optimal, the high-energy data clearly favor the Gaussian shape of f N P , while the low-energy data favor the exponential behavior of f N P . This difference is simply explained if we recall that at higher energies (and thus at generally higher q T ) the Fourier integral in eq. (2.4) is saturated by small values of b. At lower energies (and thus at generally smaller q T ) the Fourier integral in eq. (2.4) is affected by a wider interval of values of b. Therefore, the results presented in the table 9, suggest that f N P should be Gaussian at small-b and exponential at large-b. This is in complete agreement with the theory expectations discussed in sec. (2.3). The expected f N P should be a function with a Taylor series expansion (around b = 0) of even powers of b, with an exponential decay at b → ∞. A simple representative of such functions is cosh −1 (λb).
The test of this f N P is given in the last columns of the table 9 which clearly shows that this function alone, although it works much better than a Gaussian or an exponential, is not able to describe both low and high energy data, and thus we need extra parameters.
The preliminary tests with simple one-parameter dependence for the f N P shape can be summerized by the following: (i) The high and low energy data should be considered altogether, because they test different intervals of the b-space spectrum of f N P .
(ii) The subset of data points E288 and ATLAS Z-boson, is very selective for the f N P . A good fit of this subset guaranties the good fit for the whole set of data points. Nevertheless, in the following sections, we include all experiments, for consistency.
(iii) Both theoretically and phenomenologically, we argue that f N P should be a function of even powers of b with an exponential asymptotic behavior at b → ∞. Using a minimal set of two parameters (and the evolution parameter g K ) we find that one can easily fit the data with a χ 2 /d.o.f ∼ 1.2-1.3. The addition of more parameters (say for the control of b 4 correction and/or flavor dependence) has the possibility to increase the quality of the fit. However, in this work, we do not consider extra parameters, since the current quality of the fit is already typical and reasonable for the modern TMD extraction (compare e.g. with [22]).
(iv) One needs at least two parameters (one to control ∼ b 2 behavior at b → 0 and another to control the asymptotics) to fit simultaneously low and high-energy data. However, the multiplication by polynomials (e.g. f N P ∼ (1 + λb 2 )/ cosh(b)) does not work well, which suggests that the asymptotic terms ∼ b 2 e −b are disfavored.
Based on this experience we have formulated some simple ansatzes for f N P .
• Model 1 : This ansatz uses the fact that the simplest even-b function with exponent asymptotics is the hyperbolic cosine. The model reads where λ 1 [GeV] > 0 and λ 2 [GeV 2 ] > 0 are free parameters. The advantage of this model is its simplicity and independence of the Bjorken variable. The model 1 has a quadratic (Gaussian) behavior at small-b f N P ∼ e −λ2b 2 and exponential behavior at large-b f N P ∼ e −λ1b .
• Model 2 : The model 2 reads where λ 1 [GeV] > 0 and λ 2 [GeV 2 ] > 0 are free parameters. In this model we attempt to incorporate the theoretical expectations on the z-dependence of f N P . So, the model 2 has a zb 2 -behavior at small-b f N P ∼ e −λ2zb 2 and exponential behavior at large-b f N P ∼ e −λ1b .
Both models have two parameters, which we include in the parameterization such that the parameter λ 1 [GeV] dictates the asymptotical behavior at large b. and the parameter λ 2 [GeV 2 ] gives the quadratic term. A priory, the parameter λ 1 should be of order of m π ∼ 0.14 GeV, since it is the only natural scale of strong forces at large distances. The parameter λ 2 [GeV 2 ] roughly corresponds to the size of the leading power correction to small-b OPE, see sec. 2.3. We can associate λ 2 with the scale B as λ 2 ∼ B −2 . In ref. [12] we have estimated the size of this parameter in the large-β 0 approximation as λ 2 ∼ 0.075 GeV 2 .
Additionally, to the parameters λ 1,2 we have studied the parameter g K [GeV 2 ] > 0, which parametrizes the non-perturbative contribution to the rapidity evolution kernel D (see eq. (2.20)). The importance of this parameter is not clear from the literature. In ref. [12] we have estimated its size in the large-β 0 approximation as 0.01 ± 0.03 GeV 2 , i.e. consistent with zero. Also, the fit of [20] shows a negligible influence of this parameter on the final results. Therefore, in the following we consider both possibilities g K = 0 and g K = 0. In section 3.7, we demonstrate that the parameter g K is important at lower perturbative order, but its influence is negligible at NNLL/NNLO.

The domain of TMD factorization
The TMD factorization is restricted to the small-q T range. The size of the allowed q T -region is a priory unknown. We have not found any phenomenological studies on this point but only some statement on the strong dependence of the fit on the q T -window. A specific study on TeVatron Z-boson production data in ref. [71] shows that the Y-term contribution is extremely marginal for q T < 30 GeV.
In order to make a qualitative study, we introduce the parameter δ T and we consider all data points with q T < δ T Q. The amount of data points which are allowed by such a restriction are shown in the table 10. In order to estimate the maximum value of δ T we perform a series of fits with increasing values of δ T . Ideally, the χ 2 /d.o.f. and the fitting parameters should be stable within and unstable outside of the allowed δ T interval. In this way, considering the dependence on δ T one should find an interval of δ T for which the fit is not sensitive to the Y -term. This point indicates the region of TMD-factorization, and should not depend of the perturbative order.  CDF+D0 run1  27  34  38  41  44  47  49  51  52  CDF+D0 run2  22  28  32  38  43  49  55  60 63 ATLAS Z-production (7+8 TeV) 10       We have performed such a test for high-energy data set with different one-parameter forms of f N P . We have especially used the one parameter models to guarantee the absence of fine-tuning of the cross-section. For this reason we also exclude the E288 data, because it is impossible to describe high-and low-energy data with a single non-perturbative parameter. The result of the fits practically agrees for all tested models and orders. In fig. 3, we present some typical outcome of the fits.
-23 -In plots 3 one can see that all models reproduce the data very-well at very small δ T , which is expected since the TMD factorization is only valid at q T Q. Then the value of χ 2 slightly grows but keeps less then one until δ T = 0.2 and after this threshold it jumps to higher values. The next jump is seen at δ T = 0.25. After δ T = 0.25 the value of χ 2 increases rapidly. We interpret this fact saying that at δ T = 0.2 we become sensitive to Y -term, and at δ T = 0.25 the Y -term starts to dominate the cross-section, i.e. we leave the domain of TMD factorization. We have found that the presented plots rather strongly depend on the set of pertubative scales. For some choice of these scales, one can obtain an ideally flat plateau of χ 2 for δ T 0.2. However, the values of the two important thresholds, namely, δ T = 0.2 (where deviation form TMD factorization appears) and δ T = 0.25 (where the TMD factorization is completely broken), are stable with perturbative scales.
As a result of these tests, in the following we use the data points with q T 0.2 Q, or say δ T = 0.2. The choice of δ T that we make is consistent with [71]. This range includes 163 highenergy and 146 low-energy data points (in total 309 data points). Comparing this number of points with the literature, we observe that, it is the largest set of points for Drell-Yan/Z-boson production used up to present in a simultaneous fit of TMDPDF (to our knowledge), which also has the largest considered range of energies from (Q, √ s) = (4, 19.4) GeV (from the E288 experiment) to (Q, √ s) = (150, 8000) GeV (from the ATLAS experiment).

Scale variations and theoretical uncertainties
The theoretical uncertainties of the perturbative inputs are tested by varying the perturbative scales around their central values, as it is discussed in sec. 2.4. The distribution of uncertainties through orders for a typical high energy experiment is shown in figs. 4 and 5, and for a typical low-energy experiments in fig. 6-7. The complete set of plots for every included experiment can be found in [31]. The uncertainty associated with the TMD evolution factor is parameterized by the c 1 -variation. This uncertainty drops down between NLL/NLO and NNLL/NLO orders, that is together with the increase of the perturbative order for D (see table 1). The size of the band is correlated with the energy of the process, that is, it is less significant for higher-energy experiments.
The uncertainty associated with the hard scale depends on the c 2 -variation. This band is independent on q T . This error is the main one at NLL/LO (which we do not present here), but becomes negligible at higher orders.
The uncertainty associated with the low-energy behavior of the evolution factor is parameterized by the c 3 -variation. We have found that it significantly influences the shape of the crosssection and also it is rather large at small-q T . As expected it is decreases going from NLL/NLO to NNLL/NNLO. At NNLL/NNLO it gives the main contribution to the uncertainty band for the cross-section.
The uncertainty associated with the small-b matching of coefficients and PDFs is represented by the c 4 -variation. It is the most interesting error because it checks the convergences of the ζ-prescription. The corresponding error-band is larger at q T → 0, which corresponds to the contribution of large L µ (we remind that in ζ-prescription, L µ grows unrestrictedly). The important observation is that the large uncertainty area significantly shrinks between NLL/NLO and NNLL/NNLO, although the NNLL/NNLO contains a higher power of L µ . This shows a very good behavior of the ζ-prescription. In total this error is dominant at NLL/NLO, but becomes smaller (although compatible) to the one coming from the c 3 variation at NNLL/NNLO.
The size of the theoretical error-band is significantly bigger at small-Q, as can be visually checked comparing figs. 4-5 to figs. 6-7. The uncertainties reduces when one increases the perturbative orders, both in high and low energy cases. However for the low energy case the error remains of order ∼ 20% or higher even at NNLL/NNLO, which can be problematic for a precise description of these experiments. We additionally stress that at NLL/LO the uncertainties range from 50% to  100% and higher. This shows that this particular order has no prediction power, and should not be considered any serious for a well based extraction of TMDs. This is the main reason for excluding NLL/LO order from our analysis. In order to provide a final definition of the theoretical error, we use all scale variations and we take the maximum deviation among them. We have found that our definition of uncertainties is close, as far as one can compare different theoretical expressions, to the common definition used e.g. in [21,43]. In total, for the high-energy experiments we find that the theoretical uncertainty (at NNLO) is of the order 2 − 3% at the peak. It grows to ∼ 5 − 6% at maximum allowed q T , and to ∼ 10% at q T → 0. This value seems to be smaller (but comparable) to the typical values of uncertainties presented ResBos or DYRes. This is a definite positive point of the ζ-prescription. Indeed, the main contribution (at high energies) to it comes from the c 3 -and c 4 -error-bands, which are controlled by ζ-prescription. The c 4 -band would be significantly larger in the presence of doublelogarithms, which are absent due to the ζ-prescription.

Normalization
As the TMD factorization approach describes the shape of the differential cross section only in a limited range of q T , we need some extra input to normalize the curves. In order to compare with      Table 11. The normalization factors for the cross-section for each experiment. The dimensional-less numbers are ratios of partially integrated cross section over qT (3.9)(theory/experiment, i.e. N −1 ), for the data with the published value of total cross-section. For the data sets with unpublished values of total cross-section, the value of the total cross-section used for normalization is presented. The numbers are given for the model 1. The variation of the scales and models gives the change of numbers in the unrepresented digits. The numbers shown in bold are those which agree with the measured cross-section within the error bars.
the data, we weight the differential cross-section by the total (or fiducial) cross-section. The values     of the theory predictions for total cross-sections can be obtained from the studies of other groups. For example, one can use the DYNNLO code [17,53]. Its predictions for the total cross-sections are presented in the tables 3-4. However, we found that such a strategy is unreliable, because even tiny disagreement in the normalization leads to huge effects in the χ 2 -minimization. This is especially important for LHC data sets, which have very small error-bands. Additionally, as we demonstrate later, the DYNNLO predictions are worse than that obtained using our normalization factors. Therefore, to fit the high energy data set we introduce a normalization factor for each data set. This factor equals the partial integral over q T for experimental and theoretical cross-sections, and reads where ∆q T is the size of the q T bin. In this way, we fit only the q T -shape of cross-section, which is already very restricting, as we discussed in the previous section. The values of N −1 resulting from the calculations are presented in table 11. It is clear that the      deviation between the theory and experiment decreases with perturbative orders. For the majority of experiments (excluding the Z-boson production measured by ATLAS), we find a good agreement for the absolute value of the differential cross-section obtained from the data points and the TMD factorization. It is important that the values of N are very stable with respect to the change of non-perturbative model and to the scale variation. In particular, we do not present the error-band on the normalization values in the table 11, because they are smaller then the present precision. The normalization of the data from E288 experiment is generally unknown. Most probably, the main source of discrepancy comes from the fiducial cuts made for E288 experiment, which cannot be restored nowadays. The small fiducial cuts do not seriously influence the q T -shape of the differential cross-section, but can sizably decrease the total normalization. In our analysis, we change the common normalization of all E288 data points as N E288 = 0.8. (3.10) This or close values have been used in different fits, see e.g. [15,20]. However, we do not seriously ground on it, e.g. we can switch to 0.85 or 0.9 without significant loss in χ 2 (however, the value  Table 12. The results of the χ 2 -minimization procedure with gK = 0. The values of χ 2 are given including the theoretical error-band. The values of extracted parameters are given with statistical error-band (the first pair of numbers) and the theoretical error-band (the second pair of numbers). The visual presentation of this table is given in fig.9. 1 produces serious disagreement with our current input). One should take into account that the theoretical uncertainty at small−Q is very large, see figures 6 and 7. It also implies that lowenergy cross-sections are very sensitive to the choice of PDF set (in particular, our approximation of eq. (3.1) for nuclei PDF could be too crude). We have checked that the E288 data can be also fitted with N E288 = 1 to the same values (or better) of χ 2 by additional variation of Q 0 (similar to the fit made in [24]). Such an ambiguity represents a problem in the analysis of the low-energy data.

Results of the fits and TMD extraction
In this section, we present the results of the global fit for the complete data sets presented in sec. 3.1, which allows the extraction of the unpolarized TMDPDF. We have made two independent fits, with g K = 0 and with g K = 0. The results of the χ 2 minimization and the values of the extracted parameters are presented in tab. 12 and 13. The visual presentation is given in fig. 8 Table 13. The results of the χ 2 -minimization procedure with non-zero gK . The values of χ 2 are given with theoretical error-band. The values of extracted parameters are given with statistical error-band (the first pair of numbers) and the theoretical error-band (the second pair of numbers). The visual presentation of this table is given in fig.9. We have estimated both statistical and theoretical errors on the fit parameters. The statistical errors are related to the uncertainty of the χ 2 -minimization and are induced by the experimental error-bands. The statistical errors have been estimated by the MINOS procedure of MINUIT package [70]. The theoretical errors are related to the uncertainty of perturbation series. There is no common procedure for the estimation of the theoretical error. Therefore, we propose the method presented in the following.
The theoretical error is estimated by a set of independent fitting procedures for each variation of the scale constants c 1,2,3,4 ∈ [0.5, 2], as discussed in sec. 2.5. In other words, we set, say, c 1 = 2 and perform the minimization of χ 2 . In this way, we obtain a new set of model parameters (and a new value of χ 2 ). In total, we have 8 independent variations and hence have 8 values of parameters. The final theoretical error-band is given by the maximal positive and minimal deviations from the central value and the results are reported in tab. 14. A drawback of this procedure is the variation of a scale can lead to the serious increase in χ 2 . In other words changing the matching scales affects also the quality of the fit. In general, the size of the band for χ 2 value represents the stability of -30 -  Table 14. Example of parameter extraction with the variation of c1,2,3,4 constants, and evaluation of the theoretical error. Bold numbers in brackets represent the deviation of the parameter from its central value.
the theoretical model, and they are also reported in tab. 14. One can see that the error for χ 2 significantly drops with orders. The values of the parameter λ 1 , which parametrizes the asymptotics of TMDPDFs, extracted at different orders agree with each other within the error-band, that slightly reduces with the increase of the order. It has a natural size of the order of pion mass, λ 1 ∼ 1.3m π − 2.3m π . The values of the parameter λ 2 , which parameterizes the quadratic correction to the small-b regime, are not so stable although the values at different orders are compatible within the errors. In particular, they have large error-bars at NNLL/NLO order. The behavior of g K is the most peculiar. It decreases with the increase of the perturbation order. Moreover, at NNLL (both /NLO and /NNLO) its error-band touches the zero. It can be interpreted as following: the parameter g K is very small (or even zero) but within the fit, it tends to compensate the missing higher perturbative orders of evolution exponent. We also observe that all extractions of g K agrees with the theoretical estimation g K = 0.01 ± 0.03 GeV 2 made in [12]. One can see that both models produce very similar results both for χ 2 and the parameters.
As expected the theoretical error is reduced with the increase of the perturbative order. In particular, the band on the value of χ 2 is significantly smaller at NNLL/NNLO. The distribution of parameter values over perturbative orders presented in table 14 is typical. The variation of c 1 does not represent the main contribution to the error-band. It implies that the low-energy matching for the rapidity anomalous dimension is not so important (in comparison to other matchings), as typically expected.
The variation of c 2 is almost negligible. Here, however, we recall that c 2 influences only the  common normalization factor, and thus the effect of its variation could be underestimated due to our fitting procedure. The variation of c 3 and c 4 produces the most part of the error-band and the strongest variation of χ 2 . At g K = 0 these variation are more-or-less equivalent. At g K = 0 there is a clear pattern. In this case, the variation of c 3 gives the main error-band on g K , while the variation of c 4 gives the main error-band on parameters λ 1,2 . It is very natural since the variation of c 3 tests the low-energy normalization point of the evolution factor, and c 4 tests the uncertainties of perturbation determination of the TMDPDF. In tab. 15 we present the distribution of values for χ 2 among experiments. One can see that the most stringent constraints come from the Z-boson production data of ATLAS and D0 run2. This is due to the small experimental uncertainty of these data points. At the low-energy, the main tension -

Conclusion
The unpolarized Drell-Yan process at small-q T offers the simplest application of the TMD factorization formalism, and as such it has been studied by many groups. In this work, we have revised the main points of the practical implementation of TMD factorization, and reveal some new aspects of the TMD phenomenology. Altogether it allows us to critically reanalyze the available Drell-Yan data and to extract consistently the unpolarized TMDPDFs, within some approximation. The primary aim of our analysis is to answer some general questions for the TMD approach such as: Up to which q T the TMD factorization works? What is the best asymptotical behavior of a TMD distributions? How convergent is TMD formalism at higher orders of perturbative expansion? The answers to these questions are naturally affected by the used prescriptions for the practical implementation of the TMD formalism. Even so, these important issues of TMD phenomenology are undiscussed in the literature or discussed very superficially. Implementing consistently the TMD factorization formalism, we are able to fit a large set of Drell-Yan data points which ranges from low (Q = 4 GeV) to high (Q = 116-150 GeV) dilepton invariant masses on a wide interval of center of mass energies and using a limited set of parameters (two for the non-perturbative part of TMDPDF and one for the non-perturbative part of the TMD evolution).
In this work, we have formulated and used the ζ-prescription, which is one of the main new theoretical contributions of this article. The ζ-prescription consists of a particular choice of the  Figure 11. The comparison of the data for Z-boson production collected at ATLAS and CMS experiments to the fit of model 2 at NNLL/NNLO. Red data points are those which included in the fit (i.e. with δT = 0.2). Gray data points are those which are not include in the fit (i.e. δT > 0.2). The blue band is the theoretical uncertainty obtained from the variation of scales.  rapidity evolution scale ζ = ζ µ , which depends on µ, b and the parton flavor (quark or gluon). This choice corresponds to the equi-evolution line in the space of TMD scales, and thus a TMD distribution is µ-independent along this line. As a consequence, all logarithms related to the TMD evolution, which are essentially double logarithms, are eliminated from the small-b OPE. It significantly improves the perturbative convergence and the radius of convergences for the small-b OPE.
The value of ζ µ is dictated by the differential equation (2.28), which can be solved order-by-order in the perturbation theory. We stress that the ζ-prescription does not strictly solve the problem of -  large-b logarithms, which are still present in the matching coefficients. However, these logarithms are not related anymore to the TMD scales. Moreover, these logarithms are accompanied by the x-dependent coefficients which preserve the integral over x in accordance with the probability interpretation of PDFs. Note, that ζ-prescription is universal for all TMDs of the leading dynamical twist, due to the universality of TMD ultraviolet and rapidity renormalization factors. There are multiple possibilities to apply ζ-prescription, see some discussion in appendix B.1. In this work, we have used the simplest one, which can certainly be improved. A further study of ζ-prescription will be done elsewhere. Within our implementation of TMD factorization and TMD distributions, which has a generic form, we have three independent perturbative series: one for hard matching, one for rapidity evolution, and one for small-b matching. To defend the approach we provide the estimation of the perturbative uncertainty by variation of associated scales by factor 2, see sec. 3.5. We have considered three successive perturbative orders (from NLL/NLO to NNLL/NNLO, see table 1), and demonstrate that the theory uncertainties and the agreement with the data improve with the increase of the perturbative order. The agreement of the theory with the experiment resulting in our fit is a consequence of the ζ-prescription to a large extent. The lowest possible combination of perturbative order, namely NLL/LO, produces very large theoretical error-bands and thus has been excluded from the present study.
Our analysis shows that data are very selective about the non-perturbative part of the TMDs and only well-behaved models can accommodate the fit. The best models for the non-perturbative part of TMDPDF that we have found are formulated in sec. 3.3. They have a common nonperturbative structure, namely where f is the PDF, C is the small-b matching coefficient and f N P is a non-perturbative input. We have found that the best agreement with data is given when the function f N P behaves as Nonetheless, the models produce nearly identical values of χ 2 and of parameters λ 1,2 . It implies that the parameters λ 1,2 that largely determine the shape of TMDPDF have a precise physical meaning. The values of parameters are reasonable λ 1 ∼ 1.5 m π and λ 2 ∼ 0.5 GeV 2 . We also study the influence of the parameter g K , that parameterizes the non-perturbative part of TMD evolution. We have found that this parameter is significant at lower order (in our case NLL/NLO) and less significant at higher orders. Moreover, at NNLL/NNLO it is compatible with zero within the error-bars. We supplement our extraction with the estimation of the theoretical and statistical errorbars. The obtained theoretical uncertainty on the extracted parameters is notably larger in comparison to the experimental one (see fig. 8 and 9). It can indicate the purity of the models (say, the independence on the flavor, or insufficient parameterization), or the general weakness of the theory.
Another aspect that we point out, is the practical limitation of TMD factorization. To make the discussion quantitative we introduce the parameter δ T , which is the highest allowed ratio q T /Q accounted in the fit. Clearly, at very low δ T the TMD formalism should perfectly work, e.g. provide small values of χ 2 -distribution. Our expectation is that within the domain of the TMD-factorization the value of χ 2 /d.o.f. is largely constant and starts to grow outside of this domain. Indeed, for the best models, the observed picture agrees with the expectation. In this way, we have shown that TMD factorization as it is, i.e. in the absence of Y -term, is capable of describing the data with q T 0.2 Q, i.e. δ T = 0.2. With some risk, one can prolong it to δ T = 0.25. After δ T = 0.25 the TMD factorization loses any agreement with the experiment. This analysis is unique, or at least we do not know any analog in the literature.
The fit and the plots of the data has been done with the help of arTeMiDe, version 1.1, available at [31]. This is a code package for the numerical evaluation of TMD distributions and related crosssections. It has a flexible structure and allows to consider an arbitrary combination of perturbative orders up to NNLO for coefficient functions and N 3 LO for evolution factors. In the current version, it evaluates only unpolarized TMDPDFs, but we expect to update it for polarized cases and TMD fragmentation functions, as well as, to include the Y -term, in the future. -37 - The explicit NLO expression for l ζµ for gluon TMDPDF reads The expression for the ζ µ then reads (B.11) -39 -

B.2 Scale dependence and logarithmic part of coefficient function
The small-b coefficient function satisfies the pair of equations where P (x) is the DGLAP kernel of the PDF evolution, and ⊗ denotes the Mellin convolution in the variable x. Using these equations one finds the logarithmic part of the coefficient function. At NNLO the expression for the coefficient function reads +a s − L µ P (1) where we omit the argument (x) of DGLAP kernel P (x) = n a n s P (n) (x) and the finite part of the coefficient function C (n,0) (x), and δ f f (x) = δ f f δ(x).
Substituting the NLO expression for l ζ , eq. (B.7) into the coefficient function, eq. (B.14) we obtain at NNLO Note, that despite the fact that the solution for ζ-prescription in eq. (B.7) has inverse powers of L µ , the coefficient function has not. It is easy to check that this expression convoluted with PDF is renormalization-invariant,

B.3 Expression for NNLO coefficient function in ζ-prescription
The NNLO coefficient functions are cumbersome structures, which contain logarithms and polylogarithms of order 2 and 3 and their straight numerical evaluation is costly. To speed up the evaluation of convolutions within arTeMiDe, we have used an approximate expression for the coefficient function. A similar method for higher-order expressions has been suggested in ref. [67] and it is widely used in NNLO+ phenomenology of PDFs. We parameterize the NNLO coefficient function by 17 terms +c 1 + c 2 x + c 3 x 2 + c 4 x 3 + c 5 lnx ln x + c 6 lnx ln 2 x.
Here, the coefficients A represent the singular at x → 1 terms, and are evaluated exactly. The coefficients B represent singular at x → 0 terms, and also evaluated exactly. The coefficients c represent interpolation functions between asymptotics. These coefficients are fit numerically. The relative precision of the approximation is ∼ 10 −3 . The convolution integral receives the main numerical contributions at singular points, while the rest are minor corrections. So, we find that the relative accuracy of the convolution is better then 10 −6 , which is far beyond any currently needed accuracy. The values of coefficients A, B, and c are given in the tables 16, 17 and 18.  Table 16. Exact values of the coefficients of the unpolarized TMD matching coefficient in the parameterization (B.17), for the terms singular at x → 1.
-42 -   Table 18. Approximate values of the coefficients of the unpolarized TMD matching coefficient in the parameterization (B.17), for the regular in 0 < x < 1 terms.