Systematic analysis of double-scale evolution

Often the factorization of differential cross sections results in the definition of fundamental hadronic functions/distributions which have a double-scale evolution, as provided by a pair of coupled equations. Typically, the two scales are the renormalization and rapidity scales. The two-dimensional structure of their evolution is the object of the present study. In order to be more specific, we consider the case of the transverse momentum dependent distributions (TMD). Nonetheless, most of our findings can be used with other double-scale parton distributions. On the basis of the two-dimensional structure of TMD evolution, we formulate the general statement of the ζ-prescription introduced in [1], and we define an optimal TMD distribution, which is a scaleless model-independent universal non-perturbative function. Within this formulation the non-perturbative definition of the distribution is disentangled from the evolution, which clarifies the separation of perturbative and non-perturbative effects in the phenomenology. A significant part of this work is devoted to the study of the effects of truncation of perturbation theory on the double-scale evolution. We show that within truncated perturbation theory the solution of evolution equations is ambiguous and this fact generates extra uncertainties within the resummed cross-section. The alternatives to bypass this issue are discussed. Finally, we discuss the sources and distribution of the scale variation uncertainties.


JHEP08(2018)003
effects in the TMD factorization formula. In fact, the traditional choice of scales mixes up the parameters of TMD distributions and the scales of evolution, rendering unclear the interpretation of the distribution and bringing undesired dependence on the perturbative order into the model of TMD distribution.
The theoretical uncertainty of the factorized cross-section is produced by the truncation of the strong coupling perturbative series. Despite the fact that the TMD evolution is known up to the third order in the strong coupling expansion [23,24], the error coming from the evolution is the largest among all other theoretical inputs, as it has been shown in ref. [1]. In this work we demonstrate that the theoretical uncertainty of evolution is originated by the combination of two effects: the actual uncertainty in the next higher order perturbative correction, and the ambiguity of evolution procedure. Therefore, a part of the error-band is fictitious, in the sense that it is produced by a poor comprehension of the double-scale evolution, rather then the lack of perturbative information. Some comments of this effect can be found in the literature (see e.g. [2,8]) although they have not been the central topic of any study. In our attempt to cover this gap here, we motivate this statement and we show explicitly that the numerical effect of the evolution ambiguity is huge and, counterintuitively, the error caused by the truncation is larger for larger energies. The ambiguity is not entirely cured by the increase of the perturbative order, and can have even more dramatic consequences on TMD phenomenology. We cite here two. As a first, it violates the transitivity of the evolution procedure. As a consequence, the comparison of different evolution schemes is possible only with a work of reverse engineering of equations which can be also un-precise. Ultimately, this destroys also the concept of universality of the non-perturbative functions. Secondly, it is difficult (but not impossible) to trace internal inconsistencies of the phenomenological applications. An efficient realization of the perturbative part of the TMD is fundamental to provide a correct interpretation of the QCD non-perturbative information.
The dissection of the cross-section using the factorization theorem and the asymptotic limit of Operator Product Expansion (OPE) puts into evidence several important constituents of the TMD formalism beyond the evolution of TMD distributions, such as their asymptotic matching onto collinear functions, etc. Every step of this theoretical process is accompanied by the introduction of specific matching scales that control the goodness of the factorization/expansion. Traditionally, one sets up the scales to minimize the impact of individual logarithms in accordance to a classical one-dimensional evolution picture. However, the double-scale evolution grants an unprecedented freedom to set up the scales, if all scales are fixed coordinately. In this work we describe the fundamental origin of this freedom, and give the non-perturbative definition of the ζ-prescription, which is a selection of scales that completely eliminate double logarithm contribution. Additionally, in the ζ-prescription one completely disentangle the notion of the modeling of TMD distribution from the influence of TMD evolution. Altogether, the set of prescriptions that we propose leads us to obtain what we think is an optimal TMD distribution.
The TMD evolution is also affected by an additional complication coming from the fact that it is partially non-perturbative. In other words, we need to match the perturbative and non-perturbative parts of the TMD evolution. This issue has been discussed in several -3 -JHEP08(2018)003 works in the literature (see e.g. [2,25,26]). The renormalon nature of this behavior has been known for many years in [27] and the object of explicit calculations [28]. Thus, one should not be surprised that the non-perturbative effects included in this part of evolution strongly depend on prescriptions used to solve the solution ambiguity.
The article is structured as in the following. In the first part, given in the section 2, we present the elementary theory of TMD evolution, stressing its two dimensional nature. In order to emphasize it, we introduce the vector notation and the concept of the "evolution field", which allows multiple analogies to mathematical physics. We explicitly demonstrate the freedoms granted by the two dimensional nature of the TMD evolution, such as the freedom in the selection of the scales, contours of integrations, etc., which has not been used so far. We also discuss the structure of singularities of the evolution field, that gives a new point of view of some well-known concepts.
In the second part of the paper we discuss the mathematical aspects of TMD evolution in the truncated perturbation theory (section 3) and show that it leads to an ambigous TMD evolution. In section 4 we discuss the opportunities to fix the ambiguity. In particular, we demonstrate that the traditional method of "resummed" rapidity anomalous dimension (or Sudakov exponentiation) does not entirely solve the problem, but only reduce the uncertainties. From our side we suggest an alternative method to fix the evolution by "improving" the ultraviolet anomalous dimension. The suggested method is simple, obeys all expected demands, and it is easily generalizable to any model of non-perturbative evolution.
In the third part of the paper, given in section 5, we discuss the role of scale choices in the definition of TMD distributions, and introduce the concept of ζ-prescription. We show that the ζ-prescription is a general feature of double-scale evolution. This feature has been completely overlooked in the applications. The particular realization of ζ-prescription that is characterized by the absence of any restriction on the model for TMD distribution defines the optimal TMD distribution. As the standard selection of scales gives no benefits, we suggest here the optimal TMD distribution as a universal object for phenomenological studies.
Finally we collect the formulas needed for a generic TMD cross-section of a Drell-Yan or SIDIS process and we resume our findings. Using these cases, we also recall the perturbative series that enters the cross-sections and systematize the sources of perturbative uncertainties, checking the variation of all relevant scales in several examples. We observe directly that the solution that we propose, with the implementation of the optimal TMD, reshuffles the distribution of theoretical errors and, globally, it provides a better control of theoretical uncertainties.

General structure of TMD evolution
The purpose of this section is to provide the basic concepts and notation for the TMD distributions and their evolution equations. We also introduce a convenient vector notation, which makes transparent some properties of the evolution of TMD distributions which should taken into account carefully. Everywhere in this section, we consider every perturbative series as un-truncated, so their properties can be easily established. Many results of the section could be translated to the cases of other double-scale functions. here & [1,26,29] ζ γF Γ γV D [2,25] ζ γF (= γD) 1 2 γK −γF (g(µ); 1) − 1 2K [7,13,30] --Γcusp 2γ q 1 2 F ff [8] ν Table 1. Correspondence of notation for TMD anomalous dimensions used here to some other popular notations.

Definition of anomalous dimensions
The evolution of the TMD distributions (or TMD evolution for simplicity) is given by the following pair of equations where F f ←h is the TMD parton distribution function (TMDPDF) of the parton f in hadron h. The argument x is the usual Bjorken variable, and b is the transverse distance. The evolution equations for TMD fragmentation functions (TMDFF, and symbolically D f →h ) have the same form with the replacement of F f ←h by D f →h . For the exact field theoretical definition of TMD distributions see e.g. [16]. The equation (2.1) is a standard renormalization group equation, which comes from the renormalization of the ultraviolet divergences. The function γ F (µ, ζ) is called the TMD anomalous dimension and contains both single and double logarithms (see e.g. definition in [16] and eq. (2.5)). The equation (2.2) results from the factorization of rapidity divergences (for the detailed description see e.g. [21,22,29]). The function D(µ, b) is called the rapidity anomalous dimension. TMD and rapidity anomalous dimensions have not unified notation in the literature. The notations γ F and D, used in this article, were suggested in [26]. For convenience we list some popular notations and their relation to our notation in the table 1.
Starting from the definition of TMD operators, whose matrix elements give the TMD distributions, some properties of the evolution have already been established in the past. The evolution equations are independent of quantum numbers of the hadrons which enter in the TMD distributions, because they are properties of the TMD operators. Moreover, they do not depend on the polarization of partons [2,3,16,31] and they are the same for TMDPDF and TMDFF (at least at the two-loop order, see [16]). Altogether, these properties describe the universality of TMD evolution. The only important quantum number for TMDs is the color representation the initiating parton, which is tied to the parton flavor, namely, quark (fundamental representation) or gluon (adjoint representation). However, as the TMD evolution does not mix the flavors and for simplicity of notation, we omit the flavor index f in most of the article. The restoration of the flavor index is straightforward.
The equation (2.1)-(2.2) are coupled, due to the fact that the ultraviolet divergences of the TMD operator partially overlap with the rapidity divergences. As a result, the anomalous dimensions of the two scales are correlated. The mutual dependence can be worked out explicitly (see e.g. [2,8,22]), where Γ is the (light-like) cusp anomalous dimension. The equation (2.3) entirely fixes the logarithm dependence of the TMD anomalous dimension, which reads The anomalous dimension γ V refers to the finite part of the renormalization of the vector form factor. In contrast, the equation (2.4) cannot fix the logarithmic part of D entirely, but only order by order in perturbation theory, because the parameter µ is also responsible for the running of the coupling constant. It has been shown [28] that the perturbative series for D is asymptotical and it has a renormalon pole, whose contribution is significant at large-b. Therefore, the rapidity anomalous dimension D is generically a non-perturbative function, which admits a perturbative expansion only for small values of the parameter b.
On the other side, in conformal field theory, where the coupling constant is independent on µ, the rapidity anomalous dimension is linear in logarithms of µ b and coincides with the soft anomalous dimension [22,24].

General properties of the TMD evolution factor
The solution of eq. (2.1)-(2.2) can be written as where R is the TMD evolution factor. The uniqueness of solution for the system (2.1)-(2.2) is guaranteed by the integrability condition which obviously follows from the equations (2.3) and (2.4). The general form of the evolution factor is where (µ f , ζ f ) and (µ i , ζ i ) refer respectively to a final and initial set of scales. Here, the P denotes the line integral along the path P in the (µ, ζ)-plane from the point (µ f , ζ f ) to the point (µ i , ζ i ). The integration can be done on an arbitrary path P , and the solution is independent on it, thanks to the integrability condition eq. (2.7).

JHEP08(2018)003
The TMD evolution factor R obeys the transitivity relation (2.9) where (µ 3 , ζ 3 ) is arbitrary point in (µ, ζ)-plane and the point inversion property These equations are the cornerstones of the evolution mechanism, since they allow an universal definition of the non-perturbative distributions and the comparison of different experiments.
In practical applications one then has to make a choice for the initial and final scales. For the final scales the typical choice is the hard scale appearing in the process, Q (see also section 6.1). So, (µ f , ζ f ) = (Q, Q 2 ), (2.11) and of course Q Λ. The initial scale (µ i , ζ i ), instead, is the scale where the non-perturbative input for TMD distributions is inserted. This non-perturbative input is usually provided by models, and it is not a subject of TMD factorization. A typical model for TMD distributions incorporates the small-b operator product expansion (OPE), which matches the TMD distributions with integrated distributions and improves the prediction power for high-energy experiments. In this case, the model for TMD distribution has the form where C is the Wilson coefficient function and ⊗ is a convolution in the Bjorken variable x. Here and in the following we use the notation The request for minimization of the logarithmic contributions in the coefficient function in eq. (2.12) dictates the choice of initial scale µ i ∼ b −1 . Let us emphasize here that the parameter ζ remains unrestricted. Often (see e.g. [2,25]), one sets ζ i = µ 2 i . This choice is naively justified by the elimination of ln µ 2 i /ζ i from coefficient function, but actually is not the ideal one. In section 3 and section 5, we critically analyze these common choices, and suggest another selection of scales that guarantees the minimization of the logarithmic contribution in eq. (2.12) on the whole range of b.
Another important point in the implementation of the TMD evolution factor R, is represented by the integration path. The TMD evolution factor R is path independent, however in practice, one has to provide a choice. The two simplest choices of integration paths are the combinations of straight segments as Figure 1. Illustration of evolution paths corresponding to different solutions. Red lines show the solutions 1, 2, and 3, defined in eqs. (2.14), (2.15), and (2.14), correspondingly. The blue line shows the path of the improved D solution (4.6) with the normalization point µ 0 . Green line shows the path of the fixed-µ solution, (2.35). The light-green curve shows the null-evolution curve which passes though the point (µ i , ζ i ). The evolution along light-green curve is absent.
In the first path the evolution is along µ first and then along ζ, while in the second path the evolution is along ζ first and subsequently along µ. In the (µ, ζ)-plane these paths form a rectangle, see figure 1. We call the solutions corresponding to these paths as solutions 1 and 2, for simplicity. Their explicit forms are The solution 1 is practically the only one used in the literature, since it has the form of the resummed Sudakov exponent, see e.g. [15,32,33].
In the next section we discuss the effects of violation of path-independence. So that the solutions 1 and 2 can serve as natural extreme cases. For comparison, we also introduce an intermediate solution whose path has the form of a straight line between points (µ f , ζ f ) and (µ i , ζ i ). We call it the solution 3. Its explicit form reads where t parameterizes the path of integration, µ(t) = (µ f −µ i )t+µ i and ζ(t) = (ζ f −ζ i )t+ζ i .

Two-dimensional notation and the scalar potential for TMD evolution
The TMD evolution is naturally formulated in the terms of two-dimensional vectors and fields. In this section, we introduce the vector notation and rewrite the main equations of the previous sections. By the bold font we designate the two-dimensional vectors.

JHEP08(2018)003
Let us introduce the convenient two-dimensional variable which treats scales µ and ζ equally, (2.17) Here the notation 1 GeV 2 is set to indicate the unit transformation from the dimensional parameters µ and ζ to dimensionless ν. The particular value of normalization plays no role in the following discussion, but could be easily reconstructed if necessary. We also define the standard vector differential operations in the plane ν, namely, the gradient and the curl The TMD evolution is defined by the anomalous dimension which form the vector field E(ν, b). Explicitly, it is defined as Here and in the following, we use the vectors ν as the argument of the anomalous dimensions for brevity, keeping in mind that D(ν, b) = D(µ, b), γ F (ν) = γ F (µ, ζ), etc. In other words, the anomalous dimensions are to be evaluated on the corresponding values of µ and ζ defined by value of ν in eq. (2.17). The TMD evolution equations (2.1), (2.2) in this notation have the form and thus the vector field E has the meaning of the evolution flow field. Correspondingly, the TMD evolution factor (2.8) reads Written in such form the TMD evolution suggests multiple analogies with different branches of physics. Individually, the equations (2.3), (2.4) do not imply any special geometrical meaning. In contrast, the integrability condition in eq. (2.7) that can be seen as a consequence of eqs. (2.3), (2.4), has a deep meaning and it is equivalent to the statement that the evolution flow is irrotational, The irrotational vector fields are also known as conservative fields, and they can be presented as a gradient of a scalar potential, i.e. U is the scalar potential for TMD evolution. According to the gradient theorem any line integral of the field E is path-independent and equals to the difference of values of potential at end-points. Therefore, the solution in eq. (2.20) can be presented as In this form the evolution kernel is explicitly path-independent and obeys the transitivity property in eq. (2.9). The explicit form of the scalar potential can be found by integrating eq. (2.23), namely where ν 1,2 are the components of the vector ν in eq. (2.17), and the last term is an arbitrary b-dependent function.

Singularities on evolution plane
The evolution flow and the scalar potential have a non-trivial structure which is discussed in the present and in the following sections. The graphical representation of the evolution flow is shown in figure 2. It is of great importance to classify the singularities of the scalar potential and the evolution flow. In particular we are interested in the singularities that are located at finite values of parameters. There are two of them. First, there is the line µ = Λ (where Λ is the position of the Landau pole) at which both components of E turn to infinity. On top of this line, and for smaller µ the scalar potential is undefined. In figure 2 this line is not shown and it is located on the left side of the plotted region. Second, there is a saddle point where both components of E turn to zero. In figure 2 the saddle point is depicted by a blue dot. The position of the saddle point is dictated by the equation (2.26) In the standard notation this equation reads

JHEP08(2018)003
At one-loop these equations are functionally independent on a s (µ) and the saddle point position can be found explicitly. This value can be used as a good approximation of saddle point position (here for the quark flavor) (2.28) The position of saddle point depends on the parameter b, see figure 4. Generally, it moves to larger values of µ and ζ for smaller-b. In particular, at some (large) valueb the saddle point crosses the Landau pole line and escapes the observable region. Using eq. (2.28) we can estimate thatb ≈ 2e −γ E /Λ ≈ 4GeV −1 .

Null-evolution curves
The equipotential curves play the special role. Along these curves the scalar potential for TMD evolution does not change its value, and consequently the TMD evolution is 1 (unity) between points laying on the same equipotential curve. For this reason the equipotential curves are also null-evolution curves. Let us denote the equipotential curve which passes through the point ν B as ω(t, ν B , b). This curve is also a solution of dω dt · ∇U (ω, b) = 0, (2.29) where t parameterizes the curve ω. A convenient parameterization of equipotential curve is where we identify the first component of the vector ω with the parametrization parameter. In this form eq. (2.29) turns into where we omit the arguments ν B and b of the function ω for brevity. The solution of this equation reads Using the connection of the derivative of rapidity anomalous dimension to cusp anomalous dimension (2.7) we simplify the solution (2.32) and obtain This expression can be also obtained using the definition of equipotential curve as U (ω) = U (ν B ), and the fact that the scalar potential in eq. (2.25) is linear in ν 2 . Note, that there is an additional equipotential curve that is not included in the solution (2.32). It is the line µ = µ saddle . In eq. (2.32) this line is singular.

JHEP08(2018)003
The equipotential curves in eq. (2.32) do not intersect with each other with a single exception: the line µ = µ saddle , and the line defined by eq. (2.32) with ν B = ν saddle . These lines intersect at the saddle point. For their selected definition we call these curves as special null-evolution curves. Special null-evolution curves are shown in figure 2 by red lines. The evolution plane is cut by the special equipotential lines into quadrants and in each quadrant the sign of the components of the evolution field E is preserved. In particular, both components of E are negative in the first quadrant.
The evolution along any null-evolution curve is absent. This property can be used to simplify the explicit expression for the evolution kernel in eq. (2.8). Using the transitivity property of R, eq. (2.9), the evolution path can be split into two segments one of which is along an null-evolution curve, i.e.
on the null-evolution curve can be selected arbitrarily. Nevertheless it is convenient to use the point with t = ln µ 2 f so that the path of evolution has only a single vertical segment, see the green curve in figure 1. We address to this particular path as to the fixed-µ solution. In the standard notation the evolution kernel along the fixed-µ solution path reads fixed-µ solution :

Effects of truncation of perturbation theory
The picture described above is idealistic. In real applications one operates with only a few terms of the perturbative series for the anomalous dimensions. Nowadays, these anomalous dimensions are known up to three-loop order (i.e. including term a 3 s or up to NNLO), see [23,24,[34][35][36]. In figures 3 we show the function R for different orders of perturbation theory and for different explicit path solutions given in eq. (2.14), (2.15) and (2.16). The final point of the evolution is set to Q = M Z = 91GeV, which corresponds to the Z-boson production threshold. The initial point for the evolution has been set to ( GeV, as it has been used in [1]. We observe that dissimilar realizations of R, which differ only by the integration path (and, in principle, are equivalent), produce enormous numerical differences. Even at b ∼ 0.5GeV −1 , which is still a typical perturbative value (the strong coupling a s varies in the range ∼ 0.01 − 0.02 within the evolution integral), the difference between solutions 1 and 2 is (∼ 56%,∼ 35%,∼ 18%) at (LO, NLO, NNLO) respectively. The large spectrum in the values of the solution is clearly an effect of truncation of the perturbative series that is enhanced by the presence of logarithms in the rapidity anomalous dimension D. In the case of solution 1 these logarithms are ln(µ f b), while for solution 2 these are ln(µ i b). This effect can be reduced by an appropriate resummation procedure.
The blue line is the solution 1. The red Line is the solution 2. The green line is the solution 3. The error band is obtained from the improved D solution at µ 0 = µ i by variation of µ 0 ∈ (0.5, 2)µ i . The blue line with error-band corresponds to the solution used in [28].
The path dependence of the solution leads to another potentially very dangerous problem, namely, the explicit violation of evolution transitivity and evolution inversion relations of eq. (2.9), (2.10). This effect is especially difficult to control. The path dependence prevents a clear direct comparison of fits when they are obtained with evolutions over different paths (which is practically always the case). Additionally, the path dependence makes more evident that the shape of non-perturbative modifications of the rapidity anomalous dimension D, which are necessary at large b, are even more difficult to compare.
A common approach is to use the "renormalization-group improved" rapidity anomalous dimension (see e.g. [2,8]). In section 4.1 we demonstrate that such a method corresponds to an evolution along a specifically selected path. It is not the only method to resolve the solution dependence problem, because the path-dependence is caused not by large logarithms but by the run of coupling constant as it is demonstrated in section 3.1. The presence of logarithms only amplify the numerical evidence. Therefore, there are two principal solutions for the problem, either to use the commonly defined classes of evolution paths, either to use a solution that is explicitly independent on the path. We present examples of both methods in section 4.1 and section 4.2 respectively.
To our best knowledge such a problem is unique for a double-scale evolution. Clearly, it must be taken into account in phenomenological applications and during the comparison of models and fits. We emphasize that the naive application of resummed rapidity anomalous dimensions does not solve the problem of path dependence of the solution, although it reduces its numerical importance. To control the effects of resummation and guarantee the perturbative convergence for the evolution factor one should take into account the two dimensional nature of TMD evolution. This section is devoted to a detailed description of the effects of truncation of the perturbative series in TMD evolution, and to disclose the sources of solution dependence.

TMD anomalous dimensions in truncated perturbation theory
We recall that the perturbative expansions for the ultraviolet anomalous dimensions read where a s = g 2 /(4π) 2 . The leading coefficients in these expansions are Γ 0 = 4C F and γ 1 = −6C F for the quark. In the gluon case, they are Γ 0 = 4C A and γ 1 = −2β 0 (where β 0 is defined after eq. (3.3)). For the collection of higher order terms see e.g. appendix D in ref. [16]. The perturbative series for the rapidity anomalous dimension D is where L µ is defined in eq. (2.13) and d (n,k) are numbers. Note, that using eq. (2.4) the coefficients d (n,k) with k > 0 are expressed in the terms of d (i,0) , Γ i and the coefficients of βfunction. The leading terms of D are d (1,1) = Γ 0 /2 and d (1,0) = 0. The explicit expressions for d (n,k) up to n = 3 can be found in [22]. The running of the coupling constant is given by In order to study the effects of the truncation of perturbation theory one has to carefully examine some formally exact relations. In our case the path dependence of the TMD evolution is introduced by the violation of (2.4). Since the relation among anomalous dimensions is spoiled, in the following, we consider γ V , D and Γ as three independent functions.
Let us introduce a new function which accumulates the violation effect, namely By δΓ (N ) we denote the function δΓ when the expression for the D and Γ are truncated at a N s (inclusive). One can show that whereβ n is the β-function with first n terms removed For instance, we have In these expressions we take care not to expand the β-function because in applications it can be of different perturbative order with respect to the rest of anomalous dimensions. Given a truncation of the perturbative series at order N , the function δΓ is formally of the next perturbative order. Nonetheless, it is easy to see that its main contribution is always enhanced by powers of logarithms. In fact, we have Therefore, at any (finite) perturbative order there is a region of (large-)b where δΓ ∼ O(1). Moreover, since typically at large b the scale µ approaches some fixed value, the boundary of the region δΓ ∼ O(1) approaches some fixed value for N → ∞. In other words, it is not possible to keep δΓ small by increasing the order of perturbative theory N . One can always find the region of b > b 0 where δΓ (N ) ∼ O(1) for any N . Clearly, such large values of b correspond to the non-perturbative regime of QCD. Nonetheless, even within a defined model for non-perturbative physics, this ambiguity is present and it should be fixed. A direct consequence of the violation of eq. (2.4) is the loss of the integrability condition in eq. (2.7) and consequently the solution of eq. (2.8) is path-dependent. On top of this, the violation of integrability condition turns into the violation of the transitivity condition of eq. (2.9) and the inversion rule of eq. (2.10). For example, for the solutions 1 and 2 we have (3.10) This demonstrates that if a particular evolution solution has been used for modeling or fitting, in order to extend it to a broader interval of energies one should apply an inverted evolution solution. In turn, this can introduce some extra effects due to the violation of transitivity. It is clear that the effect of solution dependence is proportional to the area between different paths. Therefore, the evolution between well separated scales has an additional enhancement. Specially for this reason the problem of ambiguity should be considered with care before any global fit which would connect high-energy Drell-Yan and low-energy SIDIS data.
To conclude this sub-section we recall that at small-b the discussed problem could be softened by resumming of the contributions ∼ a s L µ , which can be done either implicitly by an improved D method, which is discussed in section 4.1, either explicitly as in [26] (see also appendix A). In this case, we have (3.11) Therefore, the integrability condition is still violated but to a smaller extent. Yet the resummation methods are valid only for the regions of b where the non-perturbative effects are negligible. For larger-b some prescription has to be used. Let us emphasize that the violation of integrability condition, and thus the path dependence of evolution, is not caused by the logarithms in the rapidity anomalous dimension. The logarithm contributions only amplify the numerical amount of violation and make this effect evident. This argument can be evinced by examining the expression (3.8), which is non-zero even when the logarithmic terms were absent. Therefore, the path dependence problem can not be solved entirely by a resummation of logarithmic contributions. On the contrary, the integrability condition is exactly preserved if the β-function is zero (i.e. in conformal field theories), even if the value of L µ is large.

Formal treatment of TMD evolution in the truncated perturbation theory
In this section we present the formal treatment of the evolution field in the truncated perturbation theory, where eq. (2.22) does not hold. In other words, the evolution field E JHEP08(2018)003 is a non-conservative vector field. Using the Helmholtz decomposition we split the evolution field into two parts The fieldsẼ and Θ are irrotational and divergence-free respectively, where curl(curl) = ∇ 2 . They are orthogonal to each other The irrotational fieldẼ is the conservative part of evolution flow, and can be written as the gradient of a scalar potentialẼ The divergence-free part in two-dimensions can be written as the vector curl of another scalar potential where operation curl is defined in eq. (2.18). The curl of the evolution field can be calculated using the definitions (2.3), (2.4), (3.4), and, using to Green's theorem, the closed-contour integral of the evolution field is where C is some closed contour and Ω is the area surrounded by this contour. Using this expression, we can calculate the difference between solutions evaluated on different paths, see eq. (2.21), where P 1 ∪ P 2 is the closed path build from paths P 1 and P 2 and Ω(P 1 ∪ P 2 ) is the area surrounded by these paths. In turn using the independence of δΓ on the variable ζ, eq. (3.4), we can rewrite it as where ζ 1,2 (µ) is the ζ-component of the path P 1,2 at the scale µ. In the case of solutions 1 and 2, paths are straight and thus ζ 1,2 are independent on µ. Therefore, comparing solution 1 and 2 we obtain

JHEP08(2018)003
One can see that this expression is enhanced by an extra logarithm of scale separation. This logarithm is typically large, namely ∼ O(L Q ). Using the order estimation eq. (3.9) we have In figure 3 one can observe the difference in the numerical value of eq. (3.21) comparing red and blue lines. In the resummed case eq. (3.11) one obtains These estimations describe the observation that the effect of solution path-dependence is significant, even in the resummed case. Indeed, assuming counting a s L ∼ 1, the difference between solution 1 and 2 is ∼ a N −2 s in the resummed case (in the fixed order case it is fixed ∼ a −1 s ). So at N = 3 (that is indicated as NNLO) the difference is as large as improvement between LO and NLO, which is clearly seen in figure 3.

Restoration of path-independence
From the discussion above one can infer that the path-independence of the TMD evolution passes through the conservation of the evolution flow field E.
One possibility to achieve it consists in modifying the evolution field such that the divergence-free component vanishes and, as a result, only the curl-free component enters in the evolution factor. The expression for the TMD evolution factor has the potential form (compare to eq. (2.24)) whereŨ is the scalar potential determined byẼ, eq. (3.15). In general, the potentialŨ does not coincide with the potential U defined in eq. (2.25). Moreover, the scalar potential U satisfies the gradient equation (2.23), while, in contrast, the scalar potentialŨ satisfies the Poisson equation Consequently, the potentialŨ can be fixed only up to an arbitrary harmonic function f (ν) (with ∇ 2 f = 0). To fix this ambiguity, an additional statement on the fieldẼ is required, e.g. a boundary condition on a line. Such a boundary condition is equivalent to imposing a null value of the divergence-free component Θ. Unfortunately, nowadays, any statement on the non-perturbative behavior of D is mostly a conjecture.
In this work instead we pursue a different strategy. Instead of defining the boundary condition for the eq. (4.2), we repair the compatibility condition in eq. (2.7) by improving the definition of anomalous dimensions γ F and/or D with terms of higher-perturbative order. Of course this improvement is not unique, so that here we explore the cases where only one of these anomalous dimensions is changed. In the following section we consider both scenarios, and call them improved D in section 4.1 and improved γ scenarios, section 4.2. Of course, both these scenarios are equivalent to a particular selection of the scalar potentialŨ .

Improved D scenario
In order to fix the features of this scenario one observes that the relation (2.4) can be used as an exact relation, i.e. in order to guarantee it to all orders, we replace the perturbative expression for D by the solution of (2.4). In this way one obtains In the improved D picture the scalar potentialŨ is obtained from eq. (2.25) replacing D by eq. (4.3). It reads One can demonstrate that this approach is equivalent to imposing to the solution of the Poisson equation eq. (4.2) the condition The expression for the corresponding TMD evolution factor depends on µ 0 and reads improved D solution: Comparing the improved D solution with the solutions 1 and 2 in eq. (2.14), (2.15) we conclude that it corresponds to a composition of solution 1 and 2 in the usual implementation of TMD evolution where ζ 0 is arbitrary. The integration path of the improved D solution is shown in figure 1 by blue lines. The improved D solution satisfies transitivity and inversion relation eq. (2.9), (2.10), and at µ 0 = µ i (µ f ) it turns into the solution 1, eq. (2.14) (into the solution 2, eq. (2.15)). The improved D scenario, is often used in the literature in different forms. For instance, the equation (4.3) is used for the resummation of logarithmic contributions within D in [1,2,8,15,25,26,37]. In these cases one has to select µ 0 such that the effect of logarithms in D is minimized, that is, typically µ 0 ∼ b −1 at small-b.
Since the improved D solution is a composition of solutions 1 and 2, it can be seen as the convention for the fixation of a common path for all evolution procedures which depends on the choice of µ 0 . Once the convention for µ 0 is established the comparison of different fits and models is plain. For instance one can propose to accept the solution of eq. (4.5) as a basic agreement. Nevertheless in the absence of such an accepted convention, the improved D solution should be considered with caution because the numerical differences between -18 -JHEP08(2018)003 different µ 0 could be large. It can be seen already in figure 3, where the initial scale is selected as µ i ∼ b −1 , and thus fulfills the requirement for logarithm minimization. The solution 1 corresponds to (4.3) with µ i = µ 0 . The blue band on it corresponds to variation of µ 0 ∈ [µ i /, 2µ i ] and all these values have reduced logarithm contributions. The width of the band reduces with the increase of perturbative order, but it is still non-negligible at the highest available order.

Improved γ scenario
The integrability condition in eq. (2.7) can be fixed modifying the anomalous dimension γ F and without using directly eq. (2.3). In this way, one changes the value of the higher order terms in γ F . The modified value of γ F (that in the following is denoted by γ M ) is dependent on b, and reads The corresponding scalar potentialŨ is obtained from eq. (2.25) by the replacement of Using the definition of δΓ, eq. (3.4) and integrating by parts we rewrite this expression in a notably simpler form Therefore, the corresponding solution for the evolution factor reads improved γ solution: The expression in eq. (4.11) is exceptionally simple, and it explicitly satisfies the transitivity and inversion relations eq. (2.9), (2.10). We stress that this solution is independent of any intermediate points (like µ 0 in the improved D case) so that one does not have to rely on a common convention for this intermediate point. This is a clear advantage of the improved γ scenario in comparison to the more traditional improved D scenario. All these advantages are also true when the values of D is modified (e.g. by a non-perturbative contribution). Note that for practical applications one has to take care of the logarithmic contributions within D. In contrast to improved D scenario, where the logarithmic contributions were effectively resummed by the selection of scale µ 0 , the improved γ scenario does not include any explicit resummation. Therefore, the rapidity anomalous dimension should be taken in a resummed form, e.g. by means of renormalization group For large-b the perturbative expansion of D is not valid. The range of validity of the perturbative expansion b <b can be determined by different methods. One can use the resummed expression [26] (see also the appendix A) and determine the position of the singularity in it. At the leading order the singularity happens at X = β 0 a s (µ)L µ = 1. Within such determination the value ofb depends on µ (see the discussion on the behavior of this value in [26]) and it does not give a clear indication of the perturbative domain. Another way to fix the range of validity of the perturbative series of the D function is to consider the stability of the resummed large-β 0 series as it was done in [28]. The analysis made in ref. [28] demonstrates that the boundary of perturbative region isb ∼ 3−4 GeV −1 .
In the present framework we observe that there exists another natural definition of b, as the value at which µ saddle < Λ. This value isb ∼ 3.5 GeV −1 , and thus practically coincides with the renormalon estimation [28].
At large-b the shape of the rapidity anomalous dimension is unknown. In fact, the only known information about non-perturbative structure of D is that it receives renormalon correction ∼ b 2 [27,28] (see also [38]). It is clear that this contribution is only the first one of a series of power corrections. So, at large-b the expression for D should be extracted from data fitting, while at small-b it should match the perturbative expression. Practically the passage from the perturbative to the non-perturbative regime can be done, e.g., by a simple modification where b max is a parameter, such that b max <b. An example of such a form for the nonperturbative correction for rapidity anomalous dimension has been suggested a long ago in [33], as part of the b * prescription [2]. Let us stress that the choice of a b * can be admissible separately for the evolution factor and that eq. (4.12) does not imply b * -prescription for the whole TMD distribution. With the choice b max <b the saddle point is always in the observable region, which (as it is discussed in the section 5) allows to determine the optimal TMD.
We note that at large-b the derivative of D NP determines the function δΓ NP . I.e. (4.14) In the model in eq. (4.12) it is equal to δΓ(µ, b * ). Note, that δΓ NP is smaller at large-b since there is no L µ to blow up. Therefore, given the non-perturbative model the problem of solution path-dependence is weakened, and the improved D and improved γ solutions converge to the same. Practically, the implementation of non-perturbative modification of evolution consists in a replacement of D in the formulas of previous the sections by D NP .

JHEP08(2018)003 5 ζ-prescription and optimal TMD distribution
The proper construction for the TMD evolution factor is only an (important) piece of the TMD evolution implementation. Another (important) piece is the selection of initial scales for the TMD distribution model. In this section we demonstrate that this problem has a natural solution, that we call the ζ-prescription. In section 5.1 we introduce the concept and main characteristics of ζ-prescription. In section 5.2 we provide expressions for matching coefficients in ζ-prescription. Finally in section 5.3 we present a particular implementation for ζ-prescription that has some exceptional properties. We call the TMD distribution defined in this particular prescription, the optimal TMD distribution. It is one of main proposals of this work.

ζ prescription
The final point of the rapidity evolution, ζ f in eq. (2.11), is as usual dictated by the hard subprocess. On the contrary, the initial value of the rapidity scale ζ i should be selected depending on the input for the non-perturbative behavior of the TMD distribution. In practice the majority of phenomenological models at small values of b match the TMD distribution to the corresponding collinear distribution. This matching guarantees the agreement of model with high-energy data, and determines significant part of the TMD distribution. The expression for small-b matching has the form where f is PDF or FF, and C is the Wilson coefficient function. For the unpolarized TMDPDF and TMDFFs the coefficient functions are known at NNLO [16,29,30], while for the polarized cases they are know only for twist-2 matching at NLO [31]. The coefficient function includes the dependence on b within the logarithms L µ and L √ ζ . In this way, the initial scales (µ i , ζ i ) explicitly enter in the TMD modeling.
The traditional choice of initial values used in many studies suggests ζ i = µ 2 i , see e.g. [2,25,39]. While this choice looks natural, it has some serious drawback which undermines its stability. In particular, this scale choice leaves uncompensated the logarithms L µ in the coefficient function. The remaining logarithms L µ unrestrictedly grow at larger b. In this regime the matching in eq. (5.1) is not valid, and thus should to be modified. In turn, any non-perturbative modification requires another matching procedure of the largeb non-perturbative regime with the small-b perturbative expansion. An example of such a procedure is offered by [2], where b * -prescription is used as a non-perturbative modification of eq. (5.1). We remark that such a procedure has a poor stability in the perturbative-tonon-perturbative transition , due to the fact, that any deviation from the matching scale uncovers the uncompensated logarithms. As a result the scale variation around ζ i = µ 2 i , induces some large error-bands. All-in-all, we come to conclusion that the popular choice of initial scales ζ i = µ 2 i is accidental and does not grant any improvement in the understanding of TMD distributions.
The main idea of the ζ-prescription is to use the two-dimensional nature of TMD evolution to improve and to extend the perturbative stability of the small-b expansion -21 -

JHEP08(2018)003
to the full range of b. This idea has been used in [1], where it has been shown that a particular choice of ζ i as a function of µ i completely eliminates the double logarithms from the coefficient functions. In [1] the largest known set of Drell-Yan data has been fitted within ζ-prescription, and without any extra non-perturbative matching, which shows the practical success of ζ-prescription.
The ζ-prescription consists in a special choice of ζ i value as a function of µ and b. The value of ζ i is selected such that the initial-scale TMD distribution is independent on µ i . We denote the corresponding value of ζ i as ζ µ i (b). The function of ζ µ (b) draws a curve on the evolution plane. By definition of ζ-prescription, the TMD distribution does not evolve along this curve, and thus it is one of the null-evolution curves defined in section 2.5. Therefore, the expression for a TMD distribution in the ζ-prescription reads Note, that the point ν B in eq. (5.2) just represents a label. It only indicates the selected null-evolution curve, but does not enter the function F (x, b; ν B ) explicitly. In other words, in eq. (5.2) the scale ν B can be changed to another scale ν B , as long as ν B belong to the same null-evolution curve, In this sense, instead of labeling a TMD distribution by a two parameter label (µ i , ζ i ), we can specify a single parameter label, given by an equipotential curve ν B . To emphasize this concept we use the single argument ν B in the notation of TMD distribution, eq. (5.2). Since the single-labelled TMD distributions depend only on the selected null-evolution curve the value of the initial scale µ i is irrelevant (as far as it belongs to a selected nullevolution curve). In particular, it allows to eliminate the parameter µ i from error analysis, which is equivalent to the evolution along the path of the fixed-µ solution in eq. (2.34). Obviously, such a form is very convenient because the scale µ f is related to the hard scale Q, and thus the evolution exponent is entirely perturbative. Additionally, the explicit form of the fixed-µ solution is notably simpler, see eq. (2.35). The passage from one null-evolution line to another can be done using the TMD evolution. We have Here the TMD evolution factor is a universal constant that measures the difference between potentials of null-evolution curves. In section 5.3 we show that when performing a TMD modeling, there is a preferred choice for the null-evolution curve namely ν B = ν saddle . This choice defines the optimal TMD distribution. The ζ-prescription separates the modeling of the TMD distribution from the factorization procedure. This is the central feature of ζ-prescription, which is absent in formulations -22 -

JHEP08(2018)003
of TMD factorization used before. In non-ζ-prescription formulation the TMD distribution has a µ-dependence that is typically related to the scale b. Thus the evolution, and hence non-perturbative modification of D, is somehow incorporated into the model for the TMD distribution. This fact makes difficult and sometimes impossible the comparison among different TMD non-perturbative estimations such as lattice or low-energy effective theories. The ζ-prescription is self-consistent only when the evolution field is conservative. If this is not the case, the ζ-prescription in principle could not be implemented because equipotential curves could not be defined for non-conservative fields. Therefore, in the truncated perturbation theory (which is the only practically possible case) the improved scenarios should be used. The naive version of ζ-prescription used in [1] uses the improved D scenario with µ 0 = µ i , and thus it is not entirely consistent. Additionally, the naive ζ-prescription in [1] also uses the perturbative series for the definition of the null-evolution curve, instead of eq. (2.33), which gives additional inconsistency. These inconsistencies have been somewhat tested by variation of scales c 1 and c 3 (see discussion in section 6.1). The corresponding variations give the dominant contribution to [1] error-band. An updated version of the arTeMiDe [40] code which removes these inconsistencies and implements the optimal TMD distributions will be soon released.

Matching coefficient in ζ-prescription
A typical model for TMD distribution incorporates the small-b matching to the collinear functions. The ζ-prescription guarantees that the matching coefficient is free from the double logarithmic contribution, which makes it more stable at larger b. In this section we derive the details for the small-b matching coefficient within ζ-prescription. We do not restrict the discussion to some particular quantum numbers of the TMD distributions and collinear distributions, since the general structure is universal.
The small-b matching has the form of eq. (5.1). The label n enumerates the collinear distributions contributing to small-b OPE at the desired order, which in general, are not restricted to leading twist distributions. The evolution of the collinear distribution f is given by where the function P is the splitting function and f enumerates all intermediate flavors that mix in the matching. For the twist-2 distributions the eq. (5.6) is known as DGLAP equation, and the sign ⊗ represents the Mellin convolution. The distributions of twist higher then 2 generally depend on several variables x i . In this case, the variable x in eq. (5.6) represents a collection of variables and ⊗ is an integral convolution over these variables. Using eq. (5.6) and the TMD evolution eq. (2.1), (2.2) we derive

JHEP08(2018)003
These equations fix the logarithmic part of the coefficient function order-by-order in perturbation theory. The explicit expression for the logarithmic part up to two-loop order can be found e.g. in [1,16,29]. The value of ζ µ (b) is defined through eq. (2.31), which explicitly reads as Evaluating equations eq. (5.7), (5.8) at ζ = ζ µ (b) and using eq. (5.9) for simplifications, we obtain whereĈ(x, b; µ) = C(x, b; µ, ζ µ (b)). Thus the perturbative series for the coefficient function has the form (here up to NNLO) where we omit the arguments of the functions as well as the sign for the summation on f for brevity. In eq. (5.11) the functions P (n) and C (n) admit the expansion The constants c i do not depend on x, but they depend on the boundary choice of the equipotential curve, ν B . The dependence on the parameter ν B is entirely concentrated in the constants c i . To fix it eq. (5.9) has to be solved order by order in perturbation theory. The solution of eq. (5.9) up to NNLO is We remark here that the perturbative expression for the equipotential line eq. (5.13), (5.14) is universal in the sense that it depends on the quark or gluon origin of the parton, but not on other quantum numbers.
Within the ζ-prescription, the convolution C ⊗ f is more stable at large-b, due to the absence of double logarithms and the peculiar functional form of logarithm coefficients. Even in the extreme asymptotic cases the shape (the x-dependence) of the convolution C⊗f does not blow up, but behave as it is expected from the naive probabilistic interpretation of collinear distributions. For example, in the case of unpolarized distributions the first x−moment of the convolution C ⊗ f is constant at all orders of perturbative expansion, due to the charge conservation. This fact has been already tested and confirmed in fits of the unpolarized distribution made in ref. [1].

Optimal TMD distribution
As an outcome of previous section one finds that the coefficient function does not depend on the scale of TMD evolution µ i . Instead, the scale that appears in the explicit expressions of eq. (5.11), is the intrinsic scale of OPE. To avoid confusion we denote it by µ OPE . Therefore, the small-b matching of TMD distribution within ζ-prescription has the generic form The matching scale µ OPE is the intrinsic scale of OPE, and is a free parameter. However, its values are restricted to the values of µ spanned by the defining null-evolution curve. In accordance to the general structure of the evolution plane presented in section 2.5-2.4, we -25 -

JHEP08(2018)003
have following restrictions on the parameter µ OPE if ν B,1 < ln µ 2 saddle ⇒ µ OPE < µ saddle , It is clear that the last case is preferable, since the model of TMD distribution is completely unrestricted. Additionally, only this case has a unique definition. The optimal TMD distribution is the distribution defined on this special null-evolution curve. We denote it simply as F (x, b) emphasizing its scale independence and uniqueness.
The values of ν saddle are given by eq. (2.27). Comparing the second equation eq. (2.27) with the perturbative expression (5.13) we find the values of constants r i The corresponding values of constants c i are The values of L s could be found by solving the transcendental equation D(µ saddle , b) = 0. At one loop the solution is L s = 0 and given in (2.28). At higher loops this equation can be solved only numerically. The value L s slowly grows at larger b, but it remains numerically small in comparison to other ingredients of TMD evolution, see figure 4. Practically, it is inconvenient to have functions L s in the coefficient function, because it requires to update the expression for the coefficient function with each correction to the evolution. The more convenient way is to determine the coefficient function on the curve with r 1,2 = 0 (which is equivalent to L s = 0), and take into account the deviation from the exact special null-evolution line by the factor in eq. (5.5). Then the expression for the coefficient function reads where C pert f ←f is given by eq. (5.11) evaluated on the particular values of r that determine ζ pert µ . In particular, r 1,2 = 0. The parameter µ in eq. (5.24) is a free parameter.
At large-b the saddle point could escape the observable region, i.e. it could appear that µ saddle < Λ. In this case the determination of universal scale-independent TMD distribution is ambiguous, since there is no way to fix a special null-evolution line. Of course all this should be prevented by an appropriate non-perturbative modification of D, as it is discussed in section 4.3.

Perturbative uncertainties in TMD factorization
The TMD factorization describes processes such as Drell-Yan, SIDIS and back-to-back hadron production in e + e − -annihilation. The factorized expressions for these cross-sections have as common form the Fourier transformation of a pair of TMD distributions. In this section we concentrate on the tests of perturbative stability of factorized cross-section that is usually done varying renormalization/factorization scales. Despite the fact that the analysis by variations of renormalization scales have not statistical meaning, it is an important part of the phenomenological studies, since it tests the falsifiability of the theory. Eventual large bands produced by such variations indicate a convergence problem in the perturbative approach and shows the limits of factorization. In this section we demonstrate that the usage of the path-independent solution reduces the variation band, due to absence of the associated path dependence uncertainty. We start this sections recalling in section 6.1 the most common inputs for the implementation of TMD inside cross sections. Then in section 6.2 we discuss the implementation of the improved γ-scenario and finally in section 6.3 we write the cross section using the optimal TMDs. In the following, we show examples with the Drell-Yan cross-section, for definiteness. All the results presented in this section hold for other TMD processes with the appropriate replacements.

More traditional implementation of cross-sections within TMD factorization
Within TMD factorization (and hence at q T Q), the cross-section for Drell-Yan processes has the generic form where dX is the q T -differential phase-space element, σ 0 is the normalization of the crosssection, H is the hard part, and F are TMDPDFs. The values of x 1,2 are dictated by the kinematics. The parameters ζ are constrained as ζ f ζ f = Q 4 . It is natural to consider The scale µ f is generically unconstrained but it is selected µ f ∼ Q in order to minimize the logarithms of Q/µ f that are present in the hard coefficient function. Therefore, the final evolution scale of the TMD distributions is (µ f , ζ f ) = (Q, Q 2 ), as it is discussed in eq. (2.11).
The model for TMD is made at the initial scale (µ i , ζ i ). The connection between the external-kinematic dependent hard scale and the TMD distribution, is made by the TMD evolution factor. In the improved D picture eq. (4.6), the practical expression for TMD cross-section reads where R is defined in eq. (4.6). Here, we have used the fact that R f = R f as far as both partons are gluons, or quarks. The models for a TMD distribution conventionally include the small-b matching to the integrated distribution supplemented by a non-perturbative function. The typical form is where C is the perturbatively calculable matching coefficient, f is the collinear distribution, and f N P is an ansatz for the non-perturbative large-b behavior of TMD distribution and it is the object of the fitting procedure. Here, we specially separate the scales of TMD distribution from the scale of OPE, to keep the discussion at the most general level. This is a typical construct used for the phenomenology. The particular details and the choice of scales the implementation vary among authors, compare e.g. realizations used in refs. [1,11,25,39,41,42]. In this way the traditional implementation of the TMD cross-section contains four renormalization scale entries of the perturbative series and consequently four scales µ. These are (µ 0 , µ f , µ i , µ OPE ). The scales (ζ i , ζ f ) are usually related to (µ i , µ f ) and they are not independent. In the infinitely precise perturbation theory the cross-section is independent on each scale µ separately and the residual dependence on each of these scales is an artifact of truncation of the perturbative series.
The standard method to test the dependence on the scales, and thus the stability of the perturbation theory prediction, is to multiply each scale by a parameter [1,11,41,43] and vary the parameters nearby the central value. E.g. in the notation of [1], one changes scales as and checks the variations of c i ∈ (1/2, 2). The variation produces a band which roughly represents the size of the next-perturbative order contribution. Each constant c i explores a particular theoretical error. The numerical source of the band is the mismatch between resummed (and hence "exact") expression (e.g. TMD evolution factor, or PDF) and the fixed  Figure 6. Effect of variation of constants c i on the Z-boson production cross-section. The right panel shows the envelope of bands. The picture is from [1]. For the definition of perturbative orders and other details see [1].
order coefficient function (e.g. hard coefficient function H, or small-b matching coefficient). In the TMD cross-section there is an additional source of perturbative scale-dependence, namely, the solution path-dependence. This error is undesirable, since it does not entirely tests the convergence of perturbation theory, and depends on the particular realization of the numerics. Examining the expression (6.2) we can sort the variation which test these cases.
• The variation of c 1 tests only the path dependence of evolution. For that reason the variation band for c 1 is uniformly large, unstable, and not significantly reducing with order improvement. It is absent in any path-independent solutions.
• The variation of c 2 and c 3 tests both the perturbative convergence and the path dependence. The latter is the subject of a particular realization of the evolution exponent. E.g. for the improved D solution (for well-separated µ 0 and µ f ) the variation of c 2 does not deform the path, as it is demonstrated in figure 5. Whereas the variation of c 3 does deform the path. The effect of it is clearly seen in figure 6 where the c 3 band is dominant (figure 6 is taken from [1], where cross-section has been taken in the form of eq. (6.2).). The usage of a path-independent solution removes this contribution from the c 2 and c 3 bands leaving only perturbative uncertainty band.
• The variation of c 4 tests only the perturbative stability. In fact, this scale is not related to TMD factorization and the problems of its implementation.
In the following sections, we give explicit expressions for a TMD factorized cross-section in path independent scenarios with and without optimal TMD definition. We also demonstrate, and it is one of the main results of the article, that the variation error-bands improve, in accordance to the general expectations discussed above.

The TMD cross-sections with evolution in the improved γ picture
In order to avoid the undesired ambiguity coming from the solution path-dependence, one can use the improved γ-picture, suggested in section 4.2. In this case the TMD crosssection looks precisely the same as in eq. (6.2), with the only difference that the TMD JHEP08(2018)003  Table 2. The adjustment of perturbative order for the cross-section in the improved γ picture. For explicitness we indicate the highest included power of a s in the expression.
evolution factor is taken as in eq. (4.11). Let us express these formulas restoring all dropped superscripts for convenience. The TMD cross-section has the form The modified rapidity anomalous dimension D f NP is perturbative at small-b and can have non-perturbative correction at large-b. In order to improve the perturbative convergence of the anomalous dimension D the resummed version can also be used, see appendix A.
The cross-section in the improved γ picture eq. (6.5), (6.6) is self-consistent in the sense that the incorporated TMD evolution is explicitly transitive, invertible and path independent. Therefore, the extractions of TMDs are simple to compare knowing the TMD functions F , the non-perturbative evolution D f NP and the scales (µ i , ζ i ) that are used for each extraction. The cancellation of the µ-dependence is achieved adjusting correctly the perturbative orders of ingredients. We stress that the typical question about which order of Γ one should use in comparison to other anomalous dimensions is absent in this scheme, due to the absence of Γ. Instead, the rapidity anomalous dimension D and γ V should be of the same order, since their finite parts jointly contribute to the integral in eq. (6.6). In the table 2 we present the consistent order composition in the improved γ scheme.
The test of the perturbative stability can be done in the same manner as usual, i.e. by rescaling the parameters µ in eq. (6.4). However now in the improved γ picture the parameter µ 0 is absent by definition. Indeed, the parameter µ 0 and its variation in eq. (6.2) parameterizes and measures the path dependence of the solution (see figure 5(left)) only. Therefore, it disappears in the path-independent solution.

The TMD cross-sections using the optimal TMD distributions
The expression for the cross-section can be simplified even more with the application of the optimal TMD definition discussed in section 5.3. In this case the TMD cross-section   Figure 9. Comparison of error bands obtained by the scale-variations for cross-sections given by (6.2) (top), (6.5) (middle), (6.12) (bottom) at NNLO. Here, the kinematics bin-integration, etc., is for the Z-boson production measure at ATLAS at 8 TeV [44]. where the evolution exponent can be given by two equivalent expressions .
where in eq. (6.8), the scale µ saddle is b-dependent, and defined by the equation The value of ζ µ (µ f , b) is defined by eq. (2.33). The optimal TMD distribution is by definition scale independent. Its matching coefficient at small-b is given by eq. (5.11) with c 1 and c 2 defined in eq. (5.15). We stress that this construction is independent of the type of TMD distribution and of the process, due to the universality of the TMD evolution. The scheme presented in eq. (6.7), (6.8) is still not very practical due to the necessity of recalculating the small-b matching coefficient with each modification of the D NP . To be more precise, using this implementation is not very costly (in the machine time) at NLO (since coefficient c 1 appears only near the δ-function) but it becomes more expensive at higher orders.
In order to have a faster implementation we suggest to exponentiate the boundary constants r i (5.21), (5.22), which is equivalent to switching from the exact special nullevolution line, to the close null-evolution line with r i = 0. In this way we obtain the distributionF f ←h (x, b) defined as Note that, at small-b, the condition r i = 0 line coincides with the special line, and at oneloop accuracy they coincide for all values of b. The change of the line is to be taken into -32 -

JHEP08(2018)003
account by an extra factor in the coefficient function eq. (5.24). This factor is universal and in the fitting expression it can be extracted from the TMD distribution and recombined with the evolution factor R. Thus the practical expression for the optimal TMD crosssection reads where C 0 = 2e −γ E and the function v is given by eq. (5.14) at r i = 0. In these expressions we recommend to use the resummed versions of ζ pert µ and D NP at small-b to improve the perturbative convergence. The corresponding expressions are derived in the appendix A. The comparison of R factors in all three versions of evolution presented here is given in figures 7. One can see that at three-loop order the difference among these functions is negligible. We emphasize that the expression in eq. (6.13) is given by a product of elementary functions, and thus, numerically much cheaper to calculate. We stress that the function in eq. (6.12), (6.13) depends only on the factorization scales (µ f , ζ f ).
One of the essential benefits of the optimal TMD definition is that it cuts out the question of the low-energy point normalization. In the suggested universal definition the low-energy normalization is defined "non-perturbatively" and uniquely by eq. (6.10). For that reason the constant µ i is absent together to the associated uncertainty factor c 3 . The part of the ambiguity related to the non-ideal perturbation theory is pumped into c 2 (since effectively, in eq. (6.8), µ f = µ i ). Therefore, the error-band for this cross-section can be obtained by the variation of c 2 and c 4 only. The same is true for the cross-section in the form eq. (6.12).
The TMD distributionF is not entirely the optimal TMD distribution. In particular, the coefficient function for small-b matching ofF is given by C pert defined in eq. (5.11) with c 1 = 0 and c 2 = γ 1 d (2,0) /Γ 0 . We note that this definition ofF coincides with the definition ofF in the "naive" ζ-prescription used in [1]. The formula (6.13) is very simple for practical implementation, since it has no integration and does require a solution of eq. (6.10), but it consists only of sums and products of elementary functions.
At the physical point (µ f , ζ f ) = (Q, Q 2 ), the expression for TMD distribution reads In this expression the coupling constant is defined at fixed hard scale Q, and it is in principle small. At very small-b (b Q −1 ) and large-b (b Q −1 ) the contribution of the logarithms can potentially appear. This behavior is unavoidable, because any resummation procedure that would move the scale inside the logarithm to a better value is equivalent to a redefinition of a point on the null-evolution line, and thus it reduces to an un-evolved expression.
At the leading order, the expression (6.14) has a very simple explicit form. Indeed All three distributions, namely, the general TMD distribution F (x, b; µ, ζ) used in eq. (6.5); the optimal TMD distribution F (x, b) used in eq. (6.7); and the universal TMD distribution defined on perturbative curveF (x, b) used in eq. (6.12) are related to each other in unique way: In these relations the convergence improves increasing the order of the perturbative series, but they are not affected by solution path-dependence effects. Therefore, given the model for D NP the comparison of TMD distribution is straightforward.
In figures 9 and 10 we compare the variation bands obtained from different versions of the cross-section at NNLO. Note, that to compare the bands we use the same model parameters for all plots, which however does not coincide with the best fit values. One can see that the size of the variation band is slightly decreased in comparison to the standard case given in eq. (6.2). This is the effect of the restoration of solution path-independence. The error bands of eq. (6.5) and eq. (6.12) do not contain the error coming from the change of the evolution path. In contrast, the error bands of eq. (6.5) and eq. (6.12) are practically the same since the only difference between these solutions is the point at which the null-evolution line is used. We appreciate that the solution in eq. (6.12) is numerically more stable, since all parameters are well inside the finite region (the numerical artifacts of error bands in the first and the second lines come from the numerical uncertainty of the extremely small a s (b −1 ) at asymptotically small b. The values of a s are taken from the MMHT package [46].) A test of the relative convergence of variation bands and central values is not so simple and will be made in future studies. The source of difficulty is the non-perturbative structure of TMD distributions, that plays an important numerical role, and thus should be fit at each pertrubative order separately.

Conclusion
The existence of a double-scale evolution of non-perturbative hadronic matrix elements poses new questions regarding an efficient implementation of these observables. In this work we have studied in detail the main consequences of double-scale evolution. We have concentrated on the evolution of TMD distributions. Nonetheless, our methods and conclusions can possibly be adapted to other non-perturbative functions/distributions. The possible areas of extension include jet-observables [18], resummation in momentum space [19], double-parton distributions [20].

JHEP08(2018)003
A consistent and efficient composition of TMD factorization formalism is fundamental for the precise extraction of non-perturbative function. It is especially important nowadays when big efforts are running to join a large amount of data coming from semi-inclusive DIS and Drell-Yan [39] or when one wants to include new LHC data, characterized by a great precision [1]. It is time to tackle the problem of the stability of the matching of the perturbative and non-perturbative parts of TMD factorization formalism. In this respect, it is essential to keep in mind the double nature of the non-perturbative structure of the TMDs. From one side we have non-perturbative corrections in the evolution factor and from the other side we want to explore the intrinsic non-perturbative content of the TMDs beyond its collinear limit. The disentanglement of these two non-perturbative effects results to be fundamental in the TMD program.
In this work we have discussed the main problem of the double-scale evolution. Namely, the absence of a unique solution within the (unavoidably) truncated perturbation theory. In this way the final implementation of the TMD evolution does depend on the particular choice of integration path in the (µ, ζ) plane. We have demonstrated and described that this problem is poorly cured by an increase of the perturbative order. In standard error estimations, this theoretical error is accounted changing the parameters µ 0 and µ i in eq. (6.4). Within such schemes, a proper definition of these parameters becomes essential for the extraction of the non-perturbative parts, whereas the theory predicts total independence on the scale fixation. The additional horrifying effect of solution ambiguity is the violation of transitivity of TMD evolution. It makes practically very difficult an accurate comparison of fits made in different schemes. The choice of a conventional set of scales which facilitates the comparison of different fits does not present much practical advantages. We propose here instead a way to bypass this problem by a forceful restoration of fundamental properties of the evolution at each order of perturbation theory. As a result, the problem of solution path-dependence is not present. Practically it results into a better control of variation error-band due to the absence of path-dependent uncertainties.
The recognition of double-scale evolution naturally proceed to the idea of ζprescription, which consists in the identification of the TMD distributions by the value of evolution potential, rather then by scales (µ, ζ). Such identification leads to many natural advantages. The main two of them is the complete elimination of double logarithms from OPE, and the disentanglement of TMD evolution from modeling of TMD distribution. This feature will result essential in the phenomenological study of the non-perturbative content of TMDs.
We denote as optimal a particular realization of ζ-prescription which is primely characterized by a unique (non-perturbative) definition. An additional benefit is the outstanding simplicity of numerical implementation of TMD factorization within the optimal definition.
The implementation of TMD factorization within ζ-prescription has only two matching scales, µ f and µ OPE . These scales have different physical meaning, and they are the only necessary scales. Notice, in fact, that in more classical approaches all other scales (such as µ i and µ 0 ) are only intermediate scales which do not depend on kinematics or hadronization and serve the purpose to smooth the transition between different regimes. The scales (µ f , ζ f ) ∼ O(Q, Q 2 ) are limited by kinematics and characterize the hard subprocess. The scale µ OPE ∼ q T appears in the re-factorization of TMDs onto collinear distributions, and characterizes the intrinsic distribution scale.
The optimal solution is implemented in the code arTeMiDe [40]. Using the arTeMiDe we have performed the test of various implementation of evolution discussed in the paper. The comparison of the approaches is given in figure 9, 10. We indeed observe all theoretically expected result, such as control of error-band in solution independent schemes and similarity of error-band in the ζ-prescription and non-ζ-prescription schemes. We admit the improved timing and numerical stability of the optimal realization of TMD factorization. Altogether it opens the road for the global fit of data well-separated in energy scales, such as Drell-Yan and SIDIS. More phenomenological studies are expected in the future.