Universality-breaking effects in $e^+e^-$ hadronic production processes

Recent BELLE measurements provide the cross section for single hadron production in $e^+e^-$ annihilations, differential in thrust and in the hadron transverse momentum with respect to the thrust axis. Universality breaking effects due to process-dependent soft factors make it very difficult to relate this cross sections to those corresponding to hadron-pair production in $e^+e^-$ annihilations, where transverse momentum dependent (TMD) factorization can be applied. The correspondence between these two cross sections is examined in the framework of the Collins-Soper-Sterman factorization, in the collinear as well as in the TMD approach. We propose a scheme that allows to relate the TMD parton densities defined in 1-hadron and in 2-hadron processes, neatly separating, within the soft and collinear parts, the non-perturbative terms from the contributions that can be calculated perturbatively. The regularization of rapidity divergences introduces cut-offs, the arbitrariness of which will be properly reabsorbed by means of a mechanism closely reminiscent of a gauge transformation. In this way, we restore the possibility to perform global phenomenological studies of TMD physics, simultaneously analyzing data belonging to different hadron classes.


Contents
1 Introduction QCD describes hadronic matter through the dynamics of its elementary consituents, quarks and gluons. However, confinement prevents the direct observation of partonic degrees of freedom, which are shaded by the hadronization mechanism.
Recently, the BELLE Collaboration at KEK has measured the e + e − → HX cross section at a c.m. energy of Q 2 ∼ 112 GeV 2 as a function of P T , the transverse momentum of the observed hadron h relative to the thrust axis [1]. The data are binned in P T and selected in thrust in such a way to ensure that T ∼ 1, which corresponds to a two jet configuration. This is one of the measurements which go closer to being a direct observation of a partonic variable, the transverse momentum of the hadron with respect to its parent fragmenting parton.
These data have indeed triggered a great interest of the high energy physics community, especially among the experts in the phenomenological study of TMD phenomena and factorization. However, there are difficulties in the analysis of these data as, due to the nature of this process, a TMD factorization as that formulated in Ref. [2] cannot be directly applied. In this case, in fact, collinear factorization would rather be the correct approach.
In this paper we will follow very closely the formulation proposed by J. Collins in Ref. [2] for e + e − → H A H B X processes and we will give a different definition of TMDs which, by extending their degree of universality, becomes suitable to be applied also to the e + e − → H X process. We will move along the lines suggested, for instance, in Ref. [3].
In this new definition, the soft factor of the process, which is responsible for potential universality breaking effects, is not absorbed in the TMD, to prevent it from influencing its genuinely universal nature. Instead, it appears explicitly in the cross section which acquires a new term, that we will call soft model, M S . After being modelled using a suitable parameterization, it can be extracted from experimental data. While the TMDs are truly universal and can be extracted from any process, M S is universal only among a restricted number of processes. In other words, M S is universal only within his hadron class. Later on in the paper we will define what we mean exactly by "class"; for the moment being we anticipate that, for instance, Drell-Yan, Semi Inclusive Deep Inelastic Scattering (SIDIS) and e + e − → H A H B X processes belong to the same hadron class, while DIS and e + e − → H X belong to a different class.
The advantage of this formulation is that a well defined expression relates TMDs extracted using different definitions. Consequently, all results obtained in past phenomenological analyses can easily be reformulated according to this new framework, with no loss of information.
We stress that the factorization procedure itself will not be altered from its original form. Rather, we introduce a new methodology to implement the phenomenological application of that same scheme, changing the focus on the fundamental ingredients of the phenomenological models.
The paper is organized as follows. In Section 1.1 we will outline the basics of TMD and collinear factorization. Section 2 will be devoted to the study of the soft factor and its factorization properties, while in Section 3 we will examine the collinear parts of hadronic processes. Here we will define the TMDs and show how a particular transformation, which we will call "rapidity dilation", allows to consider them invariant with respect to the choice of the rapidity cut-offs introduced by the regularization of the rapidity divergences. In Section 4 we will briefly outline a new way of classifying hadronic processes in terms of their "hadron class". Section 5 will be dedicated to the study of 2-hadron processes and their cross sections, respectively. Finally, in Section 6 we will apply this formalism to e + e − → H X, giving a simple example of how this scheme can be applied to a phenomenological analysis. Appendices A, B and C will be dedicated to the definition of Wilson lines, to the small b T behaviour of the soft factor and of the TMDs, and to the kinematics of a e + e − → H X process, respectively.
Throughout the paper we will adopt a pedagogical approach, as we intend to provide a review which could be useful to young beginners as well as to experienced researchers.

Collinear and TMD Factorization
Modern studies of high energy QCD processes are based on factorization, a procedure that allows to separate the cross section of a hadronic process involving a hard energy scale Q into a part which is fully computable in perturbation theory and a non-perturbative contribution, with an error suppressed by powers of m/Q, where m is a typical low energy mass scale. In general, the perturbative part is process dependent but it can be computed, at any given order, for any given process. The non-perturbative part, cannot be computed: it should rather be inferred from experimental data. However, when defined in an appropriate way, it is universal, in the sense that it can be extracted from one process and then used in any other. If factorization applies and universality is preserved, then the theory can be predictive. Nowadays, several different schemes are available to implement factorization. In the following, we will adopt the modern version of the Collins-Soper-Sterman (CSS) scheme [4,5], often referred to as CSS2, presented in Ref. [2].
When factorization applies, then the cross section of the process will appear as a convolution of contributions which can be classified in terms of the following three categories: 1. Hard part. It corresponds to the elementary subprocess and it provides the signature of the process, as it identifies the partonic scattering uniquely. It is fully computable in perturbation theory in terms of Feynman diagrams, up to the desired accuracy.
2. Collinear parts. These contributions are associated to the initial and final state hadrons of the process and contain the collinear divergences related to the massless particles emitted along the hadron direction. Each of them corresponds to a bunch of particles strongly boosted along this direction, which move almost collinearly, very fast. Due to their characteristic divergences, collinear parts cannot be fully computed in perturbation theory: their non-perturbative content has to be extracted from experimental data. Among all the particles in the collinear group, two of them deserve special attention: the reference hadron and the reference parton. If the (a) (b) Figure 1. (a): Pictorial representation of (the hadronic part of) a DIS process. The struk quark is associated to the collinear part relative to the target hadron, while the radiated gluon is the hard real emission. (b): Pictorial representation of (the hadronic part of) e + e − → H X. The quark line corresponds to the fragmenting quark associated to the collinear part representing the final hadron, while the radiated gluon is the hard real emission.
collinear group refers to the initial state of the process, the reference hadron coincides with the initial hadron and the reference parton is the parton confined inside it that is struck in the hard scattering; if the collinear group refers to the final state, the reference hadron is the detected hadron and the reference parton is the fragmenting parton, i.e. the particle that initiates the hadronization process.
3. Soft part. It embeds the contribution due to the soft gluon radiation that connects the collinear parts and that flows through the detector. It contains soft divergences and carries non-perturbative information, therefore it cannot be computed in perturbation theory. It cannot be directly extracted from data, either, as the energy of the soft radiation is so low that detectors are not sensitive to it. Since the collinear parts interact among each other only through soft gluons, their contribution can affect the cross section in a non-trivial way. Moreover, the soft part is always associated with the collinear terms and there is no way to extract them separately. This is sometimes referred to as the soft factor problem.
In several cases the contribution of the soft part is trivial. In particular, any time in addition to the collinear partons there are real emissions with hard transverse momentum, the soft factor fully factorizes but its value reduces to unity. In these cases, in fact, the soft gluons are kinematically overpowered and do not correlate the collinear parts anymore: in this way each collinear cluster of partons is totally independent from any other. Technically speaking, in such a situation the soft factor involves an integration over all the components of the total soft momentum so that the soft information is washed out in the integral. Whether there could be a hard real emission or not is determined by kinematics. Hence, it is the hard factor that discriminates among different cases. Kinematical configurations where hard real emissions are present, see for instance Fig. 1, represent instances in which collinear factorization holds. In these cases it is possible to relate each collinear part with a Parton Distribution Function (PDF) or a Fragmentation Function (FF), depending on whether the associated reference hadron is in the initial or in the final state, respectively. As an example of a collinearly factorized process, one could consider the case of an e + e − scattering where two spinless hadrons H A and H B are produced in the final state, in a configuration far from being back-to-back in

Soft Factor
In this Section we focus on those kinematical configurations in which there is no hard radiation, where TMD factorization has to be applied and the soft factor plays a non trivial role.
From the point of view of the soft gluons, each collinear group is simply a bunch of particles strongly boosted in a certain direction. The boost is so strong that the soft gluons are only sensitive to the color charge and to the direction of the collinear particles. As a consequence, the propagation of collinear particles is well approximated by a Wilson line, see Appendix A, in the direction of the corresponding collinear group, usually represented by double lines, see Eq. (2.1). In the massless limit, the versor which identifies this direction is light-like. However, a light-like Wilson line brings unregulated rapidity divergences. In order to cancel them, it is common to introduce a rapidity cut-off y i which tilts the corresponding Wilson line away from its original light-like direction. Obviously, the final result for the cross section should not depend on these rapidity cut-offs, which then have to be removed in the final stage of the computation. As explained in Ref. [2], the selfinteractions of these Wilson lines should not be included into the definition of the soft factor. If k S, T is the total transverse momentum of the real soft radiation flowing through the detector, then the soft factor of a generic process is defined as [2]: where D is the dimension of space time (D = 4 − 2 in dimensional regularization), µ is the renormalization scale and {y i } are Lorentz invariant combinations of the rapidity cut-offs (i = 1, . . . N where N is the number of collinear parts in the process). The dependence on the flavors j 1 . . . j N of the partons associated to the collinear parts is only on their Wilson line approximation, which changes according to their color representation. The label "NO S.I." reminds us not to consider the Wilson lines self energies Ref. [2]. This implies that N = 1 is excluded, since it would correspond only to a Wilson line self energylike contribution. Finally, the factor Z S is a UV-renormalization factor that cancels, order by order, the UV divergences generated when the integration region stretches outside of the soft region. The role of Z S will become clear later on, when the soft factor will be defined in the Fourier conjugate space, Eq. (2.2). It is important to stress that, with this definition, the soft factor is sensitive to the number N ≥ 2 of collinear groups, each one associated to a reference hadron h. Therefore it is not totally blind to the rest of the process, but carries some residual information about the overall process. For this reason, in what follows we will always add a label "N-h" to the soft factor S in order to take into account this dependence.
It is usually more convenient to define the soft factor in the Fourier conjugate b T space of k S, T , where the quantities involved in the cross section can be identified through an operator definition. In the following, the Fourier transformed quantities will be labeled by a tilde. In particular, the Fourier transform of the soft factor, S N-h , is a matrix in color space, given by the vacuum expectation value of a product of Wilson lines: (2. 2) The Wilson line W j i (∞, b T /2; n i ) goes from b T /2 towards infinity in the direction of n i , which is not light-like thanks to the rapidity cut-off y i , and has the color representation given by the flavor j i (quark, anti-quark, gluon).
We can obtain more information about the soft factor by studying its structure in detail. Since all the collinear information is replaced by spinless eikonal propagators, k S, T is the only vector appearing in the soft factor. Therefore, S is always rotational invariant and depends only on the modulus | k S, T | = k S, T . This reflects on the Fourier conjugate space, where the dependence on b T is only through its modulus | b T | = b T . Moreover the natural leading momentum region of S is where all the momenta are soft, with components of size λ S = λ 2 /Q, where λ << Q is a very low energy scale. When the soft factor is Fourier transformed, the total transverse soft momentum k S, T is integrated out and its dependence is replaced by b T . At fixed b T we can roughly access all momenta with k S, T ≤ 1 b T , hence this operation can be regarded as a sort of analytic continuation of the function S 2-h (k S, T ) outside of its natural momentum region, since when b T is small k S, T can be very large. This generates UV divergences which will have to be canceled order by order by the UV counterterm Z S .
The application of the factorization procedure to the soft factor itself gives us the possibility to express S N-h in terms of perturbative and non-perturbative parts (see Ref. [2]). Leading regions involve hard, collinear and soft subgraphs, as represented in Fig. 2. The hard factor is associated to the external Wilson line vertices and contains hard subgraphs with highly virtual loops. There is a collinear subgraph corresponding to each Wilson line and all of them are connected by the soft subgraph. Furthermore, if the entering transverse momentum k s is large enough, there can be more hard subgraphs C α with production of final-state jets of high transverse momentum (i.e. hard gluon emissions which cross the cut). In each hard jet there is a fully inclusive sum/integral over final states, hence the sum-over-cuts argument presented in Ref. [2] allows us to consider them as being far off-shell and part of the hard factor. In this case collinear factorization holds and the soft factor is unity. Furthermore, there is no convolution between the hard part and the collinear factors C i , since the cut eikonal propagators that exit from the hard subgraphs do not carry momentum. As a consequence, all the collinear parts are integrated 2-h soft factors and are unity as well. Therefore, the only remaining effective region is the hard factor with all the extra hard jets. It has the same structure of S N-h but now it is fully computable in perturbation theory. In particular, it is a standard result that general soft functions exponentiate and that the exponent can be computed by using web technology, see for example Ref. [6][7][8][9]. Hence at small b T the soft factor can be written as: where the sum is extended to all the (multiparton) web diagrams. In order to separate the small and large b T behavior of S N-h , we can modify its functional dependence on b T by introducing a function b T ( b T ) such that it coincides with b T at small b T , while at large b T it is no larger than a certain b max . A possible choice, according to Ref. [2,4,5], is given by: Then, by dividing and multiplying S N-h in Eq. (2.3) by its small b T behavior, we easily obtain a factorized expression which holds valid at any value of b T : is the fully non-perturbative function that models the N -h soft factor at large b T , while the whole perturbative content is gathered in the webs. In the t'Hooft limit 1 , the soft factor is strongly simplified. Regarding the perturbative part, the only surviving diagrams are planar and the exponentiation becomes trivial. Furthermore, we can also make some guess on the non-perturbative part which is, in principle, a fully arbitrary function, since there is no way to extract it independently from experiments. In this limit the non-perturbative contribution of S N-h only regards the incoherent emission of free glueballs, of every possible kind 2 . The function that models this kind of emission is a Poisson distribution, similarly to what happens for photons in QED.

2-h Soft Factor
In the 2-h class, there are two directions for the collinear parts which can be identified to the plus and the minus direction in the c.m. frame. The Wilson lines are tilted with respect to these light-like directions by introducing two rapidity cut-offs y 1 and y 2 . The original plus and minus directions are restored if the cut-offs are removed, i.e. by taking the limits y 1 → +∞ and y 2 → −∞. In total, there are four Wilson lines, two on each side of the final state cut. The only relevant case for applications involves Wilson lines that replace fermionic collinear partons. Hence, in the following we will drop the dependence on the flavors for simplicity. Furthermore, the 2-h soft factor is color singlet, proportional to the identity matrix in color space, i.e. ( S 2-h ) i j ∝ δ i j . Then, S 2-h is defined as the coefficient in front of the delta function. By using the definition in Eq. (2.2) we have: 6) where N C is the number of colors available for quarks and antiquarks (3 in QCD). The Eq. (2.6) describes a loop for the full path outlined by the Wilson lines. It starts (e.g.) from − b T /2 and goes to b T /2, passing through ∞, along the almost plus direction n 1 . Then it comes back, again passing through ∞, along the almost minus direction n 2 . Notice that the only Lorentz invariant combination for a function depending on two rapidities (e.g. y 1 and y 2 ) is their difference (e.g. y 1 − y 2 ). It is possible to write the evolution equation for S 2-h in the b T -space with respect to both rapidity cut-offs, y 1 and y 2 , using a single rapidity-independent kernel K( b T ; µ) defined as [2]: 1 NC → ∞ and αS NC is fixed. 2 In order to preserve unitarity the sum must run over all the possible final states.
It has an anomalous dimension γ K : where γ K depends on µ through the strong coupling α S and is independent of b T . Then, K can be written as: For large values of (y 1 − y 2 ), the solution to the evolution equations for S 2-h is given by: where the reference values of the RG scale and of the rapidities are chosen to be µ 0 and y 1, 0 = y 2, 0 , respectively. In the solution of the evolution equation, two functions appear: the fixed scale soft factor S 2-h (b T ; µ 0 , 0) and the soft kernel K(b T ; µ). Both of them can be separated in terms of their perturbative and non-perturbative contents by using the b prescription, similarly to what was done in Eq. (2.5): Finally, consistency between Eqs. (2.5) and (2.11) requires that: Notice that the non-perturbative function M S (b T ; µ, y 1 − y 2 ) loses its dependence on µ in the large rapidity limit, as g K does not depend on the RG scale. Since we are only interested in the asymptotic behaviour of S 2-h , we will drop the label (0) from M S (b T ) and we will refer to it as the soft model, i.e. the non-perturbative part which will have to be parametrized and treated phenomenologically, possibly taking inspiration from the properties of the soft factor in the t'Hooft limit. The two non-perturbative functions M S and g K should not contribute at small b T by definition, hence we require that g K (b T ) → 0 and M S (b T ) → 1 when b T → 0. Furthermore, since the Fourier transform of S 2-h has to be well behaved, the contribution of g K and M S should be suppressed at large b T . Notice that the factor in front of g K , being proportional to the difference of the rapidity cut-offs, is always large and negative in the large rapidity cut-off limit. In conclusion, the 2-h soft factor in b T space can be written as: This result shows that the soft factor itself can be factorized in a purely perturbative part, process dependent but calculable within pQCD, and a part which is genuinely non perturbative and, inevitably, will have to be committed to a phenomenological model, in this case embedded in the functions M S (b T ) and g K (b T ).
Although the definition of Eq. (2.6) implies that S 2-h = 1 at b T = 0, a direct fixed order perturbative computation of K does not reproduce the correct behavior in this region. In this regard, since the soft factor is unity at b T = 0, then K goes to zero at small b T , but an explicit calculation gives instead a larger and larger value as b T decreases, forcing S 2-h to vanish in b T = 0. This kind of problems arise because the integrated soft factor can be defined through perturbative QCD only as a bare quantity. A solution can be found by applying some regularization procedure, for instance one can modify the b prescription of Eq. (2.4) allowing for the introduction of a new parameter b MIN = 0 that provides a minimum value for b T (see Appendix B).

Collinear Parts and TMDs
Let's now consider a generic collinear part. If kinematics forbid hard real emissions, the information about the transverse momentum k T of the reference parton survives. All the collinear particles are boosted very strongly in the collinear group direction, that we can identify with the plus direction without loss of universality. To them, everything outside of the collinear group is moving very fast in the opposite direction, so fast that the only surviving information is the color charge and the direction. In other words, as seen from the collinear factor, the rest of the process is well approximated by a light-like Wilson line flowing in the direction opposite to that of the collinear group.
Assuming for simplicity that the reference parton is a quark, if k T is the total transverse momentum of the collinear group, then the collinear factor (along the plus direction) is defined as in Ref. [2]: final state, where the color average Tr C /N C is due to the fact that collinear factors are color singlets and, analogously to the 2-h soft factor, they are defined as the coefficient in front of the delta in color space. The variable ξ is the light-cone fraction of the momentum k of the reference parton of flavor j with respect to the momentum P of the reference hadron H, µ is the renormalization scale at which C is evaluated and y P is the (very large) rapidity of the reference hadron. The definition of ξ in the initial and final state is given by: To the Wilson line, instead, we can associate a very large and negative rapidity. Similarly to the definition in Eq. (2.1), Z C is the UV-counterterm of C, while the label "NO S.I." reminds us not to consider the Wilson lines self interactions. It is important to stress that the collinear factor C is totally blind to the rest of the process, it only depends on its intrinsic variables. As for the soft factor, the operator definition is simpler in the Fourier conjugate space. Here, we have: where x = (0, x − , b T ) and w − is the light-like minus direction of the Wilson line. The presence of a light-like Wilson line allows for particles with a low, or even a very large negative rapidity, to be considered as part of the collinear group. This contradiction reflects in the computation by inducing the presence of unregulated rapidity divergences. This problem can be solved by subtracting out these unphysical contributions from the collinear factor by using the subtraction method described in Ref. [2]. Since all the non truly collinear contributions are due to the overlapping with the soft region, they can be rearranged in one global term which turns out to be a 2-hadron soft factor. Hence we can use the definitions given in Eqs. (2.1) and (2.2) to subtract them out and obtain: , where y 1 is the rapidity cut-off carried by S 2-h that, similarly to the case of the soft factor, should be removed in the final result for the cross section. After subtraction, the particles in C can only have a rapidity y such that y 1 < y < y P ∼ +∞. Hence if y 1 is chosen to be sufficiently large, only strongly boosted particles in the plus direction contribute to C, according to the naive physical intuition. The subtracted collinear factor has its own UVcounterterm Z sub C , for this reason in the previous definition the quantities inside the limit are bare, in the sense that they have to be considered without their UV-renormalization factors. Since the unsubtracted collinear part is defined with renormalized quark fields, see Eq. (3.3), then if Z sub C is the ratio of the renormalized collinear part to the unrenormalized collinear part, we have to multiply explicitly by the wave-function renormalization factor of the quark field, Z 2 .
Having given the general definition of C, TMDs can be obtained straightforwardly. In fact, as C is an operator acting onto the space of Dirac spinors, it belongs to the Clifford algebra built from the Dirac matrices {γ µ }. Therefore, we simply expand C on the basis of this algebra. Neglecting all the dependences on partonic and hadronic variables, we have: (3.5) Then, the TMDs are related to the coefficients S, V, . . . , T µ ν of the Clifford Algebra expansion and the definition in Eq. (3.4) naturally extends to TMDs. Such coefficients can be further expanded in terms of all the Lorentz tensors contributing to the leading twist approximation (see e.g. Ref. [10]). This allows to isolate all the dependence on the vector part of b T in the coefficients of such expansion, leaving a set of scalar functions depending only on the modulus b T . These scalar functions are the TMDs. For example, the coefficient of γ + defines the unpolarized TMDs and the Sivers function:  Formally, if C is a generic TMD function referring to a collinear factor in the plus direction, then its definition equipped with subtractions is inherithed directly from Eq. (3.4) and it is given by: where Γ is the proper Dirac matrix combination to extract the desidered TMD and Z TMD is its own UV counterterm. The label "leading twist coeff." means that the TMDs are obtained, after the projection onto the Clifford Algebra, as the coefficients of the expansion at leading twist. The operator definition of TMD as given in Eq. (3.7), which follows directly from the TMD factorization prescription, will be referred to as the factorization definition. Notice that within this definition, the TMD is a purely collinear object, as all soft sub-divergences have been subtracted out.

Evolution Equations for TMDs
In the factorization definition 3 of TMDs, Eq. (3.7), a 2-h soft factor appears as a consequence of the subtraction mechanism. Therefore, we can use the results of Section 2.1 to write the evolution equation (Collins-Soper evolution) for C with respect to the rapidity cut-off y 1 . On the other hand, the evolution with respect to the scale µ (i.e. the Renormalization Group evolution) is ruled by the anomalous dimension γ C . The equations are given by: which, for later convenience, have been re-written in terms of a new variable, ζ, defined as follows: ζ = (M x) 2 e 2(y P −y 1 ) initial state hadron; where M is the mass of the reference hadron, while x and z are the light-cone fractions of the momentum of the reference parton with respect to the hadron. Thanks to the definitions in Eq. (3.2), in both initial and final states we can write ζ = Q 2 e −2y 1 . The information about the direction of the reference hadron is encoded inside the ζ variable.
In the previous definition, this direction is identified with the plus direction. In addition to the previous evolution equations, we also have the RG evolution of K, Eq. (2.9), and the CS evolution of γ C , given by: (3.11) which gives: With the help of Eqs. (2.9),(3.11) and (3.12), we can rewrite the solution to Eqs. (3.8) and (3.9) as [2]: where the standard choices for the reference values of the scales are 4 : (3.14) (3. 16) In the solution of the evolution equation the b T prescription, Eq. (2.4), has been used in order to separate the perturbative from the non-perturbative content, in complete analogy to what was done for the soft factor in Section 2.1. In particular, in Eq. (3.13), the nonperturbative behavior of the TMD is described by two functions. The first is g K , the same function that appears in Eq. (2.16) in the asymptotic behavior of S 2-h . The second is the TMD model function (M C ) j, H (ξ, b T ), that embeds the genuine non-perturbative behavior of the TMD: it depends on the flavor of the reference parton and on the reference hadron associated to the collinear part. By definition, the model should not influence the TMD at small b T . Furthermore, since the Fourier transform of the TMD has to be well behaved, the model should be sufficiently suppressed at large b T 5 . These properties restrict the behaviour of the non-perturbative function M C at small and large b T as follows The factorization procedure can be applied either to the full collinear factor or to the TMDs themselves, in order to study their behavior at small b T , outside of their natural collinear momentum region. This is given by a convolution of a finite (calculable in perturbative QCD) hard coefficient C with the TMD integrated over k T . The proof can be found in Chapter 13 of Ref. [2]. Hence, TMDs at small b T can be written as Operator Product Expansions (OPE): where C k j are the Wilson Coefficients of the OPE, which are matrices in the flavor space. A sum over k is implicit. In the second line of Eq. (3.18) we distinguish the Wilson Coefficients of the initial state from those corresponding to the final state according to the position of their upper and lower flavor indices. The convolution ⊗ of two generic functions f and g is defined as where we recall that the Wilson Coefficients of the final state have a normalization factor ρ 2−2 when the convolution is made explicit, see Ref. [11]. The integrated TMDs are indicated by lowercase letters. In the following, c k, H will be a generic integrated TMD, while f will label integrated TMD PDFs and d will refer to integrated TMD FFs. Thanks to the OPE, the solution of the evolution equations, Eq. (3.13), can be rewritten as The definition of integrated TMDs coincides with the Fourier transformed TMDs in b T = 0. Perturbative QCD fails to give the right result in b T = 0 because of the new UV divergences introduced by the integral over the whole range of k T . In fact, as we explain in Appendix B.2, C goes to zero as b T → 0 (see Eq. (B.12)) and the usual collinear PDFs and FFs are not recovered. This problem is completely analogous to that encountered in Section 2.1 and it can be solved in a similar way, by defining a regularization procedure for the definition of the integrated TMDs (see Appendix B).

Rapidity dilations
The TMDs defined in Eqs. (3.7) and (3.13) are not invariant for different choices of the rapidity cut-off y 1 . The transformation rule for a shift in the rapidity cut-off follows from the Collins-Soper evolution equation, Eq. (3.8). Let's suppose, for instance, that y 1 , Eq. (3.2), is shifted to y 1 = y 1 − δθ, where δθ is an infinitesimal real number. Then (neglecting the dependence on all variables except ζ): This result can easily be resummed to give the transformation law valid for any shift θ: as one can check by shifting the rapidity cut-off directly in the solution of the evolution equations, Eq. (3.13). Therefore, the full effect of this transformation is a dilation factor which depends on the soft kernel K(b T , µ) and the shift parameter θ.
TMDs are not invariant under a shift in the rapidity cut-off alone, however we can make them invariant under a transformation that transforms the model M C at the same time, compensating for the dilation factor exp 1 2 θ K in Eq. (3.22). Notice that the model M C is a fully non-perturbative object which cannot be computed and will have to be modelled. It is not a physical observable like the full TMD. Furthermore, M C carries only truly collinear information, since the non-perturbative soft part, g K , has already been separated out in Eq. (3.13). The only requirement on the transformed model will be that it should preserve its proper b T behaviour. In particular, the following transformation rules, when applied simultaneously, make the TMDs invariant under the choice of rapidity cut-off: where the dependence of the model on the collinear momentum fraction ξ has not been shown explicitly. Due to the dilation factor in front of the model, we will refer to the previous transformation D θ as a rapidity dilation (RD), that makes TMDs invariant with respect to the choice of the rapidity cut-off: The transformed model, Eq. (3.24), acquires the same properties of Eq. (3.17). In fact, since K goes to zero at small b T , then the dilation factor is 1 for b T ∼ 0. The properties of g K and M C ensure that also the behavior at large b T is correct. Furthermore, the transformed model acquires a dependence on µ that is expressed by the following evolution equation: This is necessary in order to keep the anomalous dimension γ C invariant under rapidity dilation. In fact, from Equations (3.9), (3.12) and (3.25): As a consequence, also the UV counterterms of the TMDs are RD-invariant. When a TMD appears in a cross section, its non-perturbative content, i.e. the last line of Eq. (3.20), has to be extracted from experimental data and in principle the result will depend on the choice of the rapidity cut-off. This will not modify the full TMD, thanks to its invariance under rapidity dilations. If C NP denotes the full non-perturbative content of the TMD, then its transformation rule under rapidity dilation is given by: where the second step is given by Eqs. (2.13), (3.23) and (3.24). In this case, the dilation factor is fully computable in perturbative QCD. Notice that the function g K is not affected by the rapidity dilation, as it should. In fact g K is also involved in the definition of the soft factor S 2-h (Eq. (2.16)), which must not depend on the extraction of the TMD. Rapidity dilation make TMDs invariant under the choice of the rapidity cut-off y 1 , which nevertheless has to be considered an arbitrary and large parameter. This is in fact one of the necessary hypothesis at the basis of any factorization formula: all the particles described by a collinear part (and ultimately by a TMD) must have a large and positive rapidity, according to the reference direction of the collinear group. Hence, a rapidity dilation performed with a very large and positive θ would contradict the initial hypothesis on the validity of factorization itself. The correct way to interpret this symmetry is to apply it only after the factorization formula has been derived, in the limit of y 1 → ∞. Roughly speaking, the model associated with a certain choice describes how collinear particles with rapidity in the range y 1 < y < y P (where y P is the large and positive rapidity of the reference hadron) behave in the non-perturbative regime. Then, rapidity dilations simply balance the perturbative and non-perturbative information encoded in the TMDs according to the choice of rapidity cut-off, in order to keep the whole physical observable invariant.
Notice that rapidity dilations define a group under the multiplication laws: Rapidity dilations are closely reminiscent of a 1-parameter gauge transformations for the "fields" y 1 and M C , that make the TMD invariant. Here the TMD plays the role of the "Lagrangian". In this sense, rapidity dilations might be considered a symmetry for the TMDs.
There is an interesting analogy between the action of rapidity dilations on the rapidity cut-off, y 1 , and the action of the Renormalization Group (RG) on the energy scale µ. Here, an arbitrary µ allows to regularize the UV divergences, but it introduces some arbitrariness in the theory, as µ can be set to any value. Some quantities are independent of the choice of this scale, like cross sections, where the RG-variation of the fields is exactly compensated by the RG-variation of the external on-shell particles (LSZ mechanism). In this case, this is enough to save the predictive power of the theory. Similarly, rapidity dilations (RD) allow to control the arbitrariness in the choice of the rapidity cut-off, y 1 , that regularizes the rapidity divergences. In particular, TMDs defined along the plus direction are RDinvariant, as the transformation of the model M C , Eq. (3.24), exactly compensates for the rapidity shift, Eq. (3.23). However, there are quantities that are not RD-invariant. As we will show in Sections 3.2.1, the TMD defined along the minus direction C − or the 2-h soft factor S 2-h are examples of such quantities. On the other hand, combinations as C + C − S 2-h are RD-invariant, because the dilation factor in C − exactly compensates the variation in S 2-h . Physical observables like cross sections have to be RD-invariant in order to preserve the predictive power of the theory. For instance, the cross section of e + e − → H X is expressed in terms of one single TMD (see Section 6), while the cross section of e + e − → H A H B X involves the combination C + C − S 2-h (see Section 5.1). Therefore, their predictive power is saved by RD-invariance.
Due to rapidity dilations, the choice of the model depends on the choice of the rapidity cut-off. Therefore, in general two independent extractions of TMDs, that use different values of ζ, will feature different models. However, rapidity dilations allow to relate these independent extractions of TMDs. The main difficulty here is that theory is devised in the b T -space, while measurements are performed in the transverse momentum space.
As a practical example, let's suppose we want to compare the TMD extractions of two independent research Groups, A and B. They will provide two TMD functions in transverse momentum space C (A) and C (B) . Since they obtained their result fitting the same experimental data, the two functions have to be compatible within the overlapping of the respective uncertainty bands, built by considering all source of errors (collinear PDFs/FFs uncertainties, experimental errors, fitting uncertainties, etc ...). However a meaningful comparison can only be made for small values of transverse momentum, because TMDs turn non-physical at large k T (see B.2). Both results can be written as the Fourier transform of their b T counterparts. Schematically: where only the dependence on the rapidity cut-off and on the model are shown explicitly and C P denotes the perturbative content of the TMD. In principle, also the choice of g K may be different for the two extractions. However, Group A and B have to agree also in the estimate of the S 2-h , and this gives further constraints. Hence, even if g (A) K and g (B) K have different functional forms, they should share more or less the same shape. For simplicity, in the following we will set g (A) K . The two rapidity cut-off are different, but a certain real number θ must exists such that ζ = ζe 2θ . Hence, Group A can perform a rapidity dilation in order to comply with the choice of Group B. By using Eq. (3.28), they can write:  .30) and (3.31) have to be compatible at small k T , but there is no constraint for larger values. Furthermore, we known that the perturbative content is constant for b T larger than a certain b SAT ≥ b MAX . At the same time, the non-perturbative content should be of order 1 at small/moderate b T , i.e. up to b SAT , in order to not interfer with the perturbative information. Therefore, we can roughly split the Fourier transform in two parts as: The first part is integrated only up to the saturation value b SAT . It is clearly dominated by perturbative information, since C NP is not drastically different from 1 in that range, for any choice of ζ and M . As a consequence, this part is almost the same for both Eq. (3.32) and (3.31). On the other hand, the second part is simply proportional to the Fourier Transform of the non perturbative content of the TMD. Different choices of ζ and M can give integrands of the same order up to b SAT but they can differ on how rapidly they go to zero as b T goes to infinity: at large b T they could be very small and at the same time differ for many orders of magnitude. This difference is not evident at small values of k T , since the area under the curve in b T space, after the saturation value, can be neglected in any case. However, the differences may be consistent at large k T . This is not a problem, since TMDs lose their physical meaning in this region. Hence, Group A can compare its result with that of Group B by Fourier transforming its non-perturbative part after it has been rapidity-dilated. Then, the following relation should hold within uncertainties: If the previous equation is intended as an equivalence relation between models, then the model M (ξ, b T ) can be considered unique, modulo this equivalence relation.

Rapidity dilation and z-axis reflection
The behaviour under z-axis reflection, which simply exchanges the plus and minus directions, is particularly important for widely studied processes, like SIDIS, Drell-Yan and e + e − → H A H B X, where two TMDs associated to opposite directions are multiplied together. If R z is the Lorentz transformation that reverses the z-axis, then the rapidity of the reference hadron swaps its sign under the action of R z . On the other hand, the rapidity cut-off is not the rapidity of any real particle. It is just an ad hoc number and hence it is trivially invariant under the action of R z . However, the particles belonging to the collinear group associated to the TMD in the minus direction should have a very large negative rapidity according to the limit y 1 → +∞. Therefore, a proper rapidity cut-off would be y 2 = −y 1 , as if y 1 had changed its sign. Summarizing: As a consequence, the variable ζ for a TMD in the minus direction is obtained by simply replacing ζ + ∝ exp (y P − y 1 ) with ζ − ∝ exp (y 2 − y P ) and the full TMD transforms as: where only the dependence on the rapidity cut-off has been made explicit. There is a non trivial interplay between z-axis reflection and rapidity dilations, since the two transformations do not commute. In fact, if the rapidity cut-off y 1 of C + is shifted, then the rapidity cut-off y 2 of C − is shifted as well, but with the sign reversed. This can easily be seen by a direct computation, with the help of Eqs. (3.23) and (3.35): (3.37) Therefore, according to Eq. (3.24), the model of C − transforms as: However, in the z-reversed TMD, C − , the rapidity cut-off appears with the opposite sign with respect to C + . Hence, there is no more compensation between the rapidity shift and the transformed model, and C − is not invariant under rapidity dilations. This can be summarized by saying that the two transformations do not commute: Finally, lets consider the behavior of the (asymptotic) 2-h soft factor, defined in Eq. (2.16), under rapidity dilations. Since the soft model is not affected by the transformation and y 1 → y 1 − θ, while y 2 → y 2 + θ, it transforms as:

Universality and Process Classification
Process-independent quantities play the most important role in factorized cross sections. They are universal, which means that once they have been estimated they can be used in any cross section, regardless of the specific process. This is particularly useful for those quantities that carry non-perturbative information. Since they cannot be computed analytically, they have to be extracted from experimental data. However, if they are universal, any process that allows for their presence in the cross section can be exploited, and we can prefer those with a richer amount of data. A lack of universality would undermine the predictive power of QCD itself. In fact, if the non-perturbative quantities had to be extracted again for each individual process, the phenomenological analysis of a hadronic cross sections would be reduced to a mere fit of experimental data.
In general, a factorized cross section is a convolution of three different objects: the hard part, the collinear factors and the soft factor (see Section 1).
The hard part is completely process-dependent. However, it can be computed in perturbative QCD and its lack of universality does not affect the predictive power of the theory.
Collinear parts and the TMDs, as defined in Section 3 by the factorization definition, Eq. (3.7), depend only on their internal variables and hence are completely blind to the kinematics of the process. Therefore, they can be really considered universal quantities.
On the other hand, the soft factor, defined in Section 2, is not completely processindependent. In fact, it depends on the number N of the collinear factors involved in the factorized cross section, each related to its reference parton with flavor j and to its reference hadron H. Therefore, they are not insensitive to the kinematics of the process in which they appear, because they depend both on the number of the Wilson lines replacing the collinear parts and also on their color representation, which is fixed by the flavor j and differs from quark and gluons. However, at fixed N and for reference partons of the same kind, soft factors are actually the same object, modulo crossing symmetry. As an example, Drell-Yan scattering with two quark-initiated collinear factors in the initial state, e + e − → H A H B X, with two quark-initiated collinear factors in the final state, and also SIDIS, with one quark-initiated collinear factor in the initial and one in the final state, share the same soft factor S 2-h modulo the crossing symmetry that relates the three processes.  Notice that in this case there are only two collinear factors and charge conservation allows only two quarks as reference partons.
Since processes with a different number N of collinear factors have a different soft factor in their factorized cross section, it is possible to classify them according to this number. This coincides with the number of reference hadrons participating to the hadronic process. The classes derived with this criterion will be called hadron classes. Formally, a process belongs to the N-h class if it globally involves N collinear parts, which can appear in the initial and/or in the final state, in all possible combinations and for all the allowed kind of reference partons. Therefore, S N-h can be considered universal only within the N -h class, modulo crossing symmetry and the possible color representations of its Wilson lines. This is a weaker kind of universality, that holds only for a limited number of processes. For instance, processes involving one collinear group belong to the 1-hadron class: Deep Inelastic Scattering (DIS), corresponding to one reference hadron in the initial state; e + e − → H X correspoding to one reference hadron in the final state. Processes involving two collinear groups belong to the 2-hadron class: here we have Drell-Yan like scattering, e + e − → H A H B X, and SIDIS.
The classification above has nothing to do with the nature of the factorization adopted (collinear or TMD): it depends only on the specific kinematics of the processes, that can be different case by case. However, it is possible to identify common properties within each hadron-class that allows to determine, a priori, which factorization scheme should be used. Consider for example the 1-hadron class case. In both DIS and e + e − → H X there is at least one hard real emission, since there is always a fermion leg crossing the final state cut (see Fig. 3). The collinear factor associated to the real emission is totally crossed by the final state cut and hence it does not have any reference hadron. Therefore, it can be considered far off-shell and part of the hard factor. All the information about soft transverse momentum is washed away and the collinear factorization scheme has to be applied. The factorized cross section for 1-hadron class processes is then written as a convolution of the collinear part associated to the reference hadron with an hard factor that, once considered together with the hard real emissions, can be interpreted as a partonic cross section, i.e. the partonic counterpart of the process.
In the 2-hadron class, instead, the choice of factorization scheme is non-trivial and depends on the specific kinematics of the process. It is dictated by the size of one parameter, namely the ratio between the modulus of the weak boson transverse momentum q T and the typical energy scale of the process Q (see Ref. [2]). When q T /Q 1, TMD factorization has to be applied, while if q T /Q 1, collinear factorization will be appropriate 6 . The cross sections predicted in these two kinematical ranges, computed within two different approximations, do not automatically match; in fact, the intermediate region, where q T ∼ Q, is usually called "matching region". Several studies have been devoted to the implementation of different algorithms to map these kinematics regions and to match the collinear cross sections to the TMD cross section (a problem known as "matching"), see for example Refs. [12][13][14][15][16].
According to the previous considerations, one can build a hierarchy based on universality. The lowest level is occupied by quantities, like the hard part, that are completely process dependent but usually fully computable in perturbation theory. At the top of the hierarchy we find quantities, like the collinear factors, that are absolutely process independent: they carry non-perturbative information but their universality properties guarantee that they can be extracted from one particular process and then used in any other. In the middle there are quantities which are only universal within their own N -hadron class, like the soft factors. As they carry non-perturbative information, they cannot be computed perturbatively. They too have to be extracted from experimental data, but they can only be used, class by class, for the processes involving the same number of collinear groups and for the same kind of reference partons.
In this sense, it is very important to provide a working scheme where objects with different degrees of universality are neatly separated, in such a way to maximize their perturbative content and their universal parts, while reducing the class-dependent factors to the minimum. For the latter, special experimental efforts will be required in order to gather a large number of high quality data corresponding to several different processes, which will then be analyzed simultaneously in a completely consistent framework. The latest analyses of the BELLE Collaboration and the current plans towards the realization of a new Electron Ion Collider (EIC) are indeed moving towards this direction [1,[17][18][19].
The classification introduced above has to be intended as a criterion to classify processes on the basis of their factorized cross section properties, and of their corresponding soft factor. Therefore, the number N that labels the classes is not the number of all hadrons involved in the process, in general much greater than the number of collinear factors. The difference is more evident when we consider the final state of a scattering process. In general, experimentalists detect a huge number of hadrons, grouped in jets. The number of jets does not correspond to the number of collinear factors, which instead is the number of reference hadrons, i.e. the number of jets in which the hadron is detected in order to study the jet's fragmentation properties. The actual topology of the event (e.g. the number of jets) is described by event-shape variables, like thrust. An example will be presented in Section 6.

The 2-hadron class
We will now focus on the 2-hadron class of processes. As mentioned above, in this class the choice of factorization scheme depends on a single parameter, the ratio q T /Q (see Section 1.1). The 2-hadron class plays a crucial role, as its soft factor S 2-h is exactly the same object that appears at denominator in the subtracted collinear factor C, Eq. (3.4) and, consequently, in the general definition of the TMD, Eqs. (3.5) and (3.6).

2-h Class Cross Section
In Section 2 we have provided a useful formalism to decompose the 2-h class soft factor and the collinear part C in a fully perturbative (computable) part and a strictly non-perturbative term, which can be modeled through the functions g K (b T ), M S (b T ) and M C (b T ), as shown in eqs. (2.16) and (3.13). At this stage we have achieved all the necessary tools to be able to write an explicit expression for the 2-hadron class cross section. Its generic structure is analogous to that given in Eq. (1.2) for e + e − → H A H B X, with H A and H B in an almost back-to-back configuration: where C + , C − refers to TMDs defined along the plus and the minus direction, respectively. The soft factor S 2-h appearing in Eq. (5.1) is the same object that appears as subtraction factor in the factorization definition of the TMDs. Reorganizing the three S 2-h factors and reabsorbing them in the TMD, leads to a different definition of TMDs (see e.g. Ref. [2], [20]): This definition of TMDs is often referred to as the square root definition.
There are many advantages to it. First of all, a single rapidity cut-off y n is sufficient to regularize all rapidity divergences, the perturbative computations are much easier and the evolution equation are unified and symmetrized, see Ref. [2]. Moreover, as mentioned above, the square root definition allows to solve the soft factor problem in the 2-hadron class. In fact, according to this definition, the cross section assumes a "Parton-Model"-like structure, where all soft gluons are reabsorbed in the TMD definition, very convenient for phenomenological applications: As an example, the unpolarized cross section for e + e − → H A H B X for almost back-to-back spinless hadrons, Eq. (1.2), becomes: Despite its numerous advantages, the square root definition lowers the degree of universality of the TMD, as it relates it to the 2-h soft factor which, by definition, is only universal within its corresponding 2-h class. In other words, the square root definition is optimal for the 2-hadron class, as it beautifully simplifies the 2-h cross section making it suitable for phenomenological applications; its drawback, however, is that it ceases to be valid outside the 2-hadron class. On the other hand, abandoning the square root definition of the TMDs in favor of the factorization definition, Eq. (3.5), will force us to face the soft factor problem and take a new (and potentially very hard) challenge: reformulating the way we do phenomenology, in terms of newly defined fundamental objects, where the soft factors are modeled explicitly rather than absorbed in the definition of the TMDs.
We will attempt such a strategy, adopting the factorization definition of the TMD, Eq. (3.7), and relying on the results of Sections 2 and 3 for the decomposition of the collinear and soft factors in terms of their perturbative and non-perturbative parts.
Using the solution of the evolution equations for the TMDs, Eq. (3.20), and the soft factor, Eq. (2.16), it is possible to write the 2-hadron class cross section in terms of perturbative and non-perturbative functions. Apart from the hard factor and a Fourier transform, the relevant structure is given by: where the reference values of the scales can be set to standard choices, Eqs. (3.14), (3.15), (3.16) and the errors due to evolution equations are neglected, since they are suppressed by O e −(y 1 −y 2 ) . From Section 3.2.1, the product of the two rapidity cut-off gives ζ 1 ζ 2 ∼ Q 4 e −2(y 1 −y 2 ) , hence the second and the third lines in Eq. (5.6) generate contributions that exactly cancel the fourth line and the exponential of the fifth line, respectively. Therefore, we simply have: As expected, in the previous equation there is no residual dependence on the rapidity cutoffs y 1 and y 2 , hence we can simply set ζ 1, 2 = Q 2 . Needless to say, this result is compatible with rapidity dilations since, as shown in Section 3.2.1, the combination C + C − S 2-h is invariant on the choice of the rapidity cut-off.

Factorization Definition vs. Square Root Definition
We can now compare the factorization definition with the square root definition of the TMDs. Ref. [2] shows that the unsubtracted TMDs C unsub i (i = 1, 2), are the same in the two definitions. Hence we can compute their ratio (here we pick the plus direction): where in the second line we used the asymptotic part of the solution to the evolution equations for the 2-h soft factor, Eq. (2.11), which is the only part that survives in the large rapidity cut-off limits, while in the last step we used Eq. (2.16) in order to separate the perturbative from the non-perturbative content. Obviously, a perfectly analogous result holds for the TMD relative to the opposite direction, C − . As we are interested in the comparison of the two TMD definitions at the same value of the rapidity cut-off, we can take y 1 = y n so that in Eq. (5.8) the dependence on the soft kernel K disappears, leaving only a square root of the soft model M S (b T ). Therefore we have: which clearly holds for both C + and C − . This is a very important result, as it shows that the choice of TMD definition (square root or factorization definition) only affects the non-perturbative content of the TMDs, while having no impact on the perturbative part. Consequently, C sqrt will differ from C mainly in the small k T region. According to Eq. (5.9), the square root definition is obtained from Eq. (3.13) by multiplying the TMD defined through the factorization definition by a square root of the soft model. In other words, the contribution of the soft physics just acts on M C (ξ, b T ): (5.10) To conclude, we can compare the effect of using either one of two different TMD definitions in the cross section. Had we used the square root definition, its net effect in Eq. (6.27) would have been the replacement: Clearly the square root definition offers an ideal framework to perform the phenomenological study of the 2-h class of processes: it solves the soft problem by reabsorbing the soft factor in the TMD definition and allows to extract the model functions M sqrt C 1, 2 from experimental data. However, this operation makes it impossible to disentangle the nonperturbative soft effects due to M S which, instead, remains explicit when using the factorization definition for the TMD.
Eq. (5.11) is particularly important from the phenomenological point of view, as it relates the TMDs obtained from data analyses based on the square root definition (which has been very widely used in the last ten years) to the TMDs extracted using the factorization definition. In this regard, the methodology proposed in this paper allows to profit of the past experience and to benefit of all the results obtained in previous analyses, while extending the scheme to all those processes which could not be considered before, because they belong to a different hadron class. A rather straightforward example of this strategy will be the combined analysis of the BELLE measurements of the polarization of Λ hyperons [17] in e + e − → Λπ(K)X processes (2-h class), already studied in Ref. [21] within a generalized-parton model approach, and in e + e − → ΛX, i.e. in a 1h-class process. This will be presented in a forthcoming paper.
6 Factorization of e + e − → H X In this section we will focus on the e + e − → H X process, which belongs to the 1-hadron class according to the classification of the Section 4. It has only one true collinear part, associated to the reference hadron H. It can be quark-or gluon-initiated; in any case there will always be at least one hard real emission that gives a collinear contribution crossing the final state cut and hence actually included into the hard factor, which can then be interpreted as a partonic cross section. The soft factor of the process is unity according to the collinear factorization scheme, see Ref. [2].
The thrust T will be included in the derivation of the final result. It is an event-shape variable that describes the topology of the final state, i.e. the number of observed jets. It can take values from 0.5 to 1.0, where the lower limit corresponds to a spherical distribution of particles in the final state, while the upper limit indicates an exact two-jet configuration (pencil-like events). Among all jets, only one is related to the collinear part, while the others have to be included into the partonic cross section. Therefore, the value of thrust will determine which Feynman graphs have to be considered in the calculation of the hard part, that will acquire a non-trivial dependence on T .
In the (thrust dependent) cross section of e + e − → H X, the leptonic tensor L µ ν , corresponding to the initial state contribution, is Lorentz contracted with the hadronic tensor W µ ν H , associated to the final state (see for example Ref. [2]). The cross section is then written as: dσ Since the coupling of QED is much smaller than α S , the leptonic tensor can be well approximated by its lowest order: where l 1 and l 2 are the momenta of the incoming electron and positron, and the electron mass is neglected.
The hadronic tensor W µ ν H depends on the momentum P of the outgoing hadron and on the momentum q of the boson connecting the initial with the final state. Furthermore, it depends on thrust, T . It can be decomposed in terms of structure functions: Thanks to this decomposition and by using the definition of the fractional energy z = 2 P · q/Q 2 , see Eq. (C.7), we can easily compute the projections:

Factorization of the hadronic tensor
The factorization procedure allows to factorize the hadronic tensor W µ ν H into hard, collinear and soft parts, as shown in Fig. 4 (a). According to Ref. [2] and by using dimensional regularization, we have: In Eq. (6.6) the collinear bubbles are represented by C j : they depend only on the total momentum entering k j and on the flavor j of the corresponding parton, and they are averaged over the color of the initiating parton. Among them, C 1 and C 2 are associated to the fermionic legs of the quark and the antiquark, hence they appear associated to the fermionic projectors, P j and P j , which connect them to the hard parts and make the jet partons on-shell. Since the hard part and the collinear parts are computed in the same  frame (the h-frame, as defined in Appendix C) the expressions for these projectors are simply Furthermore, by charge conjugation, j 2 = j 1 . The projectors defined above will be fundamental in extracting the leading twist FFs of the quark and the anti-quark in the cross section. All the other collinear bubbles, C α , are generated by gluons. In this case, the role of the fermionic projectors of Eq. (6.7) is played by a gluon density matrix ρ j j that encodes the information about the gluon polarization. In the following, we will consider the case of a fragmenting quark, corresponding to the collinear bubble C 1 as depicted in Fig. 4. In Eq. (6.6) the hard parts are represented by H and its hermitian conjugate, H † : they encode the kinematics of the process. Momentum conservation is ensured by the appropriate delta function. However, in the hard contributions, the parton momenta are approximated, in that only their leading components are considered, as stressed by the " ∧ " hats on the k momenta. In practice, the momentum k α is k α projected onto the direction of its corresponding collinear part: where w α and w α are the light-like vectors corresponding to the plus and the minus directions, respectively, in the α-bubble frame. In other words, in this frame the approximated partons move very fast along the plus directions. The approximated momentum of the fragmenting quark is simply: where k + 1, h = k + 1, h , since the bubble-frame of the fragmenting parton corresponds (by definition) with the hadron frame. Furthermore, kinematics impose constraints on the possible values that k + 1, h can assume, since P + h < k + 1, h < P + h /z (see Eqs. (C.3), (C.5) and (C.6)).
Finally, the soft contribution is a N -h soft factor, where N is the total number of partons exiting the hard scattering. It depends on the flavor of the collinear partons only through their color representation. Notice that the total soft momentum k S cannot be involved in the kinematics of the process, since it is washed out by the real hard emission (at least one, C 2 ). In fact, none of the k S components appear in the conservation delta. As a consequence, S N-h is integrated over all the components of k S and its contribution becomes trivial. As expected for a process belonging to the 1-hadron class, the soft factor is unity and can be omitted in the leading region representation [2]. All the collinear parts except C 1 , whose reference hadron is the detected hadron H, are actually hard contributions. Therefore, they can be included in one single (larger) hard bubble that will involve not only H and H † , but also all the C α . The final result, depicted in Fig. 4 (b), is given by: (6.10) In the above equation, all the hard contributions have been collected in the hard coefficient H µ ν . Notice that, while the collinear part C 1 depends on all the components of k 1 , the hard contribution depends only on its leading component, k + 1, h . Then, C 1 and H µ ν are not completely disentangled, because a convolution over k + 1, h will survive. In the following we will drop the index "1" related to the fragmenting parton, which has become redundant. Applying the fermionic projectors and parity conservation, the only surviving contribution in the case of e + e − → H X is given by the coefficient of γ − in the expansion of Eq. (3.5): The Dirac trace of γ + C(k) j, H defines two TMD FFs (as in Eq. (3.6)): where M and S T are the mass and the transverse spin of the detected hadron, while z = P + h /k + h . The function D 1, H/j is the unpolarized TMD FF, while D 1T, H/j is the Sivers-like TMD FF. For simplicity, in the following we will collectively indicate with D j, H ( z, − z k h, T ) the sum of the two contributions in the r.h.s. of Eq. (6.12). Therefore: where the kinematics constraints over z have been taken into account. The role of the hard factor in the previous equation is played by the function W µ ν j , which is the partonic analogue of the full hadronic tensor W µ ν H . It is defined as: Notice that since the approximated parton momentum has only a plus component, we can write k + h γ − = / k = spin u( k )u( k ). Therefore, W j is the algebraic expression corresponding to the pictorial representation given in Fig. 5. Its actual definition has to be equipped with the subtraction of the double counting due to the overlapping with the collinear momentum region (see Section 6.3). In Eq. (6.13), the dependence on the parton transverse momentum is only in the collinear part and, in principle, the integrand of W µ ν H could be defined as the hadronic tensor differential in k h, T . However, the parton transverse momentum is not a physical observable but kinematics relates k h, T with the transverse momentum P p, T of the outgoing hadron in the parton frame, i.e. measured with respect to its final state jet axis that we identify with the thrust axis (see Appendix C). This can be measured (e.g. as it has been done by the BELLE Collaboration, Ref. [1]) and the definition of the hadronic tensor differential in P p, T is obtained by the change of variables (C.13)). Therefore: The differential of P p, T carries information about two variables: the modulus P p, T and the azimuthal angle β in the x y-plane of the parton frame. While the first can be measured, the angle β cannot be determined experimentally. In fact, an angular dependence in the TMD contribution D j, H can originate from the Sivers-like contribution | S T × P p, T | (see Eq. (6.12)). However, as explained in Ref. [17], the transverse spin of the hadron is orthogonal to its transverse momentum with respect to the axis of the jet, identified with the thrust axis. Hence | S T × P p, T | = ±S T P p, T for any choice of the x-axis in the parton frame. Therefore, the integration over β is trivial and results just in a 2π factor on the r.h.s of Eq. (6.15): where: 1T, H/j ( z, P p, T ). (6.17)

Factorized Cross Section
The full cross section is obtained by contracting the hadronic tensor of Eq. (6.16), and its partonic counterpart, Eq. (6.14), with the leptonic tensor, as in Eq. (6.1): where the dependence on thrust has been made explicit. Lets focus on the r.h.s. of the previous equation. The only non-zero component of the approximated parton momentum k is in the plus direction, as defined in Eq. (6.9). Therefore, its Lorentz invariant phase space measure can only be written as: (6.19) and carries information about the polar angle θ and the azimuthal angle φ with respect to the beam axis (LAB frame, see Appendix C). On the l.h.s the same variables have to appear explicitly. Hence, the Lorentz invariant phase space of the detected hadron is written in the LAB frame as well: Finally, the cross section is given by: There are five independent observables: 1. The fractional energy z = 2| P |/Q.

2.
The polar angle θ of the outgoing hadron with respect to the electron.
3. The azimuthal angle φ of the outgoing hadron with respect to the x-axis in the LAB frame. This is significant only if such axis can be defined unambiguously, as in the case of polarized leptons. Otherwise, we can simply drop dφ on both sides of Eq. (6.21) as a result of integration, which is our case.
5. The (modulus of the) transverse momentum of the outgoing hadron P p, T with respect to its final state jet axis, that we identify with the thrust axis.
Common scenarios are those in which experiments provide two or three of the variables listed above: • z and θ are measured, but the thrust axis is not reconstructed, hence P p, T is unknown. In this case, in addition to the integration over T , the previous cross section has to be integrated over all possible values of the transverse momentum P p, T , restoring the integrated TMDs as in Eq. (6.13): where we used the results of Appendix B.2. This result coincides with the cross section presented in Chapter 12 of Ref. [2]. The convolution over z is between renormalized quantities, as we did for the OPE of TMDs at small b T in Eq. (B.18).
The dependence on θ, both in the partonic and in the full cross section, can expressed in terms of longitudinal (L) and transverse (T) contributions: where x can be z in the full cross section, or z/ z in its partonic counterpart. The structure functions are related to the transverse and the longitudinal component of the cross section as follows: • z and P p, T are measured, but the polar angle θ of the outgoing hadron with respect to the beam axis is integrated over. Indeed, the measurement of the transverse momentum P p, T has to be done with respect to the jet axis, which for our purposes coincides with the thrust axis. Therefore, if the cross section is differential in P p, T , it also has to be differential in T (or in an analogous variable that allows to determine the axis of the jet) 7 .
The integration of the partonic cross section with respect to θ is straightforward and follows from Eqs. (6.23), (6.24) and (6.25): For simplicity, in the following we will collectively indicate with dσ/dx the sum of the two contributions in the r.h.s. of Eqs. (6.26). Therefore: Since TMDs are defined in the Fourier conjugate space, see Eq. (3.7), it is more convenient to write the cross section using their b T -space counterparts: where D j, H is actually only a function of the modulus of P p, T . Hence: Notice that all definitions in Eqs. (3.7) and (3.20) hold for the Fourier transformed TMD FFs D j, H . Finally, the cross section in its final form is given by: As for the cross section in Eq. (6.22) the convolution is between renormalized quantities, as we will discuss in the next Section. Furthermore, in constrast to Eq. (6.27), the cross section written in terms of the Fourier transform is a function defined for any value of P p, T and in fact the errors are only sized as M 2 /Q 2 . However, the physical meaning is lost for too large values of transverse momentum of the outgoing hadron, because the TMDs themselves are ill defined in the large P p, T region (see discussion in the end of Appendix B.2). Hence, the cross section of Eq. (6.30) has to be trusted only if P p, T << Q or, more precisely, when P p, T << P + = z Q/ √ 2, because this is the very real condition that allows to consider the outgoing hadron as a collinear particle according to the power counting rules.

Rapidity dilations and RG invariance
As it is clear from Eq. (6.30), the e + e − → HX cross section, differential in z, in the thrust T and in the transverse momentum of the detected hadron with respect to the thrust axis, P p, T , offers a direct probe of the transverse motion of partons. Recently the BELLE Collaboration has provided high statistics experimental data corresponding to such cross section [1]. For each flavor j, the final result of the previous section in Eq. (6.30) is the Fourier Transform of the convolution between a thrust-dependent hard factor, i.e. the partonic cross section integrated over θ, and a TMD FF. However, the phenomenological application of Eq. (6.30) requires special care, especially regarding the treatment of rapidity divergences.
Beside b T and the collinear momentum fraction z (which is integration variable of the convolution), the TMD FF depends on other variables. In particular, there is a dependence on the RG scale µ and on the rapidity cut-off y 1 , as shown in the definition of Eq. (3.7). Let us first consider the latter. Since y 1 is completely arbitrary and the hard factor does not depend on the rapidity cut-off, the final factorized cross section would be affected by an intrinsic arbitrariness, that will affects its predictive power. However, TMDs equipped with the rapidity dilation symmetry are invariant with respect to the choice of y 1 (see Section 3.2). As a consequence, invariance is restored in the full cross section of Eq. (6.30).
As far as the µ dependence is concerned, the final cross section is RG invariant if the UV counterterm of the hard factor is exactly equal and opposite to that of the TMD FF. This argument applies to renormalized quantities, i.e. functions provided with the proper UV counterterm. Furthermore, the hard factor in Eq. (6.30) has to be properly subtracted to avoid double counting due to the overlapping with the collinear momentum region. Therefore, the hard factor of the final cross section is defined in two steps: first it is equipped with subtractions, then it is renormalized.
The unsubtracted analogue of the hard factor in Eq.(6.30) is the partonic version of the full cross section. Being a partonic quantity, it is completely unaware of the outgoing hadron. It describes the process at partonic level, which means e + e − → f X, where f is a parton of flavor f that replaces the detected hadron. The most convenient frame where to compute σ unsub is the analogue of the hadron frame, where the momentum of f lies along the plus direction. The expression of its final state tensor is obtained from the integrand of Eq. (6.13). In b T -space, it reads: where the space-time dimension has been set to D = 4 − 2 according to the dimensional regularization prescription. The unsubtracted tensor W µ ν f , unsub is the massless limit of the partonic process, i.e. its hard approximation, and can be directly computed from Feynman graphs. For a given value of T , the phase space available for real emissions is restricted, because only the final state topology associated with that particular value of thrust can be reached. In principle, this may induce unregulated rapidity divergences. However, T acts as a regulator, and the only divergences associated with the unsubtracted tensor are the collinear divergences due to the real emissions in the direction of the outgoing parton f . In momentum space, the factorization procedure applied to W µ ν f , unsub separates out all the dependence on the transverse momentum of the fragmenting parton in a delta, as δ 2−2 ( k T ). For this reason, its Fourier Transform does not depend on b T .
The second line of Eq. (6.31) is a convolution between two objects: W µ ν j , sub and D f /j is the collinear approximation of W µ ν f , unsub and can be considered the partonic version of the TMD FFs. They can be computed perturbatively and are manifestly collinearly divergent, in fact they represent the boundary of the phase space corresponding to the emissions along the direction of the outgoing parton f . On this boundary region, the dependence on thrust is replaced by a rapidity cut-offζ kin that naturally regulates the rapidity divergences and that is fixed by a kinematics relation, different at any order in perturbation theory. In particular, in b T -space, the rapidity cut-offζ kin can be explicitly written as a function of b T . This will be discussed in detail in a forthcoming paper [22]. Particles collinear to the plus direction should have a very high rapidity, however when their transverse momentum becomes too large (in the UV region), their rapidity can be very small. Hence, the only requirement onζ kin is that it should go to +∞ when b T << Q, in order to properly subtract the soft and the back-going contributions in the TMDs (as in the factorization definition, Eq. (3.7)). Finally, the label (0) means that the partonic TMDs are bare quantities, considered without their UV counterterm and computed with bare fields. Hence, D f /j are also UV divergent functions. The other fundamental ingredient of the convolution in Eq. (6.31) is W j , the subtracted counterpart of W µ ν f , unsub , equipped with subtractions of the overlapping between the hard and the collinear momentum region. All its dependence on b T is through the rapidity cut-offζ kin , which appears only in its pole part. In fact, its poles are exactly equal and opposite to the UV divergences of the partonic TMDs and hence its pole part is canceled by Z −1 TMD j ( , µ,ζ kin ). This suggests how to write a formula involving only renormalized quantities, easily obtained by using the associative property of convolutions: where a sum over repeated upper-lower flavor indices is implicit and W µ ν j, R , sub are the renormalized, subtracted hard coefficients. They are finite quantities, b T -independent. Order by order in perturbation theory, they are determined recursively by using: j, f ( z, b T , µ,ζ kin (b T )) , (6.33) The previous result follows also from the fact that the lowest order of the partonic TMDs is just a delta D [0] z). From now on, when the labels "sub" and "R" are not explicitly indicated W j will be implicitly considered both subtracted and renormalized.
The subtraction term (second line in Eq. (6.33)) can be approximated by the (in general non-trivial) thrust distribution corresponding to the whole neighborhood of the phase space boundary associated to real emissions collinear to the outgoing parton f , properly subtracted with some function that carries all the b T -dependence. By using the kinematics relationship which definesζ kin , the combination of such function with the UV counterterm that renormalizes the partonic TMDs does not depend on b T anymore. An explicit 1 loop example will be presented in Ref. [22].
The hard coefficient of the cross section of Eq. (6.30) are defined throught the subtracted, renormalized hard coefficient W j of the hadronic tensor. They have been renormalized through Z −1 TMD (ζ kin ), which involves a precise choice of the rapidity cut-off, contrary to the TMDs that instead present an arbitrary ζ. Rapidity dilations solve this discrepancy. In fact, the convolution in the final cross section can be schematically written as: where d σ l, sub is the subtracted, unrenormalized partonic cross section. This expression is RG invariant thanks to the invariance of TMDs anomalous dimensions, Eq. (3.27), under rapidity dilations. In fact, they guarantee that the convolution of UV counterterms is a delta function, as the two Z TMD differ only by the choice of the rapidity cut-off.
Summarizing, the final cross section of Eq. (6.30) is invariant under both rapidity dilations and Renormalization Group transformations. Therefore, µ and ζ can be set to the most convenient values for perturbative calculations and phenomenological analyses. Common choices are µ = Q and ζ = Q 2 . Figure 6. Amplitude squared for the LO partonic tensor, in the limit T → 1.

A Lowest Order example
The most trivial example for thrust-dependent the cross section of e + e − → H X is the basic QCD approximation in the 2-jet limit, i.e. T → 1. At lowest order, the subtraction mechanism is trivial and the subtracted, renormalized hard coefficient are easily computed from Eq. (6.33): In the 2-jet limit, the only Feynman diagram contributing to the l.h.s of the previous equation is given by Fig. 6. It is an exact 2-jet configuration, hence T = 1. The actual computation is easier for the projections (see Eqs. (6.4) and (6.5)): (z, T ) = 0. (6.37) Notice that the gluon contribution is always suppressed in a 2-jet configuration. Then we can compute the lowest order subtracted, renormalized structure functions: 1, f (z, T ). (6.39) Finally, by using Eqs. (6.24), (6.25) and (6.26), the lowest order subtracted, renormalized partonic cross section appearing in the final result of Eq. (6.30) is given by: where the factor a T accounts for the limited acceptance in the polar angle θ. In the following, the detected hadron will be considered spinless for simplicity. Hence, the Siverslike contribution disappears and in the cross section will remain only the unpolarized TMD FF D 1 . Its rudest estimate is the Leading Log (LL) approximation, given by: where L b = log (Q/µ b ) and the functions g LL 1 , g LL 2 are given in Eqs. (B.15). Therefore, the lowest order, leading log cross section is given by: Notice that this formula represents the simplest, non trivial approximation above a parton model picture. It holds valid to LO in the perturbative expansion and at T=1. Therefore, a complete phenomenological analysis should not rely on Eq. (6.42), but rather on the full NLO expression, with the appropriate order of logarithms. This will soon be available in Ref. [22]. In the following, we will give a prototypical application of this LO cross section formula to a small sub-sample of the BELLE data [1], which should only serve as an example of the rapidity dilation mechanism discussed in Section 3. The simplicity of its usage and the small number of free parameters involved in the fitting procedure are indeed the points of strength of Eq. (6.42). For our example, we will consider only the subset of the BELLE e + e − → HX cross sections, corresponding to 0.55 < z < 0.6, 0.85 < T < 0.90, in 20 P T bins ranging from 0.06 to 2.5 GeV. For the BELLE experiment Q = 10.58 GeV. This data sub-sample is shown in Fig. 7. Statistical and systematical errors are added in quadrature. The analysis will be performed using the NNFF10 fragmentation function set at LO [23], and fixing the values of b MIN and b MAX as follows Let's now suppose that, somewhere around the globe, Group A performs a phenomenological analysis of the above BELLE data subset using a power-law parameterization of the model in P T -space which, in the b T space, corresponds to a Bessel-K function, normalized in such a way that it is 1 at b T = 0: where K −1+p is the modified Bessel function of the second kind. This model was successfully used in Ref. [24] to fit the e + e − → HX cross sections measured by the TASSO and MARKII Collaborations [25,26]. Group A knows that the TMD cross section will become unphysical as P T grows larger, as it is only valid in the TMD region where P T << P + (here P + = z Q/ √ 2 ∼ 4.3 GeV). Therefore they fix P T,MAX = 1.8 GeV. After this point the cross section will rapidly fall to zero and become negative. Having set their rapidity cut-off at ζ = Q 2 , Group A best fit returns m A = 0.35 and p A = 3.00 for their two free parameters. The resulting cross section is shown in Fig. 7 (red, solid line).
On the other side of the planet Group B, totally unaware of the work of Group A, performs a fit on the same data sample, but they choose a Gaussian parameterization for the model of their cross section (clearly the perturbative part has the same functional form in both cases) Here there is only one free parameter, m, as the power p has been fixed to 2 to obtain a Gaussian form. They set their rapidity cut-off to ζ = Q 2 /4 and decide to be conservative on their TMD-regime requirement, so they fix P T,MAX = 1.3 GeV. Their fit has only one free parameter, m B , which the χ 2 minimization procedure sets to 0.12. The corresponding cross section is shown in Fig. 7, by the green dashed line. Notice that, in principle, there is at least one more free parameter in both analyses, which is used to model the g K function, see Eq. (6.42). As explained in Section 3, however, g K does not depend on the rapidity cut-off, nor on the flavour j of the fragmenting quark. Therefore, it does not play any active role in a rapidity dilation and is not relevant in this example. We will therefore suppose it to be the same for Group A and B and parameterize it as g K = a b 2 T with a = 0.11, fixed "a priori". As it is clearly shown in Fig. 7, the results obtained by Group A and B are consistent, within errors, as they fit the same data sample. Similarly, also the TMD fragmentation functions extracted by the two groups will be consistent at small P T , where they carry a truly physical information about the transverse motion of the hadronizing parton. In b T -space, the two TMDs are very similar at small b T but they may differ in their large b T behaviour, because of the different choices of models, M A (b T ) and M B (b T ).
It is at this point that Eq. (3.34) becomes crucial: in fact, it allows the two Groups to relate their independent extractions through a rapidity dilation. The two extractions will not correspond to a one-to-one relation in b T -space, nevertheless rapidity dilations preserve the physical meaning of TMDs and ultimately the predictive power of the theory. First of all, Group A performs a rapidity shift on their extraction: as expected the cross section is not invariant for a variation of the rapidity cut-off. This is illustrated in fig. 8 Here θ = log 2. This is shown in Fig. 9, where the black solid line represents the results of Group B for the non-perturbative contribution to the full cross section (left hand side of Eq. (6.45)), while the green line corresponds to the results of Group A for the analogous quantity after the application of a rapidity dilation (right hand side of Eq. (6.45)). Notice that σ N P A is related to σ N P B by a factor which is purely perturbative and therefore calculable and totally model independent, see Eq. (6.45). Fig. 10 shows the ratio of these two curves as a function of P T , R NP . In an ideal world, were all extraction converged to the same model, R NP would be 1 at all values of P T (dashed red line). However, in a realistic case R NP is very close to 1 only at small P T , as it should, and it starts deteriorating as P T grows larger. It is not by chance that it stays close to 1 up to P T ∼ 1.3, which corresponds to the value of P T,MAX set by group B. After that point, the cross section starts to become unphysical and the ratio itself becomes meaningless. In Fig. 10 a thin gray vertical line marks P T,MAX = 1.3 GeV. Notice that the invariance under rapidity dilation is considerably powerful: it allows to preserve the physical part of the cross section, embodied by the TMD function at small P T , even in a realistic situation in which a very limited range of P T is constrained by experimental information, while compensating for the variation of the rapidity cut-off in the perturbative part by a transformation of the non-perturbative model. It therefore preserves the predictive power of the theory by making it independent of the choice of the rapidity cut-off.

Conclusions
In this paper we have extended the TMD factorization mechanism to processes belonging to different hadron classes. This is potentially a very powerful tool, as it allows us to exploit the same definition of TMD parton densities in different processes, which up to now could not be used in a simultaneous data analysis. With this extended definition of TMD, in particular, we have been able to apply the TMD formalism to the process of one hadron production from e + e − scattering, belonging to the 1-h hadron class. Within this scheme, the TMD FFs extracted from a phenomenological analysis of the P T dependent e + e − → HX cross sections, can be related to the analogous TMD FFs as extracted in a 2-h class process, like SIDIS or e + e − → H A H B X.
Clearly the extension of the factorization scheme comes to a price, a price that in this case turns out to be rather large and two-folded. First of all the soft factor, which is responsible for a (partial) breaking of universality, cannot be included in the definition of the TMD, as it is elegantly done in the standard TMD factorization through the "squareroot" TMD definition. Freed by its soft contribution, TMD becomes truly universal and can be used in any class of processes. The soft factor, however, assumes a fundamental role as it becomes a pivotal ingredient of factorized cross sections, where the non-perturbative effects of soft physics are embodied by the soft model M S . It will have to be extracted within its corresponding hadron class and should only be used within that class. The process e + e − → HX is a slightly exceptional case, as the soft factor here becomes unity, as shown in Section 6.
Having recovered a solid and truly universal definition of a TMD, we can use it to factorize cross sections as that of e + e − → HX, where there is only one single TMD playing the role of the long-distance contributions. Hence we have to face an additional problem: the arbitrariness in the choice of the rapidity cut-off reflects in the cross section, undermining its predictive power. To make the TMD independent of the choice of the rapidity cut-off, they have to be made invariant under a specific transformation, which we call "rapidity dilation".
Invariance under rapidity dilations is the most important and innovative point in this work, and it is at the very heart of the technical mechanism which allows us to reabsorb the arbitrariness of the rapidity cut-offs in our extended factorization scheme. This is somehow similar to the action of the renormalization group, where the energy scale µ is introduced as an arbitrary parameter to regularize UV divergences: in principle µ can be fixed to any value, but physical observables turn out not to depend on the choice of µ. Similarly, if rapidity dilation invariance is restored in physical observables, the predictive power of the scheme is preserved.
Moreover, rapidity dilations regulate how the perturbative and non-perturbative contributions are balanced within the TMD itself. In fact, in principle, the cut-off has to be taken very large (y 1 → ∞) but in practical computations there is total arbitrariness in choosing its particular value. Rapidity dilations control this arbitrariness by acting both on the rapidity cut-off and on the model M C . The larger y 1 the more M C is suppressed, and the TMD is, basically, only perturbative. Less extreme values of y 1 , instead, will correspond to a more dominant non-perturbative contribution.
Separating perturbative and non-perturbative contributions is a highly non-trivial problem, which affects any phenomenological analysis. For example, ambiguities originate when we have to fix the value of b M AX , which marks the critical value of the impact parameter at which non-perturbative contributions start becoming non negligible. TMDs are well defined within the approximation in which the partonic k + is very large while k T is small (i.e. collinear according to power counting), they should therefore correspond to partons with a very large rapidity and very small transverse momentum with respect to the jet axis. The rapidity cut-off y 1 , formally, will have to be taken to infinity but, in practice, the specific size of y 1 will determine how far we stretch the perturbative content of the TMD and where the non-perturbative contribution will become dominant.
To clarify the practical relevance of rapidity dilation invariance, in the last Section of this paper we have concentrated on the e + e − → HX cross section, presenting a simple, lowest order example to show how rapidity dilations can offer a tangible help in relating phenomenological analyses performed using different non-perturbative model assumptions and different values of the rapidity cut-off, and a solid basis for the interpretation of the results of independent TMD extractions.
Finally, we want to stress that the scheme we are proposing does not require a new start in the phenomenological analysis of all classes of hadronic processes. In fact, we can relate the TMDs obtained from data analyses based on the square root definition to the TMDs extracted using the factorization definition. This allows us to benefit all previous phenomenological analyses and extend them to 1-h class processes. This is indeed the strategy we are planning to pursue in the near future.
coupling constant and the gluon field are bare quantities, as indicated by the label "0". In the previous formula, t a are the generating matrices of the gauge group, in the appropriate representation. The Wilson lines guarantee that PDFs and FFs (in both collinear and TMD cases) are gauge invariant, by linking the quark to the anti-quark fields in the definition of the collinear factor (see Eq. (3.4) ). The Wilson line represents the (all order) propagation of a particle strongly boosted in some direction n. If this direction is a straight line the Wilson line depends only on the endpoints of the path and can be written in a compact way as: If the strongly boosted particle is a quark, the associated Feynman rules are: More details can be found for instance in Chapter 7 of Ref. [2].

B Perturbative QCD and small b T region
Soft factors and collinear parts are well defined functions only over a rather small region in the transverse momentum space, according to power counting rules. The Fourier transform to the impact parameter space can be regarded as a kind of analytic continuation, because at fixed b T we can roughly access all transverse momenta with k T ≤ 1 b T , even trespassing the original momentum region. In particular, the small b T region is associated with large transverse momenta, where perturbative QCD can be applied and a power expansion in α s allows us to perform explicit calculations. This can be proved by a direct application of the factorization procedure to the small b T approximation of the Fourier transformed function. For the soft factor this can be found in Section 2, while for collinear parts we refer to Chapter 13 of Ref. [2] and to Ref. [27].
Despite the undeniable advantage provided by the possibility to perform explicit calculations in the small b T region, perturbative QCD is not enough to reproduce integrated quantities, which correspond to the Fourier transformed functions evaluated in b T = 0. These can be recovered from the operator definitions, that obviously give a nonperturbative, all-order point of view. Therefore in b T = 0, Eqs. (2.2) and (2.6) simply confirm that the integrated soft factor is the identity matrix, while Eq. (3.3) reproduces the integrated PDFs and FFs. The failure of perturbative QCD in b T = 0 is due to the fact that the integral over k T is intrinsically ill defined, since it extends well beyond the physical momentum region where the TMDs and the soft factor are defined. As a consequence, new UV divergences arise and the counterterms in Eqs. as bare quantities and they need a renormalization in order to acquire physical meaning and reproduce the correct results. In the following, such renormalization procedure will be investigated for both the 2-h soft factor and the TMDs.

B.1 Small b T behaviour of 2-h Soft Factor
The Feynman graphs in Fig. 11 show that in the small b T region the (renormalized) 2-h soft factor is given by: This expressions implies that K(b T , µ) is large and positive as b T decreases. Therefore, the resummed soft factor of Eq. (2.16) vanishes in b T = 0. An improvement can be reached by using a leading log estimate of K by using its evolution equation solution, Eq. (2.10). Actually, it is inappropriate to count the logs of a quantity, like the soft kernel, which is already the result of a resummation procedure. Despite this, we can apply the same recipe and set all terms to order α 0 S except γ K , which has to be taken to 1 loop. This gives: that coincides with Eq. (B.2) in the limit α S → 0. With this estimate, the divergence of K is much less severe but it is still there. An easy way to solve the problem and ensure that the perturbative QCD computation agrees with the operator definition prediction is to introduce a cut-off that prevents the soft transverse momentum to reach the UV Then, the integrated soft factor is given by the unintegrated

B.2 Small b T behaviour of TMDs
Formally, the integrated TMD is the Fourier transformed TMD computed at b T = 0. In order to recover this result from Eq. (3.20) by applying perturbative QCD, the Fourier transformed TMD has to be renormalized, otherwise it would vanish in b T = 0. This result can be proved by following the procedure described in Ref. [14]. First of all, thanks to the properties of the model M C , Eq. (3.17), and of g K , we can neglect all the non-perturbative content in Eq. (3.20) at small b T . Furthermore, in this region we can approximate b T with b T . Then, it is a standard result that the α S expansion of Wilson coefficients can be written as: where [k/2] denotes the integer part of k. If the scales are fixed according to the standard choices of Eqs. (3.14) and (3.15), all the logs disappear and the only b T dependence in the Wilson coefficients is given by α S (µ b ). Since µ b ∝ 1/b T , when b T → 0 the energy scale becomes very large and α S can be considered a small parameter. For example, at 1 loop: Then, the Wilson coefficients evaluated at the scales µ b , ζ b are well approximated at small b T by their lowest order term, which is simply a delta function: On the other hand, K, which is at exponent, allows for a different number of logarithms in front of each power of α S : As in the previous case, all the explicit dependence on b T vanishes if µ = µ b and K at small b T is well approximated by its lowest order term. However, since in this case the series starts from O (α S (µ b )), we can simply neglect this contribution. Finally, the anomalous dimensions have a simple expansion in α S : K . (B.10) The easiest way to study the behavior of their contribution in Eq. (3.20) at small b T is to consider its derivative with respect to log b T . Since ∂/∂ log b T = −∂/∂ log µ b , we can compute with the help of Eq. (B.6): This behavior affects the whole TMD at small b T , giving: From Eq. (B.2) γ [1] K = 16 C F , then the TMDs goes to zero when b T → 0 with a power-law behavior. This is also confirmed by a direct computation of the leading log (LL) estimate of Eq. (3.20). In this approximations, all the quantities are taken at order α 0 S , except γ K which instead is computed at 1 loop. The result is: where L b = log µ/µ b , a S = α S /4π and: Notice that the function g LL 2 contributes to the LL estimate even if it typically appears at NLL. This is due to the presence of three scales instead of two. In fact, if √ ζ equals either µ or µ b , only g LL 1 contributes to LL.
As a consequence of the previous arguments, integrated TMDs are bare quantities when approached perturbatively. Formally: H/j (z, µ) final state. (B.16) The bare integrated TMDs in the equation above acquire their dependence on µ through the renormalized fields used to compute them. Real bare quantities are defined through bare fields and are obtained by multiplying by Z 2 as in Eq. (3.7). Notice that integration makes the soft-collinear subtractions trivial, because the S 2-h appearing in the factorization definition is unity when integrated over all soft transverse momentum. The required UV counterterm depends on the plus component of the momentum of the reference parton, i.e. on the collinear momentum fraction ξ. Hence, the renormalized quantities are not simple products of the bare quantities with the UV counterterm, like in Eq. (3.7), but rather convolutions c j, H (ξ, µ) = (Z int ) k j (α S (µ)) ⊗ c The factorization procedure applied to the TMD at small b T does not give Eq. (3.18) directly. Instead, it expresses the final result as a convolution between a collinear part, represented by the unrenormalized integrated TMDs, and a hard factor H which has to be properly subtracted in order to cancel the double counting due to the overlapping between the hard and the collinear momentum region. This subtraction mechanism is completely anologous to that used in the definition of the subtracted collinear part in Eq. (3.4). Roughly speaking, the UV part of the bare integrated TMDs is (minus) Z int , then the subtracted hard part acquires the divergence induced by the counterterm. Despite this, we can still define a finite hard part by interpreting H sub as a bare quantity as well, with its renormalized finite counterpart represented by the Wilson Coefficients in the OPE. As a consequence, the required counterterm will be exactly Z −1 int . Then, a straightforward application of the convolution property shows that: where in the last step we used the OPE expansion valid at small b T . In general, this result does not coincide with c f, H (ξ, µ), but it will do if the Wilson Coefficients can be well approximated by their lowest order. If µ can be considered a large energy scale (e.g. if it can be set equal to the hard energy scale Q of the process) then we can set b MIN ∝ 1/µ. Then all the logs inside the Wilson Coefficients are heavily suppressed and the lowest order approximation is reliable. Therefore, if µ is large enough, the cut-off approach gives the same result of the renormalization through the UV counterterm Z int . Thanks to b MIN , the subtraction mechanism implemented in the factorization procedure applied to the TMD at small b T is now applied to the collinear parts instead of the hard factor. Therefore, we do not have to worry about subtracting the hard part. However, the final result coincides with that of Eq. (3.18) because, trivially, H sub ⊗ C unsub = H unsub ⊗ C sub .
The integration over k T of the TMD, actually gives the area under the curve designed by the TMD in k T -space. Even with the introduction of an explicit b MIN , the value of such integral is very small. Since in momentum space, at small k T , the TMD is positive (e.g. Gaussian behavior), the small value of the integrand implies that the TMD has to change sign at a certain k T . This is equivalent to say that the TMD loses its physical meaning when k T becomes too large. In fact, the power counting imposes k T ∼ λ, where λ is some small IR energy scale.

C Kinematics
As stressed in Section 1.1, kinematics play a crucial role in factorization, as it determines whether we need to apply a TMD or a collinear factorization scheme. The study of kinematics is strictly connected to the choice of the frame. In the case of e + e − → H X, three four-vectors underlay the kinematical configuration: • The momentum k of the fragmenting parton.
• The momentum P of the outgoing detected hadron H of mass M , P 2 = M 2 .
• The momentum q of the highly virtual time-like photon that makes the partonic state. Its squared momentum gives the square of the center of mass energy Q >> M , q 2 = Q 2 .
Clearly, the choice of the frame is completely arbitrary since the cross section will be Lorentz invariant. Three main frames are useful in deriving the final form of the factorized cross section: in this appendix we will provide a short description of all of them.
1. Hadron frame, labeled by h. This is the frame where the outgoing hadron H has no transverse components and it moves very fast along the (positive) z h -direction: Furthermore, since H is strongly boosted in the plus direction its plus component is very large, of order ∼ Q. As a consequence, its minus component has to be very small in order to satisfy the on-shell condition P 2 = 2P + h P − h = M 2 . Therefore, in this frame, the full four-momentum P can be written as: The fragmenting parton belongs by definition to the same collinear group of the outgoing hadron, hence it is almost collinear to it: it has a very large plus component, a low transverse momentum and an even lower minus component. It is almost onshell, with a very low virtuality. Power counting (see Chapter 5 in Ref. [2]) allows us to quantify the sizes of these quantities by introducing a small infrared scale λ << Q. Then k 2 = λ 2 , which means k + h ∼ Q, k − h ∼ λ 2 /Q and k h, T ∼ λ. Neglecting all the suppressed components, k and P become exactly collinear, i.e. k ∝ P . This can be made explicit by setting: where k α is the momentum of a generic real emission. As explained in Section 1.1, since the process e + e − → HX belongs to the 1-hadron class, there is always at least one real emission (in this case the anti-quark leg that does not fragment) with a hard momentum, i.e. with all components very large, at least of order Q. As a consequence, the only component of k that survives in Eq. (C.8) is k + h , while all the others are strongly suppressed by the large momenta k α .
2. c.m. frame, labeled by γ. In this frame the spatial momentum of q is zero q γ = 0 , (C.9) which means (C.10) Since rotations send null spatial vectors into null spatial vectors, the condition in Eq. (C.9) is defined modulo a rotation in space. Therefore, if we set the z-axis of this frame to be the direction of the outgoing hadron, we can identify the hadron frame with the c.m. frame and apply power counting and the whole factorization procedure directly in this frame. This is a big advantage, since usually the calculation of the hard part of the cross section is much easier in the c.m. frame but in general it does not coincide with the hadron frame, which on the other hand makes simpler the application of the factorization procedure 8 . Then we can write the components of q in the h-frame as in Eq. (C.10). From Eq. (C.8) it follows that the total transverse momentum of the real emissions exactly cancels the contribution of k T, h , hence | α k α, T, h | ∼ λ.
Notice that the LAB frame, in which the z-axis coincide with the beam axis, is a valid c.m. frame but it is not the hadron frame, as they differ by a spatial rotation, as shown in Fig. (14). The lepton pair is back-to-back in both the frames, but the direction of their spatial momenta is different.
3. Parton frame, labeled by p. As explained in Ref. [2], in order to properly define a fragmentation function we need a frame in which the fragmenting parton has zero transverse momentum. This is the parton frame, defined by requiring k T, p = 0 T . (C.11) In principle we have two Lorentz transformations available that we can use to reach the parton frame from the hadron frame: a rotation of the (small) angle between the 8 For example, this is the case of e + e − → HA HB X, with the two hadrons almost back-to-back. In this case, the hadron frame is defined as the frame in which both hadrons have zero transverse momentum, i.e. where they are exactly back-to-back. However, a spatial rotation can fix only one hadron and the c.m. frame cannot be identified with the h-frame. The two frames are actually connected by a light boost in the transverse direction, where the boost parameter is (proportional to) q T, h . As a consequence, we need boost-dependent projectors connecting the collinear and the hard parts of the cross section. In principle, we can use a boost also in the case of the production of a single hadron, however the boost will depend on q T, h which, in this case, is not observed.
parton pair, while for three partons T p = max{x 1 , x 2 , x 3 } ≥ 2/3, with x i = 2| k h, i |/Q, and n p is the direction of the i-th parton. Since P p, T is strictly connected to k h, T , as shown in Eq. (C.13), its measurement offers powerful information on the partonic variables. However, the experimental measurement is on the transverse momentum of the outgoing hadron with respect to the hadron thrust axis n h , which is the direction that maximizes the hadronic thrust T h defined as where now the sum runs over all the detected particles in the center of mass frame (e.g. the LAB frame). Its value is close to its partonic counterpart, but they are not the same. As shown in Ref. [28], the observed distribution of hadronic thrust is related to the distribution with respect to the partonic thrust (which can be computed in perturbation theory) by a correlation function C(T h , T p ) that is sharply peaked around T h ∼ T p . Therefore, roughly speaking, we can set C(T h , T p ) ∼ δ(T h − T p ) and the direction which maximizes the hadronic thrust is approximately the same axis that maximizes the partonic thrust, i.e. n p ∼ n h . The estimate of how much they differ can be made more quantitative in the simple case of a 2-jet configuration.
In fact, in this case we have T p = 1 and T h ∼ 1 − (M 2 1 + M 2 2 )/Q 2 (see Ref. [28]), where M 1, 2 is the invariant mass of the hadronic jets, hence T p − T h ∼ O(M 2 /Q 2 ). In this paper, we consider P p, T as a valid estimate of the transverse momentum of the outgoing hadron with respect to the hadronic thrust axis.