Kinematic power corrections in TMD factorization theorem

This work is dedicated to the study of power expansion in the transverse momentum dependent (TMD) factorization theorem. Each genuine term in this expansion gives rise to a series of kinematic power corrections (KPCs). All terms of this series exhibit the same properties as the leading term and share the same nonperturbative content. Among various power corrections, KPCs are especially important since they restore charge conservation and frame invariance, which are violated at a fixed power order. I derive and sum a series of KPCs associated with the leading-power term of the TMD factorization theorem. The resulting expression resembles a hadronic tensor computed with free massless quarks while still satisfying a proven factorization statement. Additionally, I provide an explicit check of this novel form of factorization theorem at the next-to-leading order (NLO) and demonstrate the restoration of the frame-invariant argument of the leading-power coefficient function. Numerical estimations show that incorporating the summed KPCs into the cross-section leads to an almost constant shift, which may help to explain the observed challenges in the TMD phenomenology.


Introduction
The transverse momentum dependent (TMD) factorization theorem is known to be an invaluable tool in understanding the transverse momentum distribution of inclusive and semi-inclusive processes across a wide range of energy regimes.The transverse momentum distribution of Z-boson or Higgs bosons, semi-inclusive deep-inelastic scattering (SIDIS), single-and double-spin asymmetries -these are only a few examples of successes of TMD factorization approach (for review see [1,2]).As the list of accomplished applications continues to expand, it becomes evident that the current form of TMD factorization has practical limitations.It can only be confidently applied within a restricted phase-space, which leads to many problems and tensions already today (see f.i.discussions in [3][4][5]), and will be critical for the future generation of colliders [6,7].Therefore, a systematic extension and refinement of the TMD factorization approach are crucial priorities for modern studies.
A principal feature of the TMD factorization approach is its ability to systematically account for the transverse momenta of partons within the hadron.As a result, the main application domain is processes characterized by transverse momenta compatible with typical hadronic scales.At the same time, the characteristic longitudinal momenta must be large enough to justify the application of the parton model.In this study, I focus on the Drell-Yan reaction (h 1 + h 2 → γ * + X), which is the classical case of the TMD factorization [8].In this particular reaction, the relevant momenta are the transverse (q T ) and longitudinal (q ± ) components of the produced-photon momentum.
The leading term of the TMD factorization theorem is derived [9][10][11] in the limit of the longitudinal momenta being greater than any other scales.In the position-space representation, the leading term has the general form (omitting the scaling argument for brevity) where F 1,2 are TMD parton distribution functions (TMDPDFs) in the Drell-Yan case, x 1,2 are the longitudinal momentum fractions of partons.C 0 is the hard coefficient function which depends on q + q − /µ 2 with µ being the factorization scale.TMD distributions satisfy the pair of evolution equations [12][13][14], which have multiplicative kernels in the position space.Due to it, the positionspace representation (1.1) is convenient in practice.The momentum-space representation of (1.1) is where F 1,2 are TMDPDFs in the momentum space obtained by the Fourier transform of F 1,2 .Explicit examples of these structures for various processes can be found in refs.[15][16][17].The perturbative elements of the factorization formula (1.1) (the hard coefficient function and the evolution kernels) are known up to next-to-next-to-next-to-next-to-leading order (N 4 LO), making it one of the most developed formula in the perturbation theory.
The corrections to the leading term are suppressed by powers of q ± .To simplify the exposition, I generally refer to the large scale as Q (∼ q + ∼ q − ).The power corrections can be categorized into four conceptual types: k T /Q power corrections: In the position-space representation, these corrections manifest as transverse derivatives of TMD distributions, while in the momentum representation, as the powers of k T 's.These corrections are commonly known as kinematic power corrections (KPCs).
q T /Q power corrections: In the position space, these corrections arise as terms accompanied by inverse powers of b.Once integrated, such factors transform into powers of q T .Λ/Q power corrections: These corrections indicate terms that contain nonperturbative distributions of higher twist.Generally, a TMD distribution of twist-n is considered as a (Λ/Q) n−2 correction.These corrections are often referred to as genuine power corrections.
Target-mass corrections: These corrections account for the finite mass of the hadron.Currently, they are the least studied corrections.
Together, these power corrections form a complex system, which is discussed further in the remaining part of the introduction.Importantly, each type of power correction exhibits distinct theoretical and practical features, making some corrections more significant than others in different circumstances.In this study, I present the derivation, summation, and discussion of (pure) KPCs and examine their practical importance.
In recent years, there has been significant progress in the study of power corrections to the TMD factorization theorem [18][19][20][21][22][23][24][25].This progress has been made possible by formulating the TMD factorization on the operator level, as opposed to the method-of-region analysis [9] used in the original derivation of the leading power (LP) term.As a matter of fact, the power corrections are simpler to analyze with operator methods since they allow for unambiguous identification of the different types of power corrections.This is crucial for a comprehensive understanding of the structure of the factorization theorem beyond the LP approximation.Still, the status of the TMD factorization theorem beyond the LP approximation remains an open problem.Currently, there exists a complete expression for TMD factorization at the next-to-leading power (NLP) level.It has been derived with three different methods [21][22][23] (and agrees between them), and checked at next-to-leading order (NLO) [22,24,26].Furthermore, several general statements (such as the all-order structure of rapidity divergences for twist-three distributions [22,23], the cancellation of special rapidity divergences [24,26]) provide us hope that the NLP TMD factorization is valid at all perturbative orders.The research on the next-to-next-to-leading power (NNLP) TMD factorization is very limited [19][20][21].
The brute force evaluation of power correction is impractical.Already at NNLP, one faces many novel theoretical problems and a vast number of structures.Therefore, to proceed further, it is crucial to understand and account for the general hierarchy of power-suppressed terms in the TMD factorization framework.This hierarchy can be deduced from the power counting and the dimension analysis.Naturally, there is a degree of arbitrariness in such a generalization.Nonetheless, if TMD factorization holds at higher powers, one should expect it to exhibit the following general structure (omitting target-mass corrections for simplicity) where Φ n is a TMD distribution of twist-n, which depends on b and some number of collinearmomentum factions.The power of D indicates the number of boost-invariant transverse derivatives (defined in (3.39)) in the term, which acts to TMD distributions in various combinations.The symbol × indicates an integral convolution in the collinear-momentum fractions.Each term is equipped with a coefficient function which is not shown.The first line in the brackets is the LP term (1.1).The second line is the NLP term (see [19,22]), and so on.Each term in expression (1.3) represents a complicated composition of various distributions.
In expression (1.3), one can easily recognize power corrections of different types.So, the first term in the third line (DΦ 2 × Φ 2 ) is the correction ∼ k T /Q, and the second term (Φ 2 × Φ 3 ) is the correction Λ/Q.In the fourth line, the first term (D 2 Φ 2 × Φ 2 ) is k 2 T /Q 2 , the second term (DΦ 2 × Φ 3 ) is k T Λ/Q 2 , the third and forth terms (Φ 3 × Φ 3 and Φ 2 × Φ 4 ) are Λ 2 /Q 2 , and the last term (Φ 2 × Φ 2 /b 2 ) is q 2 T /Q 2 .And so on.In eqn.(1.3), the terms with different order of k T /Q corrections but the same order of other corrections are aligned into columns.These columns represent a series of KPCs to the first term.Each column incorporates unique nonperturbative content, as it contains a combination of TMD distributions that do not appear in other columns.Hence, each column can be considered as an independent contribution to the TMD factorization theorem.All global properties expected from the hadronic tensor, such as charge conservation and frame invariance, must hold for each column individually.
It is important to emphasize that the decomposition (1.3) is based on the assumption that TMD distributions of different twists (i.e., Φ n and Φ m for n = m) are completely independent functions.This assumption holds if these distributions do not mix under TMD evolution.The requirement for non-mixture is a fundamental prerequisite because otherwise, power corrections of different types would entangle due to evolution or by higher perturbative orders of coefficient functions, and the presentation form (1.3) would be impossible.Moreover, it would violate the universality property of nonperturbative distributions1 .Therefore, the problem of systematization of power corrections in TMD factorization is equivalent to the problem of the definition of TMD-twist.This work adopts the definition of TMD twist introduced in ref. [22], which constitutes a TMD generalization of the concept of geometrical twist (aka "dimension minus spin") used for collinear operators [27,28].This definition ensures, at least, the independence of non-mixture under ultraviolet renormalization.
Apart from the theoretical challenges posed by power-suppressed terms, there is also a practical concern regarding their relevance.It is evident that a plethora of TMD distributions appear in the power corrections, and the number of new functions grows faster than the number of observables.Already at NLP, one encounters 32 twist-three TMDPDFs [26] (in addition to the 8 TMDPDFs at the leading power).Identifying and constraining all these functions from experimental data alone is practically impossible.In this regard, genuine power corrections can be characterized as "bad" as they introduce unknown functions (although offering insights into new facets of quantum physics).The other types of power corrections are "better" as they do not increase the number of unknowns but instead refine and extend the LP term.
Thus, it is sensible to incorporate power corrections by types rather than by orders.First, this approach preserves the validity of the factorization theorem, as different types of power corrections are additive and do not interfere with each other.Secondly, each type of correction contributes to a specific kinematic region and could be distinguished phenomenologically.It is convenient to consider the limit Q bigger than any over-scale as four independent statements: Including power corrections of a specific type softens the corresponding restriction, while summing the power corrections of a specific type eliminates the restriction (or replaces it with a strict inequality).At low Q, all types of power corrections are significant simultaneously.Notably, the q T /Q corrections are special, as their magnitude is controlled by the experimental kinematics.By including only q T /Q corrections, the cross-section can be described up to q T ∼ Q, while the other corrections remain negligible (for large Q).The remaining corrections follow a hierarchy of k T > m > Λ and are valid in the regime q T Q.In this work, I study KPCs for the LP term.Within the expression (1.3), these power corrections are represented by the first column.Notably, these power corrections are of utmost importance within the conventional range of application for TMD factorization q T Q. Primarily, KPCs play a crucial role in restoring the electromagnetic (EM) gauge invariance (charge conservation) and frame invariance of the hadronic tensor, both of which are explicitly broken in the LP term.Consequently, KPCs can be viewed as an integral part of the LP factorization, as the absence of these corrections render the LP factorization inconsistent.Furthermore, unlike other scales such as q T , Λ, or m, the scale k T is not fixed but rather an integration variable (1.2).This introduces a self-contradiction, as the factorization formula includes values of k T that exceed Q, contradicting the initial assumptions.Moreover, the average value of k T is on the order of the ultraviolet cut-off due to the large-k T asymptotic behavior of TMD distributions (∼ k −2  T ).An extra point in favor of KPCs is that the factorization theorem for them does not require dedicated proof.KPCs inherit the factorization property from the LP term, for which the theorem is already proven.Considering these reasons, it is sufficient to include KPCs independently of other power corrections.
The paper is structured as follows.In sec. 2 I review the concept of the TMD-twist and present in detail the simplest (but the most useful) example of twist-decomposition for the quark operator (sec.2.2) and for a general bi-quark TMD distribution (sec.2.3).Sec. 3 is dedicated to the derivation of KPCs order-by-order in power expansion.Here, I start from the sketch of the operator derivation of the LP term (sec.3.1) and continue to this derivation to all-powers (sec.3.2).In sec. 3 I inspect the influence of loop corrections for KPCs and derive the argument of the coefficient function.The summation of the KPC series is done in sec.4. To perform the summation, I first sum the series of twist-two terms for a general TMD correlator (sec.4.1) and then insert it into the factorized expression (sec.4.2).In sec.5, I inspect the properties of the summed expression, using the example of a completely unpolarized contribution to hadron tensor of Drell-Yan.The subsections 5.1 and 5.2 are devoted to the EM gauge invariance and the frame invariance, correspondingly.Finally, the estimation of numerical impact is presented in sec.5.3.

Twist decomposition in TMD factorization
The definition of TMD-twist is a vital for systematisation of power corrections.Given this definition, it becomes possible to decompose operators of any dimension into irreducible components and express their matrix elements in terms of universal and independent nonperturbative TMD distributions.A suitable definition of TMD-twist was proposed in reference [22].TMD-twist is determined by a pair of geometrical twists associated with the semi-compact operators that constitute any TMD operator.This approach strictly ensures non-mixture of operators with distinct TMD-twists under evolution with the ultraviolet (UV) renormalization scale µ.Although the nonmixture under the rapidity evolution remains questionable, this property has been demonstrated in [22] for a wide range of TMD distributions (given by quasi-partonic operators), which encompasses all TMD distributions of twist-two and twist-three.Furthermore, the equivalence of rapidity evolution between TMD distributions of TMD-twist two and three was established in [23].Therefore, this definition is satisfactory for the current objective, namely the determination of KPCs for the LP term, as they solely include TMD distributions of twist-two.
In this section, I provide a review of the general definition of TMD-twist and present the simplest yet non-trivial example of twist-decomposition.This example serves as the foundation for the subsequent computation of KPCs.

TMD-twist
At any power of TMD factorization theorem TMD distributions have the same general structure, where U is a light-cone operator, labels A and B indicate the quantum numbers of U , and {z} A is a set of coordinates z.The operator U ({z} A ; b) is a T-ordered product of QCD fields positioned at (z i n + b) for z i ∈ {z} A .The fields in the operator are connected by light-like Wilson lines, which continue to the light-cone infinity.So, the operator U spans an infinite range, and for that reason it is called semi-compact.
The renormalization of TMD distribution consists of two renormalization factors -one for each operator U .These factors are independent since the distance between operators is transverse and thus UV finite.The evolution equation reads where γA is the anomalous dimension of U A and ⊗ is the integral convolution in corresponding positions {z i }.Note, that anomalous dimensions γ contains the double logarithmic part, which is resulted from gluon propagated along Wilson line to the infinity (collinear singularity).The same singularity is presented in the rapidity renormalization factor (or soft factor).Their cancellation leads to a logarithm contribution ∼ ln(µ 2 /ζ) in the anomalous dimensions.The equation (2.2) describes only the UV evolution, while TMD distributions also obey rapidity evolution equation (2.26).Currently, there is no general consensus regarding the rapidity divergences of operators that arise at higher powers.The only established result is that all quasi-partonic TMD operators conform to the leading power (LP) rapidity evolution [22].Thus, it is plausible that the present definition of TMD-twist is incomplete, and some operators could potentially mix due to the rapidity evolution.However, this hypothetical situation would only apply to non-quasi-partonic TMD distributions, which have a minimum twist-four and emerge at NNLP for the first time, and thus, are irrelevant for the present discussion.
The light-cone operators U A and U B have independent anomalous dimensions if they have different geometrical twist (defined as "dimension-minus-spin" of the operator).Consequently, TMD distributions Φ AB and Φ A B do not mix with each other if the geometrical twist of U A or U B is different from the geometrical twist of U A or U B .It gives raise to a natural definition of TMD-twist for Φ AB , as a pair of integer numbers (N, M ), where N (M ) is a geometrical twist of semi-compact operator U A(B) .
Formally, the definition of geometrical twist is defined only for local operators.Its generalization for semi-compact operator can be done by the following procedure.First, one selects L such that {|z|} < |L|, and drop the part of Wilson line from Ln to infinity.Than the operators is expanded as series at L. It takes a form U A (z, 0) ∼ n (z − L) n U A,n , where U A,n are local operators.The geometric twist of local operators U n is the geometric twist of U A .After the decomposition of operator into components with the same twist, each component can be summed over n.Finally, the limit L → ∞ is taken.This method properly reconstructs the properties of semi-compact operators, as it is demonstrated in ref. [29].
An indication of twist for TMD operator by a single number (used so far), refers to (N + M ).For example, it was used in the expression (1.3), where Φ n = Φ M N with N + M = n.Meanwhile, all TMD distributions of twist-two have TMD-twist (1, 1).However, the single-number terminology is ambiguous beyond the twist-two case.For example, a TMD distribution of twist-three can have TMD-twist (1, 2) or (2, 1), which are two entirely independent distributions with separate evolution equations (see ref. [26]).TMD distribution of twist-four can have TMD-twist (1, 3), (3,1) or (2, 2), and so on.Still, the single-number indication is shorter and convenient to use if it does not create a confusion.

Example of twist-decomposition
As an example of twist decomposition let me consider the simplest semi-compact operator that appears in TMD factorization.It is a quark field with an attached semi-infinite Wilson line (2.4) As usually, the vectors n µ and nµ are two independent light-cone vectors with normalization The decomposition of any vector v µ reads where v + = (nv), v − = (nv) and v T is the transverse component orthogonal to (n, n)-plane v µν T = g µν T v ν with (2.7) Following the procedure described above the operator (2.3) can be presented as where D µ = ∂ µ − igA µ is the QCD covariant derivative.The local operator D µ1 ...D µn q i has the mass-dimension n + 3/2.It is a Lorenz tensor of mixed nature with n vector indices and one spinor index.The maximum possible spin of this tensor is n + 1/2.To archive it, one should symmetrize indices µ, subtract traces (in the present case, it automatically achieved by contraction with n µ1 ...n µn ), and make the spinor index γ-traceless [30], i.e. such that γ µj D µ1 ...D µj ...D µn q = 0 (for a more formal discussion see refs.[31,32]).For the present (contracted with n µ1 ...n µn ) tensor it implies that the maximum-spin operator should vanish after multiplication by γ − .The spinor q decomposes as q = ξ n + η n, where (2.9) Here, the subscript n indicates that these components are defined with respect to γ − .The component ξ is known as a "good" component, and η as a "bad" component [33].Since γ − ξ = 0, the operator D n + ξ has the maximum spin, and its geometrical twist equals one.Summing the series over n and limiting L → −∞, the twist-one part of (2.3) is (2.10) The remaining part of U q is [−∞n, 0]η n(0).It is twist-two, but the field η is not dynamically independent.The components η and ξ are related to each other via the QCD equation of motion (EOM).The EOM is / Dq = 0 for massless quark.After projection (2.9) EOM splits into two equations, which are convenient to write as Inserting the left equation into eqn.(2.8) and commuting the transverse derivative to the outer position one gets where is the gluon field-strength tensor.The subscript on a gamma-matrix indicates that its index is transverse.The summing over n gives The spin of the last term is n − 1/2, since it is traceless and γ-traceless, and has all, except one, indices symmetrized.Thus, the last term is twist-two operator.
The final expression for twist-decomposition of U q can be assembled.Let the transverse component of the gluon field vanish at light-cone infinity2 , so lim L→∞ D µ (L) = ∂ µ .Then one receives where U 1 is the twist-one operator (2.10) and U 2 is the twist-two operator (2.15) The operator U q is decomposed into operators of twist-one (first term) and twist-two (last term), and the total derivative of twist-one operator (second term).The procedure presented above is standard for analysis of higher-twist collinear operators, see e.g.examples in refs.[35][36][37].The only difference in the present case is the open spinor index, which should be extra treated in the tensor decomposition.Note that the "good" and "bad" components has also different counting with respect to λ ∼ k T /k + .Assuming that a quark field has large k + and almost massless k 2 ∼ 0, one finds that η n ∼ λξ n from EOMs (2.11).This counting is convenient to use for sorting operators in the power series, and it is widely used.

Twist-decomposition for bi-quark TMD distribution
The procedure of twist-decomposition for semi-compact operators leads to the decomposition of TMD matrix elements to definite twist parts.Let me demonstrate it for the bi-quark TMD matrix element, which is also important for derivation of KPCs.The matrix element reads where |p is a hadron state with momentum p, b is a transverse vector and U = U † γ 0 .Being contracted with n µ this matrix element results into the unpolarized and Sivers TMD distributions that are twist-two TMD distribution.For a general γ µ , the matrix element Φ [γ µ ] does not represent a conventional distribution, but is a sum of distributions with different properties.It can be checked by computing the renormalization of matrix element (2.16) and observing that it could not be expressed via Φ [γ µ ] (see f.i.computations in refs.[26,38]).The reason is that the operator (2.16) has an undefinite twist.
The momentum space representation of TMD distribution is obtained by Here and in the following, I use the (un)tilded notation to indicate the distributions in the (momentum) position space.The momentum space representation of (2.19) is The derivatives over b turn into powers of k T .
It is also critical to observe that total derivatives of semi-compact operators does not always result to the definite-twist TMD distributions.It works out for total derivatives ∂ + (which produces factors of x in the momentum space), and ∂ µT (which produces factors of k µ T ).However, total derivatives ∂ − cannot be moved outside of the operator.Nonetheless, they could be always expressed via ∂ + , ∂ µT and extra gluon fields, thanks to EOMs (2.11).The example is given in sec.3.2.
The expression (2.22) can be written as with k µ = xp + nµ + k µ T + n µ k − being the four-momentum of the parton.This respresentation gives rise to a interpretation of the leading twist distributions, as the part of the hadron's structure carried by the free-quark approximation.In the sec.4.1 I provide a further generalization of eqn.(2.23).
The twist-decomposition is a standard procedure for consideration of power corrections for factorization theorems, and the expressions (2.14) and (2.19) are rather typical for such kind of calculus.Many analogies can be found in the literature related to the collinear factorization.For example, the expressions (2.19) and (2.22) are ideologically analogous to the Wandzura-Wilczek approximation for collinear distributions [39].In the present work, the main interest is the terms with total derivatives of the leading twist operator.These terms are responsible for generation of KPCs.It is analogous to the case of Deeply-Virtual Compton Scattering (DVCS) and Generalized parton distributions (GPDs) [40,41].
Note that for the operator with a differently oriented light-like Wilson line, the vectors n and n should be swapped.For example, for decomposition for the operator [−∞n, 0]q(0) is the same as (2.14) but with n ↔ n.The "good" and "bad" components are defined as (2.24) The direction of Wilson line (future vs. past) does not play a role in this example.
For the following discussion, it is important to specify the evolution equation for twist-two TMDPDFs.these equations are [12,13] where Γ cusp , γ V and D are the cusp, quark-vector and rapidity anomalous dimensions, correspondingly.The rapidity anomalous dimension is also known as the Collins-Soper kernel [42].It is a nonperturbative function that is associated with the properties of QCD vacuum [43].In the momentum space the equation (2.26) turns to the integral equation [44].
3 KPCs for LP term TMD operators of twist-two represent possess a crucial property of incorporating only two good components of fields (quark and anti-quark, or two gluon fields).This simplifies the procedure of their extraction from higher-dimensional operators and allows for the unambiguous elimination of higher twist contributions at early stage of computation.As a result, the determination of KPCs for the LP term can be achieved at all orders of the power series, as presented in this section.For concreteness, I focus on the Drell-Yan process where the arguments denote the momenta of the respective particles.I start with the schematic review of the derivation of LP term, and then generalize this example including KPCs.

TMD factorization at LP
The derivation of the LP term of the TMD factorization theorem has been extensively discussed in numerous references employing various formulations.Here are only some of them [9-11, 15-17, 22, 45].In this section, I provide an overview of the derivation from the perspective of operator manipulations, similar to the TMD operator expansion, Soft-Collinear effective theories or High-Energy expansion.Emphasis is done on the technical aspects that are relevant for further explanations.I omit detailed proofs that can be found in the literature.
The QCD part of the cross-section for a hard reactions is given by the hadronic tensor.For the Drell-Yan process (3.1) the hadronic tensor reads where J µ is the electromagnetic (EM) current with q being the quark field.The quark charges and flavor part of the current are omitted for brevity and restored in the following sections.The kinematics of the process is defined by the photon momentum q µ and the hadrons momenta p µ 1 and p µ 2 .To avoid complications related to the target mass corrections, the hadrons are considered as massless p 2 1 = p 2 2 = 0.These momenta introduce the natural system of vectors n and n (2.5) (3.4) The leading perturbative order contribution to the hadronic tensor of Drell-Yan process (3.2) is shown in fig.1(A).This diagram and its charge-conjugated are convenient to write as where dots represent the sub-leading terms, and The fields q n (q n ) are created by |p 1 (|p 2 ) states, and, according to the parton model, have momentum almost along p 1 (p 2 ).Usually, fields q n and q n are referred as collinear and anti-collinear fields.At LP approximation the states |p 1 and |p 2 do not interact with each other directly.The complete set of states in-between currents is omitted for simplicity.
To proceed further, one should specify the counting rules for the elements of equation (3.5).For the TMD factorization the counting rules are [10,11,22] where λ ∼ Q −1 .The components are presented in the order {+, −, T }.In the regime q ± ∼ Q and q T ∼ Q 0 the vector y satisfies the counting The inhomogeneity of counting for components of y µ is the only conceptual difference between collinear and TMD factorizations.
The counting rules implies that the interactions between fields along certain directions are power-suppressed.The field must be expanded around these directions 3q n(y) = q n(y − n + y T ) + y + ∂ − q n(y − n + y T ) + ... , (3.9) where the first term is ∼ λ 0 , the second term is ∼ λ 2 , and the dots indicate the further suppressed terms.Importantly, there is no expansion in the transverse direction, because y µ T ∂ µ q n ∼ λ 0 according to (3.7) and (3.8).Meanwhile, the transverse derivatives that appear in other parts of computation (e.g. in the loops) and that are not accompanied by y T , produce power suppressed terms.
In addition to the diagram A, one should take into account the diagrams that are of the same order of power counting.These are diagrams with radiation of collinear components of gluon, such as diagram B in fig. 1. Accounting of all such diagrams restores the QCD gauge invariance and equips the collinear and anti-collinear quark fields by half-infinite Wilson lines [34,45].Practically, it is convenient to impose the light-cone gauge for the background field.The gauge can be fixed for each collinear sector such that gluon fields with a ∼ λ 0 power counting vanish, i.e.A n,+ = 0 and A n,− = 0.The detailed justification of such choice can be found e.g. in ref. [22].For the present work, it is enough to know that this set of gauges provides the correct factorization.
Dropping the power-suppressed terms in eqn.(3.9), one get the hadronic tensor in the form The central assumption of the parton model is that high-energy hadron consists of fields with corresponding collinear counting rules.Therefore, we can sort the fields with different collinearity into separate matrix elements.To do so, one needs to recouple the spinor and color indices, which could be done using the completeness relation of Dirac matrices for any matrix A. Here, Γ is a complete basis of Dirac matrices.The result reads where the common minus sing appears from the anti-commutation of the quark fields, and the factor 1/N c is due to the recoupling of color indices.The matrix elements are These distributions are Fourier images of distributions Φ (2.16) over the variables x.In expressions (3.13, 3.14) the complete set of states in-between quark fields can be omitted because all distances are space-like.The expression (3.12) still contains the power corrections the TMD distribution Ψ [γ µ ] is a mixture of terms with different powers, as it is follows form eqn. (2.17), where the first line is ∼ λ 0 , the second and fourth lines are ∼ λ 1 , etc. Therefore, the pure LP term is obtained by eliminating all, except first, terms in eqn.(2.17).It can be done by selecting only those Γ-matrices that project only "good" components.For the collinear part these matrices are with α being transverse index.The corresponding elements of the basis For the anti-collinear sector one should replace n ↔ n.After restriction of Γ-matrices to the set (3.15) the hadronic tensor contains only the LP contribution.Finally, one passes to the momentum-fraction representation for TMD distributions and integrates over y ± .The result is Tr where y T is renamed as b, to much the standard notation.The expression (3.17) is the celebrated expression for the LP TMD factorization.The TMD distributions with negative x are related to the anti-quark distributions.This is a complete derivation of LP TMD factorization at the tree order.To go beyond LO, one should include interaction diagrams.Some examples are presented in fig. 1.The diagrams of type B (with a non-collinear gluon) or C do not contribute to LP factorization.Indeed, any addition interaction with the hadrons (such as fig.1B) increases number of fields in the operator, and hence increases its dimension.The only exception is the radiation of collinear gluons, which is already accounted in the LP term.Therefore, at LP, allowed interactions are possible only in-between the hard fields.Even so, some hard interactions are power suppressed, due to inhomogeneous counting rules for y.Indeed, the diagram 1C is ∼ 1/b 2 in the position space, which leads to ∼ q 2 T /Q 2 in the momentum space 4 .Therefore, the perturbative correction at LP are given by the interactions in the vicinity of currents, such as the diagram 1D.As the result, the coefficient function is the product of coefficient functions for each current.It is in the one-to-one correspondence with the TMD-twist idea.Each semi-compact operator produces independent UV renormalization factor, that cancels the infrared poles of a corresponding part of coefficient function.This structure of pole cancellations is preserved at all power of TMD factorization.
Finally, one should also take into account that separation of collinear and anti-collinear modes is ambiguous.For the small values of momenta the collinear and anti-collinear sectors overlap.There are several methods to avoid it.The double-counting of modes can be subtracted by means of the soft factor [9,49] or rapidity renormalization factor [60], or modes can be defined with explicit cut parameter that prevents the overlap [22,50].The results coincide 5 .This procedure leads to an extra scales ζ and ζ, and corresponding evolution equations (2.26).

KPC for LP term at all orders
The extension of the TMD factorization beyond LP is conceptually straightforward (although very complicated technically).One should systematically expand the interaction vertices in the background fields, and sort operators over the twists and power counting.Detailed discussions are presented in refs.[19,22].Performing this procedure, one receives the structure (1.3).In the following, I concentrate on the KPCs to the LP term.They are much simpler than other other types of power corrections, because they incorporate only the TMD distributions of twist-two.Let me consider the series of power corrections and extract only the twist-two component from it.This component produces the complete series of KPCs.
The key observation is that the TMD distributions of twist-two are given by only two-point operators.Indeed, insertion of any extra field increases the dimension and, thus, the TMD-twist by at least one.The geometrical-twist-three operators are all three-point operators.Such a simple rule (twist ∼ number of fields) does not hold for higher-twist operators starting from twist-four, which could be three-point or four-point operators [51].Still, one can formulate a rule: any two-point operator can contain geometrical twist-two part, but any three-point and higher-number-of-points operators necessarily have geometrical twist higher than two.It implies that any diagram with three or more particles in a single collinear sector (such as diagram B) does not contribute to KPCs for LP term.Consequently, the diagram A contains full information about twist-two part at all powers at LO in perturbative expansion.
Inspecting the LP derivation in sec.3.1, one finds that the power corrections were dropped in two places: the multipole decomposition (3.9) and selection of "good" components of fields.If one does not perform these approximation the result will contain full information on KPCs.Lets release these approximations.
Starting with eqn.(3.5) and performing the multipole expansion one obtains Here, the dots represent contributions with larger number of fields.The LP term (3.10) is at n = m = 0. Recouping of spinor indices gives where the I use the notation (2.3) to highlight the semi-compact operators.Note that operators in the matrix elements with p 1 (p 2 ) are (anti-)collinear.In contrast to the LP term, there are no constraints on Γ-matrices, since any Dirac component could produce a twist-two contribution, see example in eqn.(2.19).
The semi-compact operators in eqn.(3.19) contain the mixture of TMD distribution, and should be decomposed over the definite twist contributions.For the pure U q operators the decomposition is given in eqn.(2.14).However, the "improper" derivatives of U q (such as ∂ − q n) demolish this decomposition because they could not be presented as total derivatives of distributions.This issue is resolved by application of EOMs (2.11).The analysis of these operators is complicated in the general case, but essentially simpler if one seeks for only twist-one contribution.In this case, all gluon fields in EOMs could be dropped (since gluon-fields necessarily turn one-point operator to a higher-number-of-points operators), resulting to a simple rule where dots represent the terms with two or larger number of fields.Here inverse derivatives are understood as the integral, alike (2.13).Basically, the equation (3.20) tells that at the leading-twist approximation partons are free massless fields.The higher powers of ∂ − are After these transformations, the fields ξ could be promoted to the semi-compact operators U 1 , by equipping them by semi-infinite Wilson lines, as it is demanded by the QCD gauge invariance.
As the result of these manipulations, one obtains a long expression where the argument of Ψ n11 is (y − n+y T ), and the argument of Ψ n11 is (−y + n−y T ).This expression contains all twist-two terms of TMD factorization that appear at LO in perturbation theory.The eliminated terms have higher twist.
Finally, one integrates over y ± and obtains , where arguments of the TMD-distributions are (x, b) for Φ n11 and (x, b) for Φ n11 .The derivatives ) are resulted from the integration over y − (y + ).Herewith, distributions Φ n11 are quark distributions and Φ n11 are anti-quark distributions.The sum q,q indicates the addition of the term with quark and anti-quark distributions exchanged.This expression is valid for any polarizations.The example of unpolarized cases are given in sec. 5.In particular, the LP expression is given in eqn.(5.8),NLP in eqn.(5.9), and NNLP in eqn.(5.10).The NLP part of eqn.(3.23) exactly reproduces the KPC-part of the full NLP expression derived in ref. [19,22].
It is complicated to investigate the properties (3.23) in the general form.However, one can confirm that the expression (3.23) is exactly EM-gauge invariant, The truncated at N n LP order part of series (3.23) is not transverse, but violates EM-gauge invariance up to order N n+1 LP.The exact restoration involves all terms.Moreover, it could not (or at least it is not natural) be split into parts that are independently gauge-invariant.The terms with ∂ x and ∂ x play important role in the restoration of gauge invariance, since they (after integration by parts) turn q ± to p ± , which is required for the cancellation between different orders.The expression (3.23) is also frame-invariant, although it is complicated to confirm.These invariances are discussed explicitly with the unpolarized example in sec.5.1 and sec.5.2, and they are obvious in the summed form (4.17).

KPCs beyond LO and the argument of the coefficient function
In the present approximation (solely twist-two TMD operators, neglecting q T /Q corrections) the perturbative corrections comes only from the corrections to the EM currents.Indeed, an inclusion of a non-collinear external gluon increases the twist of operators, and the in-between currents interaction (alike fig.1C) induces q T /Q corrections.Therefore, one can consider perturbative corrections for the currents J nn and J nn (3.6) independently.The NLO LP computation the coefficient function is performed in multiple works, see f.i.refs.[9-11, 22, 52].Presently, it is known up to N 4 LO accuracy (4-loop) [53,54].The NLP NLO computation has been done explicitly in ref. [22], and the result coincides with LP.It is a non-trivial check since, intermediate expressions and the algebra of NLP computation are different from those at LP (see the detailed discussion in sec.6 of ref. [22]).
As it is explained above, the coefficient function of KPC must be the same as the LP coefficient function.In this section I would like to confirm it by explicit computation.The perturbative computations beyond LP are much simpler in the position space.It is because Feynman propagators in the position space are free of external momenta, which appear only in the numerator of diagrams.Therefore, one should not worry about increasing singularity of power suppressed parts of integrals as it happen in the momentum space.The explanation of convenient technique for loop-computation with multiple examples can be found f.i. in refs.[22,36,55,56].
The NLO correction for EM current is given by the diagram shown in fig. 1 D. In the position space, this diagram (here for J nn (0)) reads where C F = (N 2 c − 1)/(2N c ) and d = 4 − 2 is the parameter of the dimensional regularization.This expression incorporates operators of all powers and twists.The power expansion is obtained from the decomposition of external quark fields over "good" and "bad" components, and their expansion in the vicinity of the collinear direction.Importantly, the power expansion can be done at the level of the integrand (since integral is convergent due to the dimensional regularization).In this way, the only complications of higher-power computation is the increasing (from power to power) size of the numerator.
The LP and NLP computation has been carried out in ref. [22].The result is convenient to write in the mixed momentum-coordinate representation where and k 1 and k 2 are the momenta of quark fields qn and q n (in the position space they appear as derivatives).The terms (3.26) and (3.27) provide the coefficient function for the LP and NLP parts EM-current.In this way, one confirms that the coefficient function of KPC at NLP is the same as for LP term (at least at NLO).The coefficient C( ) contains infra-red divergences (since J µ is a conserved current, C( ) is UV finite).These divergences are compensated by the UV renormalization factors of operators U 1 .These factors are Z U 1 for U 1 (anti-collinear) and Z † U 1 for U 1 (collinear).These factors contain collinear divergence, which is removed by the analogous divergence in the rapidity renormalization.The LO expression for Z U 1 is where ζ is the scale of the rapidity renormalization.Performing the renormalization procedure, one obtains where a s = g 2 /(4π) 2 .Importantly, for the cancellation of poles one must impose ζ ζ = (2q + q − ) 2 , where it is used that k + 1 = q + and k − 2 = q − due to the δ-functions (3.17).The coefficient function where L = ln(−2q + q − /µ 2 ).The coefficient function is multiplicative.The product of current has the coefficient function where it has been taken into account that q + q − > 0.
Let me emphasize that the argument of the coefficient function is 2q + q − .This is the result of the formal computation in the factorization approach (see e.g.[22,52]).Often, the argument of coefficient function at LP is replaced by q 2 for simplicity.However, it is not entirely correct since q 2 = 2q + q − + q 2 T has an indefinite power counting.The NLO computation can be easily automatized (necessary integrals in general form are given in appendix B or ref. [22]).Using the package FeynCalc [57], I have generated the first few powers 6 .They are and so on.For the convincingness, the computation has been performed up to N 10 LP, and it reproduces the presented pattern.At the first glance expressions (3.33) demonstrate the violation of the factorization theorem.Indeed, naively, one should expect to obtain the effective current (3.35) multiplied by C( ).I.e.all terms except the first for diag N 2 LP should vanish.Instead, the expressions contain the new terms, equipped by different coefficients.It is especially confusing that these new terms have different power of singularities.Due to the latter, these terms cannot be renormalized by Z U 1 's (3.30), and the factorization theorem seems to be violated at NNLP and higher.
Further inspection of corrections (3.33) reveals a certain pattern.The all-power series (3.33) can be rewritten as where is the twist-two part of the EM current.Therefore, the coefficient function for all KPCs is the same, but has a different argument.The argument of the coefficient function is instead of 2q + q − , that appear at LP and NLP cases.The cancellation of poles also imposes The coefficient function C 0 (X) can be used only as the whole, because any strict fixedpower expansion would lead to the non-cancellation of poles between infra-red poles of coefficient function, and UV poles of twist-one operators.To my best knowledge, it is the first example of accumulation of power corrections into the argument of the coefficient function in factorization theorems.In sec.4.2, it is shown that X 2 turns to Q 2 once KPCs are summed.
In the position space the variables k 1,2 are represented by the differential operators.Therefore, in the position space, the coefficient function (3.34) is a differential operator in the transverse space.That is the main reasons to switch to the momentum space in the following discussions.
Another contribution that reveals only beyond LO is the restoration of the boost-invariance.The boost-invariance demands that the factorization theorem is independent of the values of ζ's as far as ζ ζ = X 2 (this can be demonstrated using different approaches [58][59][60]).Therefore, the factorized expression should be invariant under The restoration of the boost-invariance happens with the accounting of the higher-twist terms.The higher-twist distributions appears in the factorized expressions with the divergent convolution.These divergences are called the special rapidity divergences [26].They are somewhat similar to the end-point divergences in the heavy-quark factorizations [61,62], but have a different nature, and are rapidity divergences according to the general criterion [60].The leading contribution of these divergences is proportional to the derivative of the Collins-Soper kernel and twist-two TMD distributions.In the sum of all terms these divergences cancel, and leave the remnant proportional to ln(ζ/ ζ), which restores the boost-invariance.Practically, it leads to the replacement and analogous for Φ n.Here, The mechanism of special rapidity divergences is well-understood at NLP, [24,25], but not beyond.The explicit check requires the computation with increased number of loops for each next power.The terms ∼ (∂ T D) n , which required at n'th order, are O(a n s ) in perturbative order and could be obtained only with n-loop computation.Also, one need the complete higher-twist structure to extract the divergences.Such computation goes far beyond the present work 7 .Nonetheless, it seems save to extend the NLP case to all powers and declare that the complete expression for KPCs (3.23) requires the replacement (3.38).This replacement restores the boost-invariance (3.37) at each order of power expansion independently.
In this section, I have demonstrated at NLO that the coefficient functions for all KPCs are the same.It is in the full agreement with the expectations, since KPCs must reproduce LP term, in order to preserve EM-gauge invariance (discussed in sec.5.1).However, it appears that the coefficient function has the mixed-power argument (as shown in eqn.(3.34)).Also, I argue that the transverse derivatives must be replaced by the boost-invariant transverse derivatives (3.38).Equipped by these modifications the expression (3.23) represents the all-power part of the TMD factorization that contains solely twist-two distributions, without q T /Q corrections.

Summation of KPC series
The series of KPCs can be summed to a simpler expression.The direct summation of expression (3.33) is possible but very tedious.The main complication is the derivatives over x and x which produces generating functions with complicated argument (see (5.29)).The simpler way is to sum the multipole expansion separately for each collinear sector before the tensor decomposition.It leads a simple and intuitive picture of KPCs as the scattering of free massless partons.The details of computation are presented in this section.

Twist-two part of a general TMD correlator
The procedure of the assembling fields into matrix elements and the multipole expansion are commuting operations.Due to it, one can collect the fields of different modes into matrix elements already in eqn.(3.5).The result is (compare with (3.12)) where Ψ(y) are the generalization of (3.13) for a four-dimensional argument, These matrix elements are alike ordinary TMD distributions, but with a generally-valued separation between semi-compact operators.The direction of Wilson lines is preserved.The expression (4.2) is not a factorization theorem, but only an intermediate expression that is used to sum KPCs.The functions Ψ should be understood as the generating functions for the series of power corrections (3.19).I.e n (y − n + y T )], (4.4) 7 I have checked explicitly that the cancellation of special-rapidity divergences between twist-three part of the NNLP terms, which are indicated as DΦ 2 × Φ 3 and Φ 3 × Φ 3 in the expression (1.3).These terms are relatively simple to obtain using NLP factorization.I have found that their one-loop special rapidity divergences produce the terms required for restoration of boost-invariance.The performed check in incomplete (and for that reason is not presented here explicitly), because it was done using only one-loop computation and caught only the terms ∼ ∂ T D, but does not catch terms ∼ (∂ T D) 2 . where and similar for Ψ n .Substituting this expression into eqn.(4.1) and making the twist-decomposition one gets eqn.(3.19).
The function Ψ is an infinite series of TMD distributions of different twists.The twist-two part of (4.4) can be extracted term-by-term using EOMs (2.11, 3.20) in the same way as it was done sec.3.2.The result is convenient to write in the following form where The dropped terms are the distributions of twist-three and higher.Note that on the level of the cross-section the derivatives turns to the boost-invariant derivatives (3.39), and thus the expression satisfy the leading-twist TMD evolution equation ( The summation of series (4.6) is straightforward in the momentum space, where it reads The sum over n produces the exponent exp(−iy + k 2 T /(2xp + )).Introducing an auxiliary variable k − = −k 2 T /(2xp + ), one can rewrite (4.8) in a compact form where the factor 2k + is the Jacobian for δ(k 2 ) and all scalar products are four-dimensional.The same expression is valid for Ψ n after the replacement of n ↔ n.The twist-two part of a general TMD distribution is related to the matrix element of the free massless parton.In fact, this statement is correct for leading-twist (twist-two) distributions of any kind.For example, in the case of PDF the equivalent derivation can be found in ref. [36].
The hadron tensor with inclusion of all KPCs is given by eqn.(4.1) with replacement Ψ → [Ψ] tw-2 .For the particular elements of Dirac basis one finds One can see that the TMD correlators are divergence-free and satisfy Laplace equation These equations are identical to the equations for twist-two part of collinear operators [32,36,63].

Hadron tensor with summed KPCs
Substituting expressions (4.10 -4.14) into (4.1)one obtains the hadron tensor with resummed KPCs.The general expression reads , where In this expression, the product of two distributions should be understood as following where in the second term the quark and anti-quark TMDPDFs are exchanged.The collinear momentum fractions are denoted by ξ in order to distinguish them from the ordinary Drell-Yan momentum fractions x.Note, that in eqn.(4.17) all indices and momenta are four-dimensional.The argument of the coefficient function is restored to Q 2 .It is due to the δ-functions in the integrand of eqn.(4.17) which allow to rewrite X (3.36) as The expression (4.17) has the same hard-coefficient function as the LP TMD factorization, and the TMDPDFs are the usual TMDPDFs, with the standard evolution equations.The main limitation of eqn.(4.17) is that the rapidity scales are taken in the symmetric point, ζ = ζ = Q 2 .It is necessary in order to remove the terms proportional to ∂ µ D ln( ζ/ζ).Otherwise the summation of KPCs is not possible.However, if required, these terms could be restored infinitesimally.Practically, the restriction ζ = ζ = Q 2 is not essential, since all phenomenological applications are done at the symmetric point.
The expression (4.17) is the TMD factorization with resummed KPCs and the main result of this work.The dropped terms either contain TMD distributions of higher-twist, either proportional to q T /Q.Being expanded at large-Q W µν KPC reproduces the ordinary fixed-power TMD factorization power-by-power.
The hadron tensor W µν KPC is exactly transverse to the vector q µ (5.12).It is also frame-invariant, up to the corrections which come due to the selection direction of the Wilson lines.The later follows from the the initial assumption on the momenta counting rules for partons.These counting rules also receive power corrections and modify the definition of TMD distributions.The resulting factorization theorem does not depend on the definition of vectors n and n.The inclusion of these corrections goes beyond the present work, and will be performed in the future.

KPC for the unpolarized Drell-Yan reaction
To exemplify the general structure derived above, let me consider the case of the sphericallysymmetric contribution to the unpolarized Drell-Yan.In this case, the only8 contributing TMD-PDFs are unpolarized TMDPDFs f 1 .They are defined as where dots indicate the omitted Sivers function.In this case the series of KPCs (3.23) reads where X is given in eqn.(3.36).The summed expression (4.17) turns to The same expression could be obtained computing the Drell-Yan reaction with free massless quarks produced by TMD distributions.In other words, this result exactly reproduces naive parton model.However, it is not naive, because it is known which terms of factorization series were dropped and thus, it could be systematically improved.
For the examples discussed in this section, it is instructive to have the first three terms of the standard power expansion in explicit form.Let me denote the terms of expansion as where W n ∼ Q −n in power counting.Explicitly, these terms are where x 1 = q + /p + 1 and x 2 = q − /p − 2 .The rapidity-scaling arguments for the LP expression are , and for the NLP they are f 1 (x 1 , k 1T ; µ, 2q + q − )f 1 (x 2 , k 2T ; µ, 2q + q − ).The argument for the coefficient function at LP and NLP is 2q + q − /µ 2 .For the (pure) NNLP term one could not specify rapidity scales and the argument of the coefficient function without introducing higher-power terms as it is discussed above.
In the position space expressions for the first three power are where arguments for TMDPDFs are omitted completely for the NNLP case.The operator D is the "long" derivative defined in eqn.(3.39).In contrast to the momentum-space expressions (5.5 -5.7), the position space expressions are boost-invariant (3.37), and the rapidity-scaling arguments are ζ and ζ.The expressions (5.8 -5.10) can be compared to the literature.Naturally, the NLP part agrees with the one computed in ref. [22].Another computation was performed in refs.[19,21] using the small-x approximation.These results can be directly compared to eqn.(5.2).I have found that the LP and NLP parts (5.8, 5.9) agree with ref. [19], but the NNLP part disagrees.The same holds for the double Boer-Mulders term, which is not presented here.The hadron tensor derived in refs.[19,21] is transverse exactly at NNLP, while (5.2) is transverse in the sum of all powers.Possibly, it is due to the differences in the definition of twist-four terms.

Restoration of EM gauge invariance
The EM gauge-invariance (or electric-charge conservation) imposes ∂ µ J µ = 0, (5.11) which on the level hadronic tensor turns to q µ W µν = W νµ q µ = 0. (5.12) This statement is exact in QCD.Nonetheless, EM gauge-invariance is violated at each power of TMD factorization.It is a consequence of the fact that the relation (5.12) does not have a definite power counting, since it contains a sum of q ± and q T .The EM gauge-invariance is restored by the power corrections.Moreover , it is restored for each terms with unique nonperturbative content by a the series of KPCs.
Let me demonstrate the restoration of EM gauge invariance using the unpolarized DY example in the momentum space 9 .Contracting the LP term (5.5) with q µ , one obtains So, the EM gauge-invariance is violated, but the violation goes beyond the LP approximation, because the left-hand-side(LHS) of the formula is ∼ Q 1 while the RHS is ∼ Q 0 .The inclusion of the NLP term (5.6) leads to The RHS expression (5.14) is of order ∼ Q −1 , i.e.NNLP in comparison to the LHS.The LP violation term (5.13) is canceled by the NLP violation term, but leaves an NNLP remnant.The twist-three contributions are not presented here, but it could checked using expression in ref. [21][22][23] that they also result into ∼ Q −1 terms.The NNLP remnant in eqn.(5.14) is than cancelled by NNLP (5.7) term: Here, the RHS is of order ∼ Q −2 , i.e.N 3 LP in comparison to LHS.This N 3 LP violation term is to be canceled by the corresponding part of q µ W µν 3 , and so on.In this way, EM gauge invariance is restored only in the sum of all powers of TMD factorization, and violated at any fixed power.A truncation of the series at power n results into the violation of EM gauge invariance at order 1/Q n+1 .This mechanism is standard for the of factorization theorems with inhomogeneous counting of q µ components.Another very well studied example is DVCS, see refs.[40,41,64,65] for detailed discussion.
Importantly, the restoration of EM gauge invariance takes place independently for the terms proportional to twist-two TMDPDFs, and twist-three TMDPDFs.This is due to the proper definition of the TMD-twist.With the present definition TMD distributions of different twists do not mix with each other, and are entirely independent nonperturbative functions.Each independent combination of TMDPDFs forms its own series of KPCs that restores the EM gauge invariance.
The complete series (5.2) is exactly EM-gauge invariant.The check is straightforward, one should only take into account that q ± turns to xp by δ-function and does not commute with the derivative.All terms of expression (5.2) participate in the restoration, which is in contrast to the small-x-based derivation made in ref. [21], where NNLP term already completes the gauge-invariant sequence.The summed expression (4.17) is also EM-gauge invariant since q µ = k µ 1 + k µ 2 and because k 2 1,2 = 0 due to the δ-functions.The part ∼ t µν in (4.17) is also transverse.The cancellation between successive terms takes a place only if their hard coefficient functions are identical.This has been checked up in sec.3.3 explicitly at NLO, but the gauge-invariance guaranties the equivalence at all perturbative orders.The EM-gauge invariance does not determine the coefficient function of other genuine contributions, since they have independent nonperturbative content.

Restoration of frame invariance
Another important symmetry that is violated by the factorization approach is the Lorenz invariance.In the factorization theorems approach it is often called the frame-invariance or reparameterization invariance.It has been intensively studied for the case of collinear factorization for DVCS, see refs.[40,66], and also in the soft-collinear/heavy-quark effective fields theories, see refs.[67][68][69].Meanwhile, I do not know any dedicated discussion for the TMD factorization case.
The frame-or reparameterization-invariance is based on the fact that the field-mode separation depends on the directions n and n, which, in turn, defined only approximately.In other words, the counting rules (3.7) can be modified by power corrections, if they preserve the leading counting.The light-cone vector n µ , which defines the collinear direction, could be turned n µ → n µ by a power-suppressed amount such that n 2 = 0, and the factorization theorem remains the same.If there are several collinear directions the transformation should preserve the relation between them.This is an obvious statement in the deep-inelastic scattering.In other cases the frame-invariance is rather involved, and usually is violated by the LP term and restored by power corrections similarly to EM gauge invariance.
Let me inspect the implication of frame-invariance for the TMD factorization approach.There are two collinear directions given by n µ and nµ , which satisfy (nn) = 1.It is straightforward to see that there are two possible transformations n → n , that preserve n 2 = n 2 = 0 and (n n ) = 1.They are I: where ∆ and ∆ are transverse vectors in the original frame, i.e. (∆n) = (∆n) = 0.The vectors have the counting ∆ ∼ Q 0 , and could be treated as infinitesimal parameters.Note, that one cannot transform n and n simultaneously, because in this case, the equation (nn) = 1 requires ∆ ∼ Q 1 , which is not a small transformation.Nonetheless, transformations I and II can be applied successively, simulating a simultaneous rotation of n and n.Important to mention that the vectors p µ 1 and p µ 2 are external vectors, and thus they are not transformed under (5.17).The definition (3.4) is not modified by ∆.
The transformations I and II are symmetric to each other.Therefore, the result valid for one of them is automatically valid for another.Thus, in the rest of the section, I consider only the transformation I.
On the level of factorized expression transformations (5.17) change the definition of TMDPDFs, since (k T n ) = 0. Therefore, for the comparison of momentum space integrands (similar analysis can be done for position space integrands), one should compensated the redefinition of k's, such that f 1 (x, k T ) is invariant.Recalling that TMDPDF is obtained by Fourier transforms (2.21) one finds that the same definitions can be achieved by the contra-rotation Here, (k 1T ∆) = 0 is imposed, in order to preserve the definition of TMDPDF f 1 (x 1 , k 1T ).The variables x are also impacted by the rotation because they are defined via q ± .One finds that where it is used that (q∆) = (k 2 ∆), due to the delta-function.So, the simultaneous transformations (5.17) (part I), (5.18) and (5.19) should keep the integrand of the factorized hadronic tensor invariant.
Applying the transformation to the LP term (5.5) one finds where the dots denote the higher power terms.The first term in brackets is ∼ Q −1 i.e.NLP.So, the LP term is frame-invariant up to NLP corrections.Adding the NLP contribution (5.6) one obtains The arguments of TMDPDFs are omitted but the relative order of TMDPDFs is preserved.Again the violation term starts with ∼ Q −2 and is NNLP.The derivative term is obtained from the expansion of x 1 .Finally, one can check that where O(N 3 LP) is a long expression.Importantly, that the frame-invariance involves the derivativeof-TMDPDF terms, which did not participate in the gauge-invariance check at NNLP order.Therefore, the series of KPCs cannot be split into sub-series which are gauge and frame invariant.Alike the gauge-invariance, the frame-invariance is not close at any given power-order.Moreover, at each order it generates the infinite series of derivative corrections, due to the Taylor expansion of f 1 (x 1 , k 1T ).The frame invariance holds for each nonperturbative sector, i.e. separately for the power series involving twist-two TMDPDFs, separately for the power series involving twist-three TMDPDFs, etc.The expression with summed KPCs (5.3) is invariant under the transformations (5.17) since it does not depend vectors n and n.Here, I recall that the external momenta p µ 1 , p µ 2 are not transformed, and thus δ-functions with arguments (ξp ± − k ± ) are not modified.

Estimation of numerical importance
The factorization theorem with summed KPCs is valid in the traditional regime of TMD phenomenology, Q q T and Q Λ but does not have restriction Q k T .In fact, the restriction Q k T was ignored in all phenomenological studies, assuming that k T ∼ Λ.The assumption k T ∼ Λ is not entirely correct because k T is an integration variable.In this section, I test the impact of inclusion of KPCs.For it, I compare LP and KPC-summed cross-sections of the unpolarized Drell-Yan (only γ-channel for simplicity).
The cross-section for the Drell-Yan reaction is computed from the hadronic tensor by where q 2 T = −q 2 T > 0, α em is the QED coupling, and e q is the charge of the quark q.The leptonic tensor is where l and l are leptons' momenta and l + l = q.Substituting the hadronic tensor from eqn.(5.5) one obtain the cross-section in the LP TMD factorization where The correction q 2 T /2Q 2 in the common factor is the result of convolution of the leptonic tensor with g µν T .The inhomogeneity of this factor in power-counting is due fact that the leptonic tensor is exact (i.e. it contains all powers) while the hadronic tensor is pure LP.Exactly this expression is usually used for the phenomenology of unpolarized TMD distributions, see e.g.[70][71][72].
The integral over transverse momenta k 1T and k 2T is convenient to present as the integral over k 2 1T and k 2 2T .The delta-function restricts the values of these variables as The region R T is shown in the fig.2(left) by yellow color.Notably, the region spans to infinite values of k 2 T 's, despite initial assumption of Q being the largest scale.The cross-section with resummed KPC is obtained from eqn.(4.17) and reads It is straightforward to check that in the limit q ± → ∞ the expression (5.28) reproduces (5.25) (up to the power-suppressed term in the common factor).Note that to make this comparison one should commute the limit Q → ∞ and integral operation.In other words, one should assume that k 2 T Q 2 , despite the integral range of k 2 T 's is infinite.The values collinear momentum-fractions of TMDPDFs are not fixed in the summed formula (5.28), but integrated in particular range.It could be anticipated apriory, because the reparametrization invariance modifies the value of x's (5.19), and thus any fixed choice is not frame-invariant.The δ-functions express ξ's as ) where τ 2 = 2q + q − = Q 2 +q 2 T , and λ(a, b, c) = a 2 +b 2 +c 2 −2ab−2ac−2bc is the kinematic function.The values of ξ 1,2 are restricted 0 < ξ 1,2 < 1 (due to the support properties of TMDPDF), which constraints the integration region of k 2 1T and k 2 2T to (5.30) The region R ξ is shown in the fig.2(left) by blue color.The variables ξ (5.29) can be seen as a kindof-Nachmann variables [73,74] for TMD distribution, which correct for the convolution integral for non-zero transverse momentum of partons.
In full the integration region for (5.28) is R ξ ∩ R T .It is compact.The maximum values of k 2 T are ( √ τ 2 + q 2 T ) 2 /4 (= Q 2 /4 at q 2 T = 0) which are reached at the points a 1,2 in fig. 2. The arguments ξ 1,2 belongs to the region shown in fig.2(right).This region does not include the LP values of collinear momentum-fraction x 1,2 (5.26), but it includes the values Q 2 /se ±y , which are often used in the phenomenology and is shown by the red star in fig. 2.
In figs. 3 and 4, the comparison of KPC-summed (5.28) and LP (5.25) cross-sections is shown.For the comparison the extractions SV19 [71] (presented in the left panels) and ART23 [72] (presented in the right panels) were used.The perturbative orders are the same as used in the extractions.They are N 3 LL for SV19, and N 4 LL for ART23.The evolution of distributions is performed in the position space, then the TMDPDFs are Fourier-transformed and substituted into (5.28,5.25).The inclusion of KPCs results into an almost flat increase of the cross-section as the function of q T , see fig. 3. The size of the shift grows at smaller x's and Q.In other words, the inclusion of KPCs does not significantly modify the shape of the LP prediction, but significantly changes the normalization.At the typical kinematic of the LHC (aka Z-boson production) the size of corrections is of the order of 1%.It grows to ∼ 30 − 40% at Q ∼ 10GeV.Using the presented curves one can estimate the effective size of KPCs.Their order of magnitude is approximately described by (5.31) This approximation is obtained by a two-parameter fit of SV19 and ART23 curves.
It is interesting to mention that the modern TMD phenomenology faces problems with the description of normalization of cross-section, while perfectly describes the shape.For LHC energies the issue is of order of few percents, while for low-energy fix-target data (Q ∼ 5 − 20GeV) is tenths of percents [3,75,76].The situation is worse for Q ∼ 2 − 4GeV which are typical for Semi-Inclusive Deep-Inelastic Scattering (SIDIS).In this case the problem with normalization is of order of factor 2-3 [5].Therefore, one could hope that inclusion of KPCs into the phenomenology will resolve this important problem.

Conclusion
The power corrections in the TMD factorization theorem provide a rich field for investigation.
As described in the introduction, there are four conceptual types of power corrections: q T /Q, k T /Q, Λ/Q, and target-mass corrections.These corrections have distinct origins and characteristics.Importantly, different power corrections become significant in different kinematic regimes, which justifies considering them independently.In this work, I have derived the series of kinematic power corrections (KPCs) or k T /Q-corrections that follow the leading power (LP) term.These corrections are essential because they are responsible for restoring electromagnetic (EM) gauge invariance (charge conservation) and frame invariance, which are broken by the LP approximation.In that sense, the series of KPCs is the entailed part of the LP factorization theorem.The KPCs that follow the LP term possess a distinctive mark: they only involve TMD distributions of twist-two.This feature allows for their straightforward extraction from the generic power expansion of operators.The resulting series is presented in equation (3.23).This series is summed in a simple expression (4.17), which is the main result of the work.The computation is done for the Drell-Yan process but can be generalized for other processes without conceptual problems.
One of the most significant features of KPCs is that they must obey the factorization theorem, even if other types of power corrections may violate it.This statement follows from the fact that KPCs are responsible for restoring the fundamental properties of the LP term, namely charge conservation and frame invariance.Consequently, the coefficient function for KPCs remains identical to the LP coefficient function.In sec.3.3, I have explicitly verified this statement at the nextto-leading order (NLO) and demonstrated the restoration of the argument of the LP coefficient function to q 2 .The latter is a non-trivial result as it indicates the impossibility of a strict power expansion beyond NLP.
The final formula for the cross-section is almost trivial -it tells that the cross-section can be computed with free massless quarks similar to a naive parton picture.However, with the present derivation, it attains a different status.It obeys the factorization theorem, and it is clear which part of the power expansion is included and which is neglected.The excluded terms consist of TMD distributions of twist-three and higher, as well as powers of q T /Q It is important to note that the derivation of the series of KPCs is contingent upon the definition of TMD-twist and higher twist distributions.The current derivation is based on the approach outlined in ref. [22], which leads to a consistent result.However, if an alternative definition is proposed, the series of KPCs could differ.In such a case, some of the present higher twist terms would be absorbed into the KPC series.In particular, the difference in the treatment of higher twist terms may describe the discrepancy between the present computation and ref. [19].
Incorporating KPCs into TMD phenomenology is essential, as they play a crucial role in restoring the consistency of formalism.Moreover, all perturbative ingredients remain unchanged since KPCs adhere to the LP factorization theorem.Therefore, including KPCs in phenomenological studies can be done without encountering conceptual difficulties while maintaining the achieved level of accuracy (currently at the N 4 LL order [72]).The summed formula is valid in the same kinematic range of application as ordinary TMD factorization.Estimations made in sec.5.3 demonstrate that including KPCs results in an almost constant increment of the cross-section.The magnitude of this correction depends on Q and x.For typical LHC kinematics, the correction is around 1%, while at Q ∼ 4 − 5 GeV, the correction can reach 100%.Interestingly, the deficiency in normalization for the TMD factorization at low energies has been reported by multiple groups.One could expect that these problems will be resolved with the inclusion of KPCs.

Figure 1 .
Figure 1.Example of diagrams contributing to the DY process.Gray blobs show the hadron nonperturbative state.The vertical dashed line denote the insertion of complete set of states.

Figure 2 .
Figure 2. (left) The region of integration for variables k 2 1T and k 2 2T in the TMD factorization formulas (5.25) and (5.28).(right) The region of values of parameters ξ1,2 (5.26) covered during the integration over k 2 1T and k 2 2T .The points ai at both plots corresponds to the same values of k 2 1T and k 2 2T .

Figure 3 .
Figure 3.The difference of KPC-summed and LP cross-sections relative to the qT = 0 LP cross-section for the Drell-Yan reaction at different qT and different x.The plot in left(right) panels are based on SV19 [71] (ART23 [72]) extraction of TMDPDF.Upper row is for Q = 90GeV, and lower row is for Q = 20GeV.Lines of different color corresponds to different values of x (at y = 0).

Figure 4 .
Figure 4.The ratio of KPC-summed to LP cross-sections for the Drell-Yan reaction at qT = 0 at different Q and different x.The plot in left(right) panel is based on SV19 [71] (ART23 [72]) extraction of TMDPDF.Lines of different color corresponds to different values of x (at y = 0).