Definition and evolution of transverse momentum dependent distribution of twist-three

We present an in-depth analysis of transverse momentum dependent (TMD) distributions of twist-three. In particular, we focus on evolution equations, symmetry relations, parameterization, interpretation, small-b asymptotic behaviour and the structure of singularities. The starting point of discussion are the correlators with the definite TMD-twist. By considering suitable combinations of these correlators, we introduce physical TMD distribution of twist-three that can be used for practical applications. We also establish relations with generic TMD distribution of twist-three, and demonstrate that their evolution equations are autonomous in the large-Nc limit.


Introduction
The transverse-momentum dependent (TMD) distributions are universal functions that enter the TMD factorization theorems and describe various aspects of partons' nonperturbative dynamics.The TMD distributions that appear in the leading power (LP) term of the factorization theorem for a variety of processes are known very well.For the review of the recent theoretical and phenomenological development and further references, see refs.[1][2][3][4].At the next-to-leading power (NLP) accuracy, a new set of TMD distributions emerges.These TMD distributions are usually referred to as twist-three distributions, and in general are studied much less.Despite being introduced long ago [5], and discussed in many works, twist-three TMD distributions suffer from a lack of a systematic approach.This is a natural consequence of the NLP TMD factorization theory being at its earlier stage of development.The first convincing studies appeared only recently [6][7][8][9][10].Further progress in this direction requires the systematization of knowledge about TMD distributions of twist-three, which we present in this paper.
The term higher-twist (and particularly twist-three) distributions requires some clarification.There are several meanings of the term "twist", which are often mixed up in the literature.The exact definitions and pedagogical explanations can be found in refs.[11][12][13].One should clearly distinguish between genuine and generic higher-twist distributions.The genuine distributions are defined by the specification of quantum numbers of the corresponding operator.They are mathematically self-contained, although often too complicated for direct practical applications.On the contrary, generic distributions are usually defined by the power counting for observables, and thus have no simple theoretical structure, but are convenient practically.The best common sense analogies for the generic distributions are structure functions, which are functionals of genuine distributions.
The relations and properties of genuine and generic twist-three collinear distributions are known well (some recent development and further references can be found, e.g., in [14][15][16]).For the TMD distributions, the theory is much less developed.So far, all twist-three TMD distributions that have been discussed were generic and, therefore, their properties remain almost unexplored.For instance, there is no complete classification, and even evolution equations, which are the central element of definition, are unknown.The reason is that a mathematically-rigorous definition of a "genuine twist" for TMD distributions has been introduced only recently in ref. [9].
In ref. [9] the TMD-twist is defined as a composition of geometric twists ("dimension-spin") of the light-ray parts of the TMD operators.The accurate definition is presented in sec.2.1.The decomposition with respect to this parameter guarantees the independence of operators in the sense that their evolution is autonomous.As a result, the basis of corresponding TMD distributions is orthogonal.Each TMD distribution with a definite TMD-twist is an independent nonperturbative function.In this way, the TMD distributions with the specific TMD-twist serve as elementary blocks for constructing other twist-three TMD distributions.However, as we discuss in this paper, the proper definition of TMD requires extra details.It appears that individual terms of crosssection, computed with the naïve definition, are divergent (although the sum of all terms is finite).Therefore, the definition of physical TMD distributions (i.e., such that provide finite cross-section term-by-term) includes extra subtraction terms.
The aim of this work is to find a suitable definition of physical TMD distributions of twist-three and derive their main properties (such as evolution equations, symmetries, support properties, etc.).To find a good definition, we start with the most mathematically convenient definite-TMD-twist TMD distribution in position space.Then, step-by-step, we modify it improving its properties.This procedure allows us to derive evolution equations and other aspects of physical TMD distributions without significant effort.The resulting functions are well-defined and suitable for phenomenological studies.Additionally, we establish relations with widely-used generic distributions and compute the evolution equations for the latter for the first time.
To make the exposition concentrated, we deal only with the quark TMD parton distribution function (TMDPDFs) in the singlet color representation.These distributions are the most important since they are the main twist-three building blocks for cross-sections of the Drell-Yan (DY) process and Semi-Inclusive Deep-Inelastic scattering (SIDIS).Other distributions, such as gluon TMDPDFs, TMD fragmentation functions (TMDFFs), or distributions with octet color representation, can be considered in the same way.Moreover, many properties are the same (e.g., evolution equations for TMDFFs can be easily obtained from those for TMDPDFs).
The paper is split into five sections, each representing a complete discussion on a particular stage of manipulations.
• We start in sec. 2 by introducing the definite-TMD-twist TMD distribution in position space, as it was proposed in ref. [9].This is the defining representation, since the operator expansion that gives rise to the factorization theorem is derived in the position space.However, in such a form, distributions are useless for phenomenology.They are complex functions with indefinite T-parity and unclear interpretation.
• In sec.3, we define definite-TMD-twist TMD distribution in momentum-fraction space, discussing their interpretation and support properties.However, the Fourier transformation to the momentum-fraction space makes evolution equations bulky.
• In sec.4, we introduce particular combinations of the definite-TMD-twist TMD distributions, in such a way that they present more 'natural' symmetry properties: specifically they have definite complexity (i.e., they are either real or purely imaginary) and definite T-parity.This is the most convenient stage to introduce the parameterization of particular components of TMD correlators.These distributions are not entirely autonomous but obey pairwise mixing under evolution.These distributions are not usual functions but rather generalized functions, i.e., they have indefinite values at certain points but definite integrals.
• The expression for the cross-section involves combinations of the zeroth Mellin moments of TMD distributions, which are individually singular but their sum, relevant for the crosssection, is finite.In sec.5, we explicitly compute the singular part of the zeroth moment and demonstrate that it is equivalent to the rapidity divergence.Finally, we introduce the physical TMD distribution by subtracting the divergent terms.The definition of physical distributions is the final product of this work.The cross-section evaluated in terms of physical TMD distributions is finite term-by-term.
• In sec.6, we discuss the connection of physical TMD distributions, and generic TMD distributions introduced in works [5,17,18].In particular, we demonstrate that evolution equations for generic twist-three distributions are not closed.However, the evolution equation essentially simplifies and closes in the large-N c limit which allows the phenomenology of these distributions with controllable precision.
We support the work with three appendices, which contains the synopsis of vector algebra definition (appendix A), additional details on the LO evolution kernels (appendix B), and leading small-b asymptotic form for TMD distribution of twist-three (appendix C).

TMD correlators in position space
In this section, we define and discuss the TMD correlators in the position space.The general discussion about the construction of TMD operators and distributions at sub-leading powers of TMD factorization can be found in ref. [6,8,9].The LO expressions for the evolution kernels are taken from ref. [9], where one can also find details of its derivation.

Definition
All TMD correlators at any power of the TMD factorization theorem have the same general structure.Namely, they are a composition of two semi-compact light-cone operators U , separated by a transverse distance b Φ AB (z 1 , ..., z a , z b , ..., z n ; b) ∼ p, s|U A (z 1 , ..., z a ; b)U B (z b , ..., z n ; 0)|p, s , where A and B indicate the quantum numbers of U .The operator U ({z i }; b) is a product of QCD fields positioned at points (z i n+b), where n µ is the light-cone vector and b µ is the transverse vector.
The fields are first transported to the light-cone infinity by light-cone Wilson lines, and then to transverse infinity by a transverse Wilson line 1 , such that the matrix element (2.1) is explicitly gauge invariant.The operator U spans an infinite range, and for that reason it is called semi-compact.
The properties of TMD correlators are related to the TMD-twist of the defining operator, the notion of which was introduced in ref. [9].The TMD-twist is given by a pair of integer numbers (N, M ), where N and M are geometrical twists (i.e."dimension-spin") of semi-compact operators 2 U in eqn.(2.1).The LP TMD factorization theorem deals with distributions of the TMD-twist (1,1).At NLP, a new set of TMD distributions appears with TMD-twists (1,2) and (2,1).At NNLP one has TMD-twists (1,3), (3,1) and (2,2), and so on.This structure resembles the ordinary structure of power expansion in the collinear factorizations with the twist equals to the sum (N +M ).Importantly, even if (N +M ) is the same, operators with different TMD-twists (N, M ) obey different evolution equations and symmetry properties as a consequence of the Lorentz invariance.However, for simplicity, we refer to TMD-twist (1,1) as twist-two and TMD-twists (2,1) and (1,2) as twist-three if it does not create a confusion.
At LP and NLP of the TMD factorization theorem for DY and SIDIS processes only two kinds of semi-compact operators appear.They are where g is the QCD coupling constant, q is the quark field, F µ+ = F µν A t A n ν is the gluon fieldstrength tensor with t A being the generator of SU (N c ).The index i is the spinor index.The gauge links are defined as usual (2.3) 1 Transverse gauge links do not contribute to perturbative calculation in regular gauges (such as the covariant gauge), because in these gauge the gluon field vanishes at infinities.In singular gauges (such as the light-cone gauge), transverse links could produce a non-vanishing contribution [19].However, one can select an auxiliary gauge condition such that interactions with the transverse link vanish.Explicit inclusion of transverse links essentially complicates the computation, and spoils determination of some global properties.For instance, it makes impossible the definition of twist in any form.Therefore, we omit writing transverse links along the paper, assuming that all relations are valid in regular gauges or gauges with appropriate boundary conditions.An extended discussion on inclusion of traverse links into the perturbation theory and selection of convenient gauge can be found in refs.[9,19,20].
2 Formally, the geometrical twist is defined for a local operator, and generalized to non-local operators by means of the generating function [21].Applying this definition to semi-compact operators one needs to regularize them by setting L being finite.Regularized operators can be presented as a series of local operators, for which geometrical twist are computed by the usual "dimension-spin'-rule.For example, at finite L, the operator U 1 can be presented as For good components of the quark field, operators D n + q have the dimension n + 3/2 and the spin n + 1/2, and thus U 1 has geometrical twist-1.For bad components of quark field, operator has no definite geometrical twist, since it can have spin n + 1/2 or n − 1/2.After the twist-decomposition the limit L → ±∞ can be taken.Such scheme correctly reproduces properties of the operator product expansion as demonstrated in ref. [22].
The letter L denotes the infinity and depends on the process for which the factorization is derived.So, L = s∞, with s = +1, for SIDIS, −1, for DY, (2.4) where s is introduced for future convenience.
For convenience, we explicitly write the conjugation of the operators in eqns.(2.2): Semi-compact operators (2.2) and (2.5) have triplet and anti-triplet color representations.Other semi-compact operators with same power counting have higher color representations.We do not considered neither these operators nor the pure gluon operators in this work.
The product of semi-compact operators (2.2) and (2.5) produces the TMD operators, whose matrix elements define TMD correlators and distributions.There is a single TMD correlator of twist-two = p, s|T {q j (z and two TMD correlators of twist-three = g p, s|T {q j (z In these expressions, the symbol × indicates the position where the transverse link should be inserted (see footnote 1).The color indices are not explicitly indicated but contracted in a natural way from quark to anti-quark fields along gauge links, such that the operator is color-neutral.The subscript "bare" indicates that these TMD correlators still require renormalization, which is discussed in the next section.The ket |p, s denotes the hadron state with the spin s µ and the momentum p µ .The large and small components of the hadrons' momentum defines the light-cone vectors nµ and n µ , where M is the hadron's mass.Along the paper we neglect mass corrections, and effectively consider massless hadron.We use the standard notation for vector decomposition a µ = a + nµ + a − n µ + a µ T , where a µ T is the transverse component.The synopsis of vector algebra definitions is presented in app.A.
The spinor indices i, j play a crucial role for the correlator's properties.In particular, the TMD-twist of the operator is different for different components of spinor matrix, since they define the projection of the quark's spin.The genuine twist-two and twist-three operators require good components of both quark fields.
It is convenient to adopt the standard notation where Γ is a Dirac matrix.Let us denote the set of Dirac matrices which project both quark fields to good components as Γ + .Such matrices form a vector space with the standard basis where α is a transverse index, and γ 5 = iγ 0 γ 1 γ 2 γ 3 .For Γ ∈ Γ + , the TMD correlator Φ [Γ] 11 has TMD-twist (1,1), whereas Φ µ,12 } (with Γ ∈ Γ + ) is complete at NLP.In the sense, that no other quark TMD correlators appear in the TMD factorization theorem [9].
In some applications, it is convenient to introduce TMD correlators Φ 11 with a Dirac matrix Γ ∈ Γ + that projects a good and a bad components of quark fields (see e.g.refs.[10,18]).The set of such Γ's is denoted as Γ T , see (A.7).The operator U 1 with a bad component has a mixed geometrical twist, and therefore, TMD correlator Φ also do not have a definite TMD-twist.As the result, their evolution equations are not closed, and are expressed via the basis correlators.Additionally, the rapidity divergences of such correlators cannot by renomalized in the usual manner, as it is shown in sec. 5.In the sec.6, we discuss the relation between Φ and TMD correlators with the definite twist and derive some of their properties.

Renormalization
Any TMD operator has two kinds of singularities that are separately renormalized.These are ultraviolet (UV) singularities and rapidity singularities.The UV singularities are specific to each semi-compact operator, and therefore, a TMD correlator is renormalized by two UV renormalization constants.The rapidity divergences appear due to the interaction of a semi-compact operator with the Wilson line of another semi-compact operator, and thus their renormalization is common for the whole TMD correlator.The renormalized TMD correlators (2.7) read where R is the rapidity renormalization constant, Z U 1 and Z U 2 are the UV renormalization constants for the semi-compact operators U 1 and U 2 , correspondingly.The symbol ⊗ denotes the integral convolution between Z U 2 and the TMD correlator.The scales µ and ζ are renormalization scales for UV and rapidity divergences, respectively.The UV-and rapidity-renormalized TMD correlators are called subtracted [23,24].Since they are the main object of this paper, we restrain from introducing any special label.
The expressions (2.11) implicitly depend on the process in which they appear.This dependence enters through the renormalization condition for rapidity divergences.The standard renormalization condition states that the cross-section for DY and SIDIS processes is proportional to a convolution of two TMD distributions without extra nonperturbative factors.Schematically, where ζ ζ = (2q + q − ) 2 .This scheme is defined nonperturbatively.In particular, it implies that the nonperturbative LP soft factor can be written as a product where δ ± are the chosen regulators for rapidity divergences in the n and n directions.Therefore, the rapidity renormalization factor contains nonperturbative parts.For a detailed discussion we refer to refs.[23,25,26].Note that the soft factor has it own UV renormalization factor.Moreover, Z U N and R have collinear divergences which cancel in the product.Therefore, the explicit expressions for Z U N 's and R depend on the order of the renormalization.This mutual dependence gives rise to the Collins-Soper equation (2.23).Apart from it, renormalization constants are independent.The explicit expression for R up to NNLO (three-loops) can be found in ref. [26].The renormalization constant Z U 1 is equal to the quark-field renormalization in the light-cone gauge, and it is know at NNLO (three-loops), see e.g.[27].The renormalization constant Z U 2 has been derived in ref. [9] at LO.

Evolution equations
The presence of two independent renormalization factors produces two scales, traditionally denoted as µ (for UV scaling) and ζ (for rapidity scaling).Correspondingly, each TMD correlator obeys a pair of evolution equations.
The evolution equations with respect to µ are where anomalous dimensions are integro-differential operators.At LO they are where for N c colors.The parameter q + arises due to the redefinition of the rapidity renormalization scale in the boost invariant form [23,25,26], and is defined together with the values of ζ and ζ.The convenient choice for q + is where p q and p q are the momenta of anti-quark and quark fields, correspondingly.Such selection naturally appears in the DY and SIDIS cross-sections due to the renormalization condition (2.12), and implies (where Q and q T are the invariant mass and the transverse momentum of the hard current).In the following, we adopt this convention.The expressions for a different (positive) value of q + can be found by the rescaling q + → αq + and ζ → α −2 ζ.
The logarithms ln(q + /∂ + ) are a short (and convenient) notation for particular integral convolutions which are understood formally by their action on the generating function: ln(q + /∂ + )e izp + = ln(q + /ip + )e izp + .Such notation allows us to operate in position space (where expressions are generally shorter), without the necessity to write the convolution explicitly.The notation is also convenient for the subsequent Fourier transformation.
Figure 1.Diagrams (c1) and (c2) produce the collinear divergence in the UV evolution kernels for operator U2.The diagrams (c3) produces the corresponding divergence in the LP soft factor.The blue arrows indicate the limits in which in the loop diagram the divergences occur.Divergences cancel in the ratio between the correlator and the soft factor, leaving the remnant logarithms ln(q + /(−s∂ + )) in eqn.(2.14).These logarithms are responsible for the complex part of the evolution kernel.
Importantly, the logarithms ln(q + /∂ + ) are in general complex-valued 3 .The complexness of evolution kernel translates into important physics effects discussed in the following sections.The origin of these logarithms are the collinear divergences that appear in the UV divergent diagrams shown in fig. 1.In the δ-regularization, the collinear divergence of diagrams (c1) and (c2) are ∼ (C F −C A /2)/ ln δ + /(−s∂ + z1 ) and ∼ C A /(2 ) ln δ + /(−s∂ + z2 ) , correspondingly.These divergences are cancelled by the collinear divergence of the LP soft factor, ∼ C F / ln(δ + /q + ).The logarithms ln(q + /∂ + ) are remnants of this cancellation.The soft-factor is momentum-independent and real, while the momenta of quark and gluon can have any sign.Therefore, for any choice of q + these logarithms have complex parts.The same takes place for U 1 operators.
In the case of γ1 (z 1 ) and γ1 (z 2 ) acting to Φ 11 , the imaginary parts of logarithms have opposite signs, and thus Φ 11 has a real-valued evolution kernel.This is the consequence of the translation invariance, which for the forward matrix element 4 implies (∂ z1 + ∂ z2 ) Φ 11 = 0.
The integral kernel H in eqn.(2.17) is the ordinary evolution kernels for quasi-partonic operators [29,30].Its action to the correlator reads where the scaling arguments are omitted for brevity, ᾱ = 1 − α and The action of H to Φ µ,21 is analogous but the order of Dirac matrices should be reversed.In the appendix B, we provide extra details about the structure of H. 3 To determine the complex part of expression, one should inspect the regularized expression prior to the cancellation.In the δ-regularization, the integral that produces these logarithms has the form [9] lim

.18)
Since ∂ + is pure imaginary (and s is defined in eqn.(2.4)) this expression is regular and totally determines the phase. 4The evolution equations (2.14) represent the properties of the TMD operator only, and is valid for any matrix element.In particular, they are also valid for the case of generalized TMD distributions (GTMDs), which are off-forward matrix elements of TMD operators [28].
The evolution in the rapidity scale is the same for all TMD correlators.It reads where D is the Collins-Soper kernel [31], which is sometimes denoted as K = −2D [23].The Collins-Soper kernel is known up to NNLO (three-loops) [32].The LO expression in the dimensional regularization where in the second line the limit → 0 is taken.The anomalous dimensions γ and the Collins-Soper kernel satisfy the relation where N and M are 1 or 2, and Γ cusp is the anomalous dimension for the cusp of the light-like Wilson lines.This relation represents the integrability condition of the pair of differential equations (2.14) and (2.21), and guaranties the existence of a common solution.

Symmetry properties
There are three essential symmetry relations for TMD distributions and correlators.They follow from complex conjugation, parity transformation and time-reversal transformation.Their derivation is straightforward and discussed in many articles, see e.g.refs.[33][34][35].Here, we summarize these relations without derivation.The complex conjugation gives [ Φ Importantly, the complex conjugate of Φ µ,21 , and vise versa.It implies that these correlators cannot be parametrized by real distributions, but only combinations of them can.
The parity transformation gives where p = (p 0 , −p 1 , −p 2 , −p 3 ) and same for s.The time-reversal transformation is conveniently combined with the parity transformation (which effectively gives charge-conjugation), producing where T (γ 0 ) * T −1 = γ 0 and T (γ i ) * T −1 = −γ i .Note that the PT transformation preserves the orientation of the light-cone vectors, but changes the position of the light-cone infinity (L → −L).It effectively connects TMD correlators defined in DY and SIDIS kinematics.Alike the complex conjugation, PT transformation relates DY and SIDIS definition of TMD distributions of the TMD twist-(1,1), but turns TMD-twist-(1,2) to (2,1) (and vice-versa).Therefore, TMD correlators Φ 12 and Φ 21 do not have definite T-parity, which was pointed out in ref. [33].
Finally, being defined by forward matrix elements, the TMD correlators are independent on the global position of the operator, i.e.
where y is an arbitrary number.

TMD correlators in momentum-fraction space
In this section we discuss the TMD correlators in the momentum-fraction space.They are related to distributions in position space by a Fourier transform, and represent distributions of partons with specified collinear momenta.The momentum-fraction distributions are certainly more useful in phenomenology.However, the distributions in position space are simpler theoretically, due to much more compact expressions.Therefore, in the following sections, we often return to the position space definition, even talking about the momentum-fraction representation.

Definition and support properties
The TMD correlators in momentum-fraction space are defined as where the integration measure is x 1 x 2 x 3 Φ 21 .Each sector of the hexagon has independent interpretation as a process with radiation/absorption of partons with positive collinear momentum fractions.The difference between transverse momenta of quark and anti-quark-gluon pair is kT .The interpretation process for each sector is shown by diagrams, where horizontal/vertical axis is for collinear/transverse momentum.For the correlator Φ12 the picture is analogous, but with the transverse momentum of the gluon reversed.
simpler evolution equations 5 .The support (3.2) is conveniently drawn in the barycentric coordinate system, where it takes the form of the hexagon [37,39] shown in fig. 2.
Formally, the Fourier transformation between position and momentum-fraction spaces is made by the integration over the infinite domain.However, the correlators Φ are zero for |x i | > 1.This statement is derived following the same route as the one used in ref. [40] for collinear distributions.First, one observes that the T-ordering in eqn.(2.7) can be omitted, because the (anti-)commutators of fields are zero due to the causality relation.Next, inserting the compete set of states in-between fields and solving the momentum conservation condition (for p + > 0) one finds that correlators vanish at |x| > 1.In the TMD case (in contrast to the collinear case discussed in ref. [40]) one should also account for the transverse distance/momentum.However, it only makes momentum inequalities sharper.
The same consideration gives rise to the partonic interpretation for correlators, as the probability of emission (for x > 0) and absorption (for x < 0) of partons in the target's infinite momentum frame.Ordering of parton fields from positive to negative x (such that energies of partons are positive) allows to identify twist-two TMD correlators in positive/negative domains of x with In this notation some expressions are a bit shorter (see e.g.expressions for evolution kernels in appendix C of ref. [9]), but the support of these distributions is more cumbersome, Therefore, manipulations in these variables are generally more involved.
quark/anti-quarks distributions.The relations are where θ(x) is the Heaviside theta-function.The relative minus sign is due to the sign produced by the charge-conjugation [5].The distributions Φ q and Φ q are defined for 0 < x < 1.These distributions are entirely independent, and do not mix under evolution.
The interpretation of twist-three correlators is more cumbersome.They are interpreted as amplitudes of processes with radiation/absorption of a parton versus a parton pair.For example, the segment x 1 < 0 and x 2 , x 3 > 0 can be interpreted as the radiation of a quark-gluon pair and absorption of a quark, whereas the segment x 2 > 0 and x 1 , x 3 < 0 can be interpreted as the radiation of a gluon and absorption of a quark-anti-quark pair.This is analogous to collinear distributions [40].The TMD correlators additionally store the information about the transverse momentum of emitted/absorbed partons.The transverse momentum k T (Fourier conjugated to b) is the relative transverse momentum of the quark and the quark-gluon pair, irrespectively their x's signs.Therefore, distributions of transverse momentum is different for Φ 21 in any segment of x's.For example, in the segment x 1 < 0 and x 2 , x 3 > 0, the transverse momentum k T is the difference between momenta the quark-gluon pair and the quark for the correlator Φ 12 , while it is the difference between momenta of the anti-quark-gluon pair and the anti-quark for the correlator Φ The involved interpretation makes impossible to identify quark and anti-quark twist-three distributions in a similar way to (3.5).However, one can introduce a similar notation, which also helps conveying similar properties.We define where the sign ± is selected in accordance to Γ in the same way as in eqn.(3.5).The distributions Φ q and Φ q do not mix with each other under evolution (this statement is proved in the following section), and therefore, they are entirely independent.However, the signs of momenta of any parton pair are not fixed, and they mixes in the evolution.Therefore, any further decomposition is not practical.The splitting (3.6) preserves the general picture of physical process and naturally applied to many practical cases (see e.g.sec.6)).For the sake of compactness of results, without loss of generality we will not employ the decompositon (3.6).
The inverse transformation (from position to momentum-fraction spaces) requires fixation of the global position of the correlators.For example, fixing the position of the quark-field at the origin, one gets Illustration for the elements of the evolution kernel.Left panel: the function Θx 1 x 2 x 3 defined in eqn.(3.14).The same shading of the segment corresponds to the same coefficient.The LO value for the coefficient is written inside each sector.Right panel: example of the integration paths for the evolution kernel Px 2 x 1 in the barycentric coordinates.The black and white dot represent the coordinates of the special points of the integral kernels (v = 0).The black one for the 'natural' ordered coordinates.The white one for the exchanged quark and gluon momenta, which take place in the last lines of eqns.(3.15, 3.16). where If a distribution with a different global position appears, it is always possible to shift it using eqn.(2.27).
We emphasize that negative values of momentum fractions are physical and contribute to the factorized expression.We also stress that the evolution equation (see the next section) mixes contributions of different sectors.This effect is well-known in the case of collinear distributions of twist-three and higher, see e.g.discussion in ref. [37,41].However, in contrast to collinear twist-three distributions, the TMD distributions are not definite at points x i = 0.This point is elaborated on in details in the following sections.

Evolution equations
The pair of evolution equations for the LP TMD correlator is well known (see [24,42]): where γ V is the anomalous dimension of the quark vector form factor.At LO one has (3.9) The higher perturbative orders can be found, e.g. in ref. [43].
The evolution equation for the TMD correlators of twist-three are more complicated.The expression for the UV evolution kernels is computed by where ⊗ are integral convolutions.The resulting equations are conveniently written in the form , where we suppressed the argument (x 1 , x 2 , x 3 , b; µ, ζ) of the TMD correlators for brevity.The elements P, Υ and Θ are defined below.The coefficient in front of the ln ζ (i.e.Γ cusp ) is fixed by the integrability condition (2.23) and at LO is given in eqn.(3.9).Although this equation is derived at LO, it is clear that the same pattern of kernels is preserved at all orders, since it is the most general pattern that can be written.The function Υ incorporates the terms that multiply TMD correlator without convolution.The LO expression is The expression for Υ x3x2x1 is obtained from Υ x1x2x3 replacing x 1 ↔ x 3 .To obtain this function we used the convention (2.17).For arbitrary q + the function Υ reads where the dependence on q + /p + is accumulated in the first term.
The function Θ results from the complex parts of the logarithms ln(q + /∂ + ), whose real parts are collected into Υ's.The values of Θ depend on the segment of the support domain.At LO, one has x 1,2,3 ∈ (+, +, −), where x 1,2,3 ∈ (+, −, −) indicates the sector {x 1 > 0, x 2 < 0, x 3 < 0} (see also fig. 2) and similar for other cases.The function Θ x3x2x1 is obtained from Θ x1x2x3 with replacement x 1 ↔ x 3 .The visual representation of Θ x1x2x3 is given in fig. 3. The fact that the evolution kernel has a complex part is worrisome.However, the evolving function is complex-valued and thus it only indicates that the real and complex parts of TMD correlators evolve by different evolution kernels.The physical observables, that are real, are expressed via the real-valued combinations of TMD distributions (see the next section).Such combinations are evolved by real kernels.We emphasize that the complex part of the evolution kernel depends on the process, via the sign-variable s.To our best knowledge, it is the first observation of the process-dependence in the evolution equation.
The evolution kernels P in momentum-fraction space are obtained from eqn. (2.14) via the transformation (3.10).They read +O(a 2 s ), In these expressions The kernels P x2x3 are defined analogously with exchange x 1 ↔ x 3 and Φ(x 1 , x 2 , x 3 ) → Φ(x 3 , x 2 , x 1 ).The integration paths in the barycentric coordinates are shown in fig. 3. The kernels P x2x1 are regular and continuous at x 1 = 0 or x 2 = 0. Also they preserve the value of x 1 + x 2 = −x 3 at each point of the integral, and therefore, the values of function at x 1 + x 2 > 0 (x 3 < 0) does not mix with the values at x 1 + x 2 < 0 (x 3 > 0).This allows identification of quark-and anti-quark components in eqn.(3.6).However, the signs of x 1 and x 2 are not preserved individually.The evolution equations with respect to rapidity scales are the same as in the position space (2.21):

Symmetry properties
The symmetry relations for TMD correlators in momentum-fraction space are obtained by substituting definition (3.1) into relations listed in sec.2.4.The complex conjugation relations (2.24) give [Φ The parity transformation relations (2.25) give The PT-transformation relations (2.26) give The translation invariance (2.27) is already accounted in the definition of TMD correlator in momentum-fraction space since it is implicitly assumed that x 1 + x 2 + x 3 = 0.

TMD distributions with definite T-parity
The main drawback of the definite-TMD-twist TMD distributions is their complex-valued evolution kernels.Also, the TMD correlators Φ [Γ] µ,12 and Φ [Γ] µ,21 do not satisfy simple rules of complex conjugation (3.19) and have indefinite T-parity (3.21).For that reason they are impractical.In this section, we introduce combinations more suitable for practical applications.The new combinations remove the issues above at the price of more involved evolution equations.We refer to these combinations as TMD distributions with definite T-parity.

TMD correlators with definite T-parity
We define two independent combinations of the definite-TMD-twist TMD distributions, where we omit arguments (µ, ζ) for brevity.Note that the combinations involves momentumfractions with opposite signs, but they have the same transverse separation.Such correlators lack a simple partonic interpretation.Since the renormalization of Φ 12 and Φ 21 operators is independent, the renormalization of Φ ⊕ and Φ is convoluted.For this reason, the correlators (4.1) cannot be presented as a matrix element of a single bare operator.Nonetheless, these combinations are the ones that appear in the practical applications [9,18,33].
The symmetry relations for TMD distributions with definite T-parity resemble the relations for the leading-twist TMD correlator.The complex conjugation gives [Φ The parity transformation gives Quark TMD distributions of twist-three sorted with respect to polarization properties of both the operator (columns) and the hadron (rows).The labels U, H, and T are for the unpolarized, longitudinal and transverse polarizations.The subscript J differentiates different angular momentum for the transversely-polarized case.The bullet • stands for the ⊕, labels.
The PT-transformation gives PT Note that, due to the composition of momentum fraction x's in the definition (4.1), the form of the split into functions with quark and anti-quark labels (3.6) is preserved.
The evolution equations for Φ ⊕ and Φ are , where it was used that Υ In this representation the evolution kernels are real.An amazing feature of these evolution equations is that they mix the functions with different T-parity.However, the mixing terms are proportional to s, which changes sign under L → −L.Thus, the T-parity of each correlator is preserved, and it remains process independent (apart of trivial sign-change for T-odd functions).To our best knowledge, it is the first example of such behaviour.

Parameterization
The discussion is matured enough to allow us to introduce the parameterization for the TMD correlators in the terms of TMD distributions.For each TMD correlator we write all possible spin and tensor structures in accordance to its parity and dimension.As an elements of contraction one can use the vectors b µ and s µ , and the tensors g µν T and µν T .The spin vector is spit into the longitudinal and transverse projections where M is the mass of the hadron.It implies λ = M s + /p + .The standard parameterization of the leading twist TMD correlators has been carried out in ref. [5].We present this parameterization here for completeness where b 2 < 0. All TMD distributions are dimensionsless real function that depend on b 2 (the argument b is used for shortness).
The parameterization of the sub-leading correlators is where • is ⊕ or , and (x 1,2,3 , b) is a shorten notation for (x 1 , x 2 , x 3 , b; µ, ζ).We emphasize that the parameterization is written for distributions with the upper index µ, i.e.Φ . This is important because all indices are transverse and thus change sign upon rising and lowering, i.e. g 11 T = g 22 T = −1, b 1 = −b 1 , etc.The distributions defined in (4.10, 4.11, 4.12) are dimensionless and real functions.The elements of parameterization (signs and tensors) are adjusted such that the evolution equations have simpler structure and minimal mixing between distributions 6 .The notation for the TMD distributions follows the traditional pattern used in the parameterization of leading TMD distributions (4.7, 4.8, 4.9) (known as the Amsterdam notation [5]).Namely, the proportionality to b is marked by the superscript ⊥, and the polarization by subscript L (for longitudinal) or T (for transverse).In the tensor case, one faces four structures ∼ b µ s α T , which are denoted as for antisymmetric, diagonal, symmetric, and traceless components.In total there are 32 TMD distributions of twist-three.The Dirac matrices project particular components of the quark polarizations, which provide an extra layer of interpretation for TMD distributions, as densities of unpolarized (for γ + ), helicity (for γ + γ 5 ) and transversity (for iσ α+ γ 5 ) quark compositions.Twist-three distributions must also The T-parity of TMD distributions of the twist-two, twist-three and bi-quark distributions.
account for gluon polarization vector.The cases of unpolarized and helicity operators correspond to different combinations with opposite quark helicities, such as ↑⇑↓, ↓⇑↑, etc (here the single (double) arrow indicates the spin component of the quark (gluon), respectively).In the transversity case, one can split operator into three components with angular momenta 0, 1, and 2. They correspond to the trace part ∼ σ α+ F α+ , the anti-symmetric part ∼ αβ T σ α+ F β+ , and the symmetric-traceless combination.In the terms of helicity states, these operators are build from the combinations with the same quark helicities, such as ↑⇑↑, ↑⇓↑, etc.Note that the exact interpretation of these distributions is not possible since the operators for (4.1) are not simple.Nonetheless, it allows to sort TMD distributions of twist-three with respect to their spin-content, see table 1.
Among the 32 TMD distributions, 16 distributions change the sign under T-parity transformation, and 16 do not.It means that 16 distributions are naïvely T-odd.They have different sign but same shape once measured in the Drell-Yan and SIDIS processes, similarly to the Sivers and Boer-Mulders functions [33,44].The T-parity of all TMD distributions discussed in this paper is summarized in the table 2.

Evolution equations for TMD distributions
The evolution equations for the twsit-two TMD distributions are the same as for TMD correlators, i.e. where , and anomalous dimensions are defined after (3.8).As already alluded, the evolution equations for TMD distributions of twist-three take on a nontrivial matrix form.To find it we substitute the parameterizations (4.10-4.12)into the evolution equations for the correlators (4.4,4.5), and extract independent tensor combinations.We found that chiral-even and chiral-odd sectors obey different evolution equations.
The evolution equations for chiral-odd distributions split into two subsets with equations where again the argument of distributions ( The evolution equations for chiral-even sector (4.15) can be rewritten in the same form as (4.16, 4.17).We write where the notation is the same as in eqn.(4.15).These combinations of F and G functions naturally appear in the applications, as it is shown in sec.6.The evolution equations are explicitly real, and explicitly preserve the T-parity, despite mixing the distributions of different parity.The full set of 32 TMD distributions of twist-three is split into two subsets -the one evolving with the kernel P A (4.18, 4.16), and the one evolving with the kernel P B (4.18, 4.16).The two subsets of equations correspond to the different spin content of a quark-gluon pair.So, the pair with helicity structures ⇑↑ and ⇓↓ evolves with P A , while the pair with ⇑↓ evolves with P B .In sec.6.3 we show that kernels P A and P B (and the associated distributions) have different properties in the large-N c limit.

Physical TMD distributions
The TMD correlators and distributions of twist-three are indefinite at x i = 0.It can be seen in several ways.First, the evolution kernels are discontinuous at x i = 0 due to the Θ-term, and also has logarithmic singularity due to Υ-term.As a consequence any continuous (or even vanishing) function at x i = 0 will turn to a discontinuous and singular function after evolution to a different scale.Second, the discontinuity is explicitly revealed in the small-b limit which can be computed explicitly.This computation is presented in appendix C. Already, at one-loop the expressions at x i = 0 are indefinite, see eqn.(C.4).Nonetheless, TMD distributions of twist-three are integrable at x i = 0.
In a sense, TMD distributions of twist-three (and higher twists as well) are generalized functions.They do not have definite values at each point of the support, but have definite integrals.This situation is unusual, and it is not yet clear how to incorporate such functions into the phenomenology.The positive point is that all known observables are expressed via integrals, and thus can be defined.
Here, however, one faces another complication.Many physical observables, such as crosssections of Drell-Yan and SIDIS processes [7,9,10], quasi-TMD distributions [45], etc. contain integrals with an additional singularity at x 2 = 0.A typical expression contributing into the hadronic tensor incorporates zeroth Mellin moment of twist-three distributions: These integrals are divergent, since Φ's are non-analytical at x 2 = 0 and lim x2→+0 Φ = lim x2→−0 Φ.In this section, we demonstrate that divergence of integrals (5.1) is the rapidity divergence.It can be explicitly computed and subtracted.It gives rise to a new layer of definition of TMD distributions of twist-three, which we refer as physical distributions.Physical distributions have finite zeroth-momentum and the observables written in their terms are finite term-by-term.

5.1
Divergence at x 2 = 0 at LO We start with the explicit computation of the divergent part of zeroth moments (5.1) at LO.As usual, it is more intuitive to perform computation in the position space, where the zeroth moments Here, the transformation between position and momentum spaces is done according to "single momentum-fraction rule" (3.1), i.e.
and similar for Φ µ,12 .We would like to compute the rapidity divergence for these operators at one-loop.The rapidity divergence appears in the loop diagrams with the gluon attached to the far-end of the light-cone Wilson line.There are two possibilities to receive divergence for discussed operators.
• The first case (standard) is to have a diagram with a gluon coupled to the Wilson line.Such diagrams are shown in fig. 4 ((r1), (r2) and (r3)).
• The second case (special) is to couple F µ+ to the rest of the operator.In the limit lim z2→L F µ+ (z 2 ) this diagram (diagram (r4) in fig.4) is also rapidity divergent.Importantly, the special case does not appear for the operators Φ, where the position of F µ+ is fixed.This special case appears only for operators Φ (0) , and represents the divergent part of zeroth moments (5.1) at x 2 = 0.This divergence can be also classified as the end-point divergence.
We regularize the rapidity divergences by the δ-regularization [43,46].It consists in the multiplication of gluon fields in the Wilson line, including F µ+ , by the suppressing factor e −sδ , with δ > 0. The technique of computation with δ-regulator can be found in refs.[9,26,39].In particular, the detailed computation of diagrams (r1), (r2) and (r3) is provided in sec.8.1 of ref. [9].The result of computation is µ,21 (z 1 , z 2 , b) + fin.terms, (5.6) where "fin.terms" are terms finite at δ + → 0. The value of q + is |p q | and |p q | for Φ µ,12 and Φ (0)[Γ] µ,21 , correspondingly.The expressions (5.6) provide the bare LO expression for the rapidity 7 Explicitly, these integrals can be written in terms of operators as follow: where Dµ is the covariant derivative.In the terms of SCET fields, these correlators are where we use the convention by ref. [8].
renormalization factor R introduced in eqn.(2.11).The complete expression for R at O(a s ) reads [26,46] where we include 1/ term from the renormalization.The diagram (r4) is calculated similarly.In this diagram, the gluon field is totally quantum and thus the final operator contains only quark fields.The difference in dimensions of operators is compensated by an inverse power of b.We obtain 8 expressions that are finite at → 0, 11 (z 1 , z 2 , b) + fin.terms, (5.9) Note that the coefficient in front of the correlator Φ11 in eqn.(5.9) exactly reproduces the derivative of R (5.7).
The expressions (5.9) represent only the perturbative part of the rapidity renormalization factor.At larger values of b the nonperturbative corrections appears.Altogether they must combine into the derivative of (nonperturbative) factor R = exp(−2D ln δ + B), where B is some finite terms [26].Therefore, the complete LO expressions for rapidity divergent part of zeroth moments are where is the derivative of the nonperturbative Collins-Soper kernel.
One could expect a contribution to the "special" rapidity divergence from the operators with two emitted gluons (see diagrams C and D in fig.5).The explicit computation presented in appendix C shows that such diagrams are finite.
We combine the expressions for rapidity divergent parts of Φ (0) , (5.12) 8 The equation (5.9) can be also derived from know results.Using the relation we can relate the rapidity divergent diagrams for twist-two and twist-three operators (at least at the one-loop order).We find , (5.8) where the derivative does not act on the quark field, and factor 1/2 is to compensate the absence of the symmetric contribution.The LO rapidity divergence for Φ [Γ] 11 has been computed in many papers, see e.g.[23,39,43,47].Explicitly, it is given in, e.g., equation (5.35) in ref. [39].Substituting Φ [Γ] 11 rap.div.into (5.8),we receive eqn. (5.9).
The terms proportional to twist-three correlators are exact at all orders.The terms proportional to twist-two correlators at higher perturbative orders are different from derivative of R, and can contain integral convolutions.The term in brackets is the same for both correlators at all perturbative orders as a consequence of PT-invariance (2.26).

Definition of physical TMD distributions
The physical cross-section is a sum of terms with various combinations of TMD correlators.Some of them have the following schematic form where H is a hard coefficient function, Φ tw3 is a twist-three TMD distribution, and ⊗ is a convolution with respect to momentum fractions.For example, the LO of H ⊗ Φ tw3 is the zeroth moment (5.1).After the renormalization procedure (2.11) explicit divergences of H and Φ's cancel.Nonetheless, the convolution H ⊗ Φ tw3 still contains the (special) rapidity divergences.These divergences, however, do not imply the breaking of the factorization theorem at NLP, because they cancel in-between various terms of the factorization theorem.Therefore, even if the factorization theorem holds and the cross-sections are finite, it makes impossible to straightforward compute the convolutions term-by-term, using the TMD distributions defined earlier.This is, of course, a problem for any phenomenological study involving twist-three TMD distributions.
In order to make physical quantities well-defined term-by-term, we further modify the definition of twist-three TMD distributions by subtracting the (special) divergent part.We define where ⊗ is some integral convolution.The kernel R is defined with respect to the hard-coefficient function H, such that the integral convolution H ⊗Φ tw3 is finite.The physical observables expressed in terms of Φ are finite term-by-term.For that reason, we call correlators defined in eqn.(5.14), and corresponding distributions, physical.Physical TMD correlators satisfy the same symmetry properties as subtracted TMD correlators discussed in sections 2 and 3. Thus one can define TMD correlators with definite T-parity analogously to eqn.(4.1), and define TMD distribution as in sec.4.2.The physical analogs of corresponding distributions we denote by the bold font.
Note that not each physical TMD distribution has subtraction term.In some cases, the subtraction term [R ⊗ Φ] [Γ] has a zero projection to corresponding tensor structure.
The functions R are to be constructed order-by-order in the perturbation theory.Using the computation made in the previous section, we can construct R at LO.We find It is straightforward to check that integrals (5.4) computed with (5.15) reproduce (5.9).Therefore the zeroth moment of the physical distribution and corresponding term in the cross-section are finite.The subtraction term removes the "special" rapidity divergence of the zeroth moment.Therefore, the physical TMD distributions satisfy ordinary evolution equation with respect to the scale where • is ( 12), ( 21), ⊕ or .Importantly, the physical distributions also satisfy ordinary rapidityevolution equation (2.21) without modifications where • is ( 12), ( 21), ⊕ or .The evolution equations with respect to µ are identical to the corresponding equations for twist-three TMD correlators, because d[R ⊗ Φ]/dµ ∼ O(a 2 s ) at LO.At the moment, it is not clear how the transformation (5.14) affects the evolution equations with respect to µ at higher perturbative orders.It is also not clear how to explicitly guarantee the finiteness of integrals in the numerical form.
In the momentum-fraction space, the expressions (5.15) become At x 2 = 0 the values of expressions (5.18) are not defined.In this case, the integral depends on the order in which one evaluate the Dirac-delta functions.However, the integral over x 2 is well-defined.Therefore, eqns.(5.18) represent generalized functions.The zeroth moments (5.1) of eqns.(5.18) are rapidity divergent at x 2 = 0. Thus, the expression for the physical TMD correlator has finite zeroth momentum, and evolves according to eqn.(5.16). For The value of discontinuity at x 2 = 0 is proportional to the difference between quark and anti-quark distributions.Indeed, where x = |x 3 | = |x 1 |, and similar for the (12)-case.Since the physical TMD correlator has a finite zeroth moment, one can expect that it is smooth at x 2 = 0, and thus the discontinuity of Φ µ,21 is given by eqn.(5.20).
The integrals (5.1) appear in the factorized cross-section at the tree-level.At higher perturbative orders one faces more singular combinations, such as ln x 2 /x 2 [9].To study these NLO terms one has to perform two-loop computation, which goes beyond the present work.However, we expect that the general strategy presented in this section holds at all orders of perturbation theory.It is also evident that the higher powers of TMD factorization theorem will suffer similar problems of singular integrals.We expect that the receipt suggested here is of general validity, i.e. the higher twist TMD distributions can be turned to physical ones, that provide term-by-term finite cross-section, by subtracting lower twist parts.
Since the physical distributions are defined with respect to kernel H, they generally are not universal, and could depend on the process.However, for SIDIS, Drell-Yan and semi-inclusive annihilation processes the hard coefficient functions are the same.It is also the same in the factorization theorem for quasi-TMD correlators [45].Therefore, the physical TMD distributions defined here have at least a large spectrum of validity, and can be used for description of all these cases.
The present definition of physical TMD distributions requires only the finiteness of integral convolution [H ⊗ Φ].It leaves a large freedom in the definition of the subtraction terms.The suggested terms (5.15) represent the "minimal subtraction scheme", where only the divergent part is removed.Clearly, one can also incorporate finite terms, and use this freedom to improve the properties of physical distributions.For example, one can prepare physical distributions of twistthree regular in the small-b limit (see appendix C).This is a very important point which requires a detailed study in the future.

Bi-quark TMD correlators
The distributions introduced in the previous sections are fundamental.They have closed evolution equations, and the factorization theorem is expressed in their terms at any order of the perturbative expansion.However, they are also very complicated from the practical point of view.To start with, they are functions of two independent momentum fractions, which complicates all formulas.For practical applications, it may be convenient to also have a simpler approximation.
Until recently, the TMD factorization theorem at NLP was known only at the three order, which can be expressed in the terms of TMD correlators Φ 11 , where Γ T projects one good and one bad components of quark fields.Such bi-quark distributions are visually simpler, albeit having very convoluted properties.In this section we derive the relations between bi-quark distributions and genuine twist-three TMD distributions.We also demonstrate that the evolution equations for bi-quark distributions are not closed.Finally, we show that the large-N c limit ensures a simpler system of evolution equations and closure of the evolution of bi-quark distributions.

Definition
The bi-quark TMD correlators were introduced in analogy to the leading-twist TMD correlators in ref. [5].They are defined as where Γ is any Dirac matrix.For Γ ∈ Γ + the bi-quark correlator has TMD-twist-(1,1) and coincides with Φ [Γ] 11 .For other Γ's bi-quark correlators do not have definite twist.We consider the correlators of twist-three with Γ ∈ Γ T , where Γ T projects one good and one bad component of the quark fields (A.5).The standard parameterization for bi-quark twist-three TMD distributions reads9 Φ where all indices are transverse.There are 16 TMD bi-quark distribution of twist-3.Eight of them are T-even and eight are T-odd, see table 2. These distributions are often used in the phenomenology.Among these distributions, one of the most studied is the e distribution, see, e.g., ref. [48,49] and references therein.The e distribution has one of the simplest structures in terms of twist-three distributions and, unlike others, contains no leading-twist contributions.

Relation between bi-quark and definite-twist distributions
On the theory side, the bi-quark TMD-correlators for Γ ∈ Γ + are not well-defined objects, in the sense that their renormalization is troublesome.To explicitly reveal the problems, we use the equations of motion for the bad components of the quark fields to obtain [18,33] Φ Here we omit the quark mass terms for simplicity.These terms are proportional to Φ11 , and can be simply restored if needed.The matrices {γ µ γ + Γ T , Γ T γ + γ µ } ∈ Γ + for any Γ T .Therefore, the distributions in the first line of eqn.(6.8) have TMD-twist-(1,1), while in the second line TMD-twists (2,1) and (1,2).In the momentum fraction representation the relation is where the zeroth moments are defined in eqns.(5.1).This bare relation clearly demonstrates the issues of the definition: • The UV renormalization of the zeroth moments Φ (0) µ,• does not express via itself.Therefore, the evolution equation for Φ [Γ T ] qq is not closed, but mixes with integrals of genuine twist-three TMD distributions; • The rapidity renormalization of the zeroth moments Φ (0) µ,• is not multiplicative.Namely, in addition to ordinary rapidity divergences (that are renormalized by R), there are special rapidity divergences alike ones considered in sec.5.1.Thus, the bi-quark TMD distributions are not well-defined.
Let us also point that the derivative of bare twist-two distributions does not commute with the rapidity renomalization factor R(b).It produces extra terms ∼ ∂ µ R during the renormalization procedure.These terms resemble special rapidity divergences in zeroth moments terms (5.12).However, they do not cancel each other due to extra factor 1/2 in eqn.(5.12).
All mentioned problems arise at NLO of perturbative expression.In fact, bi-quark correlators were introduced just as convenient combinations which appeared at the tree-order of factorization theorem [18,33].Already at NLO of factorization theorem, this notation is inconvenient, because genuine twist-three and twist-two parts receives different hard coefficient functions [9], which break down the combinations.The situation here is similar to the bi-quark twist-three collinear distributions g T , h L , e (see e.g.[11]), which appears in the polarized DIS.They are not more than a convenient combinations of genuine parton distributions, and their convenience does not hold beyond LO approximation.
For applications beyond LO, the bi-quark distributions are defined as the corresponding combinations of renormalized distributions.Similarly, we define bi-quark TMD distributions as combinations of physical TMD distributions, This expression is well-defined at all orders in perturbation theory, and reduces to eqn.(6.1) at LO.
The relations (6.11 -6.26) involves only a half of the twist-three quark-gluon-quark distributions, namely .
These are the distributions with the quark-gluon pair having opposite helicities only, i.e.only ⇑↓ or ⇓↑.The pairs (6.28) evolve autonomously with the kernel P A (3.15) (see sec.4.3), and do not mix with other distributions.Elements of a pair have opposite T-parity, and mix only due to non-zero Θ.This mixture implies that pairs of twist-three bi-quark distributions {f, g}, {h, e} (with the same labels) are intrinsically connected.
The rest distributions, do not contribute to the NLP factorization theorem, and evolve with the kernel P B (3.16).However, they could appear at NNLP factorization.

Evolution equations
To find the evolution of the bi-quark distributions one needs to compute the zeroth moment of the evolution kernels (4.4, 4.5).This computation is presented in the appendix B.2.The resulting expression does not reproduces the zeroth momentum and has also terms with convolutions of twist-three distributions.Therefore, the evolution equations with respect to µ for the bi-quark distributions has the following schematic form where {F + , F − } is a pair bi-quark distributions ({f, g} or {h, e} with the same labels), {f + , f − } are the corresponding twist-two parts, and {Φ + , Φ − } are twist-three distributions (6.28).The cusp anomalous dimension Γ cusp and the quark anomalous dimension γ V are defined in eqn.(3.9).The anomalous dimension γ 1 at LO is The mixing with twist-two terms depend on the specific distribution and are given in eqns.(6.34 -6.49).The last in eqn.(6.30) term mixes distributions with different T-parity.The kernels P and Θ are combinations of P A (3.15), Υ (3.12) and Θ (3.14).They are given in eqn.(B.12).Since equations (6.30) do not have practical importance we do not write anomalous dimensions and kernels explicitly.If necessary they could be found combining definitions (6.11 -6.26), with equations (4.18, 4.16, B.12).Importantly, these equations significantly simplify in the large-N c limit, where which is derived in appendix B.2.This structure is also evident in the NLO coefficient function for TMD factorization (see eqn. (6.15) in ref. [9]).
In the large-N c limit, the zeroth moments of twist-three TMD distributions from the list (6.28) have autonomous evolution.Alike twist-two distributions, they satisfy the pair of equations where F A is the zeroth moment (5.1) of any twist-three distribution listed in eqn.(6.28), Γ cusp is the cusp anomalous dimension, and γ 1 is given in (6.31).The evolution for the zeroth moments of other twist-three distributions, i.e. those that are listed in eqn.(6.29), does not simplify in the large-N c limit.It is given in equation (B.12).
Consequently, the evolution equations for bi-quark distributions in the large-N c limit are also closed.This observation is analogous to the situation with the twist-three structure functions for DIS and baryon distribution amplitudes, for which the evolution equations at LO and large-N c are also closed [50,51].In the collinear case, this effect happens due to the conformal properties of the operator [12].In the TMD case, we do not have any general explanation yet.
For completeness we provide the complete list of evolution equations in the large-N c limit explicitly where we omit the arguments (x, b; µ, ζ) for each function, and the term O(a s /N c ) for brevity.Note that the twist-two and twist-three parts of bi-quark distributions can be evolved separately, and added together afterwards.
The evolution with respect to scale ζ is much simpler, thanks to the physical distributions Φ in the definition (6.10).Since twist-two and twist-three distributions satisfies the same evolution with ζ, the only deviation from ordinary evolution equations happen due derivatives of twist-two terms.For completeness we provide the list of evolution equations with respect to ζ where we omit the arguments (x, b; µ, ζ) for each distribution and arguments (b, µ) for the Collins-Soper kernels for brevity.Note that these equations are exact at all orders of perturbation theory.The derivative of the Collis-Soper kernel is singular at b → 0, where β 0 is the LO coefficient of the QCD beta-function.The "NP-terms" are non-perturbative terms which are O(1) [52].Additionally, the terms with D are multiplied by x −1 .Therefore, the contribution of D to the evolution is numerically large.From this, we expect that the functions {f ⊥ , f ⊥ T , g ⊥ L , h ⊥ T , h T } are larger than other bi-quark TMD distributions.Although this effect can be isolated to the twist-two part, we observe that the genuine twist-three part of these functions have a singular small-b behavior (see appendix C.3).This implies that the functions {f ⊥ , f ⊥ T , g ⊥ L , h ⊥ T , h T } contribute to the cross-section (at large p T ) at one power higher in comparison to naïve counting.This observation could resolve the old-stated problem of mismatches in power counting between fixed-order and TMD factorization for some observable [53].The possibility of such a scenario was discussed in ref. [54].The detailed study of this mechanism is left for future publications.

Conclusion
This paper presents a study of twist-three TMD distributions starting from the operator definition obtained within the NLP factorization theorem [9].We discuss and demonstrate issues of this definition, such as complex-valued functions, discontinuities at x i = 0, and others.We revise these issues step-by-step, modifying the definition of twist-three TMD distributions in order to solve them.In such a way, we introduce physical TMD distributions of twist-three, which are well-defined in all aspects and can be used for practical applications.The results of this work are of principal importance for the description of available and future data on spin asymmetries in Drell-Yan and SIDIS processes.The recent measurements by CLAS [55] allow testing of twist-three distributions' predicted scaling and asymptotic values.
The bottom-top strategy allows us to keep track of the properties of TMD distributions and derive them with minimal efforts.So, we equip the definition with the evolution equations, LO evolution kernels, symmetry relations, discussions on interpretation, support properties, and smallb limit.We also link the genuine distributions discussed here with the generic distributions.It provides the evolution equations for the latter.In this work, we consider only quark TMDPDFs.The study of TMDFFs and/or gluon distributions can be made in the same way.
To our best knowledge, it is the first systematic first-principles study of genuine TMD distributions of twist-three.Let us emphasize several points that we consider to be the most important.
• There are 32 genuine TMDPDFs of twist-three.Only half of them contribute to the NLP cross-section of the SIDIS or Drell-Yan process.Out of these TMDPDFs, 16 are T-odd, i.e., they change signs between DY and SIDIS definitions.
• The evolution equations for twist-three TMD distributions mix T-even and T-odd TMD distributions.This mixture is proportional to a process-dependent sign factor, such that each function preserves its T-parity and in-between-process universality.
• The genuine twist-three TMD distributions are generalized functions.Namely, their value is not defined for some points of support domain (most importantly for the vanishing gluon momentum), albeit they have definite integrals.
• The convolutions that contribute to the factorized cross-section are singular due to discontinuities of TMD distributions, albeit producing a finite cross-section thanks to the cancellation of divergences between different terms.The value of discontinuity can be computed and subtracted.It gives rise to the definition of the physical TMD distribution of twist-three, which provides a term-by-term finite cross-section.
• Some of the genuine twist-three TMD distributions have singular behavior at small-b, namely as b −1 .This behavior changes the naïve counting of terms in cross-section at large transverse momenta, such as the one studied in ref. [53].
• The evolution of bi-quark twist-three TMD distributions is autonomous in the large-N c limit.
Each of these points is important for the further development of the theory and practice of twistthree TMD distributions. where The elementary evolution kernels are [30], where z α ij = z i (1 − α) + z j α.These kernels obey different properties, each one representing various components of conformal transformation.The action of the kernel on q(z 3 )F µ+ (z 2 ) is analogous, but with inverted order of gamma-matrices.The momentum representation can be found in refs.[9,56].

B.2 Evolution kernels for zeroth moments
The zeroth moment is defined as where δ > 0 is a regulator required for determination of phases in the following integrals.The evolution kernels for the zeroth moment can be computed using that where H is elementary kernel defined in eqns.(B.3).This computation is easier to perform in terms of the generation functions.Using the shift operator we present where ∂ z are derivatives acting on the argument z.Then the integrals can be carried out formally.We find In this expression the logarithms and their complex parts are presented and understood in the same way as in eqn.(2.14) (see footnote 3).Combining other elements of the evolution equation (2.14) we obtain the LO equation for the zeroth moment of Φ 12 where we use that (∂ z1 + ∂ z2 + ∂ z3 )Φ = 0, to simplify the expression.Let us emphasize the combination of logarithms in the second line.These logarithms have the same real but distinct imaginary parts.The equation for Φ 12 is obtained by replacing ∂ z1 ↔ ∂ z3 and reverting the order of gamma-matrices.
In the momentum-fractions space, the kernel reads In this expression we set q + = |x 3 p + | as prescribed by the convention eqn.(2.17).The expression enclosed by square brackets is regular at x 2 = 0.The complex parts are x 1,2,3 ∈ (+, +, −).

(B.13)
This expression can be compared with the logarithmic part of the hard coefficient function in the factorization theorem [9], and agrees with it.We emphasize the non-trivial cancellation between complex parts of the logarithms.
In the large-N c limit the expression essentially simplifies The part of the kernel with Φ [Γ] expresses via the zeroth moment only, whereas the part involving Φ [Γγ ν γ µ ] does not.Therefore, the action of the kernel P A (3.15) to the zeroth moment expresses via the zeroth moment again in the large-N c limit, whereas the action of the kernel P B (3.16) has a more general structure.The definition q + (2.17) plays the crucial role in the large-N c simplification.For a general q + one gets an additional term in the square brackets of eqn.(B.12) proportional to ln(q + /|x 3 p + |).This term survives in the large-N c limit, and produces the term 2N c ln(q 21 in the square brackets in eqn.(B.14).It spoils the simplification of evolution equations for bi-quark TMD distributions.

C On leading behavior of TMD distributions at small-b
At the small values of b, the TMD correlators are expressed via the spatially compact light-cone operators.The knowledge of asymptotic small-b expansion is important for phenomenology since it relates collinear and TMD distributions, increasing their mutual predictive power [57].In this appendix, we discuss the leading terms of small-b expansion for some of sub-leading TMD distributions, with a particular emphasis on the singular ∼ b −1 behavior.The list of relations presented here is incomplete.It lacks some of the 32 twist-three distributions which have leading matching at higher twist order.The main purpose of this appendix is to present examples of singular smallb behavior, as the phenomenon of the lowering of collinear twist.The complete evaluation and detailed study is a topic left for a separate investigation.

C.1 Computation of the small-b matching
The small-b expansion for the twist-two TMD distributions is studied in details.The coefficient functions for leading terms are known at NLO for all TMD distributions [39,47,58,59] (except pretzelosity h ⊥ 1T that has leading term at twist-four, see ref. [22]).In some cases the coefficient functions are known at NNLO [43,60,61] and at N 3 LO [62,63].
The tree order of the small-b expansion corresponds to a Taylor expansion of the TMD operator.Clearly, this operation can only increase the geometrical twist of the light-cone operator.Therefore, for an operator with the TMD-twist-(N,M) the leading small-b term has twist-(N+M).Therefore, for the present case the smallest-twist distributions are of twist-three.The leading term is Φ The complete computation of the one-loop correction to (C.1) is complicated and we do not cover it, for the time being.However, there exists a particular contribution which gives a singular behavior at small-b.This contribution arises from the 2 → 1 diagrams10 with two quantum fields turning to a single classical field shown in fig. 5.The leading small-b expression for these diagrams is straightforward to compute using the background-field approach, see ref. [39].The result is UV finite.The quark contribution is given by diagrams (A) and (B) x 1 Tr(γ ρ γ ν γ − Γ) − x 3 Tr(γ ν γ ρ γ − Γ) 2x 2 Φ µν (−x 2 )(θ(x 1 , x 3 ) − θ(−x 1 , −x 3 )), x 1 , x 3 = 0, where we used x 1 + x 2 + x 3 = 0 for simplification.At x 1 = 0, or x 2 = 0, or x 3 = 0 these expressions are not defined, but have finite integral over these points.Let us for concreteness inspect the diagram B. Once x 2 or x 3 turns to zero, the product of δ function became indefinite, in the sense that its result depends on the order of integration over α and y.However, performing the integration over x 2 or x 3 , one specifies the order of integration and the result of integration is definite.Therefore, the functions Φ Nonetheless, they receives the singular matching at the two-loop level, at least due to the evolution equations (4.15, 4.16, 4.17).Therefore, we have These are only a part of small-b relations.To obtain the leading term for the left 12 distributions, one needs to compute a one power higher.Without this computation we cannot strictly identify is the power of the leading small-b term.
The presented here relations violate the naïve expectations about power counting of TMD distributions in the large-p T asymptotic.It explains the problems with "mismatching" between the fixed order computations and the leading twist TMD factorization observed in ref. [53].In ref. [54] authors discuss a possible resolution of this problem by applying "violated" counting.Comparing the known asymptotics for the SIDIS structure function F cos φ U U , and computing the deficit, they derive the large-k T expression for the TMD bi-quark function f ⊥ (6.4).Their result agrees with our expressions for f ⊥ and g ⊥ ⊕ in the power-scaling, but has the different coefficient.It is due to the fact the authors of ref. [53] ignore the contribution of twist-four distributions to the large-p T asymptotic.

C.3 Parameterization for collinear distributions
The standard parameterization for the collinear distributions of the twist-two [11]  where f 1 , g 1 and h 1 are unpolarized, helicity and transversity PDFs (also denoted as q, ∆q and δq).The s µ T is the transverse part of the spin vector, and λ = M s + /p + is its longitudinal part.The twist-two PDFs are defined at −1 < x < 1 and for the negative values of x are associated with PDF for anti-quarks f 1 (x) = θ(x)f 1,q←h (x) − θ(−x)f 1,q←h (−x), (C.15) and similar for g 1 and h 1 .

Figure 2 .
Figure 2. Support of the TMD correlator Φ21 in the space x1, x2, x3.The momentum fractions are constrained to x1+x2+x3 = 0 and xi ∈ [−1, 1].Each sector of the hexagon has independent interpretation as a process with radiation/absorption of partons with positive collinear momentum fractions.The difference between transverse momenta of quark and anti-quark-gluon pair is kT .The interpretation process for each sector is shown by diagrams, where horizontal/vertical axis is for collinear/transverse momentum.For the correlator Φ12 the picture is analogous, but with the transverse momentum of the gluon reversed.

21 .
In total, there are six interpretation regions for each correlator Φ shown in fig.2.These twelve combinations of x's and k T show all possible quark-gluon-quark-transition processes with a single measured k T .

Figure 4 .
Figure 4.The diagrams contributing to the rapidity divergence of TMD-twist-(1,2) operator.The blue arrow shows the rapidity divergent limit.

Figure 5 .
Figure 5.The diagrams that are singular at b → 0, and which provide leading contribution to the matching ∼ as/b for Φ [Γ] µ,12 (diagrams A and C) and Φ [Γ] µ,21 (diagrams B and D).