Collinear matching for Sivers function at next-to-leading order

We evaluate the light-cone operator product expansion for unpolarized transverse momentum dependent (TMD) operator in the background-field technique up twist-3 inclusively. The next-to-leading order (NLO) matching coefficient for the Sivers function is derived. The method, as well as many details of the calculation are presented.


Introduction
The exploration of the internal structure of nuclei is a fascinating task, which identifies transverse momentum dependent (TMD) distributions as one of its most powerful tools. Transverse momentum dependent factorization theorems present a consistent description of double-inclusive processes, such as Drell-Yan/Vector/Scalar boson production(DY) [1,2] and semi-inclusive deep inelastic scattering (SIDIS) [1,3,4] in the regime of small transverse momentum. Within the TMD factorization approach, the information on hadron structure is encoded in TMD parton distribution functions (TMDPDFs) and TMD fragmentation functions (TMDFFs). The presence of the transverse scale allows to resolve the internal structure of hadron with more details than collinear parton distributions. Many polarization phenomena, which are subleading in collinear factorization, are described by the leading order TMD factorization. In this work, we study the Sivers function [5,6], which describes the correlation of an unpolarized parton transverse momentum and a hadron polarization vector.
The Sivers function is an essential part of the single-spin asymmetry (SSA) phenomenon. Experimentally, SSA has been measured in SIDIS at Hermes [7], COMPASS [8,9], JLab [10] and in Drell-Yan at RHIC [11][12][13]. Its measurement is planned also for the future Electron-Ion Collider (EIC) [14]. SSA has been also an object of intensive phenomenological analysis, see e.g. [15][16][17][18][19][20]. The resulting predictions differ substantially among these studies owing to TMD evolution [21], which shows the importance of a correct treatment of QCD perturbatively calculable parts. In the literature, there are several available calculations of the SSA in perturbative QCD. The leading order (LO) (and partially the next-to-leading order (NLO)) calculations for the SSA were performed in many works [22][23][24][25][26][27][28]. In principle, following these works it is possible to obtain the perturbative expression for Sivers function at NLO (however, different schemes are used for different parts of the calculation, see discussion in section 7.3). Therefore, the SSA and the Sivers function are probably one of the most renowned and intensively studied polarized TMD quantities.
Although the TMD distributions are genuine non-perturbative functions that should be extracted from data, they can be evaluated in a model-independent way in terms of collinear distributions in the limit of large-q T [29], or small-b in the position space. This procedure is called "matching" and typically it serves as an initial input for the nonperturbative model of the TMD distributions, see e.g. [17,30,31]. The matching greatly increases the agreement with data [30]. From the theory side, the matching procedure consists in the selection of the leading term in the light-cone operator product expansion (OPE) for the TMD operators [32,33]. Alternatively, the matching can be obtained by
Only a few TMD distributions of leading-dynamical twist match the twist-2 collinear distributions. These are the unpolarized, helicity and transversity TMDPDFs and TMDFFs. The matching coefficients for these distributions are known uniformly at the next-to-leading order (NLO) [1,2,33,35,36] and some are known at NNLO [32,37,38]. The remaining TMD distributions match twist-3 collinear distributions (apart of the pretzelosity which is apparently of twist-4 [38,39]). The knowledge of the matching for these distributions is very poor: the quark TMDPDFs are all known at LO [22,23,26,40,41] and only Sivers function is known at NLO [27,28] (however, see discussion in section 7.2). The matching for some of quark TMDFFs, such as Collins function, is known at LO [40]. The matching for the majority of gluon TMD distributions is unknown.
The importance of the computation of the perturbative part of a TMD distribution in order to meet an agreement between theory and experiment has been shown already in [30] for the unpolarized case. Depending on the experimental conditions, the measured data can be sensitive to various aspects of the theory such as power corrections in the evolution [42], power correction [43], small-x effects in the evolution [44] and many others. The full control of all of these sources of non-perturbative physics requires an accurate setting of the perturbative scales, as provided, for instance, by the ζ-prescription of [45].
In this work, we perform a complete NLO computation of the Sivers function starting from its operator definition and performing a light-cone OPE in background field [46]. To our best knowledge, this approach is used for the description of TMD operator for the first time, despite the fact that it is a standard tool in higher twist calculation, see e.g. [47,48]. This technique grants an unprecedented control of the operator structure and it allows a very general treatment for twist-3 distributions. Therefore, the result obtained in this work is also interesting for a broader study. For the first time, we demonstrate how the TMD renormalization (ultraviolet and rapidity renormalization [49]) is organized at the operator level. We also articulate the role of the gauge links and their direction and show (at the level of operators) the famous sing-change in-between DY and SIDIS definitions of the Sivers function [50]. Motivated by these considerations, we provide a detailed and pedagogical explanation of the calculation method, which is a major target of this article. For that aim, the Sivers function represents an ideal case, because one can cross-check the calculation with other methods already used in the literature. We anticipate that our results agree with the results present in the literature only partially, however, the origin of the discrepancy is clear.
The article is organized as following. Section 2 is a general introduction to SSA in the TMD factorization approach. Here we collect the expressions for SSA structure functions and describe the role of Sivers function and its collinear matching. In section 3.1 we introduce and describe in detail the operator that defines Sivers function. Its renormalization properties are discussed in section 3.2. Section 3.2 is devoted to the detailed derivation of OPE at LO. We discuss separately the evaluation in regular (section 4.1) and light-cone (section 4.2) gauges. The NLO evaluation is presented in section 5. We make a pedagogical introduction to the background field method in section 5.1-5.2. The details on the NLO JHEP05(2019)125 evaluation of diagrams are given in section 5.3. In section 5.4-5.5 we discuss the appearance of rapidity divergences and their renormalization. The difference in the evaluation of DY and SIDIS operators is discussed in section 5.6. The extra details on the calculation are given in appendices B, where we present a step-by-step calculation of a diagram and C.1, where we give the diagram-by-diagram expressions for OPE. The collinear distributions are defined in section 6. Additional details of the parametrization definition are given in appendix. A. The transition from operators to distributions is discussed in section 7.1 and the collection of diagram-by-diagram expressions can be found in appendix C.2. The final result of calculation is given in section 7.2. The discussion and comparison with earlier calculations is given in 7.3.

Sivers effect and TMD factorization
TMD distributions are defined by a large set of parameters: collinear momentum fraction x, transverse distance b (or transverse momentum p T ), polarization, parton flavor f , the type of hadron h, ultraviolet and rapidity renormalization scales (µ and ζ) and the defining process (DY or DIS). An explicit designation of all these parameters would lead to a heavy notation such as which should be read as the Sivers function for a quark q with momentum faction x at the transverse parameter b produced by hadron h in the DY kinematics, measured at scales µ and ζ. Most of this information is not needed in perturbative calculations and in the following we skip the unnecessary parts of the notation, e.g. the renormalization scales are usually dropped. We also distinguish the momentum and coordinate space TMD distributions only by their arguments. In the rest of this section we show how the Sivers function arises in SIDIS and DY cross sections.

Sivers function in SIDIS
The semi-inclusive deep inelastic scattering (SIDIS) is a common name for a set of processes l(l) + N (P ) → l(l ) + h(P h ) + X, (2.1) where l(l ) is a lepton, N is a nucleon target and h is the produced hadron. The TMD factorization is applicable in the regime |P h | Q, where Q 2 = (l − l ) 2 is a hard scale of the scattering, P h is the transverse component of the momentum P h . In the following, we use the bold font notation for the transverse components of vectors.
In the case of unpolarized lepton beam, unpolarized produced hadron h and a transversely polarized target N , the cross-section for SIDIS contains three structures. The socalled Sivers effect (proportional to sin(φ h −φ s )), Collins effect (proportional to sin(φ h +φ s )) and the sin(3φ h − φ s ) asymmetry. The structure functions corresponding to these effects within TMD factorization can be found e.g. in [4,15,51]. The structure function for the

JHEP05(2019)125
Sivers effect is denoted by F sin(φ h −φs) U T . Within the TMD factorization it is [4] F sin(φ h −φs) U T (x, z, Q, P h ) = −xH DIS (Q, µ) f e 2 f d 2 pd 2 kδ (2) where the variables x and z are the momentum fractions of partons and M is the hadron mass. The functions D 1 and f ⊥ 1T are unpolarized and Sivers TMD distributions. The factorization scale µ is typically chosen to be of order Q. The scales of soft exchanges (rapidity factorization) ζ 1,2 satisfy ζ 1 ζ 2 = Q 4 .
The TMD factorization is naturally formulated in position space, where the Fourier convolution in eq. (2.2) turns into a product of functions. In position space the structure function reads The functions D 1 and f ⊥ 1T depend only on the length of the vector b but not on its direction and one can also simplify the angular dependence [41,52] where J 1 is the Bessel function of the first kind. The equation (2.4) is the usual starting point for the parametrization of the Sivers effect in TMD factorization.

Sivers function in DY
The Sivers effect also appears in the Drell-Yan/vector boson production process where one of the initial hadrons is polarized [51,[53][54][55]. In general one refers to structure functions F 1 U T when the hadron h a is polarized and F 1 T U when the hadron h b is polarized. The structure function F 1 T U in TMD factorization (i.e. for q T Q) reads [56]

JHEP05(2019)125
where Q 2 = (l + l ) 2 is the hard scale of the process, x a,b are momentum fractions of partons, q T is the transverse component of q = l + l relative to the scattering plane and f 1 is the unpolarized TMD distribution. The factorization scales are defined similarly to the SIDIS case, i.e. µ ∼ Q and ζ 1 ζ 2 = Q 4 . The transformation of the structure function under interchange of the polarized hadron ( The structure functions can be also written in the form and where we have integrated out the angular dependence. The Sivers functions in SIDIS, eq. (2.2) and DY, eq. (2.6), have different labels that specify the processes. These functions have different operator definitions (see section 3.1). However, de facto, the process-dependence reduces to a simple sign change [22,50,57,58] In the following, we demonstrate the origin of the sign-change at the level of OPE.

TMD evolution and operator product power expansion
The practical application of TMD factorization relies on the concept of TMD evolution, which allows to relate structure functions at different values of Q. Here, we should stress that a TMD distribution is an involved non-perturbative function. In fact, in addition to the non-perturbative structure of TMD distribution (which involves the dependence on the variables (x, b)), the TMD factorization also contains a non-perturbative part of the evolution factor (which depends only on b). An efficient implementation of the TMD approach should be able to disentangle these non-perturbative contributions. The parametrization and extraction of three non-perturbative functions (two TMD distributions and the evolution kernel) of two variables would be a hopeless task if the TMD factorization would not allow us to separate the problem into pieces. First of all, the TMD evolution is regulated by two scales (µ, ζ) and it is process independent. It factors out the non-perturbative evolution effects into an evolution factor which is strictly universal for all structure functions and for all TMD factorizable processes. Nonetheless, the TMD evolution still non-trivially affects the (x, b) dependence of the distribution which should be modeled as a function of two variables. To simplify this procedure one can use any available information that restricts the functional form of the JHEP05(2019)125 TMD. In particular, at small values of b a TMD distribution can be related to collinear distributions in a model-independent way in perturbation theory. Such a relation has the general form provided by OPE where C i are perturbatively calculable Wilson coefficient functions which depend on b only logarithmically via L µ (to be defined in eq. (5.2)), f i are collinear distributions of increasing twist and ⊗ is an integral convolution in the variable x. This expansion is valid only in a certain range of b, say |b| < R, where R is some matching scale. For values of b larger than R TMD distribution is completely non-perturbative. In fact, as the value of b gets closer to R, the contribution of higher order terms in the small-b expansion becomes more important. However, our knowledge of the corresponding higher-twist distributions is very limited. Thus, it is of practical convenience to use only the first term of the small-b expansion in eq. (2.10) and replace the rest by a generic non-perturbative function, i.e. (2.11) The practical success of such an ansatz can be easily understood if we notice that the main contribution to the Fourier integrals in eqs. (2.4), (2.8) comes from the small-b region. Therefore, we can expect that the function f N P has a simple behavior in x and b, which is indeed confirmed by phenomenological applications of this formula. The details of the modeling procedure which is based on eq. (2.11) are different in different approaches, but the core picture described here remains unchanged. The small-b matching is an essential part of the modern TMD phenomenology. In ref. [30] a comparison of different orders of the matching to experimental results has been performed. It has been shown that the NLO matching is essential for the predictive power of the approach. The NNLO matching provides further improvements and it can be necessary for the description of the most precise experiments.
The achievable precision can also be affected by the choice of scales in the matching. Let us also mention that in [45] the authors have proved the possibility to disentangle the procedure of small-b matching and TMD evolution using the ζ-prescription which is not entirely possible in other formulations. The ζ-prescription allows using different perturbative orders for TMD evolution and small-b matching. This means that the modeling of the TMD through eq. (2.11) is completely separated from the evolution part of the TMD (that is, the scale choice does not mix up non-perturbative pieces of different origin). This fact results to be extremely useful for phenomenology since it allows to use the highest allowed/known expression of evolution [59] in combination with polarized observables whose high perturbative orders are unknown. The universal non-perturbative part of evolution can be extracted from the most precise data (such as Z-boson production at LHC) [60].
Let us conclude this section recalling that the hard coefficient functions H DIS and H DY within TMD factorization are given by the quark form factor evaluated in the different JHEP05(2019)125 analytical regions. At the NLO they differ only by a π 2 -term, (2.13) where l Q 2 = ln(µ 2 /Q 2 ) and a s = g 2 /(4π) 2 . The NNLO and NNNLO expression can be found in [61].

Operator definitions for unpolarized and Sivers TMD distributions
In this section, we introduce and review the main properties of TMD distributions.

Definition of TMD distributions
Through the article we use the standard notation for the light-cone decomposition of a vector The vectors n andn are light-like Their particular definition is related to the factorization frame of the scattering process. The transverse part (with respect to vectors n andn) of the metric and Levi-Civita tensors are where µνρσ is in the Bjorken convention ( 0123 = − 0123 = 1). In four dimensions (with n andn localized in the plane (0, 3)) both tensors have only two non-zero components, g 11 T = g 22 T = −1 and 12 T = − 21 T = 1. Since the transverse subspace is Euclidian, the scalar product of transverse vectors is negative, v 2 T < 0. In the following, we adopt the bold font notation to designate the Euclidian scalar product of transverse vectors, i.e. b 2 = −b 2 > 0, when it is convenient.
Using this notation, the transverse momentum dependent parton distribution functions (TMDPDFs) for unpolarized quark are defined by the matrix element [1,2,62] where M is the mass of the hadron and s T is the transverse part of the hadron spin-vector S, i.e. s µ T = g µν T S ν . The function f 1 is the unpolarized TMDPDF, which measures the unpolarized quark distribution in an unpolarized hadron. The function f ⊥ 1T is known as the Sivers function, which measures the unpolarized quark distribution in a polarized hadron.
The parametrization of eq. (3.5) is given in position space. The distributions in momentum space are defined in the usual manner where the scalar product (bp) is Euclidian. Correspondingly, the momentum space parameterization reads [4,63] Φ Some explicit relations among particular TMDPDFs can be found in the appendix of ref. [41]. These relations are used to relate structure functions in momentum and coordinate representations in section 2. The anti-quark TMD distribution is defined as Using charge-conjugation, one can relate the quark and anti-quark TMD distributions [62], from which it follows Therefore, in the following we associate the anti-quark distributions with the negative values of x and we define the TMD distributions in the range −1 < x < 1 as The small-b expansion (often called small-b matching or collinear matching) presents a TMD distributions as a series of collinear distributions and Wilson coefficients in the vicinity of b = 0 as in eq. (2.10). For instance, the leading term of the small-b expansion for unpolarized TMD is expressed by the (unpolarized) collinear PDF f 1 (x) where the sum index f indicates gluons, quarks and antiquarks of all flavors. The coefficient function C is the perturbative Wilson coefficient, which depends on b logarithmically. Its leading term is δ(1 − y) and the perturbative corrections are known up to NNLO [37]. The power corrections (as in eq. (2.10)) contain collinear distributions of twist-2 and twist-4 and they are currently unknown. The expression for the small-b matching of the Sivers function is (3.15) where T are the collinear distributions of twist-3, to be defined in sections 6.1, 6.2. The symbol ⊗ denotes an integral convolution in the variables x 1,2,3 . At leading order the expression for the coefficient function is known to be ±πδ( [22,23,26,41] (and we also re-derive it in the next section). The status of the NLO expressions is cumbersome. In principle, the quark-to-quark part can be found in [27], where it has been extracted from computation of the cross-section made in [23][24][25]. However, the computations made in [23][24][25] miss certain parts and for this reason they are partially incorrect (see extended discussion in [64]). The quark-to-gluon part is evaluated in [28], however, the authors use a scheme which is different from the standard one for twist-2 computations. We return to this discussion in section 7.2.

Evolution and renormalization
The renormalized TMD, unlike usual parton distributions, depend on a pair of scales. This is a consequence of the TMD factorization procedure, which decouples the hard scattering

JHEP05(2019)125
factorization and the factorization of the soft-gluon exchanges [1,49,65,66]. As a result the evolution of TMD is given by a pair of equations where γ F and D are respectively the ultraviolet (UV) and rapidity anomalous dimensions. Eq. (3.16)-(3.17) are independent of polarization and TMD structure. The double-scale nature of factorization and evolution opens also unique possibilities for the phenomenological implementation of TMD. In particular, it allows a universal scale-independent definition of a TMD distribution [45]. At the operator level the double-scale nature of evolution is reflected by the presence of two types of divergences, namely UV and rapidity divergences. Both divergences are to be renormalized. The UV renormalization factor is known as TMD-renormalization factor Z TMD f and it can be extracted from the UV renormalization of quark (or gluon) vertex attached to the (light-like) Wilson line. The rapidity renormalization is made through the rapidity renormalization factor R f (for the proof of multiplicativity of rapidity divergence renormalization, see ref. [49]). It is compulsary that both renormalizations are made at the level of operator and thus do not depend on the hadron states. The renormalized TMD operators U f that defines the physical TMD distribution, reads where we explicitly write the scaling variables for each expression. In eq. (3.18) Z i is the renormalization of the field wave functions (Z 2 for the quark field and Z 3 for the gluon field). The TMD operators U relevant for this work are defined later in eq. (4.1), (4.3). Both renormalizations are scheme dependent. We use the conventional MS-scheme together with the dimensional regularization for the UV divergences. For the rapidity renormalization we use the conventional scheme [1,2,49,66,67] that is fixed by the requirement that no remnants of the soft factor contribute to the hard scattering. Apart from this one should worry about the overlap between collinear and soft modes in the factorization of the cross sections, which is rapidity regulator dependent. This is resolved in the δ-regulator scheme where the form of the rapidity renormalization factor is given by the inverse square root of the TMD soft factor R = 1/ √ S, see ref. [68]. This regulator has been already used several times in higher order calculations, see refs. [32,37,38,68].
The particular expression depends on the order of application of the renormalization factors. In this work, we fix the order as in eq. (3.18) and we use the δ-regularization, whose definition is given in section 5.4. Then the rapidity renormalization factor in MS-scheme reads [49]

JHEP05(2019)125
where B = b 2 /4 and a s = g 2 /(4π) 2 . The UV renormalization constant is [32] Here, we list only the renormalization constants for quark operators at one-loop, since they are the only required in the present calculation. The gluon case, as well as, two-loop expressions can be found in ref. [32]. We emphasize that the rapidity renormalization factor depends on the boost-invariant combination of scales δ/p + [65] (here, δ regularizes rapidity divergences in n-direction and thus transforms as p + under Lorentz transformations). Such a combination appears in the factorization of the cross section of DY and SIDIS and when splitting the soft factor into parts with rapidity divergences associated with different TMD distributions [2]. In the course of factorization procedure, the accompanying TMD distribution (e.g. D 1 in (2.4) or f 1 in (2.8)) gets the rapidity renormalization factor with (δ − /p − )ζ argument, where δ − regularizes rapidity divergences inn-direction. The values of p + and p − are arbitrary, however, they dictate the value of ζ andζ, since ζζ = (2p + p − ) 2 . The standard and convenient choice of scales is ζζ = Q 4 , which is the only physical hard scale appearing in the reference processes. This scale determines the value of p + and p − as momenta of partons that couple to test current, see also section 5.4. For an extended discussion see section 6.1.1 in ref. [49] and also refs. [65,66].

Light-cone OPE at leading order
In this section we present the operators that enter in the definition of the Sivers function and their LO limit for small-b, recovering the results of [41]. The notation for operators established in this section is the one used in the NLO computation.

Light-cone OPE in a regular gauge
Let us denote the operator that defines the TMD distributions in DY case as where the Wilson lines are defined as The operator that defines the TMD distributions in the SIDIS case reads

JHEP05(2019)125
Generally, the links which connect the end points of Wilson lines at a distant transverse plane must be added in both operators (for DY and for SIDIS) [69,70]. Here, we omit them for simplicity, assuming that some regular gauge (e.g. covariant gauge) is in use. In nonsingular gauges the field nullifies at infinities, A µ (±∞n) = 0 and the contribution of distant gauge links vanishes. The case of singular gauges is discussed in the following section. We point out that for convenience of calculation and presentation the operators in eq. (4.1), (4.3) are defined differently in comparison to original operator in eq. (3.4). In particular, we double the transverse distance between fields and write it in symmetric form. Also, the operators in eq. (4.1), (4.3) are defined for arbitrary light cone positions z 1 and z 2 , although the definition of a TMD distribution depends only on the difference of these points. Such a generalization does not complicate the calculation, moreover, it allows to cross-check certain results. These modifications are undone on the last step of calculation, see eq. (7.1). Note, that the operators in eq. (4.1), (4.3) define the generalized transverse momentum distributions (GTMDs) and thus the obtained OPE can be applied for generalized TMD (GTMD) kinematics as well.
It is straightforward to check that the spatial separations between any pair of fields in the operators defined in eq. (4.1), (4.3) are space-like. 1 For that reason we can replace the T -andT -orderings by a single T -ordering. This significantly simplifies the calculation and in the following we do not explicitly show the symbol of T-ordering, but we suppose that each operator is T-ordered. The possibility to reorder the fields is not a general feature, e.g. TMD operators for fragmentation functions do not allow this simplification and thus, their properties are drastically different.
At LO in perturbation theory one can treat the fields as classical fields, i.e. omit their interaction properties. In this approximation, the small-b expansion is just the Taylor expansion at b = 0. Expanding U in b up to linear terms we obtain The leading term is the same for DY and SIDIS cases Note that the half-infinite segments of Wilson lines compensate each other due to the unitarity of the Wilson line and the resulting operator is spatially compact. The derivative term in eq. (4.4) is different for different kinematics 1 There is a single exception. The fields of anti-quark operator and the attached Wilson line have light-like separations but anti-time-ordered. However, the reordering of the operator can performed in the light-cone gauge, where the gauge links vanish. The detailed discussion on the ordering properties of quasi-partonic operators can be found in ref. [71].

JHEP05(2019)125
Here, the derivative prevents the compensation of infinite segments of Wilson lines. Acting by derivative explicitly we obtain where the covariant derivative and the field-strength tensor are defined as usual The operators which contribute to each order of the small-b expansion have different geometrical twists. 2 In particular, the first term in eq. (4.8) is a mixture of twist-2 and twist-3 operators, while the second term is a pure twist-3 operator (the same for eq. (4.9)). The procedure of separation of different twist contributions is explained in details in [41]. In the present paper, we skip this discussion because the Sivers function contains only contribution of geometrical twist-3 operator. Indeed, comparing the results for DY in eq. (4.8) and SIDIS in eq. (4.9) kinematics we observe that the first terms are the same, while the last terms differ. Therefore, already at this stage it is clear that the Sivers function is made of the operators from the last terms, i.e. pure twist-3 operator.

Light-cone OPE in the light-cone gauge
Before entering a detailed description of the background field method it is convenient to formulate the derivation of the small-b limit of the TMD functions at LO in the light-cone gauge. This gauge will then be used in the following to describe the background fields. The definition of TMD operators is gauge invariant. In order to demonstrate this explicitly, let us restore the formal structure of gauge links in eq. (4.1), (4.3). We have Notice, that in order to write eq. (4.11), (4.12) we have explicitly used the fact that the Tordering can be removed. In the absence of such assumption the finite distance transverse link must be replaced by two half-infinite links [69].

JHEP05(2019)125
The light-cone gauge is defined by the condition The application of this condition removes the contribution of gauge links along vector n in the TMD operator, i.e. [zn + b, ±∞n However, the status of the transverse gauge links is unresolved. This reflects the known fact that the gauge fixing condition (4.13) does not fix the gauge dependence entirely but should be supplemented by an additional boundary condition. There are two convenient choices for boundary conditions in our case 3 retarded: g µν T A ν (−∞n) = 0, (4.14) Clearly, each of these boundary conditions is advantageous in some particular kinematics.
As so, we apply the retarded boundary condition for the DY operator. That is, the transverse link at −∞n vanishes, Whereas for the SIDIS operator we apply the advanced boundary condition. That is, the transverse link at +∞n vanishes, Thus, the operators have the same expression in different gauges. In order to recover the structure of gauge links (and hence to obtain the explicitly gauge-invariant operators), we can make a gauge transformation of the operator and subsequently replace each gaugetransformation factor by a Wilson line along the vector n to the selected boundary. The OPE in the light-cone gauge has a compact form. The leading term of eq. (4.4) is The expression for the derivative of the operator is also independent of the underlying kinematics (compare to eq. (4.6), (4.7)) and in fact, it already gives the final expression of the correction linear in b in the lightcone gauge. Let us show how the results for LO OPE in eq. (4.8), (4.9) are recovered starting from eq. (4.19). One starts rewriting eq. (4.19) explicitly in a gauge-invariant form. With

JHEP05(2019)125
this purpose we replace the partial derivatives in eq. (4.19) with covariant derivatives, see eq. (4.10), by adding (and subtracting) appropriate gluon fields To proceed further, we have to recall the used boundary condition in the form in the advanced light-cone gauge, (4.22) where x is an arbitrary point. Substituting these expressions into eq. (4.20) we arrive to eq. (4.8), (4.9).

Light-cone OPE for the gluon TMD operator
The small-b OPE at NLO contains both quark and gluon collinear operators. The gluon operators that appear in a quark TMD are those that would appear in the small-b OPE for gluon TMD operator. Since this expansion for gluons has never been considered in the literature we briefly describe it here. We define the gluon TMD operator as (compare to eq. (4.1), (4.3)) where the Wilson lines are in the adjoint representation, i.e. the contraction of the color . The parametrization of the corresponding TMD matrix elements can be found e.g. in [36].
The evaluation of the light-cone OPE for gluon operators is totally analogous to the one made in section 4.1. The only difference is that the quark fields are replaced by F +µ and the covariant derivatives act in the adjoint representation. We obtain the following analog of eq. (4.8), (4.9) where the covariant derivatives are in the adjoint representation. Alike the quark case, the only operators which contribute to the Sivers function are given in the second lines of these equations.

JHEP05(2019)125
5 Light-cone OPE at next-to-leading order The object of this section is to introduce the calculation of OPE for U up to terms linear in b at NLO in perturbation theory. The OPE is realized when b 2 Λ −2 and it looks like where C are the coefficient functions which depend on b 2 logarithmically, n enumerates all available operators at this order and ⊗ is some integral convolution in variables z. Here, we also introduce the notation for the coupling constant a s = g 2 /(4π) 2 and for the logarithm combination that typically enters in perturbative calculations The variable µ represents the scale of OPE. The complexity of the computations for OPE increases drastically passing from LO to NLO in perturbative QCD. In the latter case one cannot omit the field interactions, as it happens in ordinary Taylor expansion as in eq. (4.4). The propagation of fields between different points is responsible of the fact that eq. (4.4) is to be modified in the presence of interactions which can pick up additional fields from the vacuum. Moreover, the OPE with interacting fields contains all possible operators with correct (as prescribed by the theory) quantum numbers.
An additional difficulty in the present calculation is that only a few computing methods have been tested on higher twist operators. For the twist-2 TMD operators the matching procedure is simple because in the OPE a TMD is in a one-to-one correspondence with the on-shell matrix elements over collinear-parton states. In the case of higher twist operators the only matrix elements of collinear partons are not suitable for obtaining the matching coefficients, since a transverse component of momentum is needed to carry the operator indices. It can also happen that a matrix element over collinear partons is not infrared-safe and it requires an additional regularization with a (specific) separation of pole contributions, see e.g. [24,73]. These problems are solved using off-shell matrix elements, which is significantly more complicated, due to the fact that the higher-twist operators mix with each other via QCD equations of motion and that off-shell colored states are not generally gauge invariant. The best method to evaluate the coefficient functions at higher twist results to be the background-field method. At the diagram level, the method is equivalent to the evaluation of a generic matrix elements, with the main difference that the result of the calculation is given explicitly in operator form. The method allows to keep track of gauge properties and significantly simplifies the processing of equations of motion. Altogether, these properties make the background-field method very effective for higher twist calculations. In the following we concentrate on this method, for which we provide a brief general introduction in section 5.1. The details of the calculation are given in section 5.2-5.3. The JHEP05(2019)125 treatment of rapidity divergences and renormalization needs a special discussion which is provided in section 5.4-5.5. All the computation is done for the DY case, but the passage to the SIDIS case does not present particular difficulties and the comparison of the two cases is provided in section 5.6.

OPE in background field method
The background-field method is founded on the idea of mode separation. The operator matrix element between states S 1 and S 2 is defined as where the letter Φ represents any QCD field {q, q, A µ }, Ψ S is the wave function of the state S and S is the action of QCD. Let us split the fields into the "fast" and "slow" (or "short-correlated" and "long-correlated" in position space terminology) components, as Here, the "fast" modes φ have momentum p > µ, while "slow" modes have momentum p < µ. The (factorization) scale µ is not explicitly defined but it is large enough to guarantee the convergence of the perturbative series. In the following we omit the argument µ for the fields. We postulate that physical states (hadrons) are built from the "slow" components, i.e. Ψ S [Φ] = Ψ S (ϕ) so that eq. (5.3) turns into In this expression the integral over "fast" components can be evaluated and the expression for observables has the following effective form The mode separation then assumes that the "slow" fields can be treated as free-fields on distances x 2 . This hypothesis is typical for effective field theories (see for instance [74][75][76] for the application of similar concepts in soft collinear effective theory (SCET) or [48] for TMD factorization at small-x). One can interpret the construction in eq. (5.6) as an evaluation of the perturbative QCD fields in a general parton background, which gives the method its name. After the integration of the "fast" fields in eq. (5.6), the resulting effective operator is then expanded using free-theory twist expansion, as it was done in section 4. It is important to realize that in background calculation the result is gauge-invariant and satisfies QCD equations of motion at each step of the evaluation (even for each diagram). The result then is also JHEP05(2019)125 universal, that is, it is valid for all states (we do not even specify them) and thus, we can operate only with fields ϕ. Essentially, the background field methods is concentrated in a single definition, eq. (5.6).
The background field method is an essential tool of the modern small-x calculations. In this case the separation of kinematic modes is based on the strong ordering in rapidity, which is a distinctive feature of the small-x kinematics. To define different modes one has to introduces a rapidity cutoff parameter σ, which separates "fast" (p + < σ) and "slow" (p + > σ) fields based on the value of the longitudinal component of the momenta p + . Instead of the twist expansion the calculation of the functional integral over "fast" fields (5.6) is now performed in the so-called shock-wave approximation. Since the procedure of separation of modes is quite general, the method can incorporate different kinematic regimes, which has been recently employed in [44,48].

QCD in background field
The QCD Lagrangian reads where the covariant derivative and F µν are defined in eq. (4.10). Following the mode separation we split the fields as A µ → A µ + B µ and q → q + ψ, where ψ and B µ are "fast" fields and q and A µ are "slow" (background) fields. The separation of modes in the main body of the Lagrangian is straightforward, but the gauge fixing term should be considered with caution. The ultimately convenient point of the background field method is the possibility to choose different classes of gauge fixing for different modes. The detailed discussion on gauge fixing in QCD with background method is given in [77,78]. We choose the most convenient combination of gauges for our task. For "fast" components we use the background-field gauge, which is the analog of covariant gauge fixing in the usual QCD perturbation theory. In particular, the propagator has the familiar form where α is a free parameter. For background fields we use light-cone gauge eq. (4.13) with retarded boundary condition eq. (4.21) for DY operators and advanced boundary condition eq. (4.15) for SIDIS operators.
In background field formulation, the Lagrangian of QCD splits into three parts where the first two terms are usual QCD Lagrangians built for particular modes and the last term is the "fast-slow" modes interaction, δL = g q Bψ +ψ Bq +ψ Aψ + δL ABB + δL AABB + δL ABBB ,  Figure 2. Example of diagrams that vanish in our scheme of calculation. Diagrams (1) and (2) vanish due to A + = 0. Diagram (3) is proportional to 1 − α and vanish at α = 1. Diagrams (4) and (5)  where δL ABB (δL ABBB ) is the interaction of a single field A µ with two (three) fields B µ and δL AABB is the interaction of two fields A µ with two fields B µ . These terms depend on the gauge fixing condition. For our calculation we need only the δL ABB interaction. It reads The rest of the terms can be found in [77]. In the following, we consider the case α = 1, which corresponds to the "Feynman gauge version" of the background gauge.

Evaluation of diagrams
We would like to evaluate the effective operator in eq. (5.6) up to twist-3 corrections, at a s order. The computation proceeds expanding the interaction part of the exponent in eq. (5.6) and integrating the "fast" modes by the Gaussian integration formula. i.e we obtain the Feynman diagrams with background fields as the external sources. The divergences of loop-integrals are regularized by dimensional regularization and δ-regulator as in [32,37,68], which allows us to use renormalization factors of eq. (3.19), (3.20).
• The UV and collinear divergences are regularized by the dimensional regularization with d = 4 − 2 . We use the conventional MS scheme with (e −γ E /4π) factor for each a s = g 2 /(4π) 2 .
• The rapidity divergences are regularized by δ-regularization, defined in [32]. See detailed discussion in section 5.4.  The diagrams with the background field coupled directly, or through a sub-graph, to the Wilson lines, such as diagrams diagrams (1) and (2)  The rest of contributions are conveniently ordered with respect to the number of background fields. Since the number of fields in the operator is less or equal to the twist of the operator, only the diagrams with two or three background fields contribute at a specific power of OPE. There are 6 non-vanishing diagrams at this order (4 of them have charge conjugated diagrams). The diagrams with two quark fields are shown in figure 3. The diagrams with two quark and gluon fields are shown in figure 4. There are also diagrams (with two and three field) that mix the quark operator with the gluon operator, as in figure 5. In principle, there could be also diagrams with more gluon insertions, which are to be combined with a single gluon insertion into a gauge invariant combination F µν (with both transverse indices). However, we recall that only F µ+ contributes to operators of twist-3 and in the light-cone gauge F µ+ = −∂ + A µ . Thus, such diagrams should not be considered at twist-3 accuracy.

JHEP05(2019)125
The process of diagrams computation is almost elementary. Let us show here the evaluation of the simplest diagram, diagram A. A similar evaluation (with the only difference in the path of Wilson lines) is presented in [47], which allows an instructive comparison. Also, in ref. [79] the diagram A (and the diagram B) has been calculated in momentum space for all values of b, which allows to match the scheme factors. Importantly, the diagram A plays a special role in TMD physics, since it is the only diagram which has rapidity divergences as discussed in the next section. In appendix B we also present a detailed explanation of the computation technique for one of the most difficult diagrams (diagram E).
The diagram A comes from the following contraction of fields in eq. (5.6) 14) where the factor in the square brackets is part of the Wilson line and the factor in the round brackets is part of δL (see eq. (5.12)). Note, that here we consider the DY operator, which dictates the integration limits over σ. The propagators in dimensional regularization where the gluon propagator is taken with α = 1. Explicitly, the diagram reads where we have simplified gamma-and color-algebra.

JHEP05(2019)125
In order to evaluate the integral over y, we recall that the background field is a classical field and the expressions of the form eq. (5.18) should be understood as a generating function for the whole tower of twist-operators. Therefore, we are allowed to make the twist-expansion under the loop-integral sign. In the considered case, we make the Taylor expansion at y µ = 0, q(y + x) = (1 + y µ ∂ µ + y µ y ν /2 ∂ µ ∂ ν + . . .)q(x). The loop-integration can be taken for each term in the series. The necessary loop-integral reads where g s is a completely symmetric composition of metric tensors. For an odd number of indices the loop-integral is zero. Metric tensors produced by loop-integration can contract derivatives, vectors b µ and n µ . Each term in the series should be sorted with respect to its twist. The thumb rule is that each transverse derivative increases the twist of an operator, but the light-cone derivative does not. Thus, the higher derivative term could be dropped. Alternatively, one can count the power of the vector b. In our current calculation, we evaluate up to terms linear in b. Note, that strictly speaking we should also expand fields in the powers of b, but it does not affect the diagram evaluation and can be postponed until later stage.
The expression in eq. (5.18) has a very simple numerator, which is linear in y. So, only odd terms of Taylor series contribute. Moreover, already the second term in the expansion, the one with three derivatives ∼ y µ y ν y ρ ∂ µ ∂ ν ∂ ρ q/3!, vanishes after contraction. Indeed, it generates ∂ + ∂ 2 q, that is at least twist-4 (on top, this contributions is proportional to b 2 ). Therefore, we consider only the single-derivative term of the series and obtain (5.21) Charge-conjugated diagrams can be evaluated independently, or obtained from the direct diagrams by reversing the order of field arguments and with the replacement z 1 ↔ z 2 . I.e. the diagram A * reads (5.22) These expressions contain rapidity divergences, which are discussed in the next section. All other diagrams are evaluated similarly.
The expression for the diagram A in SIDIS kinematics is almost identical to DY case. The only modification is the lower limit for integration over σ iin eq. (5.14), which must be changed to (+∞) for the SIDIS case. Such a replacement does not affect the evaluation of the diagram and thus the analog of eq. (5.21) in the SIDIS kinematics is obtained replacing (−∞) by (+∞).

Treatment of rapidity divergences
The rapidity divergences appear due to the localization of a gluon field in the transverse plane at the light-cone infinity [49]. There are three diagrams that have interactions with a Wilson line and thus, that are potentially rapidity divergent. These are diagrams A, C and D. However, according to the general counting rule [49], only the diagram A is rapidity divergent. In this section, we demonstrate how rapidity divergences arise in background field calculation. The fact that diagram A is rapidity divergent is well-known. It has been calculated in numerous works, see e.g. the discussions in refs. [1,2,33,37,79]. In all these works, the diagrams have been calculated in momentum space, where the loop-integral is explicitly divergent. In our case the loop-integral in the diagram A has been evaluated without any problems, however, as we demonstrate shortly, the result of the integral in eq. (5.21) is ambiguous and the resolution of this ambiguity gives rise to the rapidity divergence.
The ambiguity in diagram A is hidden in the argument of the quark field. Indeed, its value at point (α, σ) = (0, −∞) depends on the path used to approach this point. In particular, we find In order to resolve the ambiguity, the dependence on α and σ should be separated. Let us rewrite eq. (5.21) as where we set b in the arguments of the fields to 0, for demonstration purposes (the presence of b in the argument does not change the procedure of rapidity divergence elaboration and we restore it at the end of the section). In eq. (5.25) the ambiguity at (0, −∞) is enforced by the divergence of the integrand at α → 0. We isolate the ambiguous part of the diagram splitting the integration into two parts

JHEP05(2019)125
The regular part does not contain the problematic point and thus the order of integration is irrelevant. Taking the integral over σ by parts, we obtain This expression is regular at α → 0 since z α=0 21 = z 2 and it is a position representation form of the well-known "plus"-distribution.
To evaluate the singular part we introduce a regulator. Here, we use the δregularization, which consists in the following modification of the Wilson line where δ > 0. Such modification breaks gauge invariance by power corrections and therefore, only the limit δ → 0 is gauge invariant. For the detailed discussion of this issue we refer to [68]. In δ-regularization the interaction vertex with Wilson line as in eq. (5.14) receives a factor e σδ , which passes through all calculation untouched and appears in the integrand of eq. (5.28). With such a factor the ambiguity is resolved because the integrand is zero at σ → −∞ irrespectively of the value of α. In order to evaluate it, we make the change of variable τ = α(σ − z 2 ) and we obtain The integral over α is singular in the limit δ → 0 The logarithm of δ represents the rapidity singularity. In order to evaluate the construction (5.31) explicitly we rewrite q(n(z 2 + τ )) = e iτ (n·pq) q(nz 2 ), (5.33) where (p q ) µ = −i − → ∂ µ is the momentum operator acting on the quark field. Then the integral (5.31) can be taken formally

The singular part of the diagram A is
This expression literally (including the complex part) coincides with the calculation of the rapidity divergent part in δ-regularization in the momentum space [32,79].

JHEP05(2019)125
The same method can be used when the position of fields is shifted by b. The result for the diagrams A can be written in the form is the momentum operator acting on the anti-quark field. Note, that we have added a total shift ∼ αb to the first operators, to make the expression more compact. Including such a shift does not affect the expression for the TMD distribution, since it is proportional to the difference between the momenta of initial and final states. Notice that while in TMD distributions this difference is null, it is not the case for generalized TMD distributions (GTMD).

Renomalization
Performing the evaluation of all the other diagrams in a similar manner (see an explicit example for diagram E in the appendix B), we get the OPE for the bare TMD operator, which schematically can be written as where the indices i enumerate all operators that enter the expression, ⊗ is some integral convolution in the light cone positions variables z, and 1 i = 1(0) for the operators that contribute at LO (otherwise). Here, the coefficientsC depend on , δ and light-cone positions z 1,2 , the dependence b is concentrated entirely in the factors b 2 . The explicit form of each term in eq. (5.38) is rather lengthy. We present it diagram-by-diagram (since there is practically no simplification in the diagram sum) in appendix C. The bare OPE eq. (5.38) requires renormalization as in eq. (3.18), i.e. both sides of eq. (5.38) are to be multiplied by Z −1 2 Z TMD q R q , whose LO expressions are given in eqs. (3.19) and (3.20). We recall that this renormalization is universal, in the sense that, it is common for all terms of the small-b expansion and for various Lorentz structures of TMD operator. An example of this universality is already provided by the diagram A, discussed in the previous section. Indeed, according to eqs. (5.36), (5.37) the rapidity divergence enters the expression multiplying the bare TMD operator U (z 1 , z 2 ; b). In other words, we can extract the rapidity divergent terms from eq. (5.38) and write it as z 2 ; b) + a s (rapidity finite terms),

JHEP05(2019)125
where p + is the momentum of the parton. 5 Multiplying it by R q , given in eq. (3.19), the logarithm of δ cancels for all terms of the small-b expansion to all orders of . To our best knowledge this is the first explicit demonstration of rapidity divergences renormalization of TMD at higher twists. The renormalization of eq. (5.38) makes this expression finite. However, coefficientsC contain singularities in . These singularities are collinear singularities and are compensated by UV behavior of light-cone operators. To remove them explicitly we replace the bare operators on r.h.s. by the renormalized operators O bare = Z −1 ⊗ O R (µ). The factor Z −1 being convoluted with coefficient function removes the remaining poles in .
Concluding, the renormalized expression for small-b OPE has the form where the operators are renormalized at scales µ and ζ and we have set the scale of renormalization for light-cone operators to be the same as for TMD operator for simplicity. The expression for the coefficient functions at NLO for any twist can be written as where the rapidity divergences inC tw-n i are explicitly canceled and we have expressed the renormalization factors in MS-scheme, see eq. (3.19), (3.20). With this formula it is simple enough to obtain the coefficient functions for the small-b OPE in coordinate space. However, they are of little use, since in practice, one operates in terms of momentum fractions x and the corresponding collinear distributions. The transition to the distribution and the corresponding expressions are discussed in section 7.

Difference in the evaluation of DY and SIDIS operators
The operators for the DY and SIDIS initiated TMD distributions differ by the geometry of Wilson lines. This dependence influences the calculation in two aspects. The first one is the explicit expression for diagrams that have interaction with Wilson line, such as diagrams A, C and E. The second one is the preferred boundary conditions for the gauge fixing for the background field, the retarded for DY-type operators, eq. (4.14) and advanced one for SIDIS-type operators, eq. (4.15). Let us note, that boundary conditions do not influence the process of diagram evaluation, but rather the procedure of recompilation of the expressions in terms of gauge-invariant operators, see eq. (4.21), (4.22). 5 In GTMD case, initial and final partons have different momenta. We cannot specify which momentum appears in the soft factor in the absence of the process and factorizaton theorem which would fix the kinematic scales. Nonetheless, in any case, the rapidity divergences are renomalized by factor Rq, but possibly leave extra terms of the form ln(p + q /p + q ).

JHEP05(2019)125
In both cases the only difference between expressions for DY and SIDIS kinematic is the sign of infinity in the integration limits. I.e. a term contributing to OPE for DY operator has the form DY : (5.42) whereas the same term in the OPE for SIDIS operator is SIDIS : Here, dots indicate various compositions of fields, functions and integrals that do not change. Such a structure is already evident at the tree level order, as one finds comparing eq. (4.8) and eq. (4.9). As we will see, in terms of distributions this difference will result into a different global sign of the coefficient functions.

Definition of collinear distributions
In order to proceed further we need to evaluate the hadronic matrix element of OPE. This procedure is scheme dependent in the following sense: We recall that our computation is made in dimensional regularization and after the renormalization procedure the expressions are finite for → 0. Nonetheless, the finite part of the results depends on and moreover the expressions so obtained have a tensor structure which also depends on the number of dimensions. Thus, in order to completely define the scheme, we should specify the order of operations with respect to the limit → 0. There are two major options. The first one consists in setting → 0 before the evaluation of matrix elements (i.e. at the level of operators) and defining the distributions in 4-dimensions. The second one is to define the distributions in d-dimensions and to perform the limit → 0 after the evaluation of matrix elements. Both schemes have positive and negative aspects. In fact, this problem has not been accurately addressed in the TMD-related literature. Checking the traditional calculations of TMD matching at twist-2 [1,32,33,80], we conclude that the second scheme is used in all these cases. Therefore, to be consistent with earlier calculations, we use the second scheme. Nonetheless, we have also performed the calculation in the first scheme and we have found that for the Sivers function some differences appear only in the quark-gluon mixing diagrams. These differences are -suppressed and thus the expression for the NLO matching coefficient is the same in both schemes. In appendix C.2 we present the expressions for diagrams with an explicit designation of the origin of which allows to re-derive the complete result.
In the rest of this section we define the twist-2 and twist-3 matrix collinear distributions and evaluate the TMD matrix element over the small-b OPE obtained in the previous section.

Quark distributions
The forward matrix elements of the light-cone operators are parametrized by collinear distributions, or parton distribution functions (PDFs). For this work we need the forward JHEP05(2019)125 matrix element of twist-2 and twist-3 operators only. We start discussing the required quark distributions, while the gluon distributions are treated in the next section.
We emphasize that all indices appearing in eq. (6.2), (6.3) are transverse. The forward matrix element depends only on the distance between fields, but not on the absolute position. A shift of the common position can be written as a total derivative of the operator, which is a momentum transfer between initial and final states. It is the consequence of the quantum-mechanical definition of the momentum operator: where O is any operator. It allows to move each term of OPE to a convenient position and to drop terms with total derivatives. Altogether it significantly simplifies the evaluation.
To resolve the total derivative terms one should consider a non-forward kinematics, that defines GTMD distributions and generalized parton distributions. In the following, we consider each operator in a convenient point. The standard unpolarized PDF comes from the forward matrix element of O γ + , The PDF is non-zero for −1 < x < 1 and where q(x) andq(x) are the quark and anti-quark parton densities in the infinite momentum frame. The definition of twist-3 PDFs is more cumbersome since they depend on two momentum fractions x i and they have a different interpretation relative to a domain of variables. The notation simplifies considerably if one writes the twist-3 distributions as a functions of three momentum factions x 1,2,3 . Each momentum fraction is the Fourier conjugate of the corresponding coordinate z 1,2,3 . We define where M is the mass of the hadron and the integral measure is defined as Such an integral measure automatically takes into account the independence of forward matrix element on the total shift, eq. (6.5).
The functions of three variables T (x 1 , x 2 , x 3 ) have several symmetry properties. It is natural to consider them as functions defined on the hyperplane x 1 + x 2 + x 3 = 0, since only this domain contributes to forward matrix element. The domain can be split into six regions, corresponding to different signs of the variables x i , see figure 6. Each of these regions has a different interpretation in parton language: depending on the sign of x i the corresponding parton is either emitted (x i > 0) or absorbed by a hadron [71], as it is shown schematically in figure 6.
The functions T and ∆T are not independent and mix under the evolution. In ref. [64] it is shown that there exist a combination of T and ∆T which evolve autonomously, but we do not use it in this work.
The definitions in eq. (6.8), (6.9) are understood in d-dimensions. That is, the vector s µ is some vector that turns intos µ = µν T s ν when → 0. The definition of the nonperturbative functions T and ∆T coincides 6 with the one made in [41]. Also it is coincides (up to a factor M ) with the definition given in [64]. The articles [24][25][26][27]81] use a less convenient two-variable definition, which is related to the definition with three variables by (here we compare to [81]) To compare the definitions that we have used, consider the 4-dimensional relation γ + γ µν T = −i µν T γ + γ 5 .

JHEP05(2019)125
x 1 x 3 Using time-reversal and hermiticity, one can show that the functions T and ∆T are real and obey the property These properties are central in the following calculation. They represent the simple statement that gluon is a neutral particle. In barycentric coordinates the time-reversal transformation turns the picture upside down as shown in figure 7. Therefore, the function T (∆T ) is (anti)symmetric with respect to the horizontal line x 2 = 0 (given by red dashed line in figure 6). PDFs defined on these lines are known as Qui-Sterman distribution. They play a special role in TMD physics, since they provide the LO matching, as it is shown in the next sections.

JHEP05(2019)125
Generally, the decomposition (6.17) should additionally contain a symmetric-traceless component. The corresponding distribution is however zero in forward kinematics. The distributions g(x) and ∆g are conventional unpolarized and polarized gluon distributions. There is no standard parametrization for the twist-3 gluon operator. Here we introduce the parameterization that is convenient for our calculation. It is different (but equivalent) to other parameterizations used e.g. in [28,64,73,[81][82][83]. The main difference is that we use two distributions with different properties, instead of a single one. We have The overall minus sign is set in order to have a simple relation to the distributions defined in [64,81]. The foundation for this parameterization is discussed in appendix A. Despite its cumbersome appearance, this parameterization has some natural properties, that significantly simplify the calculation. Time-reversal and hermiticity imply that (6.20) which reflects the fact that the gluon is a neutral particle and thus, "anti-gluon" distribution is equal to the "gluon" one. Due to the permutation properties of the operator, the distributions are highly symmetric. Namely, the distribution G − (G + ) is (anti-)symmetric with respect to permutation of any pair of arguments The distribution Y − (Y + ) is (anti-)symmetric with respect to to permutation of x 1 and x 3 , Additionally, the distributions Y ± obey a cyclic rule The graphical representation of these transformation in barycentric coordinates is shown in figure 7. The symmetry properties in eq. (6.20)-(6.23) significantly restrict the functional form of distributions. In particular, the functions G ± are entirely defined by its values in the region 0 < x 1 /2 < −x 2 < x 1 . Whereas the functions Y ± are defined by its values in the region 0 < x 1 /2 < −x 2 < 2x 1 . Graphically these relations are demonstrated in figure 8.
The functions G and Y mix under evolution. In many aspects they are similar to the functions T and ∆T of the quark case. Nonetheless, the parametrization given here grants many simplification during calculation, because each of the structures in eq. to an irreducible representation of the Lorentz group. For that reason these structures enter the dimensionally regularized expression with different -dependent factors. The relation of the functions G ± and Y ± to the functions used in [64] is It is important to note that this comparison is made at = 0, because at = 0 the comparison is impossible. The inverse relation is .
(6.26) Therefore, our basis is equivalent to a decomposition of a general 3-variable function into antisymmetric and cyclic components. The reduction of three-variable notation used here and in [64] to the two-variable notation used in [73,81,83] is the same as for quarks in eq. (6.11). In [28,82] a different notation is used, which again can be related to our functions at → 0. For a detailed comparison we refer to the discussion in [82].

Small-b expansion for unpolarized and Sivers distributions
Having at hand the parametrization of the matrix elements we can obtain the matching coefficient for TMD distributions to collinear distributions. The standard protocol to achieve this is the following. We derive the TMD distribution using the operators U (compare eq. (3.4) and eq. (4.1)), Next, we substitute the expression for OPE eq. (5.40) into the matrix element and we evaluate the Fourier transform using the parameterization for collinear matrix elements.

JHEP05(2019)125
In this way we obtain the small-b expansion for the TMD distribution Φ [γ + ] . Collecting all terms with appropriate Lorentz structures, eq. (3.5), we obtain the small-b expansion for individual TMD distributions, in our case these are the unpolarized and Sivers distributions. The procedure is rather straightforward and it can be performed for each diagram independently. In section 7.1 we give several comments on the evaluation of it, while the final result is presented in section 7.2. The results for individual diagrams are presented in appendix C.2.

From operators to distributions and tree level results
The tree level order of OPE is given in eq. (4.4). Applying the transformation in eq. (7.1) and using the definitions in eq. (6.6), (6.8) we obtain 8 To evaluate the second line we use the following trick. We consider the two integrals over τ separately and change the variables x 1,2,3 → −x 3,2,1 , τ → −τ in the second one. The integrand is invariant under such transformation, due to the property in eq. (6.12) while the limits of integration change to (−z, +∞). As a result the two integrals over τ can be combined into a single integral over τ from −∞ to +∞, Let us stress that the dependence on the intermediate gluon position τ disappears. This property holds for all diagrams and allows to combine seemingly cumbersome expressions into simple ones. It is the result of time-reversal symmetry. Therefore, to observe such cancellation, one should collect a diagram with its conjugated. I.e. the dependence on the intermediate point cancels in combination of diagrams A and A * , C and C * , E and E * , D and D * . The rest diagrams are self-conjugated. The time-reversal symmetry is also responsible of the different relative sign in the matching of DY and SIDIS operators. Indeed, since the integrands are symmetric under time-reversal, the intermediate point cancels and the only thing that matters is a common global sign. This sign is necessarily different between DY and SIDIS expressions, due to different boundary conditions holding in two cases. In other words, all gluon fields in the DY case are connected to −∞ and the corresponding integrals are −∞ . Whereas for SIDIS they are connected to +∞ and corresponding integrals are +∞ = − +∞ . In this way, we observe the well-known relation observed a long ago [50]. Coming back to eq. (7.3), the integrals over τ and z decouple and both produce a δ-function. We obtain Using the delta-function in the definition of [dx] in eq. (6.10), the integrals over x's can be evaluated, This expression gives the leading order matching for unpolarized and Sivers TMD distributions in eq. (3.5) where + sign is for DY operator and − sign is for SIDIS operator. The same procedure with minimal modifications can be done for each term of OPE also at higher orders. In appendix C.2, we present the expressions for each diagram at NLO and the corresponding final result is given in the next section. The T and ∆T distributions defined on the line x 2 = 0 are generally known as Efremov-Teryaev-Qui-Sterman (ETQS) distributions [84,85]. In the next section, we write explicitly the evolution equation for these functions in eq. (7.15). Here, we just remind that the ETQS functions are not autonomous, meaning that their evolution involves the values of these functions in a full domain of x 1,2,3 . However, we have found that the finite part 9 of the small-b matching coefficient involves only ETQS functions.
The line x 2 = 0 plays a special role in the matching of TMD distributions as shown in red in figure 6. In the parton picture the distributions defined on this line can be interpreted as "gluonless". Indeed, while the quarks are normally emitted and absorbed by a hadron (as in usual twist-2 distribution), here the gluon is in an "intermediate state" nor emitted, nor absorbed, but smoothly distributed all-over the space. This picture also supports the interpretation of variables x, as the parton momenta measured as the fraction of the hadron momentum. In such a momentum picture, the line x 2 = 0 corresponds to null-energy gluon.
The symmetry properties of the distributions allow some simplification along the line x 2 = 0. In particular, the ∆T function (which in principle appears when x 2 = 0) does not explicitly contribute to the matching due to eq. (6.13) ∆T (−x, 0, x) = 0, (7.10) but it will appear in the evolution of the ETQS functions, as we show in the next section. 9 Following common terminology, we name C(Lµ = 0) as the finite part of the coefficient function C(Lµ), whereas C(Lµ) − C(Lµ = 0) is named the logarithmic part.

JHEP05(2019)125
Due to the anti-symmetry property the function G ± when one of their arguments in 0, they can be expressed as ETQS distributions The functions Y ± at x i = 0 also can be expressed via ETQS distributions, but with a different rule The application of these rules significantly simplifies the calculation.

Results at NLO
The NLO matching of Sivers TMD distribution at small-b reads where on the right hand side all distributions are defined at the scale µ,ȳ = 1 − y and l ζ = ln µ 2 ζ . (7.14) Eq. (7.13) is written for the DY definition of the TMD distribution. In the case of the SIDIS definition the factor π in the first line should be replaced by −π. The symbol P ⊗ T represents the evolution kernel for the function T (x 1 , x 2 , x 3 ) on the x 2 = 0 line. It reads where the plus-distribution is defined as usual (f (y)) + = f (y) − δ(ȳ)

JHEP05(2019)125
Note that the gluon part is regular for ξ → 0 since functions G ± and Y ± vanish at x 1,2,3 = 0. In eq. (7.13), (7.15) the integrals over y and ξ together with the δ(x − yξ) reproduce the Mellin convolution. This convolution naturally appears during the calculation and it is defined for the whole range of x, (−1 < x < 1) (and we recall that the anti-quark TMD distributions are given by values of x < 0, see definition in eq. (3.13)). It should be understood literally (7.17)

Discussion and comparison with earlier calculations
The evolution kernel in eq. (7.15) derived by us agrees with the known results in [64,86]. Also, the matching of the twist-2 part coincides with earlier works exactly i.e. as the whole function of . Altogether this provides a very strong check for the whole procedure and results derived by us.
It is instructive to compare eq. (7.13) to the small-b expansion of the unpolarized TMD distribution, which we have also reevaluated in this work to provide an additional cross-check. Following the notation of this work, it reads [1,2,32,79] (7.18) where the evolution kernel is .
One can see that eq. (7.13) and eq. (7.18) have a very similar structure and, more precisely, the finite parts 9 of these expressions have the same y-behavior. It is possible that this fact indicates some hidden correspondence which is to be understood in the future. Let us note that our calculation scheme (namely, the definition of distributions in ddimensions, as it is discussed in section 6) affects only the quark-from-gluon terms. In appendix C.2 we present these mixing diagrams with the explicit designation of 's from different sources. We have found that the scheme dependence enters the expressions via factors ∼ /(1−˜ ), where is the parameter of dimension regularization and˜ is the parameter of d-dimensional definition of distributions. Therefore, the current choice of scheme influences only the -suppressed terms of the final expression and thus it can contribute only from NNLO. Let us mention, that the same observation (namely, the suppression of JHEP05(2019)125 the details of the d-dimensional definition in the NLO coefficient function) is valid also in the case of the helicity distribution, which contains γ 5 -matrix, see ref. [33].
The expressions for coefficient functions in eq. (7.13)-(7.18) are given for a general scale setting (µ, ζ). For practical applications, it is convenient to use the ζ-prescription [30,45], where a TMD distribution is defined at the line ζ = ζ(µ). This line depends on certain boundary conditions that can be uniquely fixed and which define the so-called optimal TMD distribution, see a detailed discussion in [45]. The line ζ µ is universal for all TMD distributions and on this line the expression for the coefficient function simplifies. Namely, in eq. (7.13), (7.18) one should set in ζ-prescription: It is easy to see that in ζ-prescription the TMD distribution is (naively-)independent on the scale µ.
The matching coefficient for Sivers function can be found in the literature scattered in different works: the quark-to-quark part has been deduced in [27] and the quark-to-gluon part has been evaluated in [28]. In both references the derivation of the matching coefficient has been made indirectly, refactorizing the factorized cross-section for SSA with the help of known matching for unpolarized TMD distribution. In our approach we evaluate the Sivers function directly, which grants us a better control over factors and schemes. Let us compare and comment on these works one-by-one.
In [27] the quark-from-quark part of the matching (the first term in square brackets in eq. (7.13)) is derived. A comparison with this work shows a disagreement in the logarithmic part, 9 but an agreement in the finite part (i.e. compare eq. (7.15) with eq. (12) of [27]). The origin of this difference is clear. The calculation of ref. [27] is based on the fixed-order calculation of SSA made in [23,25]. The latter considers only gluon-pole contributions and misses a quark-pole contribution, which roughly corresponds to our diagrams D (see detailed discussion in [64,86,87]), which in turn, contributes only to the logarithmic part of matching coefficient, i.e. second line of eq. (7.13)).
In [28] the quark-to-gluon matching has been calculated. The result is presented using the functions N (x 1 , x 2 ) and O(x 1 , x 2 ) which can be related to a combination of the functions G and Y , similar to eq. (6.25), (6.26) (for a comparison of the definitions of these functions see [82]). In particular, . Using these relations and comparing with eq. (44) of [28] we find a complete agreement with the logarithmic part (which is expected since it is given by the evolution kernel), but disagreement in the finite part. We claim that this disagreement is the result of a different parametrization of the gluon PDF used in [28]. Indeed, according to eq. (39) of [28], the authors of [28] define PDF in d-dimensions, but they do not decompose the tensors to irreducible representations and therefore -dependent pre-factors of PDFs are different.
In fact, the method of ref. [28] could be inconsistent beyond LO. Indeed, the parameterization of the twist-3 matrix element used by [28] is based on the 4-dimensional relation JHEP05(2019)125 (see also [82]) g µν αβρδ = g µα νβρδ + g µβ ανρδ + g µρ αβνδ + g µδ αβρν , (7.21) which is used to reduce the number of degrees of freedom. In d-dimensions the relation in eq. (7.21) is not valid. Instead one has to use the decomposition to irreducible components (see discussion in appendix A), as it is made in this work. In order to consistently use the parameterization based on eq. (7.21), the limit → 0 must be taken prior to the application of the parameterization, i.e. the approach one, as it is discussed in the introduction to the section 6. Contrary, the authors of [28] have used a 4-dimensional parametrization within the d-dimensional calculation. There is no apparent contradiction at one-loop level, however, it can appear at higher perturbative orders.

Conclusion
We have derived the matching of the Sivers function to collinear distributions at NLO. The final result is given in eq. (7.13) both for quark-to-quark and quark-to-gluon channels. The final result can be compared to the known calculations piece by piece: the logarithmic part agrees with the evolution kernel derived in [64,86], the finite quark-to-quark part agrees with the one derived in [27] and the finite quark-to-gluon part is in disagreement with [28]. In section 7.3 we argue that the disagreement between our calculation and the calculation made in [28] is due to the difference in calculation schemes. The peculiarities of our calculation scheme are given in beginnings of section 5.3 and section 6. We also argue that our calculation scheme is equivalent to the scheme commonly used for twist-2 TMD matching, which we also confirm by comparing the twist-2 part of our calculation, eq. (7.18).
In contrast to all previous evaluations of Sivers function we do not consider any process but derive it directly from the definition of the TMD operator. The evaluation presented here is in many aspects novel, especially for the TMD community. Our calculation is made at the level of operators within the background field method which provides the most complete type of calculation and in the text we have described many details. In particular, for the first time, we explicitly demonstrate the appearance of rapidity divergences at the operator level, section 5.4 and explicitly demonstrate its renormalization at all twists of collinear OPE (section 5.5). We also demonstrate the appearance of the famous sign flip for Sivers functions defined for DY and SIDIS, eq. (2.9).
The method outlined in this work can be used also for the evaluation of the other leading order distributions which match on collinear twist-3 operators. All intermediate results of the calculation are presented in the appendix. Since the calculation is made at the level of operators, it contains the complete information on small-b OPE. In particular, it can be used to write down the matching of GTMD distributions to GPDs. Also, many diagrams can be used without recalculation for other polarizations. We expect that this line of research will give new results in the near future and before the advent of the Electron Ion Collider (EIC).

JHEP05(2019)125
Acknowledgments A.V. gratefully acknowledges V. Braun and A. Manashov for numerous stimulating discussions and help in clarifying several aspects of higher twist calculus. I.S. is supported by the Spanish MECD grant FPA2016-75654-C2-2-P. A.T. is grateful to J.W. Qiu and W. Vogelsang for valuable discussions and is supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics under contract DE-AC02-98CH10886 and in part by the US DOE Transverse Momentum Dependent (TMD) Topical Theory Collaboration.

A Parametrization of twist-3 operators and decomposition of 3-tensors
The light-cone gluon operators that enter our calculation are where f ABC and d ABC are structure constants of the gauge-group. Here we omit the Wilson lines, for simplicity. To find an appropriate parametrization of these operators in dimensional regularization, we proceed as the following. First of all, we decompose the V × V ×V -tensor (with V being a 2−2 dimensional vector) into irreducible components. There are 7 irreducible components, which can be selected by appropriate projectors. Explicitly the projectors read [88], symmetric-traceless P µλν;µ λ ν 1 = S µνλ;µ ν λ − P µλν;µ λ ν The dimension of corresponding irreducible sub-spaces are JHEP05(2019)125 hered = 2(1 − ). So, one can see that 3'd, 5'th and 7'th subspaces vanishes at → 0. These subspaces represent evanescent components of operator.
In the next step we construct tensors that belong to particular subspaces, These tensors parameterize the forward matrix element and thus can be built out of single s µ , a µν (a d-dimensional analog of µν T ) and g µν T . We found t µνλ 2 = s α a µα g νλ T + s α a να g λµ T + s α a λα g µν T , (A.13) t µνλ 3 = s α a µα g νλ T − 2s α a να g λµ T + s α a λα g µν T + (1 − 2 )(s µ a νλ − s λ a µν ), (A.14) Note, that s µ a νµ =s ν . The tensor t µνλ 1 = 0, since it is not possible to build completely traceless tensor with a single entry of a vector.
Finally, we parametrize the matrix element as where the integral measure is defined in eq. (6.10). The distributions F 3,5,7 do not mix with other distributions at the perturbative order that we discuss here. Therefore, they could be safely set to zero. Therefore, we have three functions F 2,4,6 that survive in the limit → 0.
The operators T ± have the following property under permutation of arguments which put some constraints on the functions F 2,4,6 . Consequently, the function F − 2 (F + 2 ) is completely (anti)symmetric, Another consequence of relation (A.20) is that functions F 4 and F 6 are related to each other. We find it convenient to use F 4 as independent, setting The function F ± 4 has the following symmetry properties

JHEP05(2019)125
For convenience of comparison we introduce additional -dependent factors and denote , (A. 24) and we obtain the parametrization in eq. (6.19). Let us also make an analogy with the parameterization of quark operator. The general quark operator with positive parity has three indices (if we omit evanescent operators with anti-symmetric products of 4, 6, etc. indices). It reads where all indices are transverse. Here, we omit the Wilson lines, for simplicity. Therefore, it is parameterized by the same set of tensors, For the same reasons as for the gluon operator we drop all functions Q 3,5,7 . The remaining functions Q 2,4,6 are not independent, but can be related by time-reversal symmetry. In particular we get Q 2 = Q 4 . Comparing to the parameterizations in eq. (6.8), (6.9) we get Therefore, we can conclude that the function ∆T is the quark analog of F 6 gluon distribution.

B Example of evaluation: diagram E
In this appendix we give a detailed technical description of the evaluation of a diagram. For demonstration purposes we have selected the diagram E (see figure 4) since it is the most involved diagram, which allows to demonstrate all particularities of the calculation. The remaining diagrams are obtained in a similar manner, albeit the evaluation is typically shorter.

B.1 Evaluation of contribution to OPE
The diagram reads where the factors in the round brackets come from the expansion of the action exponent. we obtain and we have used that γ + γ + = 0.
The expression in eq. (B.3) should be understood as a generating function that contributes to all orders of small-b expansion. The typical task requires a consideration of terms with a particular counting only. For instance, in this work we need only the terms proportional to b µ . The most straightforward approach to extract particular contributions from such generating function is to Taylor expand all fields around a point (say 0) and evaluating the loop integral that decouples from the fields. In the resulting series, the desired contributions are to be sorted out and resummed back to the non-local form. However, this is a very algebraically heavy way. Here we use an equivalent, but much more efficient, strategy that requires the evaluation of only several terms. It is described in the following.
First of all we decouple the expansion parameter (here the vector b) from the integration variables. The natural way to do so, is to join propagators by the Feynman variables and make the shift of variables. For this diagram we introduce four Feynman variables α, β, γ and ρ for propagators from left to right in (B.3). Then we make a shift of variables

JHEP05(2019)125
where the definition of Q is given in (C.13) and ∂ 1,2,3 is the ∂ + that acts onq, A, q in Q. The expression for the diagram E * could be obtained from this one by inversion of order of γ-matrices and field order and z 1 ↔ z 2 . The analogous expressions for other diagrams are given in appendix C.1.
We also replace A µ by F µ+ using the identity valid in the light-cone gauge This is valid for the operator in the DY kinematics while in SIDIS kinematics the identity eq. (4.22) should be used instead. The result of these operations reads Finally, making Fourier transformation to momentum faction x as in eq. (B.11) we get where we rename ρ → y and x 1 → ξ, and rescale b → b/2. All other diagrams are evaluated in the same manner, with the only difference that self-conjugated diagrams are already symmetric with respect to x 1,2,3 → −x 3,2,1 . The diagram-by-diagram expressions are given in appendix C.2.

C Diagram-by-diagram expressions
In this appendix we collect the expressions for diagrams presented in figures 3, 4 and 5.

C.1 Expressions for OPE
In this appendix we provide the full set of expressions obtained from the evaluation of diagrams in background field. The expressions are given in light-cone gauge for the Drell-Yan operator eq. (4.1) (i.e. with retarded eq. (4.14) boundary conditions). The analogous expressions for the SIDIS operator, eq. (4.3), are obtained by replacing −∞ with +∞ in the integration limits, as it is discussed in section 5.6. We stress that the calculation has been done for an operator with unrelated light cone positions of fields z 1 and z 2 . Therefore, the OPE presented here is also suitable for evaluating the matching of the GTMD distributions.

JHEP05(2019)125
where we explicitly show the color indices. In the expression for the diagram L , the fields are left unexpanded in b. In the expression for diagram M ∂ 1,2,3 is ∂ + that acts on A µ , A σ and A ν correspondingly.

C.2 Expressions for TMD distributions
In this section, we present the results for the matrix element in eq. (7.1) of the OPE contributions, We collect all diagrams with their corresponding time-reversal and we have In these expressions we distinguish the parameter that comes from the dimensional regularization (i.e. from the loop integral measure d 4−2 x) and the parameter˜ that comes from the definition of distributions in 4 − 2˜ −dimensions, their normalization and tensor convolutions. The parameters and˜ enter only as a universal composition /(1 −˜ ) and thus at this order of perturbative expressions the difference between schemes is absent. Combining these expressions with the renormalization constants and taking the limit → 0, as it is discussed in eq. (5.41) we find eqs. (7.13), (7.15), (7.18), (7.19).
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.