Factorization and Resummation for Massive Quark Effects in Exclusive Drell-Yan

Exclusive differential spectra in color-singlet processes at hadron colliders are benchmark observables that have been studied to high precision in theory and experiment. We present an effective-theory framework utilizing soft-collinear effective theory to incorporate massive (bottom) quark effects into resummed differential distributions, accounting for both heavy-quark initiated primary contributions to the hard scattering process as well as secondary effects from gluons splitting into heavy-quark pairs. To be specific, we focus on the Drell-Yan process and consider the vector-boson transverse momentum, $q_T$, and beam thrust, $\mathcal T$, as examples of exclusive observables. The theoretical description depends on the hierarchy between the hard, mass, and the $q_T$ (or $\mathcal T$) scales, ranging from the decoupling limit $q_T \ll m$ to the massless limit $m \ll q_T$. The phenomenologically relevant intermediate regime $m \sim q_T$ requires in particular quark-mass dependent beam and soft functions. We calculate all ingredients for the description of primary and secondary mass effects required at NNLL$'$ resummation order (combining NNLL evolution with NNLO boundary conditions) for $q_T$ and $\mathcal T$ in all relevant hierarchies. For the $q_T$ distribution the rapidity divergences are different from the massless case and we discuss features of the resulting rapidity evolution. Our results will allow for a detailed investigation of quark-mass effects in the ratio of $W$ and $Z$ boson spectra at small $q_T$, which is important for the precision measurement of the $W$-boson mass at the LHC.


Introduction
Differential cross sections for the production of color-singlet states (e.g. electroweak vector bosons or the Higgs boson) represent benchmark observables at the LHC. For the Drell-Yan process, the measurements of the transverse momentum (q T ) spectrum of the vector boson (and related variables) have reached uncertainties below the percent level [1][2][3][4][5][6], allowing for stringent tests of theoretical predictions from both analytic resummed calculations and partonshower Monte-Carlo programs. An accurate description of the q T spectrum is also a key ingredient for a precise measurement of the W -boson mass at the LHC, which requires a thorough understanding of the W -boson and Z-boson spectra and in particular their ratio [7][8][9][10]. The associated uncertainties are one of the dominant theoretical uncertainties in the recent m W determination by the ATLAS collaboration [11].
So far, mass effects from charm and bottom quarks in the initial state have been discussed extensively for inclusive heavy-quark induced cross sections, leading to the development of several variable-flavor number schemes in deep inelastic scattering and pp collisions (see e.g. refs. [12][13][14][15][16][17][18]). On the other hand, analogous heavy-quark mass effects from initial-state radiation have received little attention so far in the context resummed exclusive (differential) cross sections, i.e. where the measurement of an additional (differential) observable restricts the QCD radiation into the soft-collinear regime requiring the resummation of the associated logarithms. While e.g. for m q T the mass effects in the resummed q T distribution are simply encoded by the matching between the parton distribution functions across a flavor threshold (e.g. matching four-flavor PDFs onto five-flavor PDFs including a b-quark PDF at the scale m b , which happens much below the scale q T ), this description breaks down for q T ∼ m or q T m. A comprehensive treatment of these regimes in resummed predictions has been missing so far. This concerns in particular also parton-shower Monte-Carlo generators, which include massive quark effects primarily as kinematic effects and by using massive splitting functions. Since heavy-quark initiated corrections are one of the main differences between the W and Z boson spectra, this issue can play therefore an important role for m W measurements at the LHC.
In general, one can distinguish two types of mass effects as illustrated in fig. 1, which have different characteristics: Contributions where the heavy-quark enters the hard interaction process are called primary mass effects. Contributions from a gluon splitting into a massive quark-antiquark pair with light quarks entering the hard interaction are called secondary. For the q T spectrum, earlier treatments of the heavy-quark initiated primary contributions for m q T have been given in refs. [19][20][21], essentially combining the ACOT scheme with the standard CSS q T resummation. A systematic and complete treatment also requires to account for secondary mass effects. Their systematic treatment for differential spectra in the various relevant hierarchies between mass and other physical scales has been established in the context of event shapes in e + e − collisions [22,23] and for threshold resummation in DIS [24], see also refs. [25,26] for a recent utilization in the context of boosted heavy quark initiated jets. The application to differential spectra in pp collisions will be part of the present paper.
We present a systematic effective-theory treatment of quark mass effects including both types of mass effects and all possible scale hierarchies using soft-collinear effective theory (SCET) [27][28][29][30]. We focus on the Drell-Yan process, pp → Z/γ * → + − , and consider two types of observables that resolve additional QCD radiation and are used to constrain the process to the exclusive region, namely the transverse momentum q T of the gauge boson and  beam thrust [31], Here, p i are all hadronic final-state momenta (i.e. excluding the color-singlet final state), and n µ a,b = (1, ±ẑ) are lightlike vectors along the beam axes. Due to transverse momentum conservation q T measures the total transverse momentum of the final state hadronic radiation, while beam thrust measures the momentum projections of all hadronic particles onto the beam axis. The exclusive regime we are interested in corresponds to q T Q or T Q, where Q = q 2 is the dilepton invariant mass. These two observables restrict the allowed QCD radiation into the collinear and soft regime in different ways, leading to different effective-theory setups with distinct factorization and resummation properties. These two cases also provide simple prototypical examples that cover all basic features needed for essentially any other more complicated jet resolution variables, and their corresponding resummation in the massless limit is well known (see e.g. refs. [32][33][34][35][36][37][38][39][40][41][42] and refs. [31,43,44]). Throughout the paper we always consider the limit Λ QCD q T , T allowing for a perturbative description of the physics at these kinematic scales. We then consider all relevant relative hierarchies between the heavy-quark mass m and the kinematic scales set by the measurement of q T or T , respectively.
In the second part of the paper, we explicitly compute all required ingredients for incorporating m b effects at NNLL order, which combines NNLL evolution with the full NNLO singular boundary conditions (hard, beam, and soft functions). For Z-boson production at NNLL , primary effects contribute via O(α s ) × O(α s ) heavy-quark initiated contributions, illustrated in fig. 1(a). Secondary effects contribute as O(α 2 s ) corrections to light-quark initiated hard interactions, illustrated in fig. 1(b). Due to the strong CKM suppression primary m b -effects do not play any significant role for W -production, which represents a key difference to Z-boson production. Primary m c -effects enter W -production in the (sizeable) cs-channel, where they start already at NLL via O(α s ) × O(1) corrections. For this case, our explicit results for the regime q T ∼ m c allows for up to NNLL resummation. (Here, the resummation at NNLL would require the O(α 2 s ) primary massive contributions.) The paper is organized as follows: We first discuss in detail the effective field theory setup for the different parametric regimes for the case of q T in sec. 2 and for T in sec. 3. Here, we elaborate on the relevant mode setup in SCET, the resulting factorization formulae, and all-order relations between the factorization ingredients in the different regimes. In sec. 4, we give the O(α s ) and O(α 2 s ) results for the various ingredients for NNLL resummation. We also verify the consistency of our results with the associated results in the massless limit. Further details on all calculations are given in the appendices, where we also give the analytic results at fixed-order for the massive quark effects in the q T and T distributions in the singular limit q T , T Q. In sec. 5, we discuss the consequences of the secondary mass effects on the rapidity evolution, in particular for the q T distribution in the regime q T ∼ m b . In sec. 7, we conclude by providing an outlook an estimate of the potential size of the bottom quark effects for low-q T Drell-Yan measurements.
2 Factorization of quark mass effects for the q T spectrum

Factorization for massless quarks
Before discussing the massive quark corrections, we first briefly summarize the EFT setup and factorization for massless quarks. The relevant modes for the measurement of q T in the limit q T Q are n a -collinear, n b -collinear, and soft modes with the scaling n a -collinear: which we have written in terms of light-cone coordinates along the beam axis, withn a ≡ n b . Besides these perturbative modes there are also nonperturbative collinear modes with the scaling (Λ 2 QCD /Q, Q, Λ QCD ) and (Q, Λ 2 QCD /Q, Λ QCD ), which describe the initial-state protons at the scale µ ∼ Λ QCD , and which are unrelated to the specific jet resolution measurement. The typical invariant mass of the soft modes is parametrically the same as for the collinear modes, p 2 na ∼ p 2 n b ∼ p 2 s ∼ q 2 T , which is the characteristic feature of a SCET II theory. The soft and collinear modes are only separated in rapidity leading to the emergence of rapidity divergences and associated rapidity logarithms. The traditional approach for their resummation in QCD relies on the work by Collins, Soper, and Sterman [32][33][34]. In SCET the factorization and resummation were devised in refs. [39][40][41][42].
Here we will use the rapidity renormalization approach of refs. [40,41], where the rapidity divergences are regularized by a symmetric regulator and are renormalized by appropriate counterterms (by a MS-type subtraction). The rapidity logarithms are then resummed by solving the associated rapidity renormalization group equations. Within this framework the factorized differential cross section with n f massless quarks reads 1 1 In principle there is also a corresponding contribution for a gluon initiated hard interaction. However, taking into account the decay of the electroweak boson into massless leptons this correction vanishes for onshell gluons and only contributes to the power suppressed terms of O(qT /Q). where with Y denoting the rapidity of the color-singlet state. In eq. (2.3), the superscript (n f ) on all functions indicates that the associated EFT operators and the strong coupling constant in these functions are renormalized with n f active quark flavors, which matters for the evolution already at LL. H ij denotes the process-dependent (but measurement-independent) hard function. It encodes the tree-level result and hard virtual corrections of the partonic process ij → Z/W/γ * at the scale µ ∼ Q. Following refs. [31,45,46], the renormalized transverse-momentum dependent (TMD) beam functions B i , which are essentially equivalent to TMDPDFs, can be matched onto PDFs as where the perturbative matching coefficients I ik describe the collinear initial-state radiation at the invariant mass scale µ ∼ q T and rapidity scale ν ∼ Q, and the nonperturbative parton distribution functions (PDFs) are denoted by f k . In the following, we abbreviate the Mellintype convolution in x as in the second line above. Finally, the soft function S describes the wide-angle soft radiation at the invariant mass and rapidity scale µ ∼ ν ∼ q T . The matching coefficients I ik and the soft function are process-independent and have been computed to O(α 2 s ) in refs. [47][48][49][50] allowing for a full NNLL analysis of Drell-Yan for massless quarks. The three-loop noncusp rapidity anomalous dimension required for the resummation at N 3 LL has recently become available [51,52].
In eq. (2.3), the logarithms of q T /Q are resummed by evaluating all functions at their characteristic renormalization scales and evolving them to common final scales µ and ν by solving the set of coupled evolution equations Only the evolution of the PDF leads to flavor mixing. Consistency of RG running implies that Note that in practice, the evolution is usually performed in Fourier space, such that one actually resums the conjugate logarithms ln(bµ) where b = | b T | ∼ 1/q T is the Fourier-conjugate variable to q T . The q T spectrum is then obtained as the inverse Fourier transform of the resummed b-spectrum. The exact solution and evolution directly in q T space, which directly resums the (distributional) logarithms in q T , has been derived recently in ref. [53], and turns out to be significantly more involved due to the intrinsic two-dimensional nature of q T . In the following subsections, we discuss how the mode and factorization setup changes when massive quark flavors are involved. These lead to the appearance of additional modes related to fluctuations around the mass shell as discussed extensively in refs. [22,23]. For the different hierarchies between the mass scale m and the scales Q and q T the relevant modes are illustrated in fig. 2. In the first case, q T m ∼ Q, the massive flavor is integrated out at the hard scale, which leads to the above massless case with n l massless flavors, as discussed in sec. 2.2. The second case, q T m Q, where the quark mass is larger than the jet resolution variable, is analogous to the corresponding case for thrust in e + e − → dijets in refs. [22,23] and DIS in x → 1 limit [24]. We refer to these papers for details and only summarize briefly the main features for this regime in sec. 2.3. Our main focus is on the hierarchies q T ∼ m Q and m q T Q, which are important for bottom and charm quark mass effects at the LHC, and which are discussed in secs. 2.4 and 2.5.

Quark mass effects for m ∼ Q
If the quark mass represents a large scale ∼ Q (which concerns the top quark at the LHC), this quark flavor does not play a dynamic role in the low-energy effective theory and is integrated out at the hard scale in the matching from QCD to SCET. The relevant modes are shown in fig. 2(a). The massive quark only contributes via mass-dependent contributions to the hard function. This yields the factorization theorem which is essentially equivalent to the massless case in the previous subsection with n l massless flavors. The hard function H ij (Q, m, µ) can be evaluated either in the (n f = n l ) or the (n f = n l + 1) flavor scheme for α s , where n l is the number of light (massless) quark flavors. The associated massive quark corrections are directly related to the virtual contributions to the quark form factors, e.g. given at O(α 2 s ) by the virtual diagrams in fig. 1(b). In general both primary and secondary corrections contribute for initial (massless) quarks. Using the (n l ) flavor scheme for α s these vanish as O(Q 2 /m 2 ) in the decoupling limit m Q for the conserved Figure 2. Effective theory modes for the q T spectrum with massive quarks for q T Q and m Λ QCD .
vector current. For the axial-vector current, contributing to Z-boson production, there are in addition also anomaly corrections starting at O(α 2 s ) from the massive quark triangle in fig. 1(a) that do not decouple. 2 Since the massive quark does not appear as a dynamic flavor in the EFT below the hard scale Q, the entire RG evolution to sum the logarithms of q T is performed with n l massless flavors as in eq. (2.3).

Quark mass effects for q T m Q
Next, we consider the hierarchies where the quark mass is parametrically smaller than the hard scale, m Q. These require a different factorization setup than m ∼ Q since fluctuations around the mass-shell are now parametrically separated from hard fluctuations, which would lead to large unresummed logarithms inside the hard function H ij (Q, m, µ). In this subsection, we start with the case where the transverse momentum is much smaller than the mass, q T m Q, while q T ∼ m Q and m q T Q are considered in the following subsections. In a first step the QCD current is matched onto the SCET current with n l + 1 dynamic quark flavors at the scale µ ∼ Q. Since m Q this matching can be performed (at leading order in the expansion parameter m/Q) only with massless quarks, leading to the hard function with n l +1 massless flavors, H (n l +1) ij , with the strong coupling inside it renormalized with n l +1 flavors. The matching is performed onto SCET containing n a -collinear, n b -collinear, and soft mass modes with the scaling as illustrated in fig. 2(b). These mass-shell fluctuations arise here purely from secondary virtual contributions.
In a second step at the scale µ ∼ m, the mass modes are integrated out and the SCET with n l massless and one massive flavor is matched onto SCET with n l massless flavors with the usual scaling as in the massless case in eq. (2.1). Since the soft and collinear mass modes have the same invariant mass set by the quark mass and are only separated in rapidity, there are rapidity divergences in their (unrenormalized) collinear and soft contributions. Their renormalization and the resummation of the associated logarithms can be again handled using the rapidity RG approach in refs. [40,41], which has been explicitly carried out in ref. [54]. 3 In addition, all renormalized parameters like the strong coupling constant are matched at the mass scale from n l + 1 to n l flavors taking into account that the massive flavor is removed as a dynamic degree of freedom.
After these steps, the factorization at the low scale ∼ q T proceeds as in the massless case with all operator matrix elements depending on the n l massless flavors, which yields the factorization theorem Here H c and H s denote the hard functions that arise from the matching at the mass scale µ ∼ m. Their natural rapidity scales are ν ∼ Q for the collinear contributions and ν ∼ m for the soft ones. They can be evaluated in either the (n l ) or (n l + 1) scheme for α s . We will give their expressions at O(α 2 s ) in sec. 4.1. The resummation of all logarithms of ratios of q T , m, and Q is achieved by performing the evolution in µ and ν of all functions appearing in eq. (2.10) from their natural scales. Figure 3. Illustration of the renormalization group evolution for q T of the hard, beam, soft, and parton distribution functions in invariant mass and rapidity. The anomalous dimensions for each evolution step involve the displayed number of active quark flavors. The label m indicates that the corresponding evolution is mass dependent.
In principle, the µ evolution can be performed by evolving all functions with their respective number of quark flavors without switching the flavor scheme, i.e. with n l + 1 flavors for H, n l flavors for B and S and an additional evolution for the collinear and soft matching functions H c and H s . The consistency of RG running for the factorization theorems in eqs. (2.10) and (2.8), and eq. (2.7) with n l massless flavors, implies that the µ-dependence of the mass-dependent hard functions H c and H s is precisely given by the difference between n l and n l + 1 active quark flavors in the evolution of the hard function H ij , γ Hc m, µ, ν ω a + γ Hc m, µ, where γ (n f ) H is defined in eq. (2.6), and γ Hc and γ Hs are defined analogously. At two loops this relation can be checked explicitly using the results in eqs. (4.11), (4.13) and (A.2). As a result, the µ evolution for the hard functions can be conveniently implemented as illustrated in fig. 3(a), by carrying out the µ evolution with n l active quark flavors below the matching scale µ m ∼ m and with n l + 1 flavors above µ m , providing in this sense a "variable-flavor number scheme" [23,24]. (This effectively corresponds to using operator running for the hard scattering current, which is renormalized with n l + 1 flavors above the mass scale and with n l flavors below the mass scale.) In addition there is also a rapidity evolution, which is carried out at µ m = m, i.e. at the border between the (n l + 1) and (n l )-flavor theories (see ref. [54]), which is governed by the mass-dependent rapidity anomalous dimensions for H s and H c , ln H s (m, µ, ν) . (2.12)

Quark mass effects for q T ∼ m Q
If the q T scale is of the order of the quark mass, q T ∼ m, the massive quark becomes a dynamic degree of freedom, which contributes to the q T spectrum via real radiation effects.
The mass modes in eq. (2.9) are now the same as the usual massless SCET II modes for the q T measurement in eq. (2.1), since their parametrically scaling coincides for q T ∼ m, as illustrated in fig. 2(c). In this case, there is only a single matching at the hard scale µ ∼ Q from QCD onto SCET with these common soft and collinear modes. This hard matching gives again rise to the (mass-independent) hard function H (n l +1) ij for n l + 1 massless flavors. The SCET operator matrix elements at the scale µ ∼ q T , i.e. the beam and soft functions, now encode the effects of the massive quark. They are now renormalized with n l + 1 quark flavors and contain an explicit dependence on the quark mass. When integrating out the modes with the virtuality q T also the massive quark is integrated out and the collinear matching functions I ik between the beam functions and the PDFs thus also contain the effect from changing from n l + 1 to n l flavors, i.e. (2.13) Written out explicitly, the factorization theorem reads where i, j = Q,Q denotes the massive quark flavor in the sum over flavors. We stress that the renormalization of the bare soft and beam function with n l massless and one massive flavor is carried out in the n l + 1 flavor scheme for α s , while the strong coupling in the PDFs (which are defined in the lower theory with n l massless flavors) is renormalized with n l flavors. The renormalized soft function and beam function coefficients I ik can then be expressed in terms of either the (n l + 1) or the (n l ) flavor scheme for α s without introducing large logarithms.
In this hierarchy quark mass effects enter in eq. (2.14) at O(α 2 s ) in two ways: There are secondary radiation effects appearing in the two-loop soft function S (2) and the flavor-diagonal beam function matching coefficients I (2) qq . In addition, there are primary mass effects arising from a massive-quark initiated hard process. For Z/γ * production, this requires the production of the massive quarks via gluon splitting in both collinear sectors, which manifests itself in two one-loop collinear matching coefficients I   Qg with Q = c. The resummation of logarithms ln(q T /Q) and ln(m/Q) is again obtained by performing the RG evolution for eq. (2.14), which is illustrated in fig. 3(b). While the evolution of the PDFs proceeds in n l flavors, the µ-evolution for the hard, beam, and soft functions above the scale m is now carried out purely with n l + 1 flavors. Consistency of RG running for eq. (2. 13) In total, the factorization theorem reads × k∈{q,q,Q,Q,g} l∈{q,q,g} × k∈{q,q,Q,Q,g} l∈{q,q,g} As in sec. 2.4, massive quark corrections can arise at O(α 2 s ) either via primary mass effects involving the product of two one-loop PDF matching corrections M Qg (for Z/γ * ) generating a massive quark-antiquark pair that initiates the hard interaction, or via secondary mass effects involving one two-loop contribution M (2) qq . Note that also the running of the light quark and gluon PDFs above µ m generates an effective massive quark PDF via evolution factors U f,jk (m, z, µ) . (2.20) The absence of soft mass modes in this regime implies there is no rapidity evolution at the mass scale, while the rapidity evolution between beam and soft functions is the same as for n l + 1 massless flavors. In this regime, the mass dependence is thus fully contained in the collinear sectors. Within each collinear sector, the EFT setup is completely analogous to that of the heavy-quark induced inclusive cross section discussed in detail in ref. [18], with the beam functions here playing the role of the inclusive cross section there and the q T scale here playing the role of the hard scale there.

Relations between hierarchies
After discussing all hierarchies separately, we now show how the ingredients in each of the associated factorization theorems are related to each other. This will also make it obvious how the mass-dependent fixed-order corrections that are kept in one hierarchy but are dropped in another can be combined with the resummation of logarithms to obtain a systematic inclusion of the mass effects over the whole q T spectrum. The relations between the modes and their contributions between the different regimes are summarized in fig. 4.  Figure 4. Relevant modes for the q T spectrum with q T Q for different hierarchies between the quark mass m and the scales q T and Q. The arrows indicate the relations between the modes and their associated contributions.
In the products on the right-hand sides, which appear in eq. (2.10), logarithms ln(q T /m) are resummed to all orders in the limit q T m. One can include the associated O(q 2 T /m 2 ) power corrections that are important for q T ∼ m, by obtaining them from the fixed-order expansions of eqs. (2.22) and (2.23) and adding them to the (n l )-flavor beam function coefficients and soft function at the scale µ ∼ q T .
Finally, the fixed-order contributions for the operator matrix elements appearing in the hierarchy m q T are also encoded in the corresponding ones for q T ∼ m. Hence, the massdependent beam function matching coefficients are related to those for m q T and the PDF matching functions by Similarly, the mass-dependent and massless soft function are related by since there are no relevant soft IR fluctuations below the mass scale. In the functions on the right-hand sides, which appear in eq. (2.19), logarithms ln(m/q T ) can be resummed to all orders in the limit m q T . This can be combined with the associated O(m 2 /q 2 T ) power corrections relevant for q T ∼ m, by obtaining them from the fixed-order expansions of eqs. (2.24) and (2.25) and adding them to the (n l + 1)-flavor beam function matching coefficients and soft function at the scale µ ∼ q T .
By including the various power corrections, one combines the factorization theorems in the different hierarchies and obtains a theoretical description that is valid across the whole q T spectrum and includes the resummation of logarithms in all relevant limits. This can be considered a variable-flavor scheme for the resummed q T spectrum. (In addition one should of course also include the usual q T /Q nonsingular corrections to reproduce the full fixed-order result for q T ∼ Q.) We stress that different specific ways of how to incorporate the various power corrections are formally equivalent as long as the correct fixed-order expansion and the correct resummation is reproduced in each limit. Any differences then amount to resummation effects at power-suppressed level and are thus beyond the formal (leading-power) resummation accuracy.
A particular scheme ("S-ACOT") to merge the m q T and q T ∼ m regimes was discussed in ref. [19] for the primary massive quark corrections. In practice, for the numerical study of b-quark mass effects at low q T m Q the off-diagonal evolution factor U f,bg and thus the effective b-quark PDF at the scale q T are still quite small, so that one may effectively count f b (µ B ) ∼ O(α s ). In particular, this counting facilitates the seamless combination with the nonsingular corrections for m ∼ q T encoded in the beam function matching coefficients in eq. (2.14). This was discussed in ref. [18] in the context of the inclusive bbH production cross section, and the analogous discussion applies here as well. In refs. [23,24], the power corrections were included implicitly in the construction of the variable-flavor number schemes for thrust in e + e − and DIS in the endpoint region by applying different renormalization schemes for the massive quark contributions to the EFT operators above and below the mass scale.

Factorization for massless quarks
For the measurement of beam thrust with T Q the relevant EFT modes are n a -collinear, n b -collinear and usoft modes with the scaling The usoft and collinear modes are now separated in invariant mass, p 2 us ∼ T 2 p 2 na ∼ p 2 n b ∼ QT , which is the characteristic feature of a SCET I theory. In this case, there are no rapidity logarithms and the renormalization and evolution is solely in invariant mass. The resulting factorization formula reads [31] dσ This as well as the expressions including mass effects in the subsequent subsections are valid for the primary hard scattering, and do not account for spectator forward (multiparton) scattering effects, since the Glauber Lagrangian of ref. [55] has been neglected. (There are also corrections from perturbative Glauber effects starting at O(α 4 s ) [56,57], which are well beyond the order we are interested in, but can be calculated and included using the Glauber operator framework of ref. [55].) This is sufficient for our purposes of discussing the mass effects in a prototypical SCET I scenario. Our results are also directly relevant to include massive quark effects at NNLL +NNLO in the Geneva Monte-Carlo program [58,59], which employs T as the jet resolution variable for the primary interaction and where multiparton effects are included [60] via the combination with Pythia8 and its MPI model [61][62][63].
The hard function H ij in eq. (3.2) is measurement independent and the same as in eq. (2.3). The beam and soft functions depend on the measurement and are different from those in eq. (2.3). The virtuality-dependent beam functions B i can be factorized into perturbative matching coefficients I ik at the scale µ ∼ t ∼ √ QT and the standard nonperturbative The matching coefficients I ik have been calculated to O(α 2 s ) [65,66]. The soft function at the scale µ ∼ T is equivalent to the thrust soft function [67], which is known to O(α 2 s ) [68,69]. The noncusp anomalous dimensions required at N 3 LL are available from existing results [64].
The resummation of logarithms ln(T /Q) is performed by evaluating all functions at their characteristic scales and evolving them to a common final scale µ using the solutions of the In contrast to eq. (2.3), there is no rapidity evolution in SCET I for massless quarks. Consistency of the RG evolution implies that For beam thrust the number of possible scale hierarchies with a massive quark is larger due to the fact that the (massless) collinear and soft modes have different invariant mass scales. The discussion for the hierarchies with √ QT m where the massive quark cannot be produced via real emissions, is completely identical to q T m, since the quark mass effects in these cases are independent of the low-energy measurement. For m ∼ Q, all mass effects are encoded by using the mass-dependent hard function from sec. 2.2 in eq. (3.2) together with n f = n l everywhere else. Similarly, the case √ QT m Q is described by using eq. (3.2) with n f = n l , and replacing the hard function by the product of massless (n l + 1)flavor hard function and the soft and collinear mass-mode functions H s and H c , as for the case q T m Q in sec. 2.3. We therefore proceed directly to the hierarchies m √ QT , where the massive quark can be produced in collinear and/or soft real radiation. The four possible hierarchies and the relevant EFT modes in the p + p − -plane are illustrated in fig. 5, and are discussed in the following subsections.

Quark mass effects for √ QT ∼ m Q
For √ QT ∼ m Q massive quarks can be produced via collinear initial-state radiation, but not via soft real radiation. After the hard matching, carried out with n l + 1 massless quark flavors as discussed in sec. 2.3, the degrees of freedom in the EFT are collinear and soft modes with the scaling n a -collinear + MM: as illustrated in fig. 5(a). While the usual usoft modes live at a lower virtuality scale than the collinear modes, the soft mass-modes are separated from the collinear modes only in rapidity, leading to a mix of SCET I and SCET II features. In particular, there will be mass-related rapidity divergences. At the scale µ ∼ √ QT ∼ m this theory with n l + 1 dynamical quark flavors is matched onto a theory with n l flavors integrating out also fluctuations related to initial-state collinear radiation of massless particles. The matching in the collinear sectors leads to mass-dependent beam function coefficients I ik , analogous to eq. (2.13). The dependence on the rapidity scale ν here arises due to virtual secondary massive quark corrections and is the same as for the collinear mass-mode function H c in eq. (2.10), i.e., In the soft sector the soft mass modes are integrated out, leaving only the usoft modes. This gives exactly the soft mass-mode function H s in eq. (2.10), which encodes the effects of virtual secondary massive quark radiation. As usual, also the strong coupling constant has to be matched from n l + 1 to n l flavors. The remaining contributions at the lower scales, the soft function and the PDFs, are given in terms of n l massless flavors and in the (n l )-scheme for α s . The resulting factorized cross section reads The resummation of logarithms in eq. (3.9) is obtained by evolving all functions from their natural scales, as illustrated in fig. 6(a). The mass-dependent ν evolution, which resums the rapidity logarithms ln(Q/m), is identical to the one for the hard functions H c and H s in sec. 2.3. The µ evolution can be conveniently carried out by evolving the hard, beam, and soft functions with n l + 1 active flavors above the mass scale and with n l active flavors below the mass scale, which automatically takes into account the µ dependence of H S . To see this, the consistency of RG running for eq. (3.9) together with the consistency relation for n l + 1 massless quarks in eq. (3.5) implies are the anomalous dimensions for the soft and beam functions with n l and n l + 1 massless flavors as defined in eq. (3.4), and γ (n l +1) B,m (t, m, µ, ν/ω) is the anomalous dimension of the mass-dependent beam function, The consistency relation in eq. (3.10) can be confirmed explicitly at two loops with the expressions in eqs. (A.10), (A.16), (4.11), and (4.25). Note that this relation does not imply that γ (t, µ) are the same, which is indeed not the case for the massive quark corrections as we will see explicitly in sec. 4.2. The reason is that the presence of the quark mass leads to a SCET II -type theory, in which the required rapidity regularization redistributes the µ anomalous dimension between soft and collinear corrections with individually regularization scheme dependent pieces. Only their sum, as given on the left-hand side of eq. (3.10), is independent of the regularization scheme and yields the combined running for beam and soft functions with n l + 1 massless flavors above µ m ∼ m, as on the right-hand side of eq. (3.10).

Quark mass effects for T m √ QT
When the beam scale becomes larger than the mass scale, but the soft scale is still larger than the mass, which happens for m 2 /Q T m, the beam function matching coefficients Figure 6. Illustration of the renormalization group evolution for beam thrust of the hard, beam, soft, and parton distribution function in invariant mass and rapidity. The anomalous dimensions for each evolution step involve the displayed number of active quark flavors. The label m indicates that the corresponding evolution is mass dependent.
I ik encode only fluctuations related to initial-state collinear radiation with n l + 1 massless quarks. The EFT below √ QT contains the usual collinear and soft mass modes scaling as , which do not contribute to the beam thrust measurement. However, besides these there are also additional modes with fluctuations around the mass scale which can have a dynamic impact on the T spectrum in this hierarchy, as illustrated in fig. 5(b). Their scaling is precisely determined by this condition and the on-shell constraint, yielding the scaling n a -csoft MM: p µ cs,na ∼ T , We refer to these intermediate modes as collinear-soft (csoft), since they are simultaneously boosted (by a factor m/T ) but are softer than the standard collinear modes, thus coupling to the latter via Wilson lines and leading to a SCET + theory [70]. This type of intermediate SCET + modes have appeared in various contexts [70][71][72][73]. The setup here is similar to the case of double-differential distributions with a simultaneous q T and beam thrust measurement discussed in ref. [71]. Also there, several hierarchies are possible ranging from a SCET II regime for q T ∼ T to a SCET I regime for q T ∼ √ QT with a SCET + regime in between. The csoft modes in their SCET + regime are separated from the collinear modes only in rapidity. In our case here, the csoft mass modes are separated in invariant mass from the standard SCET I soft and collinear modes and in rapidity from their SCET II -type soft mass-mode cousins. The matching in the collinear sector can be performed in two steps as in eqs. (2.17) and (2.18). After integrating out all of the mass modes, the PDF and the soft function are still given in a (n l )-flavor theory. Thus the factorization formula reads × dt a k∈{q,q,Q,Q,g} l∈{q,q,g} The functions S c here are the csoft matching functions encoding the interactions of the collinearsoft radiation at the invariant mass scale µ ∼ m and the rapidity scale ν ∼ m 2 /T . The M ij correspond to the well-known PDF matching correction incorporating the effect of the collinear mass modes, as in eq. (2.19). The virtual soft massive quark corrections are still described by the function H s at the rapidity scale ν ∼ m as in eq. (3.9). The RG evolution for eq. (3.13) is illustrated in fig. 6(b). The csoft function satisfies the same rapidity RGE as the collinear mass-mode function H c in eq. (2.10) and the massive beam functions in eq. (3.8), i.e., 14) The only difference with respect to the rapidity evolution in eq. (3.9) is that it now happens between H s and S c with ν Sc ∼ m 2 /T rather than between H s and the beam functions with ν B ∼ Q, such that now the (smaller) rapidity logarithms ln(m/T ) are resummed. The µ evolution can be performed with n l + 1 flavors for the hard function H ij , the beam and soft function above the mass scale and with n l flavors below. This automatically accounts for the µ dependence of S c and H s above µ m ∼ m, which precisely gives the difference between the evolution of the soft function with n l + 1 and n l flavors, as implied by the consistency of RG running for eq. (3.13) and the relation in eq. (3.5) with n l + 1 massless quarks,

Quark mass effects for T ∼ m and m T
For T ∼ m the csoft and soft mass modes in the previous section merge with the usual usoft modes, In this hierarchy massive quarks can be also produced in soft real radiation leading to a soft function at the scale µ ∼ T that depends on the quark mass. In addition, there are the usual collinear modes as well as the collinear mass modes, as illustrated in fig. 5  have m √ QT , the matching in the collinear sectors is the same as in the previous subsection.
The factorization formula reads Now all rapidity divergences cancel within the soft function and do not leave behind any potentially large rapidity logarithms. The RG evolution for this case is illustrated in fig. 6(c). Finally, for m T the mass dependence in the IR insensitive soft function vanishes, if expressed in terms of the (n l +1)-flavor scheme for α s . Otherwise, eq. (3.18) remains unchanged, such that now the only dependence on the mass scale arises in the PDF matching corrections M ij . The hard, beam, and soft functions can now be always evolved with n l + 1 massless flavors and only the evolution of the PDF changes, when crossing the flavor threshold.

Relations between hierarchies
We now discuss how the ingredients appearing in the different factorization formulae are related to each other. The relations between the modes and their contributions are illustrated in fig. 7 for the different possible hierarchies. As in sec. 2.6, these relations show how one can combine the resummation of logarithms relevant in one regime with the power-suppressed fixed-order content that becomes important in the neighboring regimes, enabling a systematic inclusion of mass corrections across the entire T spectrum.
Similar to eq. (2.22), the mass-dependent beam function coefficients appearing for √ QT ∼ m (incorporating massive quark fluctuation as discussed in sec. 3.2) are related to those for √ QT m with n l massless quarks and the collinear mass-mode function H c by At the same time, the mass-dependent beam function also encodes information about the fixed-order content for T m √ QT . Comparing eqs. (3.9) and (3.13), they are related to those with n l + 1 massless flavors, the PDF matching functions, and the csoft function S c by The mass-dependent soft function for T ∼ m in eq. (3.18) contains massive quark fluctuations that for T m get split into the massless soft function with n l flavors, the soft mass mode function H s , and the csoft function S c in eq. (3.13) as Finally, as already mentioned below eq. (3.18), the soft function approaches its massless limit for m T , (3.22)

Relation to previous literature
Here, we briefly comment on the connection of the factorization setup presented here for beam thrust to the closely related SCET I setup in refs. [22,23] for thrust in e + e − -collisions (or similarly also for DIS with x → 1 [24]). Besides the fact that the jet functions appearing for thrust in e + e − are replaced by virtuality-dependent beam functions for beam thrust in pp collisions, there are also some differences in the description of the different regimes. While we have discussed each possible hierarchy in a strict EFT sense identifying a single operator matrix element or matching function with each EFT mode, refs. [22,23] already set up their factorization theorems in a way that they apply for neighboring hierarchies (e.g. T m ∼ √ QT and T m √ QT ). Using appropriate renormalization conditions, the mass dependent corrections to the jet and soft functions were assigned such that they directly give the massless results in the small mass limit and decouple in the infinite mass limit. In addition, the factorization theorems contained mass mode matching functions for hard, jet, and soft function, whenever the evolution of one of the matrix elements crossed the mass scale. In our setup this essentially amounts to a specific practical choice how to incorporate the power corrections in eqs. (3.19)- (3.22). Although the final outcome is thus essentially the same once the correct rapidity scales are chosen in the mass mode matching functions, it is perhaps more transparent conceptionally to first distinguish all hierarchies with the associated modes as we do here, and separately discuss the possible ways to add the nonsingular corrections later. In particular, for the hierarchy T m √ QT , this leads us to identify the csoft modes as a relevant degree of freedom with a corresponding function evaluated naturally at the rapidity scale ν ∼ m 2 /T . In contrast, refs. [22,23] the corresponding corrections appeared inside the mass mode matching functions as soft-bin contributions that had to be evaluated at this rapidity scale to minimize large rapidity logarithms.

Results for massive quark corrections
In this section we present our results for the contributions from primary massive quarks at O(α s ) and from secondary massive quarks at O(α 2 s ) to all components of the various factorization theorems discussed in secs. 2 and 3, providing all required ingredients for the Drell-Yan spectrum at NNLL . The results in this section are only given for a single massive quark flavor and with the rapidity divergences regularized by the symmetric Wilson line regulator introduced in refs. [40,41]. The actual computations of the primary and secondary massive quark corrections to the beam and soft functions are carried out in some detail in app. B. In sec. 4.4, we show explicitly that the results satisfy the small and large mass limits, and illustrate the numerical size of the mass-dependent corrections for the case of b quarks.
The fixed-order results for the mass-dependent corrections can be expanded either in terms of the (n l )-flavor or (n l + 1)-flavor scheme for α s . For definiteness we expand in this section any function F (m) using α (4.1) The different two-loop contributions to F (2) (m) are written as where F (2,h) contains all mass dependent two-loop corrections and F (2,l) the associated contributions for massless flavors. The expansion of F in terms of α (n l ) s can be easily obtained by using the matching relation for α s , where here and in the following we abbreviate

Hard matching functions
All hard matching functions, i.e. the hard function H at the scale Q and the mass mode matching functions H c and H s at the scale m Q, are insensitive to the measurement performed at a lower scale and are therefore the same for q T and beam thrust T . Since the QCD and SCET currents are the same as for e + e − → 2 jets, the results can be read off from the corresponding ones in refs. [23,54].

Massive quark corrections to the hard function
The secondary massive quark corrections to the hard function in eq. (2.8) read where H (0) denotes the tree-level normalization and H (1) the massless one-loop contribution given in eq. (A.1). The function h virt contains the O(α 2 s C F T F ) virtual massive quark bubble correction in full QCD shown in fig. 1. It has been calculated in refs. [74,75] and is given by x + 1060 27 ln x with r = √ 1 + 4x. For m → ∞ the massive quark decouples such that h virt (x) → 0 for x → ∞. For Z-boson production there is an additional primary massive quark contribution to the axial vector current, namely the massive quark triangle correction in fig. 1, which we denote by ∆h axial with the same prefactor as for h virt using the narrow width approximation for notational simplicity. It has been computed in refs. [76][77][78] and is given by where the vector and axial vector couplings for up-and down-type quarks are proportional to v u = 1 − 8/3 sin 2 θ W , v d = −1 + 4/3 sin 2 θ W , a u = 1, a d = −1. The functions G 1 and G 2 are given in eqs. (2.8) and (2.9) of ref. [77]. In the small mass limit m Q the function G 1 (m 2 /Q 2 ) vanishes, such that ∆h axial gives the same result as for a massless flavor in the loop, For a massless isospin partner this correction is thus canceled within the SU (2) L doublet, while for different masses (as for m b m t ) there is a (µ-independent) remainder. Note that for Q m the function ∆h axial gives a nonvanishing contribution In this case one would integrate out the heavy quark at the scale µ m ∼ m and evolve the axial current to µ H ∼ Q to resum logarithms ln(m 2 /Q 2 ).

Soft and collinear mass-mode matching functions
The contributions to the mass-mode matching functions originate only from secondary radiation. The soft mass-mode function H s appearing in eqs. (2.10), (3.9), and (3.13) has been computed at two loops with the symmetric η-regulator in ref. [54]. It is given by The rapidity anomalous dimension is even known at O(α 3 s ), see ref. [24]. The result for the collinear mass-mode function H c in eq. (2.10) can be inferred at O(α 2 s ) from the computations in refs. [23,54] and reads Its anomalous dimensions are One can easily verify that the relation in eq. (2.21) between the massive hard function in eq. (4.5), the hard function contribution for a massless flavor in eq. (A.1), and the two mass-mode functions in eqs. (4.10) and (4.12) is satisfied, (4.14)

Beam functions
Here we give our results for the massive quark beam function coefficient I Qg at O(α s ) and the secondary massive quark corrections to the light-quark coefficients I qq at O(α 2 s ), which appear in eqs. (2.14) and (3.9) for the q T and beam thrust measurement. We also give the massive quark contributions to the beam function anomalous dimensions. We also give the well-known results for the corresponding PDF matching coefficients

TMD beam function coefficients
The matching coefficient I Qg generating a massive beam function from a gluon splitting is calculated at O(α s ) in sec. B.1 and corresponds to the diagram shown in fig. 8. The result reads (p 2 with the splitting function This result is equivalent to the Fourier transform of the mass-dependent matching functions C h/G in ref. [19]. After performing an appropriate crossing it also agrees with the massive final-state splitting functions [79,80] or fragmenting jet function [81]. The contributions from secondary massive quarks to the matching coefficient I qq are computed in sec. B.3 at O(α 2 s ). The corresponding diagrams are shown in fig. 9. The result is given by 18) and the one-loop term I qq is given in eq. (A.4). Here L n (1 − z) denotes the standard plus distribution as defined in appendix D.
In the (n l + 1)-flavor scheme for α s there is also a correction from a virtual massive quark loop to the flavor-nondiagonal matching coefficient I (2) qg . This contribution is trivial, since it factorizes into a vacuum polarization correction corresponding to the matching of α s between the (n l ) and (n l + 1)-flavor schemes, and the one-loop contribution, such that with I (4.20) The L 0 ( p T , µ) distribution is defined in appendix D. The µ anomalous dimension here is the same as for a massless quark flavor, γ

Virtuality-dependent beam function coefficients
The massive quark-gluon virtuality beam function matching coefficient at O(α s ) shown in fig. 8 is given by The contributions from secondary massive quarks to the light-quark coefficient at O(α 2 s ) as shown in fig. 9 are given by and the one-loop term I qq is given in eq. (A.8). In the (n l + 1)-flavor scheme for α s there is also the analogous contribution to eq. (4.19) to the flavor-nondiagonal coefficient with I (4.25) We emphasize that the massive quark contribution to the µ anomalous dimension is not the same as for a massless flavor, but is in fact the same as for the TMD beam function in eq. (4.20). This is required by consistency with the large mass limit QT , q T m, where the massive flavor can only contribute to the (local) running of the common current operators, which are independent of the measurement. Only in combination with the soft mass-mode function H s and the soft function, the combined µ evolution above the mass scale is the same as for n l + 1 massless flavors as discussed in eq. (3.10).
The secondary massive quarks introduce rapidity divergences and associated logarithms also in the virtuality-dependent beam function. The ν anomalous dimension induced by the secondary massive effects is the same as for the collinear mass-mode function, see eq. (3.8), given in eq. (4.13).

PDF matching coefficients
The matching coefficients relating the PDFs in the (n l + 1) and the (n l )-flavor scheme are all known at two loops [82] and partially beyond (see e.g. refs. [83,84] and references therein). The matching coefficient for a primary massive quark originating from an initial-state gluon at O(α s ) is The matching coefficient coming from secondary massive quark corrections to the light-quark PDFs reads up to O(α 2 s ) The matching coefficient between the gluon PDF in the (n l ) and (n l + 1)-flavor schemes at O(α s ), which is also required for Drell-Yan at O(α 2 s ), is equivalent to the matching relation for α s Note that taking into account the nondiagonal evolution of the PDFs the known O(α 2 s ) corrections for all matching factors M ij become relevant at NNLL .

Soft and collinear-soft functions
Here we give all massive quark corrections at O(α 2 s ) to the soft and csoft functions. They arise exclusively from secondary radiation. Note that the soft functions satisfy Casimir scaling at this order and can be thus applied also to color-singlet production in gluon-fusion by replacing an overall C F → C A .

TMD soft function
The contributions from secondary massive quarks to the TMD soft function, which appears in eq. (2.14) for q T ∼ m, are calculated in app. B.4 at O(α 2 s ) and correspond to the diagrams shown in fig. 10. The result reads wherem = m/p T and c = √ 1 + 4m 2 as in eq. (4.18) and the one-loop soft function S (1) given in eq. (A.11).
The massive quark contributions to the anomalous dimensions of the soft function are The µ anomalous dimension here is the same as for an additional massless flavor, γ

Csoft function for beam thrust
The csoft function is a matching coefficient between an eikonal matrix element in the n l + 1 and n l flavor theories appearing for the hierarchy T m √ QT in eq. (3.13). The relevant diagrams at O(α 2 s ) are shown in fig. 10 and are calculated in sec. B.5. The result is given by We can see that with the scale choices µ ∼ m and ν ∼ µ 2 / ∼ m 2 /T all large logarithms (including the implicit one inside the plus distribution) are minimized. The µ anomalous dimensions of the csoft matching function is given by The ν anomalous dimension is the same as for the collinear mass mode function in eq. (4.13), γ ν,Sc = γ ν,Hc .

(Beam) thrust soft function
The secondary massive quark corrections to the (beam) thrust soft function at O(α 2 s ) were calculated in ref. [85] and are given by and the one-loop soft function S (1) is given in eq. (A.14). The term ∆S τ ( , m) contains the correction from two real final-state emissions entering two opposite hemispheres, which vanishes both for m and m and is currently only known numerically. The integral expression for this numerically small contribution is given in eq. (61) of ref. [85], and a precise parametrization can be found in ref. [23].
The massive quark contribution to the anomalous dimension is the same as for a massless flavor, γ

Small and large mass limits
In secs. 2.6 and 3.5 we explained how the ingredients in the factorization theorems for different hierarchies are related to each other. Here we verify these relations for the beam and soft functions up to O(α 2 s ). We also scrutinize the numerical impact of the power corrections for these functions. We focus in particular on the O(m 2 /q 2 T ) corrections the q T spectrum for b quarks, which are contained in the factorization theorem eq. (2.14) for q T ∼ m but not in the massless limit for m q T in eq. (2.19), as these are phenomenologically important hierarchies for b-quark mass effects at the LHC.
For the numerical results we consider the use the MMHT2014 NNLO PDFs [86] and evaluate the contributions for µ = m b = 4.8 GeV, ω = m Z , and E cm = 13 TeV. The main qualitative features of the results do not depend on these specific input parameters.

Limiting behavior for q T
We first consider the primary mass effects at one loop, which are encoded in the TMD beam function matching coefficient I  On the other hand, in the opposite limit m p T it becomes Qg in eq. (4.26). To account for the correct distributive structure in p T that emerges in the massless limit, one can integrate the expressions with massive quarks and identify the distributions at the cumulant level.
In fig. 11 we show the result for the massive quark beam function B (1) Qg ⊗ x f g at O(α s ) as function of p T using the full massive matching coefficient I (1) Qg (solid orange) and its small mass limit in eq. (4.36). Note that the results differential in p T are not explicitly µ-dependent at O(α s ). In the right panel we show the corresponding results for the cumulant which also includes the δ (2) ( p T ) constant contribution. We can see that in both cases the small mass limit is correctly approached for p (cut) T m b , while for p (cut) T m b the primary mass effects decouple with the result going to zero. The corrections to the small mass limit become sizeable for p T ∼ m b and vanish quite fast for larger p T .
In fig. 12 we show the result for the convolution between two massive quark beam functions, which enters the result for Z-boson production at O(α 2 s T 2 F ) and NNLL . The analytic expression for the convolution between the two one-loop mass-dependent coefficients is given in eq. (C.6). We see that now the corrections to small-mass limit remain nonnegligible even for larger values of p T . This is due to the fact that the p T -convolution generates a logarithmic dependence in the spectrum, such that the power corrections of O(m 2 b /p 2 T ) become enhanced by logarithms ln(p 2 T /m 2 b ). Next, we consider the secondary massive quark corrections at O(α 2 s C F T F ). The result for the mass-dependent TMD beam function coefficient I (2,h) qq ( p T , m, z) is given in eq. (4.17). In the decoupling limit p T m all its terms without distributions in p T give O(p 2 T /m 2 ) powersuppressed contributions. Combining its remaining distributional terms with the contributions arising from changing the α s scheme from n l + 1 to n l flavors yields such that all infrared mass dependence is given by the PDF matching, as required by the relation in eq. (2.24). The results for the massless coefficient and the PDF matching coefficient are given in eqs. (A.6) and (4.27), respectively. For the coefficient I at O(α 2 s T 2 F ) the limiting behavior is trivial, since it vanishes identically in the (n l )-flavor scheme for α s , and in the (n l + 1)-flavor scheme for α s it is exactly The mass-dependent TMD soft function is given in eq. (4.29). In the limit p T m all its terms without distributions in p T become O(p 2 T /m 2 ) power suppressed, just as for the beam function. Combining its remaining distributional terms with the contributions arising from changing the scheme of the strong coupling from n l + 1 to n l flavors yields We now discuss the numerical impact of the O(m 2 /p 2 T ) terms from secondary mass effects. Since the individual results for the beam and soft functions depend on the specific regularization scheme, we consider their symmetrized combinatioñ ν ω S( p T , m, µ, ν) , (4.44) which is independent of ν. 6 The O(α 2 s C F T F ) corrections explicitly depend on µ and the flavornumber scheme, but the difference between the full result and the small mass limits given in eqs. (4.40) and (4.43) do not. In fig. 13 we show the result for the O(α 2 s C F T F ) corrections (with α s = α (n l +1) s ) to the u-quark beam function, both differential in p T and the corresponding cumulant. We see that the full mass dependent results correctly reproduce the small and large mass limits. The corrections to the massless are much larger than for the primary mass effects. In particular, they are still of O(100%) for p (cut) T ∼ 10 GeV. This clearly indicates that for secondary radiation involving two massive quarks in the final state the corrections are rather of O(4m 2 /p 2 T ), as one might expect.

Limiting behavior for T
We carry out the discussion for beam thrust in close analogy. The virtuality-dependent massive quark beam function coefficient at one loop is given in eq. (4.21). In the limit t m 2 the primary massive quarks correctly decouple, (4.45) In the opposite limit m 2 t we get The secondary massive quark corrections to the virtuality-dependent beam function are given in eq. (4.22). In the decoupling limit t m 2 all its nondistributional terms become O(t/m 2 ) power suppressed. Combining the remaining distributional terms in t with the contributions arising from changing the scheme of the strong coupling from n l + 1 to n l flavors yields in agreement with eq. (3.19). The massless result for I (1) qq and the collinear mass-mode function H The mass-dependent corrections to the (beam) thrust soft function are given in eq. (4.33). In the limit m all its nondistributional terms become O( 2 /m 2 ) power suppressed. Combining the remaining distributional terms with the contributions arising from changing the scheme of the strong coupling from n l + 1 to n l flavors yields which was already checked in ref. [85]. In fig. 14, we show the numerical results for the one-loop massive beam function and the convolution between two of these (which is the leading order correction from primary massive quarks for the Z-boson production) as a function of √ t ∼ √ QT . The mass effects become relevant for √ t ∼ m b ∼ 5 GeV (corresponding to T 1 GeV for Q = m Z ). The corrections to the massless limit for the convolution of two beam functions is nonnegligible also for larger values. In fig. 15, we show the result for the secondary O(α 2 s C F T F ) corrections to the beam and soft function. The corrections to the massless limit for the beam function remain sizeable even for √ t 2m b . For the soft function, the mass effects are important for T ∼ ∼ m b and become small for > 10 GeV ∼ 2m b . Note that the small bump in the soft function in fig. 15 originates from the correction term ∆S τ in eq. (4.33). The associated correction in the massless limit is fully contained in the δ( ) term.

Rapidity evolution
Here, we discuss the solutions of the rapidity RGEs in eq. (2.12), or equivalently eqs. (3.8) and (3.14), and in particular the rapidity evolution for the mass-dependent soft function in eq. (2.16) for q T ∼ m, where the massive quark corrections give rise to a different running than for massless flavors. Our primary aim here is to highlight the different features with respect to the massless case, while leaving the practical implementation for future work.
The rapidity evolution for the mass-mode matching functions H s and H c according to eq. (2.12) has been discussed in ref. [54]. The evolution for the beam thrust beam function and csoft function according to eqs. (3.8) and (3.14) is completely analogous. For example, the ν-evolved soft matching function H s is given by The evolution function η Γ is defined by and resums the µ-dependent logarithms inside the ν anomalous dimension as required by consistency with the µ evolution to maintain the path independence in µ-ν-space [41]. With the canonical scale choice all logarithmic terms in the boundary condition γ ν,Hs (m, µ 0 (m)) are minimized. The solution of the rapidity RGE for the soft function is substantially more involved due to its two-dimensional convolution structure on p T . The formal solution of the rapidity RGE for massless quarks in eq. (2.6) is most conveniently found by Fourier transforming to impact parameter space with b = | b|, where the rapidity RGE becomes multiplicative The consistency (path independence) between µ and ν evolution requires the rapidity anomalous dimension in Fourier space to satisfy Its solution is given byγ The logarithms of ln(µ b e γ E /2) in the second boundary term are eliminated by the canonical scale choice With this choice, the ν evolution of the soft function in Fourier space at any given scale µ is given byS As is well known, the rapidity evolution kernel becomes intrinsically nonperturbative at 1/b Λ QCD [32][33][34]. This nonperturbative sensitivity appears through the resummed rapidity anomalous dimension, which with the canonical scale choice in eq. (5.7) gets evaluated at α s (1/b). It is important to note that this is not an artefact of performing the evolution in Fourier space. Rather this is a physical effect, which also happens when the ν evolution is consistently performed in momentum space. As shown in ref. [53], in this case the appropriate resummed result for γ ν,S ( p T , µ) explicitly depends on α s (p T ), which means it becomes nonperturbative for p T Λ QCD . For the massive quark corrections in the regime q T ∼ m the µ dependence of the rapidity anomalous dimension is the same as for the massless quarks, i.e. eq. (5.5), such that , µ 0 (b, m)) .
This means that the massive quark correctionsγ ν,S are the same as for a massless flavor in the limit m 1/b and are the same as the rapidity anomalous dimension of the soft mass mode function H s in the limit 1/b m, provided one uses the (n l + 1) and (n l )-flavor scheme for α s , respectively. To eliminate the logarithms insideγ Since µ (h) 0 (b, m) freezes out naturally at the perturbative mass scale for 1/b → 0, the nonperturbative sensitivity in the ν evolution gets regulated by the quark mass for the massive quark contributions.
We first illustrate this behavior in a simple one-loop toy example: We consider the radiation of a massive gluon (with mass M ) having the same couplings as a (massless) gluon in QCD, which exhibits the main features of the full results for secondary massive quarks. The associated corrections are obtained in the calculations of app. B.4.1 as intermediate results for the twoloop case. In b-space the one-loop rapidity anomalous dimensions for massless and massive gluons are given byγ  where K 0 denotes the modified Bessel function of the second kind and The mass-dependent result has the limiting behavior 14) in close analogy to eq. (5.10). A natural choice to eliminate any large terms in eq. (5.12) in both limits is where G denotes a Meijer G function. This result has the limiting behavior Hence, the correct massless limit is recovered, while in the large-mass limit one obtains the anomalous dimension in eq. (4.11). Note that one needs to perform a change for the strong coupling between the n l + 1 and n l flavor schemes to obtain both limits correctly. To minimize the logarithms for any regime one should thus adopt a canonical scale choice that satisfies eq. (5.11), as for example in eq. (5.15).

Outlook: Phenomenological impact for Drell-Yan
Our results can be applied to properly take into account bottom quark mass effects for the Drell-Yan q T spectrum at NNLL . While a full resummation analysis is beyond the scope of this paper, we can estimate the potential size of the quark-mass effects by looking at the fixed-order q T spectrum. In fig. 17, we show separately the contributions from primary and secondary massive quarks to the cross section at O(α 2 s ), normalized to the O(α s ) spectrum dσ (1) including all flavors (treating the charm as a massless flavor). We utilize the MMHT2014 NNLO PDFs [86] and evaluate the contributions for µ = m b = 4.8 GeV, Q = m Z , Y = 0, and E cm = 13 TeV. Note that the secondary mass contributions at O(α 2 s ) are explicitly µ-dependent and schemedependent, the nonsingular mass correction, i.e. the difference between the full massive result for µ ∼ m b and the massless limit (encoded partially in a massive PDF), is µ independent at this order. As can be seen, the relative contribution of the bb-initiated channel grows with larger q T , while the impact of the secondary contributions including the full mass dependence is at the sub-percent level throughout the spectrum. As expected, the nonsingular mass corrections are very small for m b q T , but can reach several percent for q T m b , which roughly corresponds to the peak region of the distribution where the cross section is largest.
The same can also be seen in fig. 18, where we show the mass nonsingular corrections to the massless limit for primary and secondary contributions as well as their sum. They are shown for µ = m b on the left and for µ = q T on the right. We see that these corrections are (at fixed order) indeed only weakly dependent on the value of µ (for q T 2 GeV). All in all, the bottom quark mass can have a relevant effect for high precision predictions of the q T -spectrum, reaching several percent around the peak of the distribution (∼ 5 GeV).
For W production sizable corrections from bottom quark effects arise only through secondary contributions (due to the strong CKM suppression of the primary contributions), which have a similar impact as for Z-production. On the other hand, charm-initiated production plays an important role and enters already at O(α s ). Estimating the nonsingular mass corrections for q T ∼ m c is more subtle, since higher-order corrections in the strong coupling and nonperturbative effects are likely to dominate the effect from the known beam function at O(α s ) at these low scales. Thus, we do not attempt to determine their characteristic size here and leave this to future work. An analysis based on the leading-order matrix element and its potential impact on the determination of m W can be found in ref. [20].

Conclusions
Massive quark effects provide a challenge for high-precision predictions at colliders. Using a SCET-based factorization framework, we have discussed how to systematically incorporate massive quark corrections into exclusive differential cross sections at the LHC, using the measurement of the transverse momentum q T and beam thrust for Drell-Yan production as prototypical examples. We have discussed the relevant factorization setup for the different hierarchies between the mass scale and the other relevant kinematic scales. We find that the presence of (secondary) massive quarks can lead to the emergence or alteration of rapidity logarithms thus changing the resummation structure in a nontrivial way.
The generic framework for the description of mass effects generalizes to other exclusive cross sections with different jet-resolution measurements and final-state kinematic cuts, which will require additional calculations of the relevant factorization ingredients. Our results for the beam thrust spectrum allow for a systematic inclusion of massive quark effects at NNLL +NNLO into in the Geneva Monte-Carlo program [58,59], which employs beam thrust as the underlying jet resolution variable. Several of our results are also immediately relevant for other processes besides Drell-Yan. The massive quark beam functions are relevant for any heavy-quark initiated process, for example exclusive bbH-production. The mass-dependent soft function and rapidity anomalous dimension at O(α 2 s ) satisfy Casimir scaling and can be therefore also utilized for the description of gluon-fusion processes, e.g. the Higgs q T -spectrum.
An important application of our framework is to the precise theoretical description of the Drell-Yan q T spectrum. To this end, we have computed all required mass-dependent beam and soft functions up to O(α 2 s ) allowing for the description of massive quark effects in the Drell-Yan q T spectrum at NNLL . In particular, our results pave the way for a detailed investigation of quark-mass effects in the ratio of W and Z boson spectra at small q T , which is important for the precision measurement of the W -boson mass at the LHC.

A Results for massless quarks
Here we summarize the relevant results with massless quarks for the hard, beam, and soft functions.

A.1 Hard function
The massless quark hard function is directly related to the QCD form factor and has been computed at O(α 2 s ) in ref. [87]. The O(α s ) and O(α 2 s C F T F ) corrections read in an expansion in terms of α s = α (n f ) s (µ) in analogy to eq. (4.1) (with L Q = ln(Q 2 /µ 2 )) where H (0) is the tree-level contribution. Note that for a single quark flavor there is in addition a nonvanishing correction to the axial current contribution relevant for Z-boson production, but cancels within an isospin doublet for massless quarks. The anomalous dimensions are

A.2.1 TMD beam function
The matching coefficients entering the TMD beam function have been computed at O(α 2 s ) in various schemes [47][48][49]88] and are obtained for the symmetric η-regulator in ref. [50]. The results at O(α s ) are The splitting functions are At O(α 2 s C F T F ) the massless matching coefficient is given by The anomalous dimensions of the massless quark TMD beam function, as defined in eq. (2.6), are given at O(α s ) and O(α 2

A.2.2 Virtuality-dependent beam function
The virtuality-dependent beam functions for massless quarks are known to two loop order [65,66]. The matching coefficients at O(α s ) read The massless matching coefficient at order O(α 2 s C F T F ) for one quark flavor reads The anomalous dimension of the massless quark beam function at order O(α s ) and O(α 2 s C F T F ) are given by (A.10)

A.3 Soft functions A.3.1 TMD soft function
The TMD soft function for massless quarks with the symmetric η-regulator has been computed at two loops in ref. [50]. At O(α s ) and O(α 2 s C F T F ) it is given by The corresponding anomalous dimensions are

A.3.2 Thrust soft function
The thrust soft function is known to two loops [68,69]. At O(α s ) and O(α 2 s C F T F ) it is given by The corresponding µ anomalous dimension is given by

B Calculations of massive quark corrections
We calculate the quark mass dependent beam and soft functions for primary and secondary contributions at one and two loops, respectively. The final renormalized results are given and discussed in sec. 4. For the computation of the collinear massive quark corrections we use the Feynman rules determined from the collinear massive quark Lagrangian [89,90]. where χ n,m indicates a massive collinear quark field, P µ is the label momentum operator, and p + extracts the residual momentum component n · k. For the transverse momentum dependent (TMD), virtuality dependent, and fully differential case the measurement functions are For convenience we discuss also the fully differential case here, from which the other two cases can be obtained by an integration over the respective other variable. The beam functions are proton matrix elements of the operators O Q . To compute the (perturbative) matching coefficients onto the PDFs, we take matrix elements with partonic states, denoting e.g. for an initial collinear gluon state with momentum p µ = p − n µ /2.
At O(α s ) the only contribution to the massive quark beam function originates from an initial collinear gluon splitting into a heavy quark-antiquark pair. The corresponding diagram is given in fig. 19. The kinematics of the on-shell final state is fully constrained at one loop, so that the diagram can be evaluated without performing any integration. For the fully differential case we obtain where P qg (z) = z 2 + (1 − z) 2 is the leading-order gluon-quark splitting function. The correction B Qg at O(α s ) is UV and IR finite. It corresponds directly to the matching coefficient I  which yields the results in eqs. (4.15) and (4.21). Note that in general, this integration has to be performed for the bare result with the full dependence on the UV and rapidity regulator. However, in this case all matrix elements are finite and do not require any renormalization at this order.

B.2 Dispersive technique for secondary massive quark corrections
For observables where only the sum over the final-state hadronic momenta enters the measurement, one can use dispersion relations to obtain the results for secondary massive quark radiation at O(α 2 s ) from the corresponding results for massive gluon radiation at O(α s ). This has been discussed in detail in ref. [23]. The key relation is that the insertion of a vacuum polarization function for massive quarks Π µν (m 2 , p 2 ) between two gluon propagators can be written as The first term contains a gluon propagator with effective mass M and the absorptive part of the vacuum polarization function, which reads in d = 4 − 2 dimensions To obtain the first term on the right-hand side in eq. (B.6) the vacuum polarization function (and thus the strong coupling) was renormalized in the on-shell scheme, i.e., with n l active quark flavors. The second term in eq. (B.6) translates back to an unrenormalized strong coupling and consists of a massless gluon propagator and the O(α s ) vacuum polarization function at zero momentum transfer, which is given by In the following we will first carry out the computation of the beam and soft functions at O(α s ) for the radiation of a "massive gluon" and in a second step use the relation in eq. (B.6) to obtain the associated results for massive quarks at O(α 2 s C F T F ). In our calculations we drop the contributions from the terms proportional to p µ p ν , which vanish in total due to gauge invariance.

B.3 Secondary mass effects in light-quark beam functions
We compute the massive quark corrections to the TMD and virtuality-dependent light-quark beam function at O(α 2 s C F T F ) starting with the massive gluon case at O(α s ). Only the contributions to the matching coefficient I qq are nontrivial, so we consider only diagrams with a quark in the initial state.

B.3.1 Quark beam function with a massive gluon at O(α s )
Contributions to the fully-differential beam function As in sec. B.1 we start also here with the computation of the corrections for the fully-differential beam function. The contributing one-loop diagrams to the matrix element B qq with massless quarks in the initial state, defined in analogy to eq. (B.3), are displayed in fig. 20. They consist of a purely virtual and a real-radiation part, qq,real (t, p T , M, ω, z) . (B.9) The virtual massive gluon contributions in fig. 20(c) are the same as for other collinear quark operators like the current or the PDF and have been computed e.g. in ref. [41]. Including Figure 20. Light-quark beam function diagrams for massive gluon radiation at one loop. In addition, also the wave function renormalization correction and the mirror diagrams for (b) and (c) have to be included in the calculation. the wave function renormalization diagrams the d-dimensional result reads [24] where H α = ψ(1 + α) + γ E is the Harmonic number. Here the rapidity divergences have been regulated using the symmetric η regulator acting on the Wilson lines [40,41], while UV divergences are regulated with dimensional regularization as usual. Furthermore, the gluon mass provides an IR cutoff. The real radiation contributions in figs. 20(a) and 20(b) can be easily evaluated, since all momentum components are fully determined by the measurement. For the first diagram we get Since UV divergences do not appear for the real radiation corrections and the gluon mass regulates all IR divergences we do not need to employ dimensional regularization here. The second diagram in fig. 20(b) yields While the fully-differential quark beam function itself does not contain any rapidity divergences, we have included here the η regulator, since we will use this result to obtain the TMD beam function by integrating over the virtuality, which results in rapidity divergences for this real radiation correction. The full real radiation contributions at one loop yield For both virtual and real radiation corrections all soft-bin subtractions are parametrically power suppressed or scaleless and therefore do not contribute.
Contributions to the TMD beam function The corrections for the TMD beam function with a massive gluon can be obtained by integrating the fully-differential beam function in eq. (B.9) over the virtuality t. We write them again as Here it is necessary to keep a nonvanishing value for η in the second term to regularize the rapidity divergence for z → 1. Expanding for η → 0 we get Contributions to the virtuality-dependent beam function The virtuality-dependent beam function with a massive gluon can be obtained by integrating the results for the fullydifferential beam function over p T . We decompose the corrections again into a virtual and real radiation part, with the fully-differential real radiation contributions in eq. (B.13). Here the η regulator has already been dropped, since for the virtuality-dependent beam function no rapidity divergences arise from the real radiation contributions.

B.3.2 Secondary massive quark effects in the TMD beam function
To obtain the secondary massive quark corrections from the one-loop results with a massive gluon, we first convolve the one-loop results with the imaginary part of the vacuum polarization function according to eq. (B.6) and define The results from these dispersion integrations are andm, c, d defined in eq. (4.18). Using eq. (B.19) entails that the massive quark corrections to the strong coupling are renormalized in the on-shell scheme, i.e., the expansion is in terms of α s = α (n l ) s . Since the beam function matrix element has to be renormalized entirely in the n l + 1 flavor theory, we need to account for the second term in eq. (B.6) (which switches back to an unrenormalized α s ) and renormalize the massive quark corrections to the strong coupling in the MS scheme, such that the expansion is in terms of α s = α (n l +1) s . The beam function operator is renormalized according to where the counterterm encodes also the rapidity divergences. This yields for the renormalized matrix element with initial state quarks at O(α 2 where the (bare) vacuum polarization function Π (1) (m 2 , 0) is given in eq. (B.8). The one-loop counterterm reads The two-loop counterterm Z B absorbs all remaining UV and rapidity divergences in eq. (B.24) and is given by This yields the anomalous dimensions in eq. (4.20). The renormalized one-loop partonic beam function B (1) qq still contains IR divergences, so its exact form depends on the choice of the IR regulator.
The beam function matching coefficient I qq as defined in (2.13) can be now easily obtained. Note that the PDFs are renormalized in an n l -flavor theory with α s = α (n l ) s in contrast to the beam function. Thus, there is a contribution coming from the scheme change of α s to n l + 1 flavors for the (renormalized) one-loop PDF correction, i.e. .

(B.27)
Here the IR divergences cancel between the one-loop beam function and the PDF to give the finite one-loop matching coefficient I

B.3.3 Secondary massive quark effects in the virtuality-dependent beam function
We proceed with the virtuality-dependent beam function. While the virtual contributions are the same as for the TMD beam function given in eq. (B.20), the dispersion integration for the real radiation terms yields qq,real (t, M, z) andm t , u, v as in eq. (4.23).
To obtain the quark mass dependent matching coefficient I we carry out our calculation using a gluon mass Λ √ QT ∼ m as IR regulator. Although the result is independent of the regulator, this is technically most convenient, since this allows us to match two SCET II theories with each other in a straightforward way. 7 While the SCET II theory with n l + 1 flavors (i.e. above the mass scale) contains collinear modes, the SCET II theory with n l flavors (i.e. below the mass scale) contains collinear and csoft modes like in the mode setup of sec. 3.3.
The matching relation reads where B (n l +1) qq corresponds to the pure SCET II beam function matrix element and S (n l ) represents the csoft matrix element.
To separate UV, rapidity, and IR divergences properly from each other, we also employ the SCET II -type IR regulator (here a gluon mass Λ) for the one-loop expressions, and at this stage the renormalized matrix elements and the counterterms still depend on this IR regulator. The matching coefficient I qq can now be calculated as (in an expansion in terms of α qq (t,z,µ) .
(B.32) 7 Alternatively, one can also perform the matching between theories where the fluctuations related to the n l massless flavors are described within a SCETI theory. In this setup, there is no csoft function on the right-hand side of the matching relation in contrast to eq. (B.30). However, in this case the zero-bin subtractions for the collinear fields with respect to the ultrasoft modes in the SCETI n l flavor theory yield a nontrivial contribution to the beam-function matrix element on the left-hand side of the matching relation. Their contribution is equivalent to the inverse of the csoft function in eq. (B.30), such that the resulting matching coefficient Iqq is the same.
Here the IR divergences cancel between the one-loop beam function, the PDF, and the csoft matrix element and yield the finite one-loop matching coefficient I

B.4 Secondary mass effects in the TMD soft function
The TMD soft function is defined as with the soft Wilson line S n given by [41] S n = perms exp − g n · P ν η/2 |2P 3 | η/2 n · A s , (B.37) and in analogy for the others. Again we will first calculate the one-loop corrections to the soft function with a massive gluon, which is used in a second step to obtain the corrections from secondary massive quarks at O(α 2 s C F T F ). The UV-finite and IR-finite real radiation diagram in Fig. 21(b) gives After expanding in η and adding the mirror diagram, the real radiation contribution to the TMD soft function at one loop then reads The results from these dispersion integrations are

B.5 Csoft function at two loops
We compute the csoft function S c for beam thrust appearing in the hierarchy T m √ QT .
As in the computation for the beam function matching coefficient in app. B.3.3 we carry out the calculation using a SCET II IR regulator (a gluon mass Λ m). In this context the csoft function is the matching coefficient between the csoft matrix elements in the n l + 1 and n l flavor SCET II theories, S (n l +1) ( , m, µ, ν) = d S c ( − , m, µ, ν) S (n l ) ( , µ, ν) . (B.50) The latter are defined for any direction n as X n = perms exp − g n · P ν η/2 (n · P) η/2 n · A cs , V n = perms exp − ḡ n · P ν η/2 (n · P) η/2n · A cs , (B.52) Besides replacing the soft fields by csoft fields we have also expanded the η regulator according to the soft scaling as in ref. [24]. 9 B.5.1 Csoft function with a massive gluon at O(α s ) We will first calculate the one-loop corrections to the csoft matrix elements S with a massive gluon, that can then be used to obtain the two-loop corrections with secondary massive quarks using the dispersion technique described in sec. B.2. The one-loop results for the csoft matrix elements can be written as 1 + z 2 a + 2m 2 z a (1 + z 2 a ) + 4m 4 z 2 a (−5 + 6z a − 5z 2 a ) ln Depending on the hierarchy between m and q T and Q some of the contributions in eqs. (C.4) and (C.6) are power-suppressed and therefore only appear via nonsingular corrections in the factorization formula for the associated parametric regime in sec. 2. Note also that virtual corrections are reshuffled among the components of the factorization theorem, which are in addition evaluated with α s in different flavor number schemes. This essentially allows for a consistent factorization and the resummation of logarithms at higher orders. (C.10) Depending on the hierarchy between m and T and Q some of the contributions in eqs. (C.8) and (C.10) are power-suppressed and therefore only appear via nonsingular corrections in the factorization formula for the associated parametric regime in sec. 2.

D Plus distributions
The standard plus distribution for some dimensionless function g(x) is defined as The special case used in this paper is with L b defined in eq. (5.13).