A fast and accurate method for perturbative resummation of transverse momentum-dependent observables

We propose a novel strategy for the perturbative resummation of transverse momentum-dependent (TMD) observables, using the $q_T$ spectra of gauge bosons ($\gamma^*$, Higgs) in $pp$ collisions in the regime of low (but perturbative) transverse momentum $q_T$ as a specific example. First we introduce a scheme to choose the factorization scale for virtuality in momentum space instead of in impact parameter space, allowing us to avoid integrating over (or cutting off) a Landau pole in the inverse Fourier transform of the latter to the former. The factorization scale for rapidity is still chosen as a function of impact parameter $b$, but in such a way designed to obtain a Gaussian form (in $\ln b$) for the exponentiated rapidity evolution kernel, guaranteeing convergence of the $b$ integral. We then apply this scheme to obtain the $q_T$ spectra for Drell-Yan and Higgs production at NNLL accuracy. In addition, using this scheme we are able to obtain a fast semi-analytic formula for the perturbative resummed cross sections in momentum space: analytic in its dependence on all physical variables at each order of logarithmic accuracy, up to a numerical expansion for the pure mathematical Bessel function in the inverse Fourier transform that needs to be performed just once for all observables and kinematics, to any desired accuracy.

This has mostly to do with the peculiar structure of the factorized cross section which makes the resummation of large logarithms an interesting problem. The cross section can be factorized in terms of a hard function, which lives at a virtuality Q, the invariant mass of the gauge boson, and soft and the beam functions (or TMDPDFs) which describe the IR physics and live at the virtuality q T Q, which is the transverse momentum of the gauge boson. The soft and collinear emissions are the ones providing the recoil for the transverse momentum of the gauge boson. This automatically means that these functions are convolved with each other in transverse momentum space so that the q T of the gauge boson is a sum of the q T contribution from each emission: at s = (P 1 + P 2 ) 2 , with colliding protons of momenta P 1,2 , and gauge boson invariant mass Q 2 and rapidity y. For the case of the Higgs, we have a Wilson coefficient C t after integrating out the top quark. (For DY we just set C t = 1 in Eq. (1.1), and consider explicitly only the γ * channel in this paper.) Here, S is the soft function accounting for the contribution of soft radiation to q T , f ⊥ 1,2 are the TMDPDFs (or beam functions) accounting for the contribution of radiation collinear to the incoming protons to q T , and they depend on kinematic variables p ∓ = Qe ∓y = x 1,2 √ s. The peculiarity of the factorization is that even though the TMDPDFs form a part of the IR physics, they depend on the hard scale Q (c.f. [27]), which, as we shall see later, will play an important role in our resummation formalism. The hard function H encodes virtual corrections to the hard scattering process, computed by a matching calculation from QCD to SCET. The scale µ is the renormalization scale normally encountered in the MS scheme and plays the role of separating hard modes (integrated out of SCET) from the soft and collinear modes, by their virtuality. The additional rapidity renormalization scale ν, introduced in [28,29], arises from the need to separate soft and collinear modes, which share the same virtuality µ, in their rapidity (Fig. 1). The cross section itself is independent of these arbitrary virtuality and rapidity boundaries, but the renormalization group (RG) evolution of factorized functions from their natural scales, where they have no large logs, to arbitrary µ, ν can be used to resum the large logs in the cross section.

RG and RRG Evolution in Impact Parameter vs. Momentum Space
These functions obey the renormalization group (RG) equations in µ where F i can be C 2 t (M 2 t , µ), H(Q 2 ; µ), S( q T s ; µ, ν) or f ⊥ i ( q T i , Q, x i ; µ, ν). The RG equations in ν have a more complicated convolution structure: where G i can be soft functions or TMDPDFs. The symbol ⊗ here indicates convolution defined as Apart from the complicated structure of the RG equations, the anomalous dimensions themselves are not simple functions but are usually plus distributions [29] which makes it even harder to solve these equations directly in momentum space. A typical strategy to get around this is to Fourier transform to position (i.e. impact parameter) space, defining (1.5) the latter definitions accounting for the fact that all the distributions we encounter will have azimuthal symmetry in q T or b. This then gives ordinary multiplicative differential equations (instead of convolutions), and a closed form solution to the RG equations can be easily obtained. Moreover the cross section now takes the simpler structure, dσ dq 2 T dy = σ 0 π(2π) 2 C 2 t (M 2 t , µ)H(Q 2 , µ) db bJ 0 (bq T ) (1.6) where J 0 is the n = 0 Bessel function of the first kind. Note we have changed variables from q T in Eq. (1.1) to q 2 T in Eq. (1.6). The b-space soft and beam functions S and f ⊥ i now obey multiplicative rapidity RGEs in ν, whose anomalous dimensions and solutions we shall give below. Only the b integration in Eq. (1.6) stands in the way of a having a simple product factorization of the momentumspace cross section. Finding a way to carry it out will be one the main focuses of this paper.
For perturbative values of q T , the TMDPDF's can be matched onto the PDF's. The b-space cross section, defined as the following product of factors in the integrand of Eq. (1.6): σ(b, x 1 , x 2 ; µ, ν) = H(Q 2 , µ) S(b; µ, ν) f ⊥ 1 (b, x 1 , p − ; µ, ν) f ⊥ 2 (b, x 2 , p + ; µ, ν) , (1.8) computed in fixed-order QCD perturbation theory then contains logs of Qb 0 where b 0 = be γ E /2 (see Eq. (A.37)). Schematically, the expansion takes the form where i,ī = g for Higgs production and i = q for DY, and where we ignore effects of DGLAP evolution for the moment (we include them in Eq. (A. 37) and in all our analysis below). This takes the typical form of a series of Sudakov logs. The number of coefficients G nm that need to be known is determined by the desired order of resummed accuracy. Using the heuristic power counting ln Qb 0 ∼ 1/α s in the region of large logs needing resummation, the leading log (LL) series includes the O(1/α s ) terms m = n + 1, the next-to-leading log (NLL) series the O(1) terms up to m = n, at NNLL the O(α s ) terms up to m = n − 1, etc. When we later talk about resummation in momentum space, we will define our accuracy by the corresponding terms in the b-space integrand that we have successfully inverse Fourier transformed (cf. [30]). For a TMD cross section, the logs in the full QCD expansion Eq. (1.9) are factored into logs from the hard and soft functions and TMDPDFs of ratios of the arbitrary virtuality and rapidity factorization scales µ, ν and the physical virtuality and rapidity scales defining each mode. Each function contains logs: where the individual anomalous dimension coefficients satisfy the constraints Z H +Z S +2Z f = 0 and γ 0 H + γ 0 (For DY, γ C 2 t = 0.) RG evolution of each factor-hard, soft, and collinear-in both virtuality and rapidity space from scales where the logs are minimized, namely, µ H ∼ Q, µ T ∼ M t and, naively, µ S,f ∼ 1/b 0 for the virtuality scales, while ν S ∼ µ S and ν f ∼ Q for the rapidity scales, to the common scales µ, ν achieve resummation of the large logs, to an order of accuracy determined by the order to which the anomalous dimensions and boundary conditions for each function are known and included. This will be reviewed in further detail in Sec. 2. This, at least, is the procedure one would follow to resum logs in impact parameter space. It corresponds, in SCET language, to how to obtain the result of the standard CSS resummation through traditional [10] or modern techniques [31], as well as recent EFT treatments like [32]. Then the resummed b-space cross section is Fourier transformed back to momentum space via Eq. (1.6). The main issue with this procedure is that the strong coupling α s (µ) in the soft function and TMDPDFs is then evaluated at a b-dependent scale µ S,f ∼ 1/b 0 , which enters the nonperturbative regime at sufficiently large b in the integral in Eq. (1.6). So the integrand must be cut off before reaching the Landau pole in α s . There are quite a few procedures in the literature to implement precisely such a cutoff by introducing models for nonperturbative physics, see e.g. [33][34][35][36][37].
Motivated by these observations, in this paper we explore the following main questions: • Even though the natural scale for minimizing the logarithms in the soft function and TMDPDFs is a function of the impact parameter b, can we actually set scales directly in momentum space, after performing the b integration? (without an arbitrary cutoff of the b integration?) • If that is possible, can we obtain a closed-form expression for the cross section which will be accurate to any resummation order and ultimately save computation time?
In Sec. 2 we shall propose a way to answer the first question, and in Sec. 3 we shall develop a method to answer the second. To aid the reader in quickly grasping the main points of our paper, we offer a more detailed-than-usual summary of these sections here, which is somewhat self-contained and can be used as a substitute for the rest of the paper upon a first reading. Readers interested in the details of our arguments can then delve into the main body of the paper. Except for a brief discussion near the end, we emphasize we address only the perturbative computation of the cross section in this paper.

A hybrid set of scale choices for convergence of the b integral
Regarding the first question, the issue with leaving the µ, ν scales for the soft function and TMDPDFs unfixed before integrating over b in Eq. (1.6) is that the integral, while avoiding the Landau pole from long-distance/small-energy scales, is then plagued by a spurious divergence from large-energy/short-distance emissions [29], e.g. at NLL accuracy: although H, S, f ⊥ are truncated to tree level at NLL while C t = α s . The hard scale µ H is usually set to iQ to implement what is called π 2 resummation to improve perturbative convergence [38,39]. This integral, as we will see below in Eq. (2.33), is still divergent. At this point, µ L and ν H,L are b-independent and cannot help with regulating the integral. What we need in Eq. (1.12) is a factor that damps away the integrand for both large and small b. In this paper, we adopt the approach that there are already terms in the physical cross section itself that can play the role of this damping factor and that we should use them. Namely, at NLL order and beyond, the soft function evaluated at the low scales µ L , ν L in the integrand of Eq. (1.12) contains logs of µ L b 0 that we can use to regulate the integral, see Eq. (A.24). Since S(b, µ L , ν L ) no longer contains large logs (if µ L , ν L are chosen near the natural soft scales), it is typically truncated to fixed order (see Table 1). However, we know that the logs themselves still exponentiate, being predicted by the solution Eq. (A.21) to the RG and νRG equations. If we could keep the exponentiated one-loop double log in S in Eq. (A. 24) in the integrand of Eq. (1.12), exp αs(µ L ) 8π Z S Γ 0 ln 2 µ L b 0 , where Z S = −4, it would play precisely the role that we desire. Now, as we argue below, if we are going to keep this term exponentiated, we should also include a piece of the 2-loop rapidity evolution kernel ∼ α 2 s ln 2 µ L b 0 ln(ν H /ν L ) given by Eqs. (2.26) and (2.27) in the exponent, as it is of the same form and same power counting, so that the terms we wish to promote to the exponent of Eq. (1.12) at least at NLL order are: (1. 13) These are terms that would otherwise be truncated away at strict NLL accuracy. Since they are subleading, we are in fact free to choose to include them (and not other subleading terms that are formally of the same order). While this is admittedly a bit ad hoc, we take the view that it is no more arbitrary than any regulator or cutoff we might choose to introduce to Eq. (1.12), and these are terms that actually exist in the expansion of the physical cross section. We can rephrase this choice of subleading terms in Eq. (1.13) to include in Eq. (1.12) as part of our freedom to choose the precise scale ν L in Eq. (1.12) (the variation of which anyway probes theoretical uncertainty due to missing subleading terms). Namely, if one were otherwise to choose ν L ∼ µ L in S in Eq. (1.12), we propose then shifting that choice to: Maintaining a Gaussian form for the exponent in ln b inside the b integral will be crucial to this strategy. Beyond NLL, we will choose to keep the same shifted scale choice Eq. (1.14), but to ensure that we do not introduce higher powers of logs of µ L b 0 than quadratic into the exponent of the integrand in Eq. (1.12), we make one additional modification to how we treat the rapidity evolution kernel. Namely, in the all orders form of the rapidity evolution kernel: we divide the anomalous dimension into a purely "conformal" part containing only the diagonal pure cusp terms with a single log of µ L b 0 and the same for the non-cusp part γ RS . We divide the rapidity evolution kernel Eq. (1.15) into corresponding "conformal" and "nonconformal" parts: where V Γ contains pure anomalous dimension coefficients, and V β contains all the terms with beta function coefficients, whose expansion is shown in Eq. (2.44). We will keep V Γ exponentiated as in Eq. (1.18), and the shift ν L → ν * L in Eq. (1.14) will turn it into a Gaussian in ln µ L b 0 and thus allow us to carry out the b integral in Eq. (1.12) (updated beyond NLL). However, to keep this Gaussian form of the exponent, we will then choose to truncate V β at fixed order. The logs of µ L b 0 in V β will give integrals in Eq. (1.12) that we can carry out by differentiating the basic result we obtain in Sec. 3. Admittedly, this expansion and truncation of V β is not part of any usual scheme for N k LL resummation, but is our addition. In particular, V β still contains large logs of ν H /ν L as seen in Eq. (2.44). This means that starting at NNLL order, we will not actually exponentiate all the logs that appear at this accuracy, as usual log counting schemes in the exponent require. This is the price we choose to pay for the (semi-)analytic solution we obtain in Sec. 3, which requires a Gaussian exponent in ln b in the b integrand. This is essentially an implementation of Laplace's method for evaluating the b integral. As we argue in Sec. 2.3.3, our truncation of V β in fixed order is not as bad as failing to exponentiate large logs involving µ (which we do exponentiate) would be. There is never more than a single large log of ν H /ν L appearing in the exponent of the rapidity evolution. Thus, the series of terms in the fixed-order expansion of the exponentiated V β are suppressed at every order by another power of α s . 1 Our expansion 1 In the µ evolution kernels, the exponents, e.g. Eqs. (A.7a) and (A.8a), themselves contain higher and higher powers of large logs of µH /µL, and truncating any part of it to fixed order would not be sensible. Truncating V β in Eq. (2.44) to the same order as other corresponding genuinely fixed-order terms at N k LL accuracy makes more sense. Loosely speaking, we maintain counting of logs in the exponent for most of the cross section, except for V β , in which we revert to older log counting in the fixed-order expansion (N k LL E vs. N k LL F in [27].) of V β should be viewed as asymptotic expansion, which, indeed, we find truncating at a finite order yields a good numerical approximation to the resummed cross section (within the theoretical uncertainties otherwise present in the resummed cross section at N k LL accuracy) in the perturbative region. 2 Note, furthermore, that in the conformal limit, V β = 1, and the exponentiated part V Γ of the rapidity evolution would be exact.
We should point out that, through the shift Eq. (1.14), we do introduce b dependence into our choice of scale ν L , so we would not call our resummation scheme entirely a momentumspace scheme. (See [40,41] for such proposed methods.) We do, however, leave the µ L scale unfixed until after the b integration, and this still allows us to avoid integrating over a Landau pole in α s (µ L ) in Eq. (1.6).
In Sec. 2 we also use our freedom to determine exactly where µ L ∼ q T should be in order to improve the convergence of the resummed perturbative series. We argue it should be set at a value such that other unresummed fixed-order logs make a minimal contribution to the final momentum-space cross section. For small values of q T , this scale turns out to be shifted to slightly higher values µ L ∼ q T + ∆q T . Without making such a shift we find instabilities in the evaluation of the cross section. This is similar in spirit to the shift µ L → q T + q * T proposed in [42,43], though not identical in motivation, implementation, or interpretation in terms of nonperturbative screening.

A semi-analytic result for the b integral with full analytic dependence on momentum-space parameters
If we stopped there, our choice of scale Eq. (1.14) might be no more than just another in a long series of proposed schemes to avoid the Landau pole in Eq. (1.6), and, in addition, our division of the rapidity evolution kernel Eq. (1.15) into an exponentiated and a fixedorder truncated part in Eq. (1.17) would be quite unnecessary and inexplicable. However, what we find in Sec. 3 is that all of these scheme choices together yield a form of the bspace integrand Eq. (1.8) that is Gaussian in ln b so that we can integrate it analytically into a fairly simple form, modulo a numerical approximation for the pure Bessel function in Eq. (1.6). The dependence on all physical parameters and scales such as q T , µ H,L , ν H,L , is obtained analytically. We now briefly summarize our procedure and results.
With the division Eq. (1.15) of the rapidity evolution kernel and the scale choice ν * L in Eq. (1.14), the momentum-space cross section Eq. (1.6) can be written in the form, given in Eq. (3.1), where we isolated the b integral, 20) 2 We expect that the way in which it breaks down for small qT will yield clues to the behavior of the nonperturbative contributions to the cross section, which however are not the subject of this paper.
F contains the fixed-order terms, including powers of logs of µ L b 0 , contained in the soft function, TMDPDFs, and the part V β in Eq. (1.17) of the rapidity evolution kernel that we choose to truncate at fixed order. As we will show in Sec. 3, the exponentiated part of the rapidity evolution kernel V Γ in Eq. (2.43), with the scale choice ν * L in Eq. (1.14), can be written in the form of a pure Gaussian in ln b, where C, A, χ are functions of the scales µ L , ν L , ν H and the rapidity anomalous dimension, given explicitly in Eq. (3.6). In particular A ∼ Γ[α s (µ L )]. If we could figure out how to integrate this Gaussian against the Bessel function in Eq. (3.2), we would be done. Now, the presence of terms in F in Eq. (3.2) with nonzero powers of ln µ L b 0 can be obtained from the basic result by differentiation, as we will derive in Sec. 3.3, so we really only need to figure out how to evaluate the basic integral, where Ω ≡ µ L e γ E χ/2. Now, our mathematical achievements in this paper do not reach so far as to evaluate Eq. (1.22) analytically in its precise form. We will, however, develop a procedure to evaluate it in a closed form, with analytic dependence on q T , A, Ω (and thus all scales and anomalous dimensions), to arbitrary numerical accuracy determined by the goodness of an approximation we use for the Bessel function. We find a basis in which to expand the pure Bessel function, in which just a few terms are sufficient to reach a precision better than needed for NNLL accuracy in the resummed cross section, and which can be systematically improved as needed. The details of this derivation are in Sec. 3, but we summarize the key steps here.
The first step is to use a Mellin-Barnes representation for the Bessel function, where the contour lies to the left the poles of the gamma function Γ(−t), so c < 0. The choice c = −1 turns out to be well behaved, and useful as it is closely related to the fixed-order limit of where we parametrized the contour in Eq. (1.23) as t = c + ix, and where t 0 = −1 + AL, where L = ln(2Ω/q T ). We also used the reflection formula Γ(−t)Γ(1 + t) = −π csc(πt). It may appear that we are no farther along than when we started with Eq. (1.22)-we still have to do the x integral. However, we now observe that thanks to the form of the Gaussian with a width ∼ √ A, which vanishes in the limit α s → 0, we only need to know the rest of the integrand, in particular in a fairly small region of x. In fact we shall not need it out to more than |x| ∼ 1.5 for any of our applications. Thus if we can find a good basis in which to expand f where every term gives an analytically evaluable integral in Eq. (1.24), we shall be in good shape. Now, this would not have been a good strategy in Eq. (1.20) for the Bessel function itself, as it is highly oscillatory out to fairly large b, and the Gaussian does not damp the integrand away quickly-its width only grows as α s (i.e. A) goes to zero. However, inside Eq. (1.24), we find an expansion of f (Eq. (1.25)) in terms of Hermite polynomials H n to work very well: where we now pick c = −1 and factor out Gaussians with widths set by a 0 , b 0 which closely (but not exactly) resemble the real and imaginary parts of Γ(1 − ix) 2 itself, near x = 0. Their departures from an exact Gaussian are accounted for by the remaining series of Hermite polynomials. It would be natural to choose the scaling factors α, β for the Hermite polynomials to be α 2 = a 0 and β 2 = b 0 , but instead we leave them free, to be determined empirically to optimize fast convergence of the series. We find we can get sufficient numerical accuracy acceptable for NNLL accuracy in the final cross section with just a few (3 or 4) terms in each series, real and imaginary. The coefficients c n in Eq. (1.26) still have to be determined by the numerical integrals Eq. (3.25), which unfortunately prevents us from having a fully analytic result for the momentum-space cross section. However, the series Eq. (1.26) with these numerical coefficients depends only on properties of the pure mathematical function Γ(1−ix) 2 itself-not on any physical parameters. The dependence on these we keep analytically. All that is left is to evaluate analytically the integral of each Hermite polynomial against the Gaussian in Eq. (1.24), leading to the result we derive in Eq. (3.37), where each term H n is defined by the integral, each of which has the closed form result, (1.28) are the primary mathematical result of our paper. The final and primary physics result of our paper, Eq. (3.77), the resummed cross section in momentum space, is given in terms of the analytic result Eq. (1.27) for I 0 b above. While a first glance at these formulas may not be particuarly illuminating, we would like to emphasize that the results Eq. (1.29) of the integrals Eq. (1.28) in terms of which the final result is written contain within them explicit dependence on all the physical parameters such as q T and the scales µ L,H , ν L,H that one would want to vary not only to evaluate the cross section but estimate its theoretical uncertainties. This is made very fast to compute by our explicit analytic formula, modulo only the numerically computed coefficients in Eq. (1.26), but that can be done once and for all, for any TMD observable or kinematics.
It is important to emphasize that the result Eq. (3.77) we give for the resummed momentumspace cross section represents, then, a triple expansion: • V β expansion: The additional truncation of the V β part of the rapidity evolution kernel in Eqs. (1.17) and (2.54) to a fixed order in α s , according to Table 1, makes possible the integration of a rapidity exponential Eq. (1.21) Gaussian in ln b, and behaves as an asymptotic expansion. This expansion becomes exact in the conformal limit of QCD. , truncated to a finite number of terms, as needed to achieve a numerical accuracy in the cross section better than the perturbative uncertainty already present.
These are the expansions we find necessary to obtain the analytical (up to the numerical Hermite coefficients) result for the cross section in Eq. (3.77). Each expansion is systematically and straightforwardly improvable. The last two expansions could be avoided if one is satisfied with a fully numerical evaluation of the b integral in Eq. (1.20). We find the expansions worthwhile as they yield the faster and similarly accurate formula Eq. (3.77). 3 In the rest of the paper, we will do our best to make clear which expansion(s) are being used at each stage.
The remainder of our paper is organized as follows. Before concluding Sec. 3 we match our resummed result onto fixed-order perturbation theory in Sec. 3.3.3 and obtain and illustrate our results resummed to NNLL accuracy and matched to O(α s ) fixed order. In Sec. 4 we offer some comments about expected nonperturbative corrections to our perturbative predictions, and in Sec. 5 we survey other methods to resum TMD cross sections in the literature as compared to ours. We conclude in Sec. 6. In the Appendices we offer an array of technical results we need to evaluate the integrals and cross sections in the rest of the paper, as well as some alternatives to particular choices of schemes or methods we made in the main body of the paper.

Resummed cross section
In this section, we first review RG and rapidity RG methods to resum logs of separated hard and soft/collinear virtuality scales and collinear and soft rapidity scales in TMD cross sections. We review a standard procedure to set scales in impact parameter space, and then inverse Fourier transforming to momentum space. Then we propose a hybrid scale setting scheme where the soft rapidity scale is chosen to depend on b, but the virtuality scales are chosen only after we transform back to momentum space, allowing evaluation of the b integral without encountering a Landau pole. We also organize the rapidity evolution kernel in a way that anticipates making use of it to perform the b integral semi-analytically in Sec. 3. We also address the choice of the soft virtuality scale itself in momentum space to ensure stable power counting of logs.

RGE and νRGE solutions
We defined the b-space cross section in Eq. (1.8). The cross section is independent of the virtuality and rapidity factorization scales µ, ν, but each factor H, S, f ⊥ does depend on them, and contains logs of ratios of the scales µ, ν to their "natural" virtuality or rapidity scales µ H , (µ L , ν L ) and (µ L , ν H ), at which no large logarithms exist. Thus we would like to evaluate each factor at these separate scales, and then use RG and νRG evolution to take them to the common scales (µ, ν) at which the cross section is evaluated. The solutions to these evolution equations are in a form where the large logs of ratios of separated scales are resummed or exponentiated.

Hard function
The hard function H = |C| 2 depends only on the virtuality scale µ, and obeys the RGE, where the anomalous dimension takes the form, where Γ cusp is known as the cusp anomalous dimension, the proportionality constant Z H = 4, and γ C [α s ] is the non-cusp part of the anomalous dimension. The cusp anomalous dimension can be written as an expansion in the strong coupling α s (µ).
The RGE Eq. (2.1) has the solution where the evolution kernel is and U H = |U C | 2 . The pieces K γ , η Γ , K γ of the evolution kernel are given by: Explicit expressions for these kernels up to NNLL accuracy are given in App. A.1.
For the case of the Higgs production, we have another Wilson coefficient (C 2 t ) obtained from integrating out the top quark. So in addition to the hard function, we also have a running for this coefficient.
The anomalous dimension takes the general form where γ i is a number. The RGE has the solution where the evolution kernel is Explicit expressions for these kernels up to NNLL accuracy are given in App. A.1.

Soft function and TMDPDFs
The soft function in b space obeys the µ-and ν-RGEs, while the TMDPDFs/beam functions obey The µ anomalous dimensions take the form: where µ and ν independence of the cross section require In γ f µ we recall the large rapidity scales are given by p ± = Qe ±y = x 1,2 √ s for the two colliding hard partons. Note p + p − = Q 2 . As for the form of the ν anomalous dimensions, at one-loop fixed order in perturbation theory, they take the values the µ-scale at which rapidity logs are minimized in b space. Beyond O(α s ), the form of the ν anomalous dimensions can be deduced from the consistency relation: Solving this equation in µ, we obtain (2.18) where the boundary condition of the evolution at 1/b 0 determines the non-cusp part γ Ri [α s ] of the ν anomalous dimension. The independence of the cross section Eq. (1.8) on ν requires, again, Z S = −2Z f , and γ RS = −2γ Rf . The solutions of the µ and ν RGEs for S and f ⊥ are: where each pair of equalities accounts for two, equivalent paths for RG evolution in the two-dimensional µ, ν-space (see Fig. 1). The evolution kernels U S,f in the µ direction are: Note that the µ anomalous dimension for f ⊥ in Eq. (2.14b) does not have a log of µ in its cusp anomalous dimension term, so no K Γ term appears in its evolution kernel U f in Eq. (2.20b). Meanwhile, the ν evolution kernels V S,f are given by integrals over ν of Eq. (2.18),

RG evolved cross section
We can now put these pieces together to express the cross section Eq. (1.8) in terms of the hard, soft, and beam functions evolved from their natural scales where logs in each are minimized, and thus logs in the whole cross section are resummed: in which we observe that the explicit dependence on the arbitrary scales µ and ν has exactly canceled out, leaving only the dependence on the natural scales µ H,T,L,b and ν L,H where the hard, soft, and beam functions live. Note U C 2 t , K γ C 2 In Eq. (2.24), we envision that the rapidity evolution takes place at (or around) the scale 1/b 0 (see Fig. 1). Then we can actually just expand the evolution factor η Γ (1/b 0 , µ L ) and the rapidity anomalous dimension γ RS in a fixed-order expansion in α s (µ L ), to the order required for N k LL accuracy. This is in fact what we will do below. Then it becomes useful to split up U tot in Eq. (2.24) into two factors, In practice we truncate this expansion at the appropriate order of logarithmic accuracy. We will always pick µ L in such a way that none of these generate large logs (either µ L ∼ 1/b 0 in b space, or in momentum space in such a way that they remain small after inverse transformation-see Sec. 2.3.5), except the factor of ln(ν H /ν L ) in Eq. (2.26). This is an observation that will become key below, when we split γ ν into two separate parts in Sec. 2.3.3.

How to choose the scales?
To evaluate the cross section Eq. (2.22) (and its inverse Fourier transform back to momentum space Eq. (1.6)) explicitly, we need to make explicit choices for the scales µ H,L and ν L,H between which to run in Eq. (2.24). Choosing these near the scales at which the logs in each individual function are minimized in principle achieves resummation of all large logarithms. However, these natural choices are different in impact parameter and momentum space.
There are various possible ways in which this resummation can be handled. In this paper, we envision, in Eqs. (2.25) and (2.26) and Fig. 1, running the hard function in µ to the natural low scale of the soft and collinear functions, and the soft function in ν to the natural rapidity scale of the TMDPDFs. The high scales µ H and ν H for the running of the hard and soft functions are unambiguously best chosen near the invariant mass ∼ Q of the gauge boson. The choices of the low scales µ L and ν L are under debate since we can choose those scales either in b space or in momentum space.
The µ scales, like in a usual EFT, are a measure of virtuality of the modes that contribute to that function. For the hard function, this virtuality scale, not surprisingly, is the hard scale Q, which also happens to be the scale choice for which the logarithms in the hard function are minimized. The virtuality for soft and beam functions is of the order of the transverse momentum that the function contributes to the total transverse momentum. This can be seen in momentum space where the product of these functions in impact parameter space turns into a convolution over transverse momentum, Eq. (1.1). Since the total transverse momentum is a sum over the transverse momenta contributed by each function, for a given total q T , the contribution of any one of these functions traverses a range of scales. While this situation is not unique to this observable, what is different is the dependence of the TMDPDF on the hard scale Q. As we will see, due to this Q dependence, the conjugate natural scale to b 0 in the resummed result is no longer q T but is shifted away from q T towards Q.
However, the final aim of any resummation is to have a well behaved perturbative series. Whenever the fixed order logs become too large, the expansion in α s does not converge, and it becomes necessary to reorganize the series in terms of resummed exponents. A successful resummation is then one in which the fixed order terms that are left behind form a rapidly converging series in α s . Since the large logarithms are, in fact, the terms that spoil the convergence of the fixed order perturbative series, the general strategy is then to minimize the effect of these logarithms in the residual fixed order series.
Keeping these issues in mind, we explore two possible sets of scale choices for µ L , ν L for resummation: the standard choices in impact parameter space in Sec. 2.2, and a new proposed set of choices in Sec. 2.3 allowing evaluation of the resummed cross section in momentum space.

Scale choice in impact parameter space
To choose scales for resummation, we need some idea about the natural scales at which each of the three functions (hard, soft and TMDPDF) live. This is easily seen by looking at their behavior up to one loop. From the results given in App. A, we find that each of these functions are function of the logs, given in impact parameter space for S, f ⊥ . In this space, it is perfectly evident from the fixed-order calculation that the natural scale which minimizes all the logs in µ for the soft and beam functions is µ = µ L ∼ 1/b 0 . Since the final cross section at a given q T involves an integral over a range of b, the scale choice is in fact spread over a range of scales. This is to be expected from the earlier discussion of their being no unique physical scale for the soft and beam functions. The natural scales for the various functions then are µ = µ H for the hard function, (µ, ν) = (µ L , ν L ) for the soft function and (µ, All the logs can then be resummed by running the hard function from the scale Q to 1/b 0 and the soft function in ν from Q to 1/b 0 . This will produce the result (for the central values, not counting scale variations) of the CSS formalism. Therefore, this scheme resums logarithms of the form ln(Qb 0 ). The power counting adopted for this resummation then is straightforward since there is only one type of log. It is usually chosen as α s ln(Qb 0 ) ∼ 1. Leading log (LL) accuracy then resums α n s ln n+1 (Qb 0 ), with NLL and NNLL down by one and two powers of the logarithm respectively.
Since the lower scales µ L are chosen in b space, the cross section involves an inverse Fourier transform over arbitrarily large values of b, so eventually we hit the nonperturbative scale which manifests itself in the form of the Landau pole: α s (1/b 0 ). This corresponds to the fact that the beam and soft functions can contribute arbitrarily small values of transverse momentum even when the total transverse momentum is perturbative. This is usually handled by putting a sharp or smooth cutoff in b space which provides a way to model nonperturbative physics [33][34][35][36][37]. The impact of these nonperturbative effects will be discussed in Sec. 4.
The obvious advantage of this scheme is that the power counting is unambiguous and we can guarantee that with the central values of scale choices in b space, all the logs in the residual fixed order series are set exactly to zero. As far as the choice of central values is concerned, the terms that are resummed are exactly equal to the CSS resummation formalism [10]. However, due to the introduction of the new rapidity renormalization scale ν, there is much better control over which terms can be included in the exponent and which terms remain in the fixed order [28,29]. This directly translates into a much better estimates of error due to missing higher order terms.
Another advantage of having control over what exactly goes in the exponent is when we match the resummed cross section to the fixed order cross section at large q T . To maintain accuracy over the full (perturbative) range of q T (Q ≥ q T Λ QCD ), we need to turn off resummation at the value q T where the nonsingular contribution is the same order as that of the singular one. Due to the two independent scales µ, ν available, this can be done very easily by using profiles in these scales to smoothly turn off the resummation and simultaneously match onto the full (including non-singular pieces) fixed order cross-section. This technique was implemented for the Higgs transverse spectrum in [32] to obtain the cross section to NNLL+NNLO accuracy. In this paper, for the purposes of comparison with other resummation schemes, we present the results for the cross section at NNLL accuracy for both the Higgs and DY using this scheme (Fig. 2).
After having decided on the central values, we next need to estimate the size of higherorder perturbative corrections we have missed by using scale variations. This is accomplished by varying the two renormalization scales µ and ν independently as detailed in [32]. Since we resummed and chose scales in b space, our final result involves an inverse Fourier transform It turns out the cusp anomalous dimension for DY (Γ 0 = 4C F ) is much lower in magnitude than that for Higgs (Γ 0 = 4C A ). This results in a much lower damping effect at large values of b, see Fig. 3. The plot shows the b space integrand K(b) for q T = 5 GeV. The divergence beyond b = 3 in the left-hand plot in Fig. 3 is the Landau pole. It is clear that there is only a narrow window of stability before we hit the Landau pole. The situation worsens when try and do a scale variation about the central choice µ L ∼ 1/b 0 , specifically µ ∼ µ L /2, as shown in the right-hand plot of Fig. 3. What this does is to bring the Landau pole closer by factor of 2, in which case there is no clear separation between the perturbative and nonperturbative regimes. It is then unclear how a hard cut-off or even a smooth one would give an accurate estimation of the perturbative uncertainty band. The error bands in Fig. 2 for the case of DY at NNLL, therefore cannot be generated meaningfully via this scale variation. For the purposes of error estimation then we were forced to make an educated guess for the upper boundary of the error band at NNLL. Clearly, we would like to find a procedure that does better.

Resummation in momentum space
The Landau pole in the b-space resummation scheme above comes about due to the running of the strong coupling α s all the way down to 1/b 0 which goes all the way down to Λ QCD . A natural question to ask, then, is if we can avoid the Landau pole by choosing the µ scale in momentum space. It is clear from the above discussion that we cannot choose a single scale in momentum space which will put all the logs to 0. However, what we can ask is whether it is possible to make an appropriate choice for µ, ν directly in momentum space such that the resummed exponent, on an average, minimizes the contribution from the residual fixed order logs. These small logs, though nonzero, can then be included order by order ensuring that they only contribute to the same order as the error band.

Leading logs
As before let us assume the power counting α s ln(µ H /µ L ) , α s ln(ν H /ν L ) ∼ 1, where µ L , ν L will be our choices of the renormalization scales in momentum space. According to this power counting then, at leading log (LL) we wish to resum terms of the form α n s ln n+1 ( µ H µ L ) in the exponent of the cross section. We also assume that the residual logs of the form ln(µ L b 0 ) are small (of O(1)) and need not be resummed. The resummation then involves running just the hard function from µ H ∼ Q to µ L . All the residual fixed order logs as well as logs in ν are then subleading at this order. The cross section we get is where we can obtain U LL H from Eq. (2.5). This is highly singular and gives a trivial result at nonzero q T . This suggests that the supposedly higher order pieces that we are ignoring are not unimportant. Let's look at the fixed order pieces left over at one loop: The biggest term here appears to be ln(Q/µ L ) ln(µ L b 0 ). According to our naive power counting, this term is subleading at LL and hence should be ignored. Going to momentum space, this term gives us ln( Q µ L )/q T , which is in fact the leading logarithmic term in the fixed order cross section at nonzero q T . So then it appears that we haven't resummed any of the logs which contribute at nonzero q T , which makes sense since we get a trivial result at nonzero q T . We can then conclude that the LL cross section in Eq. (2.30) with this power counting only gives us the result at zero q T . We will have to go to NLL to have any handles to "fix" it.

Next-to-leading logs
The next logical step is to go to NLL. We first update the hard function to resum all logs of the form α n s ln n (Q/µ L ). However this by itself is not useful since it will lead to the same problem as for the LL case in that it will only contribute at zero q T . But power counting demands that we also run the soft function in ν from the scale We can then use the leading term of this anomalous dimension to resum the soft function: . This result works for q T > 0. Clearly then we still have a singularity in the cross section at ω s = 1. As was noted in earlier papers [29], this is a divergent series. The reason for this is that the logarithms in b space of the form ln n (µb 0 ) do not translate directly to logarithms of ln n (µq T ). The simplest example of this is the inverse Fourier transform of ln(µb 0 ). This gives us a term proportional to the plus distribution function L 0 = 1 µ 2 µ 2 /q 2 T + as defined in [29]. For nonzero values of q T , this is simply 1/q 2 T . The same thing happens for higher powers of logarithms. For example ln 2 (µb 0 ) also gives a term proportional to L 0 along with terms which go like ln(µ/q T ). So an all order summation of the logarithms in b space eventually adds up all of these pieces (whose coefficient is controlled by ω s ) which leads to a divergence for sufficiently large ω s . So this step softens the singularity somewhat moving it away from q T = 0, so that we at least have a nonzero cross section for nonzero q T . However, the result is still singular. The singularity results from the single logarithm of ln(µ L b 0 ) in the exponent which diverges at very small values of b. We would have expected that the integrand would receive its dominant contribution from the region b ∼ 1/q T . Instead, as it was noted in [29] it is the UV region b ∼ 1/Q which appears to dominate the integral.
Since our resummation kernel in b space is unstable, we cannot yet assign a power counting to the residual logs. This is because the fixed order logs (of the form α n s ln m (µ L b 0 ) are weighted by this exponent in the inverse Fourier transform. So before we can talk of power counting for these logs, we must stabilize this kernel such that the region b ∼ 1/q T dominates. Clearly, to cure this singularity (which is a UV singularity) we need an even powered logarithm ln 2m (µb 0 ) with a negative coefficient in the soft resummation exponent. So we move ahead and attempt to see if we can resum the next biggest term (which would technically be subleading at this order) ∼ α s ln 2 (µ L b 0 ). There are two places we can find such a double logarithm, one from the second term in Eq. (2.32), which is part of the 2-loop rapidity anomalous dimension, and another from the fixed-order soft function at 1-loop, see Eq. (A.24), or Eq. (2.31), which includes the beam function contribution. Since the term in Eq. (2.32) is already part of the RRG exponent, including it just means tacking on a subleading term in the exponent, which we are free to do at NLL order. As for the fixed-order term in Eq. (A.24), while standard resummation schemes tell us to truncate the logs in the soft function S(b, µ L , ν L ) at fixed order, these logs in the soft function, though not large, do actually exponentiate, see Eq. (A.21). The higher order logs are subleading at NLL, but again, we are free to include them, either in fixed-order or exponentiated form.
Specifically, together with standard soft exponent at NLL, the terms we want to put in the total soft exponent are: The typical default choice for the low rapidity scale is ν L = µ L . We imagine making a choice near this scale. But we will rescale it so that the extra ln 2 (µ L b 0 ) terms in Eq. (2.34) get automatically included in the standard soft NLL exponent. We now attempt to reproduce the above terms in the pure NLL soft exponent, with a modified scale choice ν * L : This can be done with the scale choice: The value of p that allows us to obtain the double log terms in Eq. (2.34) is determined by comparing them with the exponent in Eq. (2.35): By comparing above equation to Eq. (2.34) we find This ensures that we have now resummed all the terms of the form α s ln 2 (µ L b 0 ). (We assume one did not choose the default ν L scale on the right-hand side of Eq. (2.36) with nontrivial b dependence.) Since Γ 0 > 0, the double log now provides the necessary stability to the exponential kernel in b space (at both large and small values of b). With this term in place, we can now talk of a systematic power counting for the fixed order logs, which, hitherto, was not meaningful. So we now adopt the usual power counting that ln(µ L b 0 ) ∼ 1, i.e., this log is small. We still need to confirm numerically, that the fixed order logs that remain , when integrated against our exponent in b space, are actually small so that our series is well behaved perturbatively. With this power counting in place, our result for the resummation at NLL then looks like At NLL, all fixed order logs are subleading and hence not included. We will find below that not only does the quadratic ln 2 µ L b 0 term in the exponent of the b integrand in Eq. (2.39) introduced by the scale choice Eq. (2.36) make the integral converge for both small and large b, it actually makes it integrable analytically (after a very good numerical approximation for the Bessel function). We will describe this in detail in Sec. 3. First, we explore how to generalize the above-described procedure beyond NLL.

NNLL and beyond
At NNLL and higher order, we have some freedom in choosing how to update the accuracy of the rapidity evolution kernel in Eq. (2.26). In this subsection, we will use this freedom to look for a way to preserve both the stable power counting of fixed-order logs of µ L b 0 after integrating over b as well as our ability, in Sec. 3, to evaluate that b integral (semi-)analytically.
A simple, standard, choice would simply be to include the next order terms in the rapidity anomalous dimension Eq. (2.27), e.g. the O(α 2 s ) terms at NNLL, while keeping the scale choice ν * L in Eq. (2.36) for the soft rapidity scale. There are two somewhat undesirable consequences of this choice, however. First, as we noted at the end of Sec. 2.3.2, an exponent in the b integrand that is quadratic in ln µ L b 0 will make the integral analytically computable, to a very good approximation to be described in Sec. 3-but not higher powers of ln µ L b 0 , which will begin to enter starting at N 3 LL order in Eq. (2.27). Second, starting at NNLL order, maintaining the scale choice ν * L would put some terms into the exponent twice, namely, the Z S Γ 0 β 0 term in O(α 2 s ), once from the shift from ν L to ν * L Eq. (2.37) in the O(α s ) term of the exponent, and once from updating the anomalous dimension appearing in Eq. (2.26) with the two-loop value in Eq. (2.27). Of course, this is compensated by the shift from ν L to ν * L in the fixed-order soft function. From Eq. (A.24): where we see that the one-loop double log of µ L b 0 has canceled-the scale choice ν * L has promoted this term to the exponent in Eqs. (2.26) and (2.27). The added O(α 2 s ) term subtracts off (at fixed order) the corresponding Γ 0 β 0 term that was double-added to the exponent at NNLL. While this is acceptable as far as power counting goes, it seems awkward to have this term double-counted in the exponent by itself. Similar observations apply to additional terms at higher orders.
There are a number of ways to avoid these problems, while maintaining the desirable quadratic ln 2 µ L b 0 terms in the exponent of the b integrand of the cross section. One possibility would be just to revert back to the ordinary scale choice ν L from ν * L beyond NLL, but for meaningful comparisons between orders of accuracy, we should maintain the same scale choice as we increase accuracy. Moreover this solution would not prevent higher than quadratic power terms in ln µ L b 0 from entering the exponent, which will spoil our analytic integration below. Since the choice ν * L is needed at NLL to stabilize the b integral in Eq. (2.39) and restore a meaningful resummed power counting, we will go ahead and keep it beyond NLL as well. We will then need a prescription for how to update the rapidity evolution kernel in Eq. (2.37) that respects the stabilization of the b integral at NLL, without introducing unwanted double counting of terms or higher powers of ln n>2 µ L b 0 as discussed in the previous paragraph.
Another possibility, then, is to just keep the NLL part of the exponent in the rapidity evolution kernel Eq. (2.35), and expand out the NNLL and higher-order parts in α s , i.e.
At NNLL we would keep the second term of order α s in the bracket, and at N 3 LL two more terms of order α 2 s would be included, etc. This is not insensible, as the rapidity kernel in Eq. (2.26) contains only a single large log (ln ν H /ν L ) multiplying the whole ν-anomalous dimension. Thus, while the NLL terms are all order α s × 1/α s ∼ 1 in log counting and should be exponentiated, the NNLL O(α 2 s ) part of the ν-anomalous dimension and higherorder terms are all truly suppressed by additional powers of α s . This is in contrast to the µ evolution kernels, e.g. Eqs. (A.7a) and (A.8a), where there are infinite towers of terms of the same order in log counting, because terms at higher powers in α s are multiplied by large logs of µ/µ 0 . This is not the case in Eq. (2.27), since higher order terms in α s generated by µ running do not come with large logs-we are doing the rapidity evolution at a scale µ L ∼ 1/b 0 , generating only small logs of µ L b 0 . All the effects of µ running between widely separated scales and their associated large logs are contained in U in Eq. (2.26).
However, we do not need to be so draconian in truncating the terms we could resum using the RRG kernel in Eq. Let us then use this freedom to divide the terms in the rapidity anomalous dimension Eq. (2.27) into two parts, those that we will exponentiate and those that we will expand out in fixed order. Namely, we will exponentiate all the pure Γ n and γ n RS anomalous dimension terms; these are at most single logarithmic in µ L b 0 , and the shift from ν L to ν * L introduces the double logs 1 2 Z S Γ n ln 2 µ L b 0 in the fixed-order soft function S associated with Γ n , see Eq. (A.24); as well as (part of) the ∼ Z S Γ n β 0 ln 2 µ L b 0 term in the rapidity anomalous dimension Eq. (2.27), which stabilize the b-space integrand. The remaining terms will be expanded out in α s , and these are all associated with all the beta function terms coming from α s running of the "pure" Γ n and γ n RS terms. Concretely, we split the rapidity evolution kernel in Eq. (2.26) into: which remains exponentiated and contains all the "pure" anomalous dimension terms, and which is the fixed-order expansion of the part of the rapidity evolution kernel Eq. (2.26) coming from all the remaining (β n ) terms in Eq. (2.27) that are not included in Eq. (2.43). This division Eq. (2.42) ensures that the exponentiated part of the rapidity evolution kernel contains at most double logs of µ L b 0 after the shift from ν L to ν * L , and also avoids double counting of the β 0 induced terms in Eq. (2.27) in the exponent as described above.
We can give a more formal definition of the two factors V Γ and V β in Eqs. ( (2.45) and divide up each term into pieces containing just the "pure" anomalous dimension coefficients and those generated by beta function terms. For γ RS this is easy: The first piece in the last line of Eq. (2.46) contains those terms in the anomalous dimension that would survive in the conformal limit of QCD, where α s does not run. The second set of terms, in ∆γ RS contain all the beta function induced terms. For example, the ratio 1/r has the fixed order expansion up to O(α 2 s ): (2.48) and the "1" term is subtracted off in Eq. (2.46), leaving just the β i terms. The similar thing happens for 1/r n+1 . We can similarly split up η Γ into two pieces. To all orders in α s , η Γ (1/b 0 , µ L ) is given by Eq. (A.4), and is expanded out in fixed orders in Eq. (A.8b). We want to split up η Γ into the "pure" Γ n ln µ L b 0 pieces along the diagonal of Eq. (A.8b), and the remaining β i induced terms. We do this by very straightforwardly defining: contains the "pure" anomalous dimension terms in η Γ (the ones which would survive in the conformal limit), and ∆η Γ is given simply by We will keep η conf. Γ exponentiated part of Eq. (2.42), while ∆η Γ will go into the part expanded in fixed orders in α s . The explicit expansion for ∆η Γ is of course given by Eq. (A.8b) with the diagonal terms deleted, or can be worked out to all orders in α s (up to NNLL accuracy) from the expression in Eq. (A.4). The corresponding expression for ∆η Γ up to terms of NNLL accuracy is then: where we notice the subtraction terms at the end of each line just modify the pure anomalous dimension Γ i /Γ 0 pieces, as designed. Note the following properties of the expanded functions of r = α s (µ L )/α s (1/b 0 ) that appear in each line of Eq. (2.52): , (2.54b) indicating that V β is to be truncated to fixed order in α s according to Table 1. This defines the "V β -expansion" we first mentioned in Sec. 1.3. In this paper we shall not need it beyond the O(α 2 s ) terms given in Eq. (2.44). Our master expression for the resummed cross section, using the scale choices and prescriptions we have explained above, is then:   The β-function dependent part of the RRG kernel V β is also truncated at fixed order in our scheme, and is defined in Eq. (2.54) and given by Eq. (2.44) up to O(α 2 s ), the highest order we shall need in this paper.
In order for Eq. (2.55) to successfully resum large logs of scale ratios, we recall that the scales µ H , µ L and ν H , ν L should be chosen near the natural scales of the respective hard, soft, and beam functions. µ H and ν H should be chosen ∼ Q. We expect ν L to be chosen ∼ µ L , and then shifted to ν * L according to Eq. (2.36) to introduce the quadratic damping factor in the b exponent. As for µ L itself, it should be chosen ∼ q T in momentum space, although we will explore in Sec. 2.3.5 a modified choice for this scale that better preserves stable power counting. For now, it remains a free scale.

Truncation and resummed accuracy
Here we summarize the rules for how to truncate the various objects in the full resummed cross section Eq. (2.55). These rules are mostly standard and well known, but with our introduction of the division of the RRG kernel into exponentiated and fixed order pieces V Γ × V β , it seems prudent to restate these rules here. These are given in Table 1.
We should note here that our choice of terms to group into the exponentiated part V Γ and those expanded in fixed order in V β in Eq. (2.55) is not unique. Terms in V Γ and V β in Eqs. (2.43) and (2.44) can be shifted back and forth by a different choice of prescription, and different choices of scales (such as ν * L in Eq. (2.36)). Indeed, with our choice of the split between V Γ and V β and the scale ν * L , the rapidity evolution exponent contains only a subset of the terms in the full rapidity anomalous dimension Eq. (2.27)-but the subset that allows an analytic evaluation of the b integral, as we will show in Sec. 3. One may very well make a different set of choices that put a different set of terms in the exponent, based on a different set of desired criteria. This freedom is allowed by the presence of only a single large log in the RRG kernel in Eq. (2.26) at the virtuality scale µ L . We present our particular choice as just one such example. We advocate that readers make use of the freedom to choose the scales ν L,f and µ L,H in Eq. (2.55), either before or after b integration, along with the organization of terms in the RRG kernel Eq. (2.42) into exponentiated and fixed-order parts (beyond NLL) to achieve their desired properties and results for the resummed momentum-space cross section. 4 The way we have organized the resummed cross section Eq. (2.55), all logs of µ L b 0 are contained in the exponent of V Γ , and in fixed-order terms in V β , S, f ⊥ i . Furthermore, the power of ln µ L b 0 in the exponent Eq. (2.56) with the scale choice ν * L is at most quadratic to all orders in α s . Our choices to arrange this property are motivated, as we will see later, by the fact that it enables us, using some approximations, to obtain an analytical expression for the b space integral. This is only possible as long as the quadratic nature of the exponent holds. At N 3 LL (or NNLL ) and higher order, the fixed-order coefficients V β , S, f ⊥ i contains other powers of ln µ L b 0 , but the contributions of these fixed-order logs can be dealt with analytically as well, as long as the exponent is no more than quadratic in ln µ L b 0 .

Stable power counting in momentum space and the scale µ L
We have not yet specified exactly what we will choose for the scale µ L . All of the above arguments are contingent upon the fact that our power counting ln(µ L b 0 ) ∼ 1 holds. What this means in momentum space is that after performing the b integral in Eq. (2.55), fixedorder logs in the integrand of the form ln n µ L b 0 do not generate parametrically large terms after integration. This, it turns out, depends on what we pick for the scale µ L . Let us see what µ L we should choose for this power counting to remain true. It turns out that the "obvious" choice in momentum space µ L = q T is not always a very good choice for minimizing the contributions of the fixed order logs. This is not too surprising, since the kernel against which they are integrated in transforming back to momentum space is no longer a simple inverse Fourier transform with a single scale q T . Instead it involves an exponent which is also a function of the high scale. It is then natural that the scale at which the logarithms are minimized (if not put exactly to 0) is shifted towards the high scale. This effect is particularly pronounced at low values of q T .
For the rest of the fixed order logarithmic pieces, we then check what value of scale will minimize the contribution to the cross section ( 10%). Here, the nature of the resummed kernel can help us. The soft exponent in b-space provides damping at both small and large values of b which, combined with the Bessel function, gives an integrand which has the form of damped oscillations. Then it is reasonable to assume that the most of the contribution to the integral is coming from the around the region of the first peak. A ballpark choice for the scale µ then is 1/b * , where b * is the scale at which the resummation kernel has the first peak. Since the hard kernel is independent of b, we only need to consider the soft resummation. A straightforward analysis of the b-space integrand then gives the following condition for the peak  J 0,1 (x) are the zeroth and first order Bessel Functions of the first kind respectively. We now set b * = 1/µ L and we can further simplify the expression above by keeping the dominant terms.
This can be solved numerically to obtain the scale µ L . We can also confirm this by checking the contribution of the leading term α s ln(µ L b 0 ) at O(α s ) and cross checking it against the next biggest piece α 2 s ln 2 (µb 0 ), α 2 s ln 3 (µb 0 ) at O(α 2 s ). While we are currently keeping only one-loop fixed order pieces in our NNLL cross section, we can always include the two-loop pieces α 2 s ln n (µb 0 ) which are known to us at NNLL (this would be part of the NNLL cross section). Fig. 4 looks at the percentage contribution of the fixed order logs when integrated against the NNLL exponent in b space. This particular plot is for the Higgs q T distribution No cutoff of the b integral is required, and the µ L scale can be varied in its full range from µ L /2 to 2µ L to obtain the uncertainty, without hitting a pole as in Fig. 3. These plots are obtained by performing the b integral numerically. We stop plotting at low q T where true nonperturbative effects will enter.
for q T = 10 GeV. It is pretty clear that µ = q T is a poor choice for µ and that a good choice would be somewhere between µ ∼ 20 GeV for the contribution from the fixed order logs to be ∼ 2%. In comparison, the value predicted by Eq. (2.58) is 22 GeV. Fig. 5 gives the scale choice for Higgs and DY (Choosing a common value of Q = 125 GeV and √ s = 13 TeV. For low values of q T , the scale is shifted away from q T toward Q as expected. The shift is far more pronounced for Higgs than for DY since the cusp anomalous dimension for Higgs is much larger so that the soft exponent has far more impact on the shape of the b-space integrand. At large values of q T , the scale is more or less q T . The key point to notice here is that apart from the single log α s ln(µ L b 0 ), which we will include in the fixed order cross section at NNLL, the contribution from the higher order logs shows a flat behavior at at values of µ ≥ 15 GeV for q T = 10 GeV in Fig. 4. So the result is insensitive to the choice of µ L as long it is chosen greater than this threshold value.
Using these scale choices, we now obtain the transverse spectrum, again at NNLL (Fig. 6). The uncertainties are obtained by scale variations described in Eq. (3.71), obtained reliably by varying both µ and ν scales, without cutoffs in the b integral.

Explicit formula for resummed transverse momentum spectrum
In this section we will provide an expression for the transverse momentum spectra of gauge bosons that is analytic in terms of its dependence on all the kinematic variables. It requires a numerical but efficient approximation for the Bessel function in the b integral in Eq. (1.6) or Eq. (2.55).
We can write the resummed cross section Eq. (2.55) in the form where the we have isolated the terms in the b integral, grouping together the terms in the integrand that are to be expanded in fixed order in α s (µ L ), separating them from the exponentiated V Γ factor. The V Γ factor is explicitly to all orders in α s , plugging in Eq. (2.36) for ν * L in Eq. (2.56), which can always be written in the form where C, A, and Ω are independent of b and thus constants as far as the integral I b in Eq. (3.2) is concerned. Explicitly, These show all the dependence on the scales and on the anomalous dimensions contained in V Γ . They are to be truncated to the order in resummed accuracy we intend to work (which we show at NLL and NNLL in App. A in Eqs. (A.38) and (A.39)).
Thus we now just have to figure out how to integrate the Gaussian V Γ in ln b in Eq. (3.5) against the Bessel function in the integral I b in Eq. (3.2). We will encounter integrals of the form where the factors ln k (µ L b 0 ) come from the fixed-order factor F in Eq. (3.3). We will first focus on the case where F = 1 (e.g. at NLL) and compute I 0 b , and then discuss below how to generate the terms I k>0 b resulting from integrating the fixed-order terms containing logs of µ L b 0 against the rest of the integrand.
In the next two subsections we shall develop a method to evaluate I 0 b semi-analyticallywith a numerical series expansion for the Bessel function but deriving the analytic dependence of Eq. (3.7) on all relevant physical parameters including q T . Then in Sec. 3.3 we shall show how to obtain arbitrary I k b from derivatives on I 0 b .

Representing the Bessel Function
The main issue in doing the integrals in Eq. (3.7) analytically is the presence of the Bessel function inside the integrand. The exponent, in our scheme, has terms only up to the quadratic powers of ln(µ L b 0 ). Analytic integration is then possible if we can approximate the Bessel function using a series of simpler functions, such as polynomials, which can be easily integrated given the quadratic nature of our exponent. However, a simple power series expansion fails to reproduce the Bessel function in the region, where it contributes to the integrand. This is basically the region up to b ∼ 2 GeV −1 . This is because the argument of the Bessel function is bq T and at larger values of q T (≥ 10 GeV), the power series expansion is no longer useful. We can possibly switch to the large b q T asymptotic form in terms of the cosine function, but then analytic integration is again not possible. An alternative way then is to rewrite the Bessel function in terms of an integral representation, so that the b space integrand can them be done exactly. This automatically rules out any representations in terms of trigonometric functions, and given the discussion earlier, the most expeditious choice is if a polynomial representation can be used. One choice that we will find amenable to a polynomial expansion is the Mellin-Barnes type representation of the Bessel function, where the contour is to the left of all the non-negative poles of the Gamma function, i.e., c < 0. We give a proof of this identity in App. E.1.
Going back to our all-orders b space integral I 0 b given by Eq. (3.7), it now looks like Since we do not have a Landau pole, we can extend the limit of integration in b space all the way to infinity, in which case, the b integral is now in the form of a simple Gaussian integral and admits the analytical result: Let's examine the Gaussian exponent in this integrand. A simple rearrangement gives us where t 0 = −1 + AL , which is a saddle point for this Gaussian and lies on the real line. In some sense, the t space integral is a dual of the b space integral since the degree of suppression inverts itself from one space to another. So then in a region of large A, we should stick to the b space integral, fit a polynomial to the Bessel function (which will now work since the integrand is highly suppressed). On the other hand for small A (which would be relevant for most perturbative series), we should go to t space.
If we parametrize the contour as t = c + ix, we have It is clear that the path of steepest descent passes through this saddle point (c = t 0 ) and is parallel to the imaginary axis. One can then consider doing a Taylor series expansion of the rest of the integrand along this path around the saddle point, truncating the series after a finite number of terms. The primary difficulty with a polynomial expansion is that to have percent level accuracy that we desire for this integral, we need to have a good description of f (t) out to x l ∼ 2A ln (10) at which value the exponent drops to 1% of its value. The factor of A is essentially the cusp anomalous dimension, ( for e.g., it is 2α s C A /π for the Higgs ) which is a small number. Considering the worst case scenario A ∼ 0.5, we would need x l ∼ 1.5, which clearly cannot be accomplished using a Taylor series expansion because the radius of convergence is around 1. We will, instead, find an expansion for f (t) in the next section in terms of (Gaussianweighted) Hermite polynomials that performs quite well with just a few terms. There a few customizations of this expansion we will make to optimize fast convergence. Our particular procedure presented here is by no means unique, and we give a couple of alternative methods for expanding and approximating the integrand of I 0 b in App. D. There are, undoubtedly, others that would work also.

Expansion in Hermite polynomials
One of the difficulties with finding a series expansion of f (t) in Eq. (3.13) is that it grows exponentially for large |x|, with t = c + ix. We can factor out this exponential growth by using Euler's reflection formula: . (3.14) Then The exponential behavior for large |x| is now factored out in the sine function in front, and we can focus on finding a good expansion for Γ(−t) 2 . The integral Eq. (3.12) is then: The sine can be written in terms of exponentials, which shift the linear and constant terms in the Gaussian exponential. It is straightforward to work out that the result is By changing variables in the second line from x → −x, we can write the result compactly as The remaining function Γ(−t) 2 is exponentially damped for large x. Indeed, Stirling's formula tells us that as |x| → ∞. The contribution of this exponential tail is further suppressed by the Gaussian factor it multiplies in Eq. (3.18). On the other hand, near x = 0, the function Γ(−t) 2 itself closely resembles a Gaussian, times polynomials. To determine the curvature of the Gaussian, we look at the Taylor series expansion of Γ(−t) 2 near x = 0: It now remains to find a good series expansion for Γ(−c − ix) 2 that enables us to perform the integral in Eq. For practical purpose of series expansion, we set c = −1 which makes the gamma function less oscillatory than the values |c| < 1. This is the saddle point in the limit A → 0, i.e., when we are in the fixed order regime with the resummation turned off. This would still induce some imaginary part and hence oscillations in the exponent away from A = 0, but with a good expansion, this is not an issue. Thus one can try a series expansion for Γ(1 − ix) 2 in terms of Hermite polynomials H n (x) which form a complete orthogonal basis. They are well known, of course, but we nevertheless remind ourselves of the first several H n :

etc. They satisfy the orthogonality relation
We expand Γ(1 − ix) 2 in terms of these polynomials in the following way: We have introduced weighting factors e −a 0 x 2 and iγ E e −b 0 x 2 for the real and imaginary parts to help with faster convergence of the series, as they capture the behavior of Γ(1 − ix) 2 near x = 0, using the values of a 0 and b 0 obtained from the Taylor series expansion in Eq. (3.20) at c = −1: These integrals still have to computed numerically, as far as we know, but note they are purely mathematical, having no dependence on any of our physical parameters, and need only be computed once. Thanks to the damped behavior of the integrand and the normalization factors in front, the expansion coefficients fall off fairly rapidly with n.
To make the series expansion well behaved, the width parameter of Gaussian functions should be positive definite: α 2 − a 0 > 0 and β 2 − b 0 > 0. These widths define the region of x where the function Γ(1 − ix) 2 is expanded in terms of the Hermite bases. For a narrow width, the series converges swiftly with n but is valid only in a narrow region around x = 0, while for broader width, the convergence is slower but the expansion is valid in a wider region around x. The Gaussian function in our integration in Eq. 10 and found that for small values ∼ 1 the convergence is slow and hence more terms are needed for an accurate description. For large values ∼ 10 the accuracy of the integration in Eq. (3.18) is very good with a few basis terms but is not further improved by including higher order terms because the series expansion is resolving only a narrow x region compared to the one dictated by Eq. (3.18). Empirical tests show that the series converges rapidly for α 2 − a 0 and β 2 − b 0 around 3 ∼ 5 while maintaining required accuracy over the range of x (from 0 to ±1.5) desired. Fig. 7 shows the agreement between the exact result and series expansion up to 3 or 4 basis terms for the real (even) and imaginary (odd) parts, for: which indeed show a rapid convergence. In practice we include up to c 6 in our numerical results; from c 7 onwards the impact is negligible. From staring at Fig. 7, one notices a residual deviation in the real part above x ∼ 1, which thus appears to be the potentially largest source of error from our method. However, the region of larger x in Fig. 7 is suppressed by the remaining Gaussian in Eq. (3.18). The remaining deviation can easily be further suppressed if desired by including higher-order polynomials. In practice, at NNLL accuracy the cross section has ∼ 10% uncertainties, and the error due to our series truncation at n = 5 or 6 is significantly smaller than this perturbative uncertainty. This is clearly seen in Fig. 10, which shows the effect of increasing the total number of terms in the Hermite expansion from 6 to 7.

In terms of this expansion, the integration in Eq. (3.18) is rewritten in terms of following basis of integrals
The integrals for odd n arising from the expansion in Eq. (3.23) are obtained from Eq. (3.28) with the substitutions α → β, a 0 → b 0 . The prefactors in front have been included in the definition of H n for later convenience. Now we go about evaluating analytically the integrals in Eq. (3.28). There are a number of ways to do this, we choose one that seems particularly elegant.

Generating function method to integrate against Hermite polynomials
Using the generating function for Hermite polynomials, we can efficiently evaluate all the integrals H n in Eq. (3.28) at the same time. By forming the infinite series, we are able to use the generating function relation Eq. (3.29) to obtain a Gaussian integral on the RHS. By evaluating the integral on the RHS and expanding the result back out in powers of t n , we will be able to obtain expressions for the individual H n . Rescaling the integration variable and completing the square in the exponent on the RHS of Eq. (3.30), The exponent of the x-dependent Gaussian is complex, but the result of integrating it is just √ π (see Eq. (E.5)). Thus, where we have expanded the exponential in t in Eq. (3.31) in a Taylor series. We cannot directly read off the coefficients of powers of t in Eq. (3.32) to obtain H n in Eq. (3.30) due to the powers of the binomial in t, but using the binomial expansion and some reindexing, we can do so. The proof is given in App. E.3, with the result:

33)
where n 2 is the floor operator, i.e. the integer part of n 2 , and If one desires, Eq. (3.33) can be written separately for even and odd n, where

Explicit result of integration
Explicitly, the first several H n given by Eq. (3.33), including H 0,1 given above, are given in Eq. (E.14). The integral I 0 b in Eq. (3.18) that we sought to evaluate in the first place is then given explicitly by, for c = −1, where the coefficients c n are given by Eq. (3.25). As many terms in the sum over n may be included to achieve the numerical accuracy desired. In practice, we include the first few terms in the sum over n (three or four even c 2n and three odd c 2n+1 coefficients), which gives us percent level accuracy for the cross section in the perturbative resummation region. Although the coefficients c n still need to be evaluated numerically, we note that they depend only on properties of the pure function Γ(1 − ix) 2 and need be determined only once (Eq. (3.27)). Our Hermite expansion is applied only to this function, which arises solely from the Bessel function J 0 (z) which appears in the factorization convolution in Eq. (1.6). The dependence on all the physical variables, such as q T , Q, and the scales µ L,H , ν L,H , enters through the evolution exponent in Eqs. (3.5) and (3.6) and is captured analytically by Eq. (3.37). For other processes or observables, the same basis and coefficients we used and resultant analytic integration Eq. (3.37), should apply, though the number of terms needed to get good convergence (determined by the width of the evolution exponent Eq. (3.5)) may vary.

Fixed-order prefactors in resummed expression
At NNLL (or NLL ) and higher orders, fixed order logarithmic terms of the form ln k (µ L b 0 ) appear in the prefactor F multiplying the resummed exponent in the integrand of Eq. (3.2). Explicitly, F can be expanded: (2.36) that we use. Then we have for the terms we need up to NNLL accuracy, at tree level: (3.39) where i,ī = g for Higgs production and i = q for DY, and at O(α s ): Note that F (1) 2 vanishes due to the extra term from the shift ν L → ν * L since the double log term was promoted from the fixed-order soft function to the exponent in Eq. (2.37), as designed (though this cancellation will no longer be exact once we implement profile scales in the next subsection). Up to NNLL accuracy, the only O(α 2 s ) terms we need come from the the α 2 s piece of the soft function induced by ν L → ν * L , and the O(α 2 s ) terms in the V β piece of F in Eq. (3.3), which from Eq. (2.44) we see has the expansion coefficient actually just multiplies a small log ln 2 µ L b 0 , so strictly at NNLL we can drop it. Then the only relevant piece of F (2) we would need at NNLL accuracy is where the second term in F where we used Eq. (3.6) and we have defined Using the final expression Eq. (3.37) for I 0 b we can now write In the expression Eq. (3.33) for H n , the variable χ only appears inside of (3.47) which appears inside z 0 and H 0 in Eq. (3.33). So we can writê To construct fixed order terms up to order k, we need up to k th derivatives of the L dependent pieces that make up I k b . There is very simple recursion relation satisfied by the derivatives of H n : With these expressions, we can now write the resummed cross section Eq. (3.1) in q T space in terms of the above results of integrating I b and I k b in Eqs. (3.2) and (3.7), where the integrals I k b are given in final evaluated form in Eq. (3.46) with the first few derivatives (∂ χ ) k H n given by Eq. (3.50). The calculation of these integrals have formed the bulk of this Sec. 3 and constitute one of the main results of this paper.
To turn Eq. (3.52) into a final prediction for the perturbative q T cross section, we still need to match it on to the fixed-order prediction of full QCD for the large q T tail. We turn our attention now to this task.

The large q T limit
In the large q T ∼ Q limit the resummation is turned off and this corresponds to the A → 0 limit in the soft evolution of Eq. (3.10). In this limit the resummed part of the cross section Eq. (3.52) should reduce to a sum of the fixed-order singular terms in the cross section. Since we evaluate the I k b 's in Eq. (3.52) using an expansion in Hermite polynomials giving Eqs. (3.37) and (3.46), it is not obvious whether these expansions, when truncated to our desired number of terms, maintain their accuracy in the fixed order limit. We check the accuracy here. In the A → 0 limit, the integrals of the even and odd Hermite polynomials reduce to where f k is a linear combination of the coefficients c n in the series expansion : where we used the values in Eqs. (3.26) and (3.27). We can compare to the result of taking the A → 0 limit of the exact expression for I k b in Eq. (3.7) before expanding the Bessel function in a series representation. Using Eqs. (3.7), (3.10), and (3.44), we obtain Using Eq. (3.48), we can computê which we can re-express as a derivative with respect to t, where we also used Eq.
Then, integrating repeatedly by parts in Eq. (3.55), we obtain

59)
We can now take the A → 0 limit in Eq. (3.59), and in so doing turn the Gaussian into a Dirac delta function: (3.60) recalling we pick c = −1 in t = c + ix, and where we used the Dirac delta identity: Thus, in the A → 0 limit, We compute the exact derivatives: Inserting Eq. (3.63) into Eq. (3.62), we have Comparing Eq. (3.64) to Eq. (3.53), we find exact values of f k which are 1,1, γ 2 E ≈ 0.33317 for k = 1, 2, 3 and the approximate values in Eq. (3.54) agree better than 1% for k = 1, 2 and agree within about 15% for k = 3. Note that the term I b,k at k = 3 is the O(α 2 s ) contribution induced by our prescription in Sec. 3.4 as it multiplies the coefficient F (2) 3 in Eq. (3.43),which is suppressed by another order of α s at the cross section level. Taking this into account, the error in f 3 itself induces only an order of magnitude smaller error in the cross section, much smaller than the total theoretical error at NNLL accuracy, and so we need not be concerned about it. If desired, one can go beyond this accuracy by including higher-order Hermite polynomials in the computation of I k b , and thus of f k . This means that the fixed order series probes very narrowly the behavior of f (−1 + ix) near x = 0, and the more accurate our expansion is near x = 0, the higher the order in α s that we can reproduce accurately in the fixed order (high q T ) regime. Practically, we would like to reproduce the one loop fixed order cross section since we would be matching our NNLL predictions to that result in the high q T region, which means we need our expansion to match the exact result up to the second order in the Taylor expansion of f , i.e. = 0, 1, 2.

Matching to the fixed order cross section
The resummed formula Eq. (3.52) accurately predicts the spectrum for relatively low (but not too low) values of q T . At large q T ∼ Q, the non-singular terms in q T become just as big as the logs of q T /Q themselves, and we should use the fixed-order perturbative expansion for the cross section there. It has been shown in the context of B → X s + γ [44], thrust [45] as well as the Higgs jet veto calculation [46], that if one does not turn off the resummation then one can over estimate the cross section, in the region where fixed order perturbation theory should suffice, by an amount which goes beyond the canonical error band in the fixed order result. This overshoot happens despite the fact that the resummed terms are formally sub-leading in the expansion. The reason for this overshoot has been shown [44][45][46] to be due to the fact that there are cancellations between the singular and non-singular terms in the tail region and that this cancellation will occur only if the proper scale is chosen in the logarithms. We can smoothly combine the resummed and fixed-order formulas by turning off resummation in Eq. (3.52) in the high q T region using profiles in µ and ν [45] and match the result to the one loop (O(α s )) full theory cross section.
Our resummation as given in Eq. (3.1) (and evaluated in Eq. (3.52)) is neatly divided into two parts, the hard function which runs in µ and functions C and I b which implement ν running. To turn off resummation at large q T we implement the following profiles for both µ, ν in Eq. (3.1): The function ζ(q T ) is chosen so that at low values of q T where resummation is important, its value is 0 while for values near Q it approaches 1. µ L (q T ) is given by Eq. (2.58) and illustrated in Fig. 5, and we have indicated ν run so that inside Eq. (2.36) for ν * L the µ L is also set to µ run . This is designed so that not only resummation of logs of µ L /µ H and ν L /ν H is turned off as ζ → 1 but also the shifting of logs of µ L b 0 into the soft rapidity exponent in Eq. (2.56) as we move out of the resummation region. The exponent p defined in Eq. (2.38) is also now evaluated at µ run .
The choice of ζ(q T ) that we make for this function is: Here q 0 determines the central value for the transition and ρ determines its rate. In practice we use ρ = 3. The value of q 0 is determined by the scale at which the non-singular pieces  become as important as the resummed singular cross section. The profiles in µ and ν for the case of the Higgs spectrum are shown in Fig. 9. (Here q 0 has been chosen to be 40 GeV for the central profile). We also probe the effect of varying the profiles by varying the value of q 0 , in our case, between 30 GeV and 50 GeV as shown in Fig. 8. For each of these profiles, we consider the scale variation by a factor of 2 for µ, ν (Fig. 9) For the case of the DY, we use a similar value for q 0 . This procedure is straightforward to implement for the hard function running, let's see what effect it has on the ν running. Going back to the ν running, our exponent in Eq. (2.26) with the scale choice Eq. (2.36) for ν L looks like Putting in our choices for ν L we have So the effect is to merely multiply the argument of the exponent by a factor of 1 − ζ(q T ). Notice that we have also put in µ run in the ν anomalous dimension since it is a function of µ L . This is basically the same as A → (1 − ζ)A and γ RS → (1 − ζ)γ RS in the expressions Eqs. (3.5) and (3.6) for the rapidity evolution factor V Γ inside I b in Eq. (3.2). Thus, using these profile scales simply modifies the definitions in Eq. (3.6) to: and Ω, χ in Eq. (3.6) remain unchanged (with the exception that µ L → µ run ). The final expressions for I 0 b and I k b given by Eqs. (3.37) and (3.46) are modified accordingly. So what we are doing is smoothly taking the limit A → 0 (and γ RS → 0) as we enter the high q T region. We already know from the previous section (see Eq. (3.53)) that in this limit I 0 b goes to 0 smoothly, which means the resummation factor is 0, and each I k b goes to the appropriate fixed order limit as well.
The central profiles we choose for the scales are: µ H is chosen as iQ, which implements the resummation of enhanced π 2 terms at each order in α s , which improves the perturbative convergence [38,39]. While implementing the hard function running, we take the absolute value of U H . We make six scale variations for µ, ν by a factor of 2 about their central values for each of the profiles shown in Fig. 8: to estimate higher order terms as shown in Fig. 9. In the first two variations in Eq. (3.71) µ L and ν L are each varied independently, while in the last one, they are varied simulatneously. The anticorrelated variation ((µ L /2, 2ν L ) ← (µ L , ν L ) → (2µ L , ν L /2)) is not included to avoid double counting [32].
We take the largest envelope in µ, ν and q 0 variations as our estimate of the error band. In principle, we can make variations at either end of the running (i.e. at the high scales µ H and ν H , or the low scales µ L and ν L ). In practice, since the value of α s is larger at lower scales, the scale variation at the low scale of running yields the largest error band. There is implicit q T dependence in z(q T ), µ run , and ν run . Now we are ready to give results for our final resummed cross section matched to the fixed-order QCD result. We give an explicit expression of the transverse spectrum which can be written as sum of resummed and nonsingular parts. We match the resummed and perturbative QCD results such that final result is valid in both small are large q T regions. This can be done by adding the resummed result to nonsingular terms where all logarithmic singular terms reproduced by EFT results in Eq. (A.37) are subtracted from the perturbative result σ = σ res + σ ns , σ ns (µ) = σ pert (µ) − σ sing (µ) , (3.72) where the differential in q 2 T and rapidity y is implied. The fixed-order σ ns is evaluated at a scale µ, which we choose to be equal to |µ H |, the high scale to which the profile Eq. (3.65a) goes for large q T . In this section we give explicit expressions for σ res and σ sing , and App. C gives σ pert .
The resummed cross section σ res is given by Eq.
where the coefficients F while the two-loop coefficients F (2) k remain zero at NNLL accuracy. In the pure fixed order limit ζ = 1, the resummation is turned off as µ run = ν run = |µ H | = ν H , and we see the term containing explicit µ L , ν L in Eq.
The coefficients in the singular piece of the cross section can be found in App. A. Together with σ pert given in App. C, Eq.

Final resummed cross section in momentum space
Then, the resummed cross section in momentum space Eq. (3.1), using Eqs. (3.5), (3.7), and (3.46) to express the integral I b , implementing the profile scales Eq. (3.65a), and matching onto the fixed-order QCD cross section for large q T , is given by the expression: (3.77) This fairly compact expression still has many pieces to it. We provide a roadmap to where to find them all: The basic cross sections σ 0 for Higgs production (gg → H) and for DY (qq → γ * ) are given by:  A.14). The top matching coefficient C 2 t for Higgs production is given by Eq. (A.17) while it is set to 1 for DY. The µ (virtuality RG) evolution kernel U (µ L , µ H , µ T ) is given in Eq. (2.26), with the parts of the exponent K Γ,γ , η Γ defined and expanded up to NNLL accuracy in App. A.1. These all contain dependence on the cusp anomalous dimension, whose coefficients Γ n for DY and Higgs are given by Eq. (A.9). The factor C(µ L , ν L , ν H ), which came from expressing the exponentiated "conformal" part of the rapidity evolution kernel V Γ in Eq. (2.56) as a Gaussian in ln b, is defined in Eq. (3.6), and expanded to NLL and NNLL accuracy in App. A.6.
The coefficients F which were given in Eq. (3.24), and values of α, β given by α 2 − a 0 = 4 and β 2 − b 0 = 4, which were given in Eq. (3.26). These choices are by no means unique; other values can also be chosen, which would then give different coefficients in the Hermite expansion. Our choices allowed sufficiently accurate representation of the exact function Γ(1 − ix) 2 with an economical basis of a few terms each for the real and imaginary parts, the coefficients of which we gave in Eq. (3.27). Fig. 10 illustrates the agreement of our resummed cross section Eq. (3.77) to NNLL accuracy with a total of six or seven (four real, three imaginary) terms in the Hermite expansion vs. the result of numerically integrating the exact b integrand in Eq. (3.1). The truncation error lies well within the perturbative NNLL uncertainty band. Table 1 gives the orders to which the anomalous dimensions and fixed-order pieces of Eq. The final piece dσ ns /dq 2 T dy is defined by Eq. (3.72) and can be obtained from the fixedorder results in App. C, subtracting off the singular terms Eq. (3.76).
It is good to remind ourselves at this point that the final formula Eq. (3.77) is a threefold expansion as described in Sec. 1.3: a perturbative expansion in α s and resummed logs, an additional fixed-order expansion of the non-conformal piece V β of the rapidity evolution inside F , and an expansion in Hermite polynomials in evaluating each I k b . It is thus quintessentially a "formula" according to a delightful definition we recently encountered: an expression given by an = sign with a controlled error of known parametric form. 5

Remarks on nonperturbative region of q T
The final result Eq. (3.52) for the resummed cross section is obtained by doing an integral Eq. (3.2) over a wide range of b. It is a reasonable assumption that the perturbative resummation will hold as long as 1/b > Λ QCD ∼ 500 MeV. This is roughly the value at which we begin to see the Landau pole in the b space resummation scheme (see Fig. 11). Beyond this value of b, we expect that nonperturbative effects will play a nontrivial role. Although it is not the goal of this paper to include or advance any new method to account for these nonperturbative effects, we will freely make some loose observations here, mainly delineating where we do and do not need to worry about them. While they are certainly important, we will have to leave their incorporation into our results to future work.

Remarks on nonperturbative effects
As we approach such low values of 1/b, it is no longer possible to match perturbatively the beam functions onto PDFs as in Eq. (A.27). Instead we need to retain the complete transverse momentum dependent beam function, which is now a fully nonperturbative function that need to be fit from data. In the b space resummation scheme, where µ ∼ 1/b, giving the result Eq. (2.29), the running using the perturbative anomalous dimension will work only for 1/b Λ QCD . A sensible thing to do in this scheme is to freeze the resummation at 1/b ∼ 2Λ QCD and evaluate the beam function at this frozen low scale which is still perturbative. Interestingly, for the momentum resummation scheme leading to Eq. (3.52), while we still need to replace the PDF's with the full nonperturbative TMDPDFs at low q T , the resummation is always in the perturbative regime since µ Λ QCD for any value of q T . For the case of the soft function, since both resummation schemes use a b dependent value for ν, we need to freeze out the resummation at a perturbative value of ν L ∼ 1/b * > Λ QCD and fit the soft function at this scale from data. All this seems rather complicated and it might appear that without complete information about the nonperturbative functions, it is not possible to give a prediction for the cross section. However, once again the nature of the resummed perturbative exponent comes to the rescue. Even before we reach a nonperturbative value, the double logarithmic term in b in the exponent completely damps out the integrand. Then the nonperturbative corrections become irrelevant since the region of b space in which they start contributing is heavily suppressed.
If we consider b space resummation, the damping which is provided by the resummed exponent depends mainly on the hard scale Q and the cusp anomalous dimension. For the Higgs, Q is fixed and the cusp anomalous dimension is large so the damping is always large. Increasing the center-of-mass energy only changes the x value where the PDFs are evaluated, but there is only a mild dependence on this factor. So, we can safely neglect any nonperturbative effects and rely completely on the perturbatively resummed cross section.
For the case of DY, the cusp anomalous dimension is much smaller and the value of Q is variable. So if we go to low Q, nonperturbative effects become important, see Fig. 11. As can be seen from this figure, b ∼ 3 GeV −1 is a rough estimate of the value beyond which nonperturbative effects become important where we begin observing the divergence due to the Landau pole in the b space resummation scheme. For low values of Q, several different ways of incorporating nonperturbative effects using model functions (which effectively cut off the Landau pole) in b space resummation have been proposed [33][34][35][36][37] . In this paper, we stick to showing results for larger values of Q where these effects are not as important. A detailed discussion of how to handle nonperturbative effects in our hybrid impact parametermomentum space resummation scheme will be given in future work. We suspect 6 , among other things, that the nature of our asymptotic V β expansion in Eqs. (2.44) and (2.54) will in fact give clues about the best way to include nonperturbative effects together with our perturbative resummation scheme. This follows from the fact that in the b space resummation scheme, the Landau pole, related to the running of α s was the indicator of the onset of nonperturbative effects. In our scheme, we have expanded out the running of α s in the form of the V β function and it is natural that the breakdown of this expansion will dictate the onset of nonperturbative effects. (See [49] for a recent study of nonperturbative power corrections based on renormalon divergences of perturbative expansions of TMD functions.)

Remarks on perturbative low q T limit
The q T → 0 behavior of the perturbative resummed q T distribution has of course been extensively discussed, and is known to be affected by configurations of multiple large momentum emissions k i T that cancel vectorially i k i T = q i T ∼ 0, leading dσ/dq 2 T to go to a constant nonzero value e.g. [16,40,42,50]. We will not add anything to these discussions here, postponing consideration of this regime until we also include nonperturbative effects as remarked above.
We observe, however, that our scheme in Sec. 3 applies practically only to the larger q T regime. Specifically, we are using an approximation for the ratio of gamma functions While the approximation that we are currently using is good enough in the perturbative regime (q T ≥ 2 GeV), below this q T scale, if we compare it to the CSS resummation then it shows exponential suppression as opposed to a constant behavior. However, the q T scale at which this deviation from the exact resummed cross section happens could be systematically lowered by improving our expansion of F (t) (i.e. adding more terms to our final formula Eq. (3.37)). This is evident from the behavior of the integrand in t space Eq. (3.12). As q T is lowered, both A and t 0 become large. Since A controls the suppression of the integrand which allows us to approximate the ratio of Γ functions as a series, as q T is lowered, this suppression becomes smaller and smaller which will force us to include more and more terms in our expansion in order to maintain the same accuracy. While we can get good accuracy in the perturbative regime q T ∼ 2 GeV with a few terms, below this scale it is impractical to continue using this series approximation. So in principle, to recover the behavior as q T → 0, we can no longer do an expansion in the Hermite polynomials.
We observe, however, that as A increases at lower q T , the integrand in b space (Eq. (3.7)) is now highly suppressed at large b. This means that it would be more practical to now do an expansion of the Bessel function itself (which was not possible at larger q T ), rather than go to t space. For q T 2 GeV we would need a series expansion that approximates J 0 (bq T ) well just out to its first peak or so. It should be possible to find such an expansion similar to what we did for large q T above that still allows us to do the b space integrals analytically and accurately, which by construction would now display constant behaviour at low q T . We defer a presentation of the details of this procedure to future work.
So the Mellin-Barnes representation is in some sense the dual of the Bessel function, in that at perturbative q T , t space is more amenable to an expansion and hence an analytical result with a few terms, but fails as we move into the low-q T region where nonperturbative effects kick in, where expanding directly in b space makes more sense.

Comparison with previous formalisms
There have been numerous other techniques developed for implementing the resummation for transverse momentum spectra of gauge bosons. We will briefly comment on how ours compares to some of them, though we will not undertake any sort of in-depth comparison here and will not really do justice to any of these other methods. A more detailed discussion of them was given in [32] as well as [40].
The earliest one was the CSS formalism [10,31], which was applied in, e.g. [14,51], for computing DY and Higgs transverse momentum cross sections. The value of µ L is implicitly chosen to be 1/b 0 . There is no explicit independent scale ν, however a comparison of the resummed exponents reveals that the implicit choice for ν L is also 1/b 0 . So the central values agree with the b space resummation implemented in Eq. (2.29) and an earlier paper [32].
The difference in the two approaches is two-fold. Firstly, the error analysis using scale variation is different in the absence of an independent scale ν. Varying only µ scales is likely to underestimate the uncertainty. Second, in the high q T region, the matching procedure is different. In the CSS formalism, there is no systematic way of turning off resummation while matching to the fixed order cross section, so that the predicted results differ in the high q T region. The Landau pole is handled by implementing a smooth cut-off in b space. However, as we have seen, as long as we are at high Q, this does not affect the prediction. An explicit comparison between the two schemes was given in [32].
In Fig. 12, we compare the b space resummation scheme for the implementation in [32] at NNLL+NLO accuracy for the Higgs and DY transverse spectrum with the hybrid b-space/momentum-space resummation scheme developed in this paper. This will serve transitively as a comparison also with other b space resummation schemes. We can deduce the following • The width of the error bands is comparable in the entire region of q T which is not too surprising since the error analysis in both [32] and the present paper were based on the same variations Eq. (3.71) around the respective central values.
• In the low q T region, the central value in our hybrid scheme is lower that the pure b space scheme even though it is within the error band. This is to be expected since the two schemes differ in subleading terms at a given resummation accuracy. . The overlap is a good cross-check of the accuracy of our method, and the improvement in the reliable estimation of uncertainties and computation time in our resummation scheme has been described in the text. The Higgs cross section is differential only in q T .
• In the high q T regime, the results agree exactly since in this range, the resummation has been turned off and the cross section is just the one-loop fixed order cross section in both schemes.
Another technique was implemented in [42,43] which again follows the CSS formalism with the implicit ν choice 1/b, but the µ choice is made in momentum space choosing µ ∼ q T + q * T , where q * T is chosen as 2 GeV for DY and 8 GeV for the Higgs. For the kinematics we chose to illustrate in Sec. 2.3.5, we actually found very similar shifts at low q T based on our analysis of the scales which minimize the contributions of the residual fixed order logs, which in itself parallels the logic in [42,43], though we do not necessarily adopt the same physical interpretations. The matching procedure is again similar to the CSS case and hence differs from the scheme in [32] the high q T regime. Again a detailed discussion of the differences was provided in [32].
There also have been methods proposed for setting all renormalization scales in momentum space. The most recent [40] technique has been to solve the RG equations in momentum space directly. In momentum space the beam and soft functions are functions of plus distributions of the form 1/q 2 T ln n (q 2 T /Q 2 ) + and these terms are resummed directly in momentum space using a technique of distributional scale setting. This involves setting the scale under an integral of the plus distribution. The integral turns the plus distribution into ordinary logarithms which can be minimized by choosing a specific momentum scale. However, they also observe that for transverse momentum spectra of gauge bosons, a direct scale choice of µ, ν ∼ q T does not work since this scale choice gives a spurious contribution from highly energetic emissions (k T q T ) in the phase space and hence a scale that varies with the energy of emissions has to be used so that the region of phase space of energetic emissions is suppressed. This is reflective of the divergence observed in the soft resummation Eq. (2.33) at low values of b, which, in our method we chose to cure by adding subleading terms in the cross section through Eq. (2.36). Mathematically the solution proposed in [40] is quite elegant. It will be interesting to see its implementation numerically and to compare the results at NNLL.
Another method of obtaining the transverse momentum spectra has been proposed [16,41] that uses the coherent branching formalism. The cross section is given in terms of a convolution over independent emissions off the initial gluons (or quarks for DY). It then singles out the hardest emission which also sets the scale for ν which again suppresses the energetic emissions since all other emissions are by construction of lower energy. This differs mainly from [40] in this scale choice, as in [40] ν ∼ k i which follows the energy of each emission instead of just the hardest one.
Amusingly, every proposal we know of so far (e.g. [16,40,41] and this paper) to implement TMD resummation in momentum space yields a result formally correct at a given order of logarithmic accuracy, but in terms of either an infinite sum or infinite nest of expressions (beyond the perturbative expansion itself) that must be truncated to yield a result that can be evaluated numerically. Refs. [16,40,41] obtain their final resummed results in terms of infinite sums over gluon emissions, which [16,41] implemented in a Monte Carlo routine. Our final result, on the other hand, contains the infinite sum in Eq. (3.37), over Hermite polynomials in the basis expansion of the function Γ(t) 2 arising from the representation Eq. (3.8) we used for the Bessel function in the inverse Fourier transform from impact parameter to momentum space. This is of course quite different from sums over gluon emissions. Truncating our formula corresponds to the level of numerical accuracy one attains for the Bessel function and resultant integral, rather than the number of gluons one includes in the emission amplitudes.
All the methods have their pros and cons in terms of the perturbative series obtained, error analysis, and how rigorously or easily nonperturbative effects can be included (see, e.g. [36]). Again, we leave it to future work to show how we do the latter in our method.

Conclusions
We took a fresh look at resummation for transverse momentum spectra of gauge bosons in momentum space. In contrast to the classic procedure which chooses both virtuality and rapidity scales µ, ν for resummation both in impact parameter space, we proposed a hybrid prescription for resummation, choosing the rapidity renormalization scale ν with impact parameter dependence and the virtuality scale µ in momentum space. We made a choice ν * L ∼ ν L (µ L b 0 ) −1+p for the low (soft) rapidity scale, and observe that with this choice, the integral over the b space rapidity resummation exponent is convergent. We stress that a well-defined power counting for ln(µ L b 0 ) is not possible before we have a stable soft exponent and that only when this exponent is in place, we can treat ln(µ L b 0 ) as a small log with an appropriate choice of µ L . We also give a prescription for obtaining the µ L scale in momentum space using the analysis of the b space integrand to justify our power counting ln(µ L b 0 ) ∼ 1, which shifts the scale up from the naïve momentum-space choice µ L ∼ q T .
We then use the idea that restricting the soft exponent in b space to be at most quadratic and thus Gaussian in ln(µ L b 0 ) allows us to obtain a semi-analytic formula for the cross section. Using the Mellin-Barnes representation of the Bessel function and the absence of the Landau pole in our resummation formalism, we are able, with certain approximations for the Bessel function appearing in the inverse Fourier transform that are independent of the details of the observable or kinematics, to give a closed-form analytic expression for the cross section at any order of resummation accuracy.
In brief, the main ideas and results of our paper are: • Exponentiation of quadratic fixed-order small logs of µ L b 0 from the soft function and rapidity evolution that are formally subleading at a given order of resummation accuracy, but automatically make the b integral in going to momentum space convergent without an additional regulator or cutoff. This is formally achieved by the shifted scale choice ν L → ν * L in Eq. (2.36).
• Division of the rapidity exponent V into an exponentiated part V Γ (ν * L , ν H ; µ L ) in Eq. (3.5) that is quadratic and thus Gaussian in µ L b 0 , and a part V β in Eq. (2.44) expanded in an asymptotic series, making the b integral Eq. (3.2) doable.
• Use of the Mellin-Barnes representation Eq. , which for NNLL accuracy in the cross section can be safely truncated to a few terms each in the real (even) and imaginary (odd) parts, each term of which gives rise to an integral over b (or x) which can be done be performed analytically, giving the result Eq. (3.33).
• The above steps give rise to the final resummed cross section in momentum space, Eq. (3.77), in which we can implement scale variations and profiles to reliably estimate theoretical uncertainty and match smoothly onto fixed-order predictions for large q T .
The final result Eq. (3.77) represents a threefold expansion: perturbative expansion in α s and resummed logs in the RG and RRG evolution kernels and fixed-order hard, soft, and collinear functions; V β expansion, a fixed-order asymptotic expansion of the non-conformal part of the RRG evolution kernel to ensure a Gaussian rapidity kernel; and Hermite expansion, in number of terms in basis expansion of the pure Bessel function in the inverse Fourier transform between b and q T space. Each of these is systematically improvable.
We do not even claim that the particular methods, expansions, and strategies we implemented are the fastest or most accurate amongst all similar strategies. It is fast, and it is accurate. Keeping just a few terms in the Hermite expansion we obtain an error in the cross section at the percent level, much better than the NNLL perturbative accuracy to which we work in this paper, while obtaining the result with a ∼five-fold improvement in computation time in our tests. We hope our presentation provides a blueprint and an example to obtaining a faster, more accurate predictions for many TMD observables in momentum space, and is certainly open to further development and improvement.
We applied our results to obtain the transverse spectrum of the Higgs as well as the DY q T spectrum at NNLL, matched to fixed-order O(α s ) results at large q T . We give a comparison with results obtained using the CSS formalism and observe a very good agreement where they should agree, consistent within subleading terms which is observed from the overlapping of the error bands at both NLL and NNLL. We also gave cursory discussions of the relevance of nonperturbative effects in different kinematical regimes, and also of how our method compares with some recently proposed methods of resummation directly in momentum space for all renormalization scales.
The techniques we have proposed should be applicable to other observables that depend on a transverse momentum or are sensitive to "soft recoil," (e.g. [28,29]), and admit of a factorization of the form Eq. (1.1) with a convolution between soft and collinear functions in (2-D) transverse momentum q T describing modes separated in rapidity as in Fig. 1. When a (semi-)analytic formula can be obtained as we have done, it should drastically cut down computation time and improve our understanding of the physical behavior of the cross section and its computational uncertainties as a function of the scales it depends on. We will perform a a more detailed phenomenological study using our expressions with comparisons to data in the near future, and then also apply our techniques to other TMD observables. dimension [54,55] in MS are T F n f , where C i is C F and C A for the quark and gluon, respectively.

A.2 Hard Function
The hard function H is given as the square of the SCET matching coefficient C arising from matching QCD and SCET amplitudes, H = |C| 2 . The form of the fixed-order expansion of C can be deduced from Eq.
At an arbitrary scale µ, C then has the expansion, where to O(α 2 s ),

A.3 Soft Function
The fixed-order expansion of the soft function S can be deduced from its (R)RG solution Eq. (2.19a) and the fixed order expansions of U S and V S in Eqs. (2.20a) and (2.21a). At the scales µ L = ν L = 1/b 0 , all the logs in S vanish, Evolving it to arbitrary scales µ, ν by Eq. (2.19a), we obtain We expand the exponent using Eqs. (A.7a) and (A.8a), the constants in front using Eq. (A.20), and the rapidity anomalous dimension in powers of α s (µ), using In principle all the higher-order β-function terms could contribute at the same order as lowerorder ones, but we will always evaluate µ close to 1/b 0 in a fixed-order soft or beam function or RRG evolution factor, so the higher-order logs are not large, and those terms are genuinely suppressed by powers of α s relative to lower-order terms.
Putting together these pieces, we obtain the expansion of the soft function, 2π S(b, µ, ν) = 1 + The two-loop terms I (2) ij (z) can be found from [21] (see also [20]). The 1-loop splitting functions defined in Eq. (A.29) are given by 8 The anomalous dimensions γ i Rf that we need are given above in Eq. (A.26).

A.6 Gaussian rapidity exponent
The pieces is in the exponentiated rapidity evolution kernel Eq. (3.5) are given to all orders by Eq. (3.6), and to NLL accuracy by: and to NNLL accuracy by:

B A general scheme for soft resummation
The default prescription for ν running in our scheme uses the choice (Eq. (2.36)) which automatically resums the leading double logarithms of ln(µ L b 0 ). This scheme also partially resums single and double logarithms of the argument µ L b 0 at higher orders in α s . This is the simplest scheme that provides a stable b space kernel that respects the power counting that l = ln(µ L b 0 ) is small (see Sec. 2.3). This scale choice is by no means unique. There is still a lot of space to play around with the choice of this scale, where each choice would differ from the other in exactly which set of small logs l get included in the exponent. All of these different schemes, therefore, would only differ from each other in subleading terms, and hence we would expect that each of these would lead to an overlapping error band at any given order in resummation.
In this section, we give a general prescription for the scale choice ν L , that covers all of these schemes, while still allowing us to obtain an analytical expression. So we still obey the constraint of putting terms at most of quadratic power in l in the exponent. The generalization that we propose isν * where we now expand out both r and s as a power series in α s . ζ 0 is a constant r = In practice, while resumming to a particular order, we truncate the series in r and s to that order in accuracy. The r i parameters will control the coefficient of both l and l 2 at each order in α s , while the s i parameters will control the coefficient of l. The s i parameters also induces constant terms in the exponent, which ideally, one would not find in an exponent, however, their effect at each order can be cancelled out by including the corresponding constant terms induced in the fixed order by this scale choice. We have checked that the effect of several different choices for the parameters r and s produce variations in the resummed cross sections smaller than the inherent perturbative uncertainty already present at each order seen in Fig. 6.

C Perturbative QCD results at NLO
The QCD results of q T spectrum for Higgs and for DY are known up to NNLO [65][66][67][68][69][70][71][72]. Here we give NLO expression [66,[73][74][75] which is used to obtain the nonsingular part defined in Eq. (3.72) where G ij is a reduced partonic cross section depending on partonic Mandelstam variables. All variable above are defined as For Higgs production, the tree-level and partonic cross sections are given by Empirical tests imply the series converges well forα 2 −ã 0 andβ 2 −b 0 around 3 ∼ 5. Fig. 7 shows the exact result and series expansion up to 4th order forα 2 −ã 0 = 4 andβ 2 −b 0 = 4. Note that the deviations from the exact results above x = 1.5 is suppressed by the Gaussian kernel in Eq. (3.12) and resulting error in the integral should be smaller than that appearing in Fig. 13. The coefficientsc 2n andd 2n are given bỹ

D.2 A tailored basis for expanding Γ[−t]
Γ [1−t] While the weighted Hermite polynomial basis presented in the main text is a systematic expansion in an orthogonal basis, we can come up with a basis more closely tailored to the behavior of the function f (t) = Γ[−t] Γ [1−t] , although it is not as systematic in that it is not orthogonal and there is not a simple formula for the basis coefficients. This basis is: f app (t) = n c n f (t; N n , a n , b n ) (D.6) f (t; N n , a n , b n ) = (t + 1) Nn e an(t+1) 2 +bn(t+1) = i Nn (x − x 0 ) Nn e −an(x−x 0 ) 2 +ibn(x−x 0 ) , (D.7) where N n are integers and a n and b n are complex constants. In the 2nd equality, we set t = c + ix and x 0 = i(1 + c). This form has been deliberately chosen in anticipation of our choice of c = −1. Because we are not aware of a systematic expansion in terms of this basis unlike the weighted Hermite polynomial expansion, the values of a n , b n , and N n as well as c n in Eq. (D.6) should be determined by fitting to the exact function f (t). The integration against the evolution kernel is given by = ω N (a, b) F 0 (a, b) , (D. 8) where ∂ L = ∂/(∂L) and ω N is determined by N th derivative of F 0 (a, b). The integral I b is rewritten as T n c n F Nn (a n , b n ) , (D.9) where the derivative on F N can be replaced by F 0 multiplied by a coefficient d N,k . Here, we show the result of using the basis in Eq. (D.7) for N n =0 and 2. The real part of f app (t = −1 + ix), which we call f R (t) is even, while the imaginary part f I (t) is an odd function of x. We therefore write f R (t) = g 1 (e −g 2 x 2 − cos[g 3 x]) + g 4 x 2 e −g 5 x 2 = g 1 f (t; 0, g 2 , 0) − where t = −1 + ix. In practice we first find the fit in the first equality then, rewrite it in terms of our basis. The b space integral obtained by replacing f (t, N n , a n , b n ) by F Nn (a n , b n ). where we have defined L = ln 2Ω q T . The value of the parameters g i , h i shifts as we shift the contour via the value of c, so that the final result is independent of the contour chosen. In this paper we have made the following choice for the contour and hence the corresponding parameters c = −1, g 1 = 0.5532, g 2 = 1.77, g 3 = 2.465, g 4 = 0.4582, g 5 = 2.42, h 1 = 0.0525, h 2 = 4.09, h 3 = 0.98, h 4 = 0.793. It is to be stressed that once the contour is fixed, these parameters are also fixed and hence can be used for any observable in any kinematical regime. This is because the fitting is only done for the ratio of Gamma functions f (t) which, in no way involves the details of the specific observable or its kinematics. The only condition as we specified earlier that A be a small number to ensure adequate suppression.
Let This agrees with Eq. (3.63) better than 1% that is acceptable at NNLL accuracy.

E.1 A proof of the Mellin-Barnes identity for the Bessel function
Here we present a short proof of the key identity Eq. This identity can be found, e.g. in [76], §10.9.22, which is given as valid for J α (z) for α > 0. We briefly verify that it works also for α = 0. For convenience let us change the integration variable in Eq. (E.1) from t → −t. Then we have where now the contour lies to the right of all the poles of Γ(t), i.e. c > 0. The contour can be closed in the left half plane out at t → −∞. The value of the integrand falls rapidly in this limit, and the circular part of the contour contributes zero to the integral. Deforming the contour, we pick up the residues at all the poles t = −n of the Gamma function Γ(t), which are which is precisely the series representation of the Bessel function J 0 , proving the identity.

E.2 Integral of complex Gaussian
We can evaluate the integral over a complex Gaussian using the contour integral: where C is the contour shown in Fig. E.2.

E.3 Proof of Gaussian integral of Hermite polynomials
Here we prove the result Eq.
where in the last line we integrated the d/dx term by parts, using that the boundary terms are zero at x → ±∞. Now we can use the known recursion relation for derivatives of Hermite polynomials: H n (x) = 2nH n−1 (x) , (E. 19) and obtain which proves Eq. (3.49).

E.4.2 Second proof
Using the all orders result for H n in Eq. For the second term in ∂ L H n (α, a 0 ), we can make the following observations • For even values of n, the term m = n/2 does not contribute to the derivative.
• For odd values of n, n/2 is the same as (n − 1)/2 .