A fast and accurate method for perturbative resummation of transverse momentum-dependent observables

We propose a novel strategy for the perturbative resummation of transverse momentum-dependent (TMD) observables, using the qT spectra of gauge bosons (γ∗, Higgs) in pp collisions in the regime of low (but perturbative) transverse momentum qT as a specific example. First we introduce a scheme to choose the factorization scale for virtuality in momentum space instead of in impact parameter space, allowing us to avoid integrating over (or cutting off) a Landau pole in the inverse Fourier transform of the latter to the former. The factorization scale for rapidity is still chosen as a function of impact parameter b, but in such a way designed to obtain a Gaussian form (in ln b) for the exponentiated rapidity evolution kernel, guaranteeing convergence of the b integral. We then apply this scheme to obtain the qT spectra for Drell-Yan and Higgs production at NNLL accuracy. In addition, using this scheme we are able to obtain a fast semi-analytic formula for the perturbative resummed cross sections in momentum space: analytic in its dependence on all physical variables at each order of logarithmic accuracy, up to a numerical expansion for the pure mathematical Bessel function in the inverse Fourier transform that needs to be performed just once for all observables and kinematics, to any desired accuracy.


Introduction
The transverse momentum spectra of gauge bosons is well trodden territory. They are important for measurements of, e.g. Higgs production, as well as the dynamics of QCD in Drell-Yan (DY) processes. There are calculations available at NNLL+NNLO accuracy using a variety of resummation schemes both using the framework of soft collinear effective theory (SCET) [1][2][3][4][5], e.g. [6][7][8][9], and Collins-Soper-Sterman (CSS) [10] formalisms, e.g. [11][12][13][14][15], and even N 3 LL+NNLO [16] (see also calculations in, e.g. [17][18][19][20][21][22][23]). Joint resummation of threshold and tranverse-momentum logs is even possible to NNLL and beyond (e.g. [17,[24][25][26]).Why, then, do we wish to visit this subject anew? This has mostly to do with the peculiar structure of the factorized cross section which makes the resummation of large logarithms an interesting problem. The cross section can be factorized in terms of a hard function, which lives at a virtuality Q, the invariant mass of the gauge boson, and soft and the beam functions (or TMDPDFs) which describe the IR physics and live at the virtuality q T Q, which is the transverse momentum of the gauge boson. The soft and collinear emissions are the ones providing the recoil for the transverse momentum of the gauge boson. This automatically means that these functions

JHEP04(2018)149
are convolved with each other in transverse momentum space so that the q T of the gauge boson is a sum of the q T contribution from each emission: dσ d 2 q T dy = σ 0 C 2 t (M 2 t , µ)H(Q 2 ; µ) d 2 q T s d 2 q T 1 d 2 q T 2 δ 2 q T − ( q T s + q T 1 + q T 2 ) (1.1) × S( q T s ; µ, ν)f ⊥ 1 q T 1 , x 1 , p − ; µ, ν f ⊥ 2 q T 2 , x 2 , p + ; µ, ν , at s = (P 1 + P 2 ) 2 , with colliding protons of momenta P 1,2 , and gauge boson invariant mass Q 2 and rapidity y. For the case of the Higgs, we have a Wilson coefficient C t after integrating out the top quark. (For DY we just set C t = 1 in eq. (1.1), and consider explicitly only the γ * channel in this paper.) Here, S is the soft function accounting for the contribution of soft radiation to q T , f ⊥ 1,2 are the TMDPDFs (or beam functions) accounting for the contribution of radiation collinear to the incoming protons to q T , and they depend on kinematic variables p ∓ = Qe ∓y = x 1,2 √ s. The peculiarity of the factorization is that even though the TMDPDFs form a part of the IR physics, they depend on the hard scale Q (cf. [27]), which, as we shall see later, will play an important role in our resummation formalism. The hard function H encodes virtual corrections to the hard scattering process, computed by a matching calculation from QCD to SCET. The scale µ is the renormalization scale normally encountered in the MS scheme and plays the role of separating hard modes (integrated out of SCET) from the soft and collinear modes, by their virtuality. The additional rapidity renormalization scale ν, introduced in [28,29], arises from the need to separate soft and collinear modes, which share the same virtuality µ, in their rapidity (figure 1). The cross section itself is independent of these arbitrary virtuality and rapidity boundaries, but the renormalization group (RG) evolution of factorized functions from their natural scales, where they have no large logs, to arbitrary µ, ν can be used to resum the large logs in the cross section.

RG and RRG evolution in impact parameter vs. momentum space
These functions obey the renormalization group (RG) equations in µ where F i can be C 2 t (M 2 t , µ), H(Q 2 ; µ), S( q T s ; µ, ν) or f ⊥ i ( q T i , Q, x i ; µ, ν). The RG equations in ν have a more complicated convolution structure: where G i can be soft functions or TMDPDFs. The symbol ⊗ here indicates convolution defined as Apart from the complicated structure of the RG equations, the anomalous dimensions themselves are not simple functions but are usually plus distributions [29] which makes it JHEP04(2018)149 Figure 1. Left: EFT modes and their scalings in light-cone momentum k ± space. For the TMD cross sections we consider, the small parameter can be taken to be λ ∼ q T /Q or ∼ Qb 0 in impact parameter b space, where b 0 = be γ E /2. Right: RG and rapidity RG evolution. µ runs between the hard and soft hyperbolas of virtuality shown in the left-hand figure, while ν runs between the soft and collinear modes which are separated only by rapidity. The evolution is path independent, one convenient path is shown here.
even harder to solve these equations directly in momentum space. A typical strategy to get around this is to Fourier transform to position (i.e. impact parameter) space, defining the latter definitions accounting for the fact that all the distributions we encounter will have azimuthal symmetry in q T or b. This then gives ordinary multiplicative differential equations (instead of convolutions), and a closed form solution to the RG equations can be easily obtained. Moreover the cross section now takes the simpler structure, dσ dq 2 T dy = σ 0 π(2π) 2 C 2 t (M 2 t , µ)H(Q 2 , µ) db bJ 0 (bq T ) (1.6) where J 0 is the n = 0 Bessel function of the first kind. Note we have changed variables from q T in eq. (1.1) to q 2 T in eq. (1.6). The b-space soft and beam functions S and f ⊥ i now obey multiplicative rapidity RGEs in ν, whose anomalous dimensions and solutions we shall give below. Only the b integration in eq. (1.6) stands in the way of a having a simple product factorization of the momentumspace cross section. Finding a way to carry it out will be one the main focuses of this paper. For perturbative values of q T , the TMDPDF's can be matched onto the PDF's. The b-space cross section, defined as the following product of factors in the integrand of eq. (1. where i,ī = g for Higgs production and i = q for DY, and where we ignore effects of DGLAP evolution for the moment (we include them in eq. (A. 37) and in all our analysis below). This takes the typical form of a series of Sudakov logs. The number of coefficients G nm that need to be known is determined by the desired order of resummed accuracy. Using the heuristic power counting ln Qb 0 ∼ 1/α s in the region of large logs needing resummation, the leading log (LL) series includes the O(1/α s ) terms m = n + 1, the next-to-leading log (NLL) series the O(1) terms up to m = n, at NNLL the O(α s ) terms up to m = n − 1, etc.
When we later talk about resummation in momentum space, we will define our accuracy by the corresponding terms in the b-space integrand that we have successfully inverse Fourier transformed (cf. [30]). For a TMD cross section, the logs in the full QCD expansion eq. (1.9) are factored into logs from the hard and soft functions and TMDPDFs of ratios of the arbitrary virtuality and rapidity factorization scales µ, ν and the physical virtuality and rapidity scales defining each mode. Each function contains logs: ln 2 µb 0 + 2 ln µb 0 ln ν µ + Z f Γ 0 ln µb 0 ln ν 2 Q 2 + 2γ 0 f ln µb 0 , (1.11) where the individual anomalous dimension coefficients satisfy the constraints Z H + Z S + 2Z f = 0 and γ 0 H + γ 0 ) RG evolution of each factorhard, soft, and collinear -in both virtuality and rapidity space from scales where the logs are minimized, namely, µ H ∼ Q, µ T ∼ M t and, naively, µ S,f ∼ 1/b 0 for the virtuality scales, while ν S ∼ µ S and ν f ∼ Q for the rapidity scales, to the common scales µ, ν achieve resummation of the large logs, to an order of accuracy determined by the order to which the anomalous dimensions and boundary conditions for each function are known and included. This will be reviewed in further detail in section 2.
This, at least, is the procedure one would follow to resum logs in impact parameter space. It corresponds, in SCET language, to how to obtain the result of the standard CSS resummation through traditional [10] or modern techniques [31], as well as recent EFT JHEP04(2018)149 treatments like [32]. Then the resummed b-space cross section is Fourier transformed back to momentum space via eq. (1.6). The main issue with this procedure is that the strong coupling α s (µ) in the soft function and TMDPDFs is then evaluated at a b-dependent scale µ S,f ∼ 1/b 0 , which enters the nonperturbative regime at sufficiently large b in the integral in eq. (1.6). So the integrand must be cut off before reaching the Landau pole in α s . There are quite a few procedures in the literature to implement precisely such a cutoff by introducing models for nonperturbative physics, see e.g. [33][34][35][36][37].
Motivated by these observations, in this paper we explore the following main questions: • Even though the natural scale for minimizing the logarithms in the soft function and TMDPDFs is a function of the impact parameter b, can we actually set scales directly in momentum space, after performing the b integration? (without an arbitrary cutoff of the b integration?) • If that is possible, can we obtain a closed-form expression for the cross section which will be accurate to any resummation order and ultimately save computation time?
In section 2 we shall propose a way to answer the first question, and in section 3 we shall develop a method to answer the second. To aid the reader in quickly grasping the main points of our paper, we offer a more detailed-than-usual summary of these sections here, which is somewhat self-contained and can be used as a substitute for the rest of the paper upon a first reading. Readers interested in the details of our arguments can then delve into the main body of the paper. Except for a brief discussion near the end, we emphasize we address only the perturbative computation of the cross section in this paper.

A hybrid set of scale choices for convergence of the b integral
Regarding the first question, the issue with leaving the µ, ν scales for the soft function and TMDPDFs unfixed before integrating over b in eq. (1.6) is that the integral, while avoiding the Landau pole from long-distance/small-energy scales, is then plagued by a spurious divergence from large-energy/short-distance emissions [29], e.g. at NLL accuracy: dσ dq 2 T dy = σ 0 π(2π) 2 C 2 t (M 2 t , µ T )U NLL (µ H , µ T , µ L )H(Q 2 ; µ H ) db bJ 0 (bq T ) S(b; µ L , ν L ) although H, S, f ⊥ are truncated to tree level at NLL while C t = α s . The hard scale µ H is usually set to iQ to implement what is called π 2 resummation to improve perturbative convergence [38,39]. This integral, as we will see below in eq. (2.33), is still divergent. At this point, µ L and ν H,L are b-independent and cannot help with regulating the integral. What we need in eq. (1.12) is a factor that damps away the integrand for both large and small b. In this paper, we adopt the approach that there are already terms in the physical cross section itself that can play the role of this damping factor and that we should use them. Namely, at NLL order and beyond, the soft function evaluated at the low scales µ L , ν L in JHEP04(2018)149 the integrand of eq. (1.12) contains logs of µ L b 0 that we can use to regulate the integral, see eq. (A.24). Since S(b, µ L , ν L ) no longer contains large logs (if µ L , ν L are chosen near the natural soft scales), it is typically truncated to fixed order (see table 1). However, we know that the logs themselves still exponentiate, being predicted by the solution eq. (A. 21) to the RG and νRG equations. If we could keep the exponentiated one-loop double log in S in eq. (A.24) in the integrand of eq. (1.12), exp αs(µ L ) 8π Z S Γ 0 ln 2 µ L b 0 , where Z S = −4, it would play precisely the role that we desire. Now, as we argue below, if we are going to keep this term exponentiated, we should also include a piece of the 2-loop rapidity evolution kernel ∼ α 2 s ln 2 µ L b 0 ln(ν H /ν L ) given by eqs. (2.26) and (2.27) in the exponent, as it is of the same form and same power counting, so that the terms we wish to promote to the exponent of eq. (1.12) at least at NLL order are: These are terms that would otherwise be truncated away at strict NLL accuracy. Since they are subleading, we are in fact free to choose to include them (and not other subleading terms that are formally of the same order). While this is admittedly a bit ad hoc, we take the view that it is no more arbitrary than any regulator or cutoff we might choose to introduce to eq. (1.12), and these are terms that actually exist in the expansion of the physical cross section. We can rephrase this choice of subleading terms in eq. (1.13) to include in eq. (1.12) as part of our freedom to choose the precise scale ν L in eq. (1.12) (the variation of which anyway probes theoretical uncertainty due to missing subleading terms). Namely, if one were otherwise to choose ν L ∼ µ L in S in eq. (1.12), we propose then shifting that choice to: which we derive in eqs. (2.36) and (2.38). This achieves the shifting of the terms in the exponent of eq. (1.13) that would otherwise be truncated away into the integrand of eq. (1.12) where they appear explicitly, and can be used to regulate the b integral. This particular choice of regulator factor in eq. (1.14) is motivated, furthermore, by the fact that it will allow us actually to evaluate the b integral eq. (1.12) (semi-)analytically, as we show in section 3. Maintaining a Gaussian form for the exponent in ln b inside the b integral will be crucial to this strategy.
Beyond NLL, we will choose to keep the same shifted scale choice eq. (1.14), but to ensure that we do not introduce higher powers of logs of µ L b 0 than quadratic into the exponent of the integrand in eq. (1.12), we make one additional modification to how we treat the rapidity evolution kernel. Namely, in the all orders form of the rapidity evolution kernel: we divide the anomalous dimension into a purely "conformal" part containing only the diagonal pure cusp terms with a single log of µ L b 0 and the same for the non-cusp part γ RS . We divide the rapidity evolution kernel eq. (1.15) into corresponding "conformal" and "non-conformal" parts: where V Γ contains pure anomalous dimension coefficients, 18) and V β contains all the terms with beta function coefficients, whose expansion is shown in eq. (2.44). We will keep V Γ exponentiated as in eq. (1.18), and the shift ν L → ν * L in eq. (1.14) will turn it into a Gaussian in ln µ L b 0 and thus allow us to carry out the b integral in eq. (1.12) (updated beyond NLL). However, to keep this Gaussian form of the exponent, we will then choose to truncate V β at fixed order. The logs of µ L b 0 in V β will give integrals in eq. (1.12) that we can carry out by differentiating the basic result we obtain in section 3.
Admittedly, this expansion and truncation of V β is not part of any usual scheme for N k LL resummation, but is our addition. In particular, V β still contains large logs of ν H /ν L as seen in eq. (2.44). This means that starting at NNLL order, we will not actually exponentiate all the logs that appear at this accuracy, as usual log counting schemes in the exponent require. This is the price we choose to pay for the (semi-)analytic solution we obtain in section 3, which requires a Gaussian exponent in ln b in the b integrand. This is essentially an implementation of Laplace's method for evaluating the b integral. As we argue in section 2.3.3, our truncation of V β in fixed order is not as bad as failing to exponentiate large logs involving µ (which we do exponentiate) would be. There is never more than a single large log of ν H /ν L appearing in the exponent of the rapidity evolution. Thus, the series of terms in the fixed-order expansion of the exponentiated V β are suppressed at every order by another power of α s . 1 Our expansion of V β should be viewed as asymptotic expansion, which, indeed, we find truncating at a finite order yields a good numerical approximation to the resummed cross section (within the theoretical uncertainties otherwise present in the resummed cross section at N k LL accuracy) in the 1 In the µ evolution kernels, the exponents, e.g. eqs. (A.7a) and (A.8a), themselves contain higher and higher powers of large logs of µH /µL, and truncating any part of it to fixed order would not be sensible. Truncating V β in eq. (2.44) to the same order as other corresponding genuinely fixed-order terms at N k LL accuracy makes more sense. Loosely speaking, we maintain counting of logs in the exponent for most of the cross section, except for V β , in which we revert to older log counting in the fixed-order expansion (N k LLE vs. N k LLF in [27]).

JHEP04(2018)149
perturbative region. 2 Note, furthermore, that in the conformal limit, V β = 1, and the exponentiated part V Γ of the rapidity evolution would be exact.
We should point out that, through the shift eq. (1.14), we do introduce b dependence into our choice of scale ν L , so we would not call our resummation scheme entirely a momentum-space scheme. (See [40,41] for such proposed methods.) We do, however, leave the µ L scale unfixed until after the b integration, and this still allows us to avoid integrating over a Landau pole in α s (µ L ) in eq. (1.6).
In section 2 we also use our freedom to determine exactly where µ L ∼ q T should be in order to improve the convergence of the resummed perturbative series. We argue it should be set at a value such that other unresummed fixed-order logs make a minimal contribution to the final momentum-space cross section. For small values of q T , this scale turns out to be shifted to slightly higher values µ L ∼ q T + ∆q T . Without making such a shift we find instabilities in the evaluation of the cross section. This is similar in spirit to the shift µ L → q T + q * T proposed in [42,43], though not identical in motivation, implementation, or interpretation in terms of nonperturbative screening.

A semi-analytic result for the b integral with full analytic dependence on momentum-space parameters
If we stopped there, our choice of scale eq. (1.14) might be no more than just another in a long series of proposed schemes to avoid the Landau pole in eq. (1.6), and, in addition, our division of the rapidity evolution kernel eq. (1.15) into an exponentiated and a fixed-order truncated part in eq. (1.17) would be quite unnecessary and inexplicable. However, what we find in section 3 is that all of these scheme choices together yield a form of the b-space integrand eq. (1.8) that is Gaussian in ln b so that we can integrate it analytically into a fairly simple form, modulo a numerical approximation for the pure Bessel function in eq. (1.6). The dependence on all physical parameters and scales such as q T , µ H,L , ν H,L , is obtained analytically. We now briefly summarize our procedure and results. With the division eq. (1.15) of the rapidity evolution kernel and the scale choice ν * L in eq. (1.14), the momentum-space cross section eq. (1.6) can be written in the form, given in eq. (3.1), where we isolated the b integral, F contains the fixed-order terms, including powers of logs of µ L b 0 , contained in the soft function, TMDPDFs, and the part V β in eq. (1.17) of the rapidity evolution kernel that we choose to truncate at fixed order. As we will show in section 3, the exponentiated part of 2 We expect that the way in which it breaks down for small qT will yield clues to the behavior of the nonperturbative contributions to the cross section, which however are not the subject of this paper.

JHEP04(2018)149
the rapidity evolution kernel V Γ in eq. (2.43), with the scale choice ν * L in eq. (1.14), can be written in the form of a pure Gaussian in ln b, where C, A, χ are functions of the scales µ L , ν L , ν H and the rapidity anomalous dimension, given explicitly in eq. (3.6). In particular A ∼ Γ[α s (µ L )]. If we could figure out how to integrate this Gaussian against the Bessel function in eq. (3.2), we would be done. Now, the presence of terms in F in eq. (3.2) with nonzero powers of ln µ L b 0 can be obtained from the basic result by differentiation, as we will derive in section 3.3, so we really only need to figure out how to evaluate the basic integral, where Ω ≡ µ L e γ E χ/2. Now, our mathematical achievements in this paper do not reach so far as to evaluate eq. (1.22) analytically in its precise form. We will, however, develop a procedure to evaluate it in a closed form, with analytic dependence on q T , A, Ω (and thus all scales and anomalous dimensions), to arbitrary numerical accuracy determined by the goodness of an approximation we use for the Bessel function. We find a basis in which to expand the pure Bessel function, in which just a few terms are sufficient to reach a precision better than needed for NNLL accuracy in the resummed cross section, and which can be systematically improved as needed. The details of this derivation are in section 3, but we summarize the key steps here.
The first step is to use a Mellin-Barnes representation for the Bessel function, where the contour lies to the left the poles of the gamma function Γ(−t), so c < 0. The choice c = −1 turns out to be well behaved, and useful as it is closely related to the fixedorder limit of eq. (1.20) (see section 3.3.2). This trades the b integral in eq. (1.22) for the t integral, and we obtain where we parametrized the contour in eq. (1.23) as t = c + ix, and where t 0 = −1 + AL, where L = ln(2Ω/q T ). We also used the reflection formula Γ(−t)Γ(1 + t) = −π csc(πt). It may appear that we are no farther along than when we started with eq. (1.22)-we still have to do the x integral. However, we now observe that thanks to the form of the Gaussian with a width ∼ √ A, which vanishes in the limit α s → 0, we only need to know the rest of the integrand, in particular

JHEP04(2018)149
in a fairly small region of x. In fact we shall not need it out to more than |x| ∼ 1.5 for any of our applications. Thus if we can find a good basis in which to expand f where every term gives an analytically evaluable integral in eq. (1.24), we shall be in good shape. Now, this would not have been a good strategy in eq. (1.20) for the Bessel function itself, as it is highly oscillatory out to fairly large b, and the Gaussian does not damp the integrand away quickly -its width only grows as α s (i.e. A) goes to zero. However, inside eq. (1.24), we find an expansion of f (eq. (1.25)) in terms of Hermite polynomials H n to work very well: where we now pick c = −1 and factor out Gaussians with widths set by a 0 , b 0 which closely (but not exactly) resemble the real and imaginary parts of Γ(1 − ix) 2 itself, near x = 0. Their departures from an exact Gaussian are accounted for by the remaining series of Hermite polynomials. It would be natural to choose the scaling factors α, β for the Hermite polynomials to be α 2 = a 0 and β 2 = b 0 , but instead we leave them free, to be determined empirically to optimize fast convergence of the series. We find we can get sufficient numerical accuracy acceptable for NNLL accuracy in the final cross section with just a few (3 or 4) terms in each series, real and imaginary. The coefficients c n in eq. (1.26) still have to be determined by the numerical integrals eq. (3.25), which unfortunately prevents us from having a fully analytic result for the momentum-space cross section. However, the series eq. (1.26) with these numerical coefficients depends only on properties of the pure mathematical function Γ(1 − ix) 2 itself -not on any physical parameters. The dependence on these we keep analytically. All that is left is to evaluate analytically the integral of each Hermite polynomial against the Gaussian in eq. (1.24), leading to the result we derive in eq. (3.37), where each term H n is defined by the integral, each of which has the closed form result, (1.29) the first several of which are written out explicitly in eq. (E.14). In eqs. (1.28) and (1.29), z 0 = A(π/2 + iL), in terms of which the integral eq. (1.24) can be written, the shifted exponents arising from absorbing the sine function in eq. (1.24).
The results eq. (1.29) for the integrals eq. (1.28) are the primary mathematical result of our paper. The final and primary physics result of our paper, eq. (3.77), the resummed JHEP04(2018)149 cross section in momentum space, is given in terms of the analytic result eq. (1.27) for I 0 b above. While a first glance at these formulas may not be particuarly illuminating, we would like to emphasize that the results eq. (1.29) of the integrals eq. (1.28) in terms of which the final result is written contain within them explicit dependence on all the physical parameters such as q T and the scales µ L,H , ν L,H that one would want to vary not only to evaluate the cross section but estimate its theoretical uncertainties. This is made very fast to compute by our explicit analytic formula, modulo only the numerically computed coefficients in eq. (1.26), but that can be done once and for all, for any TMD observable or kinematics.
It is important to emphasize that the result eq. (3.77) we give for the resummed momentum-space cross section represents, then, a triple expansion: • Perturbative expansion: usual expansions in α s of matching coefficients and resummed exponents in eq. (1.19), counting α s ln(µ H /µ L ) ∼ 1 or α s ln(ν H /ν L ) ∼ 1, and fixed-order tails (not shown in eq. (1.19)).
• V β expansion: the additional truncation of the V β part of the rapidity evolution kernel in eqs. (1.17) and (2.54) to a fixed order in α s , according to table 1, makes possible the integration of a rapidity exponential eq. (1.21) Gaussian in ln b, and behaves as an asymptotic expansion. This expansion becomes exact in the conformal limit of QCD.
• Hermite expansion: the integral of the Gaussian in eq. (1.21) against the Bessel function in eq. (1.22) is performed in terms of analytic integrals, by expanding J 0 through the representation eq. (1.23) and the series of Gaussian-weighted Hermite polynomials eq. (1.26), truncated to a finite number of terms, as needed to achieve a numerical accuracy in the cross section better than the perturbative uncertainty already present.
These are the expansions we find necessary to obtain the analytical (up to the numerical Hermite coefficients) result for the cross section in eq. (3.77). Each expansion is systematically and straightforwardly improvable. The last two expansions could be avoided if one is satisfied with a fully numerical evaluation of the b integral in eq. (1.20). We find the expansions worthwhile as they yield the faster and similarly accurate formula eq. (3.77). 3 In the rest of the paper, we will do our best to make clear which expansion(s) are being used at each stage. The remainder of our paper is organized as follows. Before concluding section 3 we match our resummed result onto fixed-order perturbation theory in section 3.3.3 and obtain and illustrate our results resummed to NNLL accuracy and matched to O(α s ) fixed order. In section 4 we offer some comments about expected nonperturbative corrections to our perturbative predictions, and in section 5 we survey other methods to resum TMD cross sections in the literature as compared to ours. We conclude in section 6. In the appendices

JHEP04(2018)149
we offer an array of technical results we need to evaluate the integrals and cross sections in the rest of the paper, as well as some alternatives to particular choices of schemes or methods we made in the main body of the paper.

Resummed cross section
In this section, we first review RG and rapidity RG methods to resum logs of separated hard and soft/collinear virtuality scales and collinear and soft rapidity scales in TMD cross sections. We review a standard procedure to set scales in impact parameter space, and then inverse Fourier transforming to momentum space. Then we propose a hybrid scale setting scheme where the soft rapidity scale is chosen to depend on b, but the virtuality scales are chosen only after we transform back to momentum space, allowing evaluation of the b integral without encountering a Landau pole. We also organize the rapidity evolution kernel in a way that anticipates making use of it to perform the b integral semi-analytically in section 3. We also address the choice of the soft virtuality scale itself in momentum space to ensure stable power counting of logs.

RGE and νRGE solutions
We defined the b-space cross section in eq. (1.8). The cross section is independent of the virtuality and rapidity factorization scales µ, ν, but each factor H, S, f ⊥ does depend on them, and contains logs of ratios of the scales µ, ν to their "natural" virtuality or rapidity scales µ H , (µ L , ν L ) and (µ L , ν H ), at which no large logarithms exist. Thus we would like to evaluate each factor at these separate scales, and then use RG and νRG evolution to take them to the common scales (µ, ν) at which the cross section is evaluated. The solutions to these evolution equations are in a form where the large logs of ratios of separated scales are resummed or exponentiated.

Hard function
The hard function H = |C| 2 depends only on the virtuality scale µ, and obeys the RGE, where the anomalous dimension takes the form, where Γ cusp is known as the cusp anomalous dimension, the proportionality constant Z H = 4, and γ C [α s ] is the non-cusp part of the anomalous dimension. The cusp anomalous dimension can be written as an expansion in the strong coupling α s (µ).

JHEP04(2018)149
The RGE eq. (2.1) has the solution where the evolution kernel is and U H = |U C | 2 . The pieces K γ , η Γ , K γ of the evolution kernel are given by: Explicit expressions for these kernels up to NNLL accuracy are given in appendix A.1.
For the case of the Higgs production, we have another Wilson coefficient (C 2 t ) obtained from integrating out the top quark. So in addition to the hard function, we also have a running for this coefficient.
The anomalous dimension takes the general form where γ i is a number. The RGE has the solution where the evolution kernel is Explicit expressions for these kernels up to NNLL accuracy are given in appendix A.1.

Soft function and TMDPDFs
The soft function in b space obeys the µ-and ν-RGEs, while the TMDPDFs/beam functions obey The µ anomalous dimensions take the form: where µ and ν independence of the cross section require In γ f µ we recall the large rapidity scales are given by p ± = Qe ±y = x 1,2 √ s for the two colliding hard partons. Note p + p − = Q 2 . As for the form of the ν anomalous dimensions, at one-loop fixed order in perturbation theory, they take the values the µ-scale at which rapidity logs are minimized in b space. Beyond O(α s ), the form of the ν anomalous dimensions can be deduced from the consistency relation: Solving this equation in µ, we obtain The solutions of the µ and ν RGEs for S and f ⊥ are:

JHEP04(2018)149
where each pair of equalities accounts for two, equivalent paths for RG evolution in the two-dimensional µ, ν-space (see figure 1). The evolution kernels U S,f in the µ direction are: Note that the µ anomalous dimension for f ⊥ in eq. (2.14b) does not have a log of µ in its cusp anomalous dimension term, so no K Γ term appears in its evolution kernel U f in eq. (2.20b). Meanwhile, the ν evolution kernels V S,f are given by integrals over ν of eq. (2.18),

RG evolved cross section
We can now put these pieces together to express the cross section eq. (1.8) in terms of the hard, soft, and beam functions evolved from their natural scales where logs in each are minimized, and thus logs in the whole cross section are resummed: in which we observe that the explicit dependence on the arbitrary scales µ and ν has exactly canceled out, leaving only the dependence on the natural scales µ H,T,L,b and ν L,H where JHEP04(2018)149 the hard, soft, and beam functions live. Note U C 2 t , K γ C 2 t are present only in the case of the Higgs.
In eq. (2.24), we envision that the rapidity evolution takes place at (or around) the scale 1/b 0 (see figure 1). Then we can actually just expand the evolution factor η Γ (1/b 0 , µ L ) and the rapidity anomalous dimension γ RS in a fixed-order expansion in α s (µ L ), to the order required for N k LL accuracy. This is in fact what we will do below. Then it becomes useful to split up U tot in eq. (2.24) into two factors, which are of course just U H (µ L , µ H )U C 2 t (µ L , µ T ) and V S (ν L , ν H ; µ L ) as given by eqs. (2.5), (2.10), and (2.21a). For brevity in the rest of the paper we will just use U, V in eq. (2.26).
Inside V in eq. (2.26), we use the fixed-order expansion of γ S ν (µ L ) given in eq. (2.18) using the expansions eq. (A.8b) for η Γ and eq. (A.22) for γ RS : In practice we truncate this expansion at the appropriate order of logarithmic accuracy. We will always pick µ L in such a way that none of these generate large logs (either µ L ∼ 1/b 0 in b space, or in momentum space in such a way that they remain small after inverse transformation -see section 2.3.5), except the factor of ln(ν H /ν L ) in eq. (2.26). This is an observation that will become key below, when we split γ ν into two separate parts in section 2.3.3.

How to choose the scales?
To evaluate the cross section eq. (2.22) (and its inverse Fourier transform back to momentum space eq. (1.6)) explicitly, we need to make explicit choices for the scales µ H,L and ν L,H between which to run in eq. (2.24). Choosing these near the scales at which the logs in each individual function are minimized in principle achieves resummation of all large logarithms. However, these natural choices are different in impact parameter and momentum space.

JHEP04(2018)149
There are various possible ways in which this resummation can be handled. In this paper, we envision, in eqs. (2.25) and (2.26) and figure 1, running the hard function in µ to the natural low scale of the soft and collinear functions, and the soft function in ν to the natural rapidity scale of the TMDPDFs. The high scales µ H and ν H for the running of the hard and soft functions are unambiguously best chosen near the invariant mass ∼ Q of the gauge boson. The choices of the low scales µ L and ν L are under debate since we can choose those scales either in b space or in momentum space.
The µ scales, like in a usual EFT, are a measure of virtuality of the modes that contribute to that function. For the hard function, this virtuality scale, not surprisingly, is the hard scale Q, which also happens to be the scale choice for which the logarithms in the hard function are minimized. The virtuality for soft and beam functions is of the order of the transverse momentum that the function contributes to the total transverse momentum. This can be seen in momentum space where the product of these functions in impact parameter space turns into a convolution over transverse momentum, eq. (1.1). Since the total transverse momentum is a sum over the transverse momenta contributed by each function, for a given total q T , the contribution of any one of these functions traverses a range of scales. While this situation is not unique to this observable, what is different is the dependence of the TMDPDF on the hard scale Q. As we will see, due to this Q dependence, the conjugate natural scale to b 0 in the resummed result is no longer q T but is shifted away from q T towards Q.
However, the final aim of any resummation is to have a well behaved perturbative series. Whenever the fixed order logs become too large, the expansion in α s does not converge, and it becomes necessary to reorganize the series in terms of resummed exponents. A successful resummation is then one in which the fixed order terms that are left behind form a rapidly converging series in α s . Since the large logarithms are, in fact, the terms that spoil the convergence of the fixed order perturbative series, the general strategy is then to minimize the effect of these logarithms in the residual fixed order series.
Keeping these issues in mind, we explore two possible sets of scale choices for µ L , ν L for resummation: the standard choices in impact parameter space in section 2.2, and a new proposed set of choices in section 2.3 allowing evaluation of the resummed cross section in momentum space.

Scale choice in impact parameter space
To choose scales for resummation, we need some idea about the natural scales at which each of the three functions (hard, soft and TMDPDF) live. This is easily seen by looking at their behavior up to one loop. From the results given in appendix A, we find that each of these functions are function of the logs, given in impact parameter space for S, f ⊥ . In this space, it is perfectly evident from the fixed-order calculation that the natural scale which minimizes all the logs in µ for the soft and beam functions is µ = µ L ∼ 1/b 0 . Since the final cross section at a given q T involves

JHEP04(2018)149
an integral over a range of b, the scale choice is in fact spread over a range of scales. This is to be expected from the earlier discussion of their being no unique physical scale for the soft and beam functions. The natural scales for the various functions then are µ = µ H for the hard function, (µ, ν) = (µ L , ν L ) for the soft function and (µ, All the logs can then be resummed by running the hard function from the scale Q to 1/b 0 and the soft function in ν from Q to 1/b 0 . This will produce the result (for the central values, not counting scale variations) of the CSS formalism. Therefore, this scheme resums logarithms of the form ln(Qb 0 ). The power counting adopted for this resummation then is straightforward since there is only one type of log. It is usually chosen as α s ln(Qb 0 ) ∼ 1. Leading log (LL) accuracy then resums α n s ln n+1 (Qb 0 ), with NLL and NNLL down by one and two powers of the logarithm respectively.
Since the lower scales µ L are chosen in b space, the cross section involves an inverse Fourier transform over arbitrarily large values of b, so eventually we hit the nonperturbative scale which manifests itself in the form of the Landau pole: α s (1/b 0 ). This corresponds to the fact that the beam and soft functions can contribute arbitrarily small values of transverse momentum even when the total transverse momentum is perturbative. This is usually handled by putting a sharp or smooth cutoff in b space which provides a way to model nonperturbative physics [33][34][35][36][37]. The impact of these nonperturbative effects will be discussed in section 4.
The obvious advantage of this scheme is that the power counting is unambiguous and we can guarantee that with the central values of scale choices in b space, all the logs in the residual fixed order series are set exactly to zero. As far as the choice of central values is concerned, the terms that are resummed are exactly equal to the CSS resummation formalism [10]. However, due to the introduction of the new rapidity renormalization scale ν, there is much better control over which terms can be included in the exponent and which terms remain in the fixed order [28,29]. This directly translates into a much better estimates of error due to missing higher order terms.
Another advantage of having control over what exactly goes in the exponent is when we match the resummed cross section to the fixed order cross section at large q T . To maintain accuracy over the full (perturbative) range of q T (Q ≥ q T Λ QCD ), we need to turn off resummation at the value q T where the nonsingular contribution is the same order as that of the singular one. Due to the two independent scales µ, ν available, this can be done very easily by using profiles in these scales to smoothly turn off the resummation and simultaneously match onto the full (including non-singular pieces) fixed order cross-section. This technique was implemented for the Higgs transverse spectrum in [32] to obtain the cross section to NNLL+NNLO accuracy. In this paper, for the purposes of comparison with other resummation schemes, we present the results for the cross section at NNLL accuracy for both the Higgs and DY using this scheme (figure 2).
After having decided on the central values, we next need to estimate the size of higherorder perturbative corrections we have missed by using scale variations. This is accomplished by varying the two renormalization scales µ and ν independently as detailed in [32]. Since we resummed and chose scales in b space, our final result involves an inverse Fourier JHEP04(2018)149 Figure 2. Result of b-space resummation for Higgs production and DY, setting the central values for the low scales for resummation at µ L , ν L = 1/b 0 and varying around these by factors of 2 to obtain the uncertainty bands. The plots in momentum space are obtained by an inverse Fourier transform., which hits a Landau pole for large b, which must thus be cut off, see figure 3. The uncertainty for the DY curve is actually thus underestimated due to an ambiguity in this cutoff, see also figure 3.
transform over the resummed b space result: It turns out the cusp anomalous dimension for DY (Γ 0 = 4C F ) is much lower in magnitude than that for Higgs (Γ 0 = 4C A ). This results in a much lower damping effect at large values of b, see figure 3. The plot shows the b space integrand K(b) for q T = 5 GeV. The divergence beyond b = 3 in the left-hand plot in figure 3 is the Landau pole. It is clear that there is only a narrow window of stability before we hit the Landau pole. The situation worsens when try and do a scale variation about the central choice µ L ∼ 1/b 0 , specifically µ ∼ µ L /2, as shown in the right-hand plot of figure 3. What this does is to bring the Landau pole closer by factor of 2, in which case there is no clear separation between the perturbative and nonperturbative regimes. It is then unclear how a hard cut-off or even a smooth one would give an accurate estimation of the perturbative uncertainty band. The error bands in figure 2 for the case of DY at NNLL, therefore cannot be generated meaningfully via this scale variation. For the purposes of error estimation then we were forced to make an educated guess for the upper boundary of the error band at NNLL. Clearly, we would like to find a procedure that does better.

Resummation in momentum space
The Landau pole in the b-space resummation scheme above comes about due to the running of the strong coupling α s all the way down to 1/b 0 which goes all the way down to Λ QCD . A natural question to ask, then, is if we can avoid the Landau pole by choosing the µ scale JHEP04(2018)149 On the left is the result of the central value µ L = 1/b 0 for the low virtuality scale, for which a somewhat stable plateau exists before hitting the Landau pole above b ∼ 3 GeV, allowing imposition of a cutoff to which the result is insensitive. On the right is shown the result of varying this scale down by a factor of 2, in which case the pole moves to lower b, and no stable region for a reasonable cutoff exists.
in momentum space. It is clear from the above discussion that we cannot choose a single scale in momentum space which will put all the logs to 0. However, what we can ask is whether it is possible to make an appropriate choice for µ, ν directly in momentum space such that the resummed exponent, on an average, minimizes the contribution from the residual fixed order logs. These small logs, though nonzero, can then be included order by order ensuring that they only contribute to the same order as the error band.

Leading logs
As before let us assume the power counting α s ln(µ H /µ L ), α s ln(ν H /ν L ) ∼ 1, where µ L , ν L will be our choices of the renormalization scales in momentum space. According to this power counting then, at leading log (LL) we wish to resum terms of the form α n s ln n+1 ( µ H µ L ) in the exponent of the cross section. We also assume that the residual logs of the form ln(µ L b 0 ) are small (of O(1)) and need not be resummed. The resummation then involves running just the hard function from µ H ∼ Q to µ L . All the residual fixed order logs as well as logs in ν are then subleading at this order. The cross section we get is where we can obtain U LL H from eq. (2.5). This is highly singular and gives a trivial result at nonzero q T . This suggests that the supposedly higher order pieces that we are ignoring are not unimportant. Let's look at the fixed order pieces left over at one loop: 31) The biggest term here appears to be ln(Q/µ L ) ln(µ L b 0 ). According to our naive power counting, this term is subleading at LL and hence should be ignored.

JHEP04(2018)149
Going to momentum space, this term gives us ln( Q µ L )/q T , which is in fact the leading logarithmic term in the fixed order cross section at nonzero q T . So then it appears that we haven't resummed any of the logs which contribute at nonzero q T , which makes sense since we get a trivial result at nonzero q T . We can then conclude that the LL cross section in eq. (2.30) with this power counting only gives us the result at zero q T . We will have to go to NLL to have any handles to "fix" it.

Next-to-leading logs
The next logical step is to go to NLL. We first update the hard function to resum all logs of the form α n s ln n (Q/µ L ). However this by itself is not useful since it will lead to the same problem as for the LL case in that it will only contribute at zero q T . But power counting demands that we also run the soft function in ν from the scale ν H ∼ Q to ν L ∼ µ L . Using eq. (2.18), We can then use the leading term of this anomalous dimension to resum the soft function: where ω s = Γ 0 αs 2π ln ν H ν L . This result works for q T > 0. Clearly then we still have a singularity in the cross section at ω s = 1. As was noted in earlier papers [29], this is a divergent series. The reason for this is that the logarithms in b space of the form ln n (µb 0 ) do not translate directly to logarithms of ln n (µq T ). The simplest example of this is the inverse Fourier transform of ln(µb 0 ). This gives us a term proportional to the plus distribution function L 0 = 1 µ 2 µ 2 /q 2 T + as defined in [29]. For nonzero values of q T , this is simply 1/q 2 T . The same thing happens for higher powers of logarithms. For example ln 2 (µb 0 ) also gives a term proportional to L 0 along with terms which go like ln(µ/q T ). So an all order summation of the logarithms in b space eventually adds up all of these pieces (whose coefficient is controlled by ω s ) which leads to a divergence for sufficiently large ω s . So this step softens the singularity somewhat moving it away from q T = 0, so that we at least have a nonzero cross section for nonzero q T . However, the result is still singular. The singularity results from the single logarithm of ln(µ L b 0 ) in the exponent which diverges at very small values of b. We would have expected that the integrand would receive its dominant contribution from the region b ∼ 1/q T . Instead, as it was noted in [29] it is the UV region b ∼ 1/Q which appears to dominate the integral.
Since our resummation kernel in b space is unstable, we cannot yet assign a power counting to the residual logs. This is because the fixed order logs (of the form α n s ln m (µ L b 0 )

JHEP04(2018)149
are weighted by this exponent in the inverse Fourier transform. So before we can talk of power counting for these logs, we must stabilize this kernel such that the region b ∼ 1/q T dominates. Clearly, to cure this singularity (which is a UV singularity) we need an even powered logarithm ln 2m (µb 0 ) with a negative coefficient in the soft resummation exponent. So we move ahead and attempt to see if we can resum the next biggest term (which would technically be subleading at this order) ∼ α s ln 2 (µ L b 0 ). There are two places we can find such a double logarithm, one from the second term in eq. (2.32), which is part of the 2-loop rapidity anomalous dimension, and another from the fixed-order soft function at 1-loop, see eq. (A.24), or eq. (2.31), which includes the beam function contribution. Since the term in eq. (2.32) is already part of the RRG exponent, including it just means tacking on a subleading term in the exponent, which we are free to do at NLL order. As for the fixed-order term in eq. (A.24), while standard resummation schemes tell us to truncate the logs in the soft function S(b, µ L , ν L ) at fixed order, these logs in the soft function, though not large, do actually exponentiate, see eq. (A.21). The higher order logs are subleading at NLL, but again, we are free to include them, either in fixed-order or exponentiated form. Specifically, together with standard soft exponent at NLL, the terms we want to put in the total soft exponent are: The typical default choice for the low rapidity scale is ν L = µ L . We imagine making a choice near this scale. But we will rescale it so that the extra ln 2 (µ L b 0 ) terms in eq. (2.34) get automatically included in the standard soft NLL exponent. We now attempt to reproduce the above terms in the pure NLL soft exponent, with a modified scale choice ν * L : This can be done with the scale choice: The value of p that allows us to obtain the double log terms in eq. (2.34) is determined by comparing them with the exponent in eq. (2.35): By comparing above equation to eq. (2.34) we find

JHEP04(2018)149
This ensures that we have now resummed all the terms of the form α s ln 2 (µ L b 0 ). (We assume one did not choose the default ν L scale on the right-hand side of eq. (2.36) with nontrivial b dependence.) Since Γ 0 > 0, the double log now provides the necessary stability to the exponential kernel in b space (at both large and small values of b). With this term in place, we can now talk of a systematic power counting for the fixed order logs, which, hitherto, was not meaningful. So we now adopt the usual power counting that ln(µ L b 0 ) ∼ 1, i.e., this log is small. We still need to confirm numerically, that the fixed order logs that remain , when integrated against our exponent in b space, are actually small so that our series is well behaved perturbatively. With this power counting in place, our result for the resummation at NLL then looks like dσ At NLL, all fixed order logs are subleading and hence not included. We will find below that not only does the quadratic ln 2 µ L b 0 term in the exponent of the b integrand in eq. (2.39) introduced by the scale choice eq. (2.36) make the integral converge for both small and large b, it actually makes it integrable analytically (after a very good numerical approximation for the Bessel function). We will describe this in detail in section 3. First, we explore how to generalize the above-described procedure beyond NLL.

NNLL and beyond
At NNLL and higher order, we have some freedom in choosing how to update the accuracy of the rapidity evolution kernel in eq. (2.26). In this subsection, we will use this freedom to look for a way to preserve both the stable power counting of fixed-order logs of µ L b 0 after integrating over b as well as our ability, in section 3, to evaluate that b integral (semi-)analytically.
A simple, standard, choice would simply be to include the next order terms in the rapidity anomalous dimension eq. (2.27), e.g. the O(α 2 s ) terms at NNLL, while keeping the scale choice ν * L in eq. (2.36) for the soft rapidity scale. There are two somewhat undesirable consequences of this choice, however. First, as we noted at the end of section 2.3.2, an exponent in the b integrand that is quadratic in ln µ L b 0 will make the integral analytically computable, to a very good approximation to be described in section 3-but not higher powers of ln µ L b 0 , which will begin to enter starting at N 3 LL order in eq. (2.27). Second, starting at NNLL order, maintaining the scale choice ν * L would put some terms into the exponent twice, namely, the Z S Γ 0 β 0 term in O(α 2 s ), once from the shift from ν L to ν * L eq. (2.37) in the O(α s ) term of the exponent, and once from updating the anomalous dimension appearing in eq. (2.26) with the two-loop value in eq. (2.27). Of course, this is compensated by the shift from ν L to ν * L in the fixed-order soft function. From eq. (A.24):

JHEP04(2018)149
where we see that the one-loop double log of µ L b 0 has canceled -the scale choice ν * L has promoted this term to the exponent in eqs. (2.26) and (2.27). The added O(α 2 s ) term subtracts off (at fixed order) the corresponding Γ 0 β 0 term that was double-added to the exponent at NNLL. While this is acceptable as far as power counting goes, it seems awkward to have this term double-counted in the exponent by itself. Similar observations apply to additional terms at higher orders.
There are a number of ways to avoid these problems, while maintaining the desirable quadratic ln 2 µ L b 0 terms in the exponent of the b integrand of the cross section. One possibility would be just to revert back to the ordinary scale choice ν L from ν * L beyond NLL, but for meaningful comparisons between orders of accuracy, we should maintain the same scale choice as we increase accuracy. Moreover this solution would not prevent higher than quadratic power terms in ln µ L b 0 from entering the exponent, which will spoil our analytic integration below. Since the choice ν * L is needed at NLL to stabilize the b integral in eq. (2.39) and restore a meaningful resummed power counting, we will go ahead and keep it beyond NLL as well. We will then need a prescription for how to update the rapidity evolution kernel in eq. (2.37) that respects the stabilization of the b integral at NLL, without introducing unwanted double counting of terms or higher powers of ln n>2 µ L b 0 as discussed in the previous paragraph.
Another possibility, then, is to just keep the NLL part of the exponent in the rapidity evolution kernel eq. (2.35), and expand out the NNLL and higher-order parts in α s , i.e.
At NNLL we would keep the second term of order α s in the bracket, and at N 3 LL two more terms of order α 2 s would be included, etc. This is not insensible, as the rapidity kernel in eq. (2.26) contains only a single large log (ln ν H /ν L ) multiplying the whole ν-anomalous dimension. Thus, while the NLL terms are all order α s × 1/α s ∼ 1 in log counting and should be exponentiated, the NNLL O(α 2 s ) part of the ν-anomalous dimension and higherorder terms are all truly suppressed by additional powers of α s . This is in contrast to the µ evolution kernels, e.g. eqs. (A.7a) and (A.8a), where there are infinite towers of terms of the same order in log counting, because terms at higher powers in α s are multiplied by large logs of µ/µ 0 . This is not the case in eq. (2.27), since higher order terms in α s generated by µ running do not come with large logs -we are doing the rapidity evolution at a scale µ L ∼ 1/b 0 , generating only small logs of µ L b 0 . All the effects of µ running between widely separated scales and their associated large logs are contained in U in eq. (2.26).
However, we do not need to be so draconian in truncating the terms we could resum using the RRG kernel in eq. (2.26). The terms in eq. (2.41) that we either exponentiate or truncate at fixed order are basically the same order as terms in the fixed-order expansion of the soft function S(b, µ L , ν * L ) given by eqs. (A.23) and (A.24) that are kept at each order of logarithmic accuracy, so there is a freedom in choosing which terms in the rapidity anomalous dimension eq. (2.27) and the soft function eq. (A.23) we will exponentiate or JHEP04(2018)149 leave in a fixed-order expansion, to obtain desirable behavior of the b-space integrand in eq. (1.6).
Let us then use this freedom to divide the terms in the rapidity anomalous dimension eq. (2.27) into two parts, those that we will exponentiate and those that we will expand out in fixed order. Namely, we will exponentiate all the pure Γ n and γ n RS anomalous dimension terms; these are at most single logarithmic in µ L b 0 , and the shift from ν L to ν * L introduces the double logs 1 2 Z S Γ n ln 2 µ L b 0 in the fixed-order soft function S associated with Γ n , see eq. (A.24); as well as (part of) the ∼ Z S Γ n β 0 ln 2 µ L b 0 term in the rapidity anomalous dimension eq. (2.27), which stabilize the b-space integrand. The remaining terms will be expanded out in α s , and these are all associated with all the beta function terms coming from α s running of the "pure" Γ n and γ n RS terms. Concretely, we split the rapidity evolution kernel in eq. (2.26) into: which remains exponentiated and contains all the "pure" anomalous dimension terms, and which is the fixed-order expansion of the part of the rapidity evolution kernel eq. (2.26) coming from all the remaining (β n ) terms in eq. (2.27) that are not included in eq. (2.43). This division eq. (2.42) ensures that the exponentiated part of the rapidity evolution kernel contains at most double logs of µ L b 0 after the shift from ν L to ν * L , and also avoids double counting of the β 0 induced terms in eq. (2.27) in the exponent as described above.
We can give a more formal definition of the two factors V Γ and V β in eqs. (2.43) and (2.44). Since they are built out of pieces of the anomalous dimension γ S ν (µ L ) in eq. (2.27), we go back to its all orders expression given by eq. (2.18): and divide up each term into pieces containing just the "pure" anomalous dimension coefficients and those generated by beta function terms. For γ RS this is easy: The first piece in the last line of eq. (2.46) contains those terms in the anomalous dimension that would survive in the conformal limit of QCD, where α s does not run. The second set of terms, in ∆γ RS contain all the beta function induced terms. For example, the ratio 1/r has the fixed order expansion up to O(α 2 s ): 48) and the "1" term is subtracted off in eq. (2.46), leaving just the β i terms. The similar thing happens for 1/r n+1 . We can similarly split up η Γ into two pieces. To all orders in α s , η Γ (1/b 0 , µ L ) is given by eq. (A.4), and is expanded out in fixed orders in eq. (A.8b). We want to split up η Γ into the "pure" Γ n ln µ L b 0 pieces along the diagonal of eq. (A.8b), and the remaining β i induced terms. We do this by very straightforwardly defining: contains the "pure" anomalous dimension terms in η Γ (the ones which would survive in the conformal limit), and ∆η Γ is given simply by We will keep η conf. Γ exponentiated part of eq. (2.42), while ∆η Γ will go into the part expanded in fixed orders in α s . The explicit expansion for ∆η Γ is of course given by eq. (A.8b) with the diagonal terms deleted, or can be worked out to all orders in α s (up to NNLL accuracy) from the expression in eq. (A.4). The corresponding expression for ∆η Γ up to terms of NNLL accuracy is then: where we notice the subtraction terms at the end of each line just modify the pure anomalous dimension Γ i /Γ 0 pieces, as designed. Note the following properties of the expanded

JHEP04(2018)149
functions of r = α s (µ L )/α s (1/b 0 ) that appear in each line of eq. (2.52): etc. So the remaining terms in the expansion of ∆η Γ in eq. (2.52) contain only the β i induced terms, that we want to put in to V β in eq. (2.42), as designed.
With the splitting up of terms in eqs. (2.46) and (2.49), we can formally define the two pieces V Γ , V β into which we have split the rapidity evolution kernel in eq. (2.42): indicating that V β is to be truncated to fixed order in α s according to table 1. This defines the "V β -expansion" we first mentioned in section 1.3. In this paper we shall not need it beyond the O(α 2 s ) terms given in eq. (2.44). Our master expression for the resummed cross section, using the scale choices and prescriptions we have explained above, is then: where U in eq. (2.26) and V Γ given by eq. (2.43) or eq. (2.54) are exponentiated: The β-function dependent part of the RRG kernel V β is also truncated at fixed order in our scheme, and is defined in eq. (2.54) and given by eq. (2.44) up to O(α 2 s ), the highest order we shall need in this paper.
In order for eq. (2.55) to successfully resum large logs of scale ratios, we recall that the scales µ H , µ L and ν H , ν L should be chosen near the natural scales of the respective hard, soft, and beam functions. µ H and ν H should be chosen ∼ Q. We expect ν L to be chosen ∼ µ L , and then shifted to ν * L according to eq. (2.36) to introduce the quadratic damping factor in the b exponent. As for µ L itself, it should be chosen ∼ q T in momentum space, although we will explore in section 2.3.5 a modified choice for this scale that better preserves stable power counting. For now, it remains a free scale. Table 1. Order of anomalous dimensions, beta function, and fixed-order functions (hard, soft, TMDPDF, and V β in eq. (2.54)) required to achieve N k LL and N k LL accuracy in the resummed cross section eq. (2.55).

Truncation and resummed accuracy
Here we summarize the rules for how to truncate the various objects in the full resummed cross section eq. (2.55). These rules are mostly standard and well known, but with our introduction of the division of the RRG kernel into exponentiated and fixed order pieces V Γ × V β , it seems prudent to restate these rules here. These are given in table 1.
We should note here that our choice of terms to group into the exponentiated part V Γ and those expanded in fixed order in V β in eq. (2.55) is not unique. Terms in V Γ and V β in eqs. (2.43) and (2.44) can be shifted back and forth by a different choice of prescription, and different choices of scales (such as ν * L in eq. (2.36)). Indeed, with our choice of the split between V Γ and V β and the scale ν * L , the rapidity evolution exponent contains only a subset of the terms in the full rapidity anomalous dimension eq. (2.27)-but the subset that allows an analytic evaluation of the b integral, as we will show in section 3. One may very well make a different set of choices that put a different set of terms in the exponent, based on a different set of desired criteria. This freedom is allowed by the presence of only a single large log in the RRG kernel in eq. (2.26) at the virtuality scale µ L . We present our particular choice as just one such example. We advocate that readers make use of the freedom to choose the scales ν L,f and µ L,H in eq. (2.55), either before or after b integration, along with the organization of terms in the RRG kernel eq. (2.42) into exponentiated and fixed-order parts (beyond NLL) to achieve their desired properties and results for the resummed momentum-space cross section. 4 The way we have organized the resummed cross section eq. (2.55), all logs of µ L b 0 are contained in the exponent of V Γ , and in fixed-order terms in V β , S, f ⊥ i . Furthermore, the power of ln µ L b 0 in the exponent eq. (2.56) with the scale choice ν * L is at most quadratic to all orders in α s . Our choices to arrange this property are motivated, as we will see later, by the fact that it enables us, using some approximations, to obtain an analytical expression for the b space integral. This is only possible as long as the quadratic nature of the exponent holds. At N 3 LL (or NNLL ) and higher order, the fixed-order coefficients V β , S, f ⊥ i contains other powers of ln µ L b 0 , but the contributions of these fixed-order logs can be dealt with analytically as well, as long as the exponent is no more than quadratic in ln µ L b 0 .

JHEP04(2018)149
2.3.5 Stable power counting in momentum space and the scale µ L We have not yet specified exactly what we will choose for the scale µ L . All of the above arguments are contingent upon the fact that our power counting ln(µ L b 0 ) ∼ 1 holds. What this means in momentum space is that after performing the b integral in eq. (2.55), fixedorder logs in the integrand of the form ln n µ L b 0 do not generate parametrically large terms after integration. This, it turns out, depends on what we pick for the scale µ L . Let us see what µ L we should choose for this power counting to remain true. It turns out that the "obvious" choice in momentum space µ L = q T is not always a very good choice for minimizing the contributions of the fixed order logs. This is not too surprising, since the kernel against which they are integrated in transforming back to momentum space is no longer a simple inverse Fourier transform with a single scale q T . Instead it involves an exponent which is also a function of the high scale. It is then natural that the scale at which the logarithms are minimized (if not put exactly to 0) is shifted towards the high scale. This effect is particularly pronounced at low values of q T .
For the rest of the fixed order logarithmic pieces, we then check what value of scale will minimize the contribution to the cross section ( 10%). Here, the nature of the resummed kernel can help us. The soft exponent in b-space provides damping at both small and large values of b which, combined with the Bessel function, gives an integrand which has the form of damped oscillations. Then it is reasonable to assume that the most of the contribution to the integral is coming from the around the region of the first peak. A ballpark choice for the scale µ then is 1/b * , where b * is the scale at which the resummation kernel has the first peak. Since the hard kernel is independent of b, we only need to consider the soft resummation. A straightforward analysis of the b-space integrand then gives the following condition for the peak J 0,1 (x) are the zeroth and first order Bessel Functions of the first kind respectively. We now set b * = 1/µ L and we can further simplify the expression above by keeping the dominant terms.
This can be solved numerically to obtain the scale µ L . We can also confirm this by checking the contribution of the leading term α s ln(µ L b 0 ) at O(α s ) and cross checking it against the next biggest piece α 2 s ln 2 (µb 0 ), α 2 s ln 3 (µb 0 ) at O(α 2 s ). While we are currently keeping only one-loop fixed order pieces in our NNLL cross section, we can always include the two-loop pieces α 2 s ln n (µb 0 ) which are known to us at NNLL (this would be part of the NNLL cross section). Figure 4 looks at the percentage contribution of the fixed order logs when integrated against the NNLL exponent in b space. This particular plot is for the Higgs q T distribution for q T = 10 GeV. It is pretty clear that µ = q T is a poor choice for µ and that a good choice would be somewhere between µ ∼ 20 GeV for the contribution from the fixed order logs to be ∼ 2%. In comparison, the value predicted by eq.   is 22 GeV. Figure 5 gives the scale choice for Higgs and DY (Choosing a common value of Q = 125 GeV and √ s = 13 TeV. For low values of q T , the scale is shifted away from q T toward Q as expected. The shift is far more pronounced for Higgs than for DY since the cusp anomalous dimension for Higgs is much larger so that the soft exponent has far more impact on the shape of the b-space integrand. At large values of q T , the scale is more or less q T . The key point to notice here is that apart from the single log α s ln(µ L b 0 ), which we will include in the fixed order cross section at NNLL, the contribution from the higher order logs shows a flat behavior at values of µ ≥ 15 GeV for q T = 10 GeV in figure 4. So the result is insensitive to the choice of µ L as long it is chosen greater than this threshold value.
Using these scale choices, we now obtain the transverse spectrum, again at NNLL (figure 6). The uncertainties are obtained by scale variations described in eq.  No cutoff of the b integral is required, and the µ L scale can be varied in its full range from µ L /2 to 2µ L to obtain the uncertainty, without hitting a pole as in figure 3. These plots are obtained by performing the b integral numerically. We stop plotting at low q T where true nonperturbative effects will enter.

Explicit formula for resummed transverse momentum spectrum
In this section we will provide an expression for the transverse momentum spectra of gauge bosons that is analytic in terms of its dependence on all the kinematic variables. It requires a numerical but efficient approximation for the Bessel function in the b integral in eq. (1.6) or eq. (2.55).
We can write the resummed cross section eq. (2.55) in the form where the we have isolated the terms in the b integral, grouping together the terms in the integrand that are to be expanded in fixed order in α s (µ L ), separating them from the exponentiated V Γ factor. The V Γ factor is explicitly to all orders in α s , plugging in eq. (2.36) for ν * L in eq. (2.56),

JHEP04(2018)149
which can always be written in the form where C, A, and Ω are independent of b and thus constants as far as the integral I b in eq. (3.2) is concerned. Explicitly, These show all the dependence on the scales and on the anomalous dimensions contained in V Γ . They are to be truncated to the order in resummed accuracy we intend to work (which we show at NLL and NNLL in appendix A in eqs. (A.38) and (A.39)).
Thus we now just have to figure out how to integrate the Gaussian V Γ in ln b in eq. (3.5) against the Bessel function in the integral I b in eq. (3.2). We will encounter integrals of the form where the factors ln k (µ L b 0 ) come from the fixed-order factor F in eq. (3.3). We will first focus on the case where F = 1 (e.g. at NLL) and compute I 0 b , and then discuss below how to generate the terms I k>0 b resulting from integrating the fixed-order terms containing logs of µ L b 0 against the rest of the integrand.
In the next two subsections we shall develop a method to evaluate I 0 b semi-analytically -with a numerical series expansion for the Bessel function but deriving the analytic dependence of eq. (3.7) on all relevant physical parameters including q T . Then in section 3.3 we shall show how to obtain arbitrary I k b from derivatives on I 0 b .

Representing the Bessel function
The main issue in doing the integrals in eq. (3.7) analytically is the presence of the Bessel function inside the integrand. The exponent, in our scheme, has terms only up to the quadratic powers of ln(µ L b 0 ). Analytic integration is then possible if we can approximate the Bessel function using a series of simpler functions, such as polynomials, which can be easily integrated given the quadratic nature of our exponent. However, a simple power series expansion fails to reproduce the Bessel function in the region, where it contributes to the integrand. This is basically the region up to b ∼ 2 GeV −1 . This is because the argument of the Bessel function is bq T and at larger values of q T (≥ 10 GeV), the power series expansion is no longer useful. We can possibly switch to the large b q T asymptotic form in terms of the cosine function, but then analytic integration is again not possible. An alternative way then is to rewrite the Bessel function in terms of an integral representation, so that the b space integrand can them be done exactly. This automatically JHEP04(2018)149 rules out any representations in terms of trigonometric functions, and given the discussion earlier, the most expeditious choice is if a polynomial representation can be used. One choice that we will find amenable to a polynomial expansion is the Mellin-Barnes type representation of the Bessel function, where the contour is to the left of all the non-negative poles of the Gamma function, i.e., c < 0. We give a proof of this identity in appendix E.1.
Going back to our all-orders b space integral I 0 b given by eq. (3.7), it now looks like (3.9) Since we do not have a Landau pole, we can extend the limit of integration in b space all the way to infinity, in which case, the b integral is now in the form of a simple Gaussian integral and admits the analytical result: Let's examine the Gaussian exponent in this integrand. A simple rearrangement gives us where t 0 = −1 + AL, which is a saddle point for this Gaussian and lies on the real line. In some sense, the t space integral is a dual of the b space integral since the degree of suppression inverts itself from one space to another. So then in a region of large A, we should stick to the b space integral, fit a polynomial to the Bessel function (which will now work since the integrand is highly suppressed). On the other hand for small A (which would be relevant for most perturbative series), we should go to t space.
If we parametrize the contour as t = c + ix, we have It is clear that the path of steepest descent passes through this saddle point (c = t 0 ) and is parallel to the imaginary axis. One can then consider doing a Taylor series expansion of the rest of the integrand along this path around the saddle point, truncating the series after a finite number of terms.

JHEP04(2018)149
The primary difficulty with a polynomial expansion is that to have percent level accuracy that we desire for this integral, we need to have a good description of f (t) out to x l ∼ 2A ln (10) at which value the exponent drops to 1% of its value. The factor of A is essentially the cusp anomalous dimension, (for e.g., it is 2α s C A /π for the Higgs) which is a small number. Considering the worst case scenario A ∼ 0.5, we would need x l ∼ 1.5, which clearly cannot be accomplished using a Taylor series expansion because the radius of convergence is around 1. We will, instead, find an expansion for f (t) in the next section in terms of (Gaussian-weighted) Hermite polynomials that performs quite well with just a few terms. There a few customizations of this expansion we will make to optimize fast convergence. Our particular procedure presented here is by no means unique, and we give a couple of alternative methods for expanding and approximating the integrand of I 0 b in appendix D. There are, undoubtedly, others that would work also.

Expansion in Hermite polynomials
One of the difficulties with finding a series expansion of f (t) in eq. (3.13) is that it grows exponentially for large |x|, with t = c + ix. We can factor out this exponential growth by using Euler's reflection formula: . (3.14) Then The exponential behavior for large |x| is now factored out in the sine function in front, and we can focus on finding a good expansion for Γ(−t) 2 . The integral eq. (3.12) is then: The sine can be written in terms of exponentials, which shift the linear and constant terms in the Gaussian exponential. It is straightforward to work out that the result is By changing variables in the second line from x → −x, we can write the result compactly as The remaining function Γ(−t) 2 is exponentially damped for large x. Indeed, Stirling's formula tells us that

JHEP04(2018)149
as |x| → ∞. The contribution of this exponential tail is further suppressed by the Gaussian factor it multiplies in eq. (3.18). On the other hand, near x = 0, the function Γ(−t) 2 itself closely resembles a Gaussian, times polynomials. To determine the curvature of the Gaussian, we look at the Taylor series expansion of Γ(−t) 2 near x = 0: It now remains to find a good series expansion for Γ(−c − ix) 2 that enables us to perform the integral in eq. This is the saddle point in the limit A → 0, i.e., when we are in the fixed order regime with the resummation turned off. This would still induce some imaginary part and hence oscillations in the exponent away from A = 0, but with a good expansion, this is not an issue. Thus one can try a series expansion for Γ(1 − ix) 2 in terms of Hermite polynomials H n (x) which form a complete orthogonal basis. They are well known, of course, but we nevertheless remind ourselves of the first several H n :

etc. They satisfy the orthogonality relation
We expand Γ(1 − ix) 2 in terms of these polynomials in the following way: We have introduced weighting factors e −a 0 x 2 and iγ E e −b 0 x 2 for the real and imaginary parts to help with faster convergence of the series, as they capture the behavior of Γ(1 − ix) 2 near x = 0, using the values of a 0 and b 0 obtained from the Taylor series expansion in eq. (3.20) at c = −1:

JHEP04(2018)149
Note that the real and imaginary parts of the l.h.s. of eq. (3.23) are respectively even and odd functions of x. Hence on the r.h.s. , we need even polynomials for the real part and odd polynomials for imaginary part. Although the relation eq. (3.22) would make it seem natural to pick α 2 = a 0 and β 2 = b 0 in the arguments of H n in the expansioneq. (3.23), we choose α, β to be floating, and will determine their optimal values to ensure the fastest convergence for this expansion. Using eq. (3.22), the coefficients in eq. (3.23) are given by These integrals still have to computed numerically, as far as we know, but note they are purely mathematical, having no dependence on any of our physical parameters, and need only be computed once. Thanks to the damped behavior of the integrand and the normalization factors in front, the expansion coefficients fall off fairly rapidly with n.
To make the series expansion well behaved, the width parameter of Gaussian functions should be positive definite: α 2 − a 0 > 0 and β 2 − b 0 > 0. These widths define the region of x where the function Γ(1 − ix) 2 is expanded in terms of the Hermite bases. For a narrow width, the series converges swiftly with n but is valid only in a narrow region around x = 0, while for broader width, the convergence is slower but the expansion is valid in a wider region around x. The Gaussian function in our integration in eq. (3.18) resolves the region |x − Aπ 2 | ∼ √ A and for the maximal value we encounter, A ∼ 0.5, the broadest region is up to |x| ∼ 1.5. The parameters α, β should be chosen so that the Gaussians in eq. (3.25) roughly match the width of this region and resemble Γ(1 − ix) 2 itself as closely as possible. We explored various values of these parameters such that the exponents α 2 − a 0 and β 2 − b 0 were between 1 and 10 and found that for small values ∼ 1 the convergence is slow and hence more terms are needed for an accurate description. For large values ∼ 10 the accuracy of the integration in eq. (3.18) is very good with a few basis terms but is not further improved by including higher order terms because the series expansion is resolving only a narrow x region compared to the one dictated by eq. (3.18). Empirical tests show that the series converges rapidly for α 2 − a 0 and β 2 − b 0 around 3 ∼ 5 while maintaining required accuracy over the range of x (from 0 to ±1.5) desired. Figure 7 shows the agreement between the exact result and series expansion up to 3 or 4 basis terms for the real (even) and imaginary (odd) parts, for: From staring at figure 7, one notices a residual deviation in the real part above x ∼ 1, which thus appears to be the potentially largest source of error from our method. However, the region of larger x in figure 7 is suppressed by the remaining Gaussian in eq. (3.18). The remaining deviation can easily be further suppressed if desired by including higher-order polynomials. In practice, at NNLL accuracy the cross section has ∼ 10% uncertainties, and the error due to our series truncation at n = 5 or 6 is significantly smaller than this perturbative uncertainty. This is clearly seen in figure 10, which shows the effect of increasing the total number of terms in the Hermite expansion from 6 to 7.
In terms of this expansion, the integration in eq. (3.18) is rewritten in terms of following basis of integrals where z 0 = Aπ/2 − i(c − t 0 ). For c = −1, z 0 = A(π/2 + iL). The integrals for odd n arising from the expansion in eq. (3.23) are obtained from eq. (3.28) with the substitutions α → β, a 0 → b 0 . The prefactors in front have been included in the definition of H n for later convenience. Now we go about evaluating analytically the integrals in eq. (3.28). There are a number of ways to do this, we choose one that seems particularly elegant.

Generating function method to integrate against Hermite polynomials
Using the generating function for Hermite polynomials, we can efficiently evaluate all the integrals H n in eq. (3.28) at the same time. By forming the infinite series,

JHEP04(2018)149
we are able to use the generating function relation eq. (3.29) to obtain a Gaussian integral on the r.h.s. By evaluating the integral on the r.h.s. and expanding the result back out in powers of t n , we will be able to obtain expressions for the individual H n .
Rescaling the integration variable and completing the square in the exponent on the r.h.s. of eq. (3.30), The exponent of the x-dependent Gaussian is complex, but the result of integrating it is just √ π (see eq. (E.5)). Thus, where we have expanded the exponential in t in eq. (3.31) in a Taylor series. We cannot directly read off the coefficients of powers of t in eq. (3.32) to obtain H n in eq. (3.30) due to the powers of the binomial in t, but using the binomial expansion and some reindexing, we can do so. The proof is given in appendix E.3, with the result: If one desires, eq. (3.33) can be written separately for even and odd n,

Explicit result of integration
Explicitly, the first several H n given by eq. (3.33), including H 0,1 given above, are given in eq. (E.14). The integral I 0 b in eq. (3.18) that we sought to evaluate in the first place is JHEP04(2018)149 then given explicitly by, for c = −1, where the coefficients c n are given by eq. (3.25). As many terms in the sum over n may be included to achieve the numerical accuracy desired. In practice, we include the first few terms in the sum over n (three or four even c 2n and three odd c 2n+1 coefficients), which gives us percent level accuracy for the cross section in the perturbative resummation region. Although the coefficients c n still need to be evaluated numerically, we note that they depend only on properties of the pure function Γ(1 − ix) 2 and need be determined only once (eq. (3.27)). Our Hermite expansion is applied only to this function, which arises solely from the Bessel function J 0 (z) which appears in the factorization convolution in eq. (1.6). The dependence on all the physical variables, such as q T , Q, and the scales µ L,H , ν L,H , enters through the evolution exponent in eqs. (3.5) and (3.6) and is captured analytically by eq. (3.37). For other processes or observables, the same basis and coefficients we used and resultant analytic integration eq. (3.37), should apply, though the number of terms needed to get good convergence (determined by the width of the evolution exponent eq. (3.5)) may vary.

Fixed-order prefactors in resummed expression
At NNLL (or NLL ) and higher orders, fixed order logarithmic terms of the form ln k (µ L b 0 ) appear in the prefactor F multiplying the resummed exponent in the integrand of eq. (3.2). Explicitly, F can be expanded: so that we can isolate integrals of each term in the form eq. (3.7). We can build each coefficient F In doing so we must take into account extra powers of ln µ L b 0 in the soft function due to the shifted scale ν * L = ν L (µ L b 0 ) −1+p in eq. (2.36) that we use. Then we have for the terms we need up to NNLL accuracy, at tree level:

JHEP04(2018)149
where i,ī = g for Higgs production and i = q for DY, and at O(α s ): Note that F vanishes due to the extra term from the shift ν L → ν * L since the double log term was promoted from the fixed-order soft function to the exponent in eq. (2.37), as designed (though this cancellation will no longer be exact once we implement profile scales in the next subsection). Up to NNLL accuracy, the only O(α 2 s ) terms we need come from the α 2 s piece of the soft function induced by ν L → ν * L , and the O(α 2 s ) terms in the V β piece of F in eq. (3.3), which from eq. (2.44) we see has the expansion The V (2) 3 coefficient actually just multiplies a small log ln 2 µ L b 0 , so strictly at NNLL we can drop it. Then the only relevant piece of F (2) we would need at NNLL accuracy is where the second term in F where we used eq. (3.6) and we have defined

JHEP04(2018)149
Using the final expression eq. (3.37) for I 0 b we can now write In the expression eq. (3.33) for H n , the variable χ only appears inside of which appears inside z 0 and H 0 in eq. (3.33). So we can writê To construct fixed order terms up to order k, we need up to k th derivatives of the L dependent pieces that make up I k b . There is very simple recursion relation satisfied by the derivatives of H n : which can be derived either from the original definition eq. (3.28) of H n , or its explicit allorders result in eq. (3.33). These proofs are given in appendix E.4. Derivatives to arbitrary order (∂ χ ) k H n can then be obtained by repeatedly applying this relation eq. (3.49): With these expressions, we can now write the resummed cross section eq. (3.1) in q T space in terms of the above results of integrating I b and I k b in eqs. (3.2) and (3.7),

JHEP04(2018)149
where the integrals I k b are given in final evaluated form in eq. (3.46) with the first few derivatives (∂ χ ) k H n given by eq. (3.50). The calculation of these integrals have formed the bulk of this section 3 and constitute one of the main results of this paper.
To turn eq. (3.52) into a final prediction for the perturbative q T cross section, we still need to match it on to the fixed-order prediction of full QCD for the large q T tail. We turn our attention now to this task.

The large q T limit
In the large q T ∼ Q limit the resummation is turned off and this corresponds to the A → 0 limit in the soft evolution of eq. (3.10). In this limit the resummed part of the cross section eq. (3.52) should reduce to a sum of the fixed-order singular terms in the cross section. Since we evaluate the I k b 's in eq. (3.52) using an expansion in Hermite polynomials giving eqs. (3.37) and (3.46), it is not obvious whether these expansions, when truncated to our desired number of terms, maintain their accuracy in the fixed order limit. We check the accuracy here. In the A → 0 limit, the integrals of the even and odd Hermite polynomials reduce to where f k is a linear combination of the coefficients c n in the series expansion: where we used the values in eqs. (3.26) and (3.27). We can compare to the result of taking the A → 0 limit of the exact expression for I k b in eq. (3.7) before expanding the Bessel function in a series representation. Using eqs. (3.7), (3.10), and (3.44), we obtain

JHEP04(2018)149
which we can re-express as a derivative with respect to t, where we also used eq. (3.47) in the last equality. eqs. (3.56) and (3.57) then allow us to make the replacement in eq. (3.55): Then, integrating repeatedly by parts in eq. (3.55), we obtain We can now take the A → 0 limit in eq. (3.59), and in so doing turn the Gaussian into a Dirac delta function: (3.60) recalling we pick c = −1 in t = c + ix, and where we used the Dirac delta identity: Thus, in the A → 0 limit, We compute the exact derivatives: Inserting eq. (3.63) into eq. (3.62), we have

JHEP04(2018)149
Comparing eq. (3.64) to eq. (3.53), we find exact values of f k which are 1,1, γ 2 E ≈ 0.33317 for k = 1, 2, 3 and the approximate values in eq. (3.54) agree better than 1% for k = 1, 2 and agree within about 15% for k = 3. Note that the term I b,k at k = 3 is the O(α 2 s ) contribution induced by our prescription in section 3.4 as it multiplies the coefficient F (2) 3 in eq. (3.43),which is suppressed by another order of α s at the cross section level. Taking this into account, the error in f 3 itself induces only an order of magnitude smaller error in the cross section, much smaller than the total theoretical error at NNLL accuracy, and so we need not be concerned about it. If desired, one can go beyond this accuracy by including higher-order Hermite polynomials in the computation of I k b , and thus of f k . This means that the fixed order series probes very narrowly the behavior of f (−1 + ix) near x = 0, and the more accurate our expansion is near x = 0, the higher the order in α s that we can reproduce accurately in the fixed order (high q T ) regime. Practically, we would like to reproduce the one loop fixed order cross section since we would be matching our NNLL predictions to that result in the high q T region, which means we need our expansion to match the exact result up to the second order in the Taylor expansion of f , i.e. = 0, 1, 2.

Matching to the fixed order cross section
The resummed formula eq. (3.52) accurately predicts the spectrum for relatively low (but not too low) values of q T . At large q T ∼ Q, the non-singular terms in q T become just as big as the logs of q T /Q themselves, and we should use the fixed-order perturbative expansion for the cross section there. It has been shown in the context of B → X s + γ [44], thrust [45] as well as the Higgs jet veto calculation [46], that if one does not turn off the resummation then one can over estimate the cross section, in the region where fixed order perturbation theory should suffice, by an amount which goes beyond the canonical error band in the fixed order result. This overshoot happens despite the fact that the resummed terms are formally sub-leading in the expansion. The reason for this overshoot has been shown [44][45][46] to be due to the fact that there are cancellations between the singular and non-singular terms in the tail region and that this cancellation will occur only if the proper scale is chosen in the logarithms. We can smoothly combine the resummed and fixed-order formulas by turning off resummation in eq. (3.52) in the high q T region using profiles in µ and ν [45] and match the result to the one loop (O(α s )) full theory cross section.
Our resummation as given in eq. (3.1) (and evaluated in eq. (3.52)) is neatly divided into two parts, the hard function which runs in µ and functions C and I b which implement ν running. To turn off resummation at large q T we implement the following profiles for both µ, ν in eq. (3.1): The function ζ(q T ) is chosen so that at low values of q T where resummation is important, its value is 0 while for values near Q it approaches 1. µ L (q T ) is given by eq. (2.58) and illustrated in figure 5, and we have indicated ν run so that inside eq. (2.36) for ν * L the µ L JHEP04(2018)149  is also set to µ run . This is designed so that not only resummation of logs of µ L /µ H and ν L /ν H is turned off as ζ → 1 but also the shifting of logs of µ L b 0 into the soft rapidity exponent in eq. (2.56) as we move out of the resummation region. The exponent p defined in eq. (2.38) is also now evaluated at µ run . The choice of ζ(q T ) that we make for this function is: Here q 0 determines the central value for the transition and ρ determines its rate. In practice we use ρ = 3. The value of q 0 is determined by the scale at which the non-singular pieces become as important as the resummed singular cross section. The profiles in µ and ν for the case of the Higgs spectrum are shown in figure 9. (Here q 0 has been chosen to be 40 GeV for the central profile). We also probe the effect of varying the profiles by varying the value of q 0 , in our case, between 30 GeV and 50 GeV as shown in figure 8. For each of these profiles, we consider the scale variation by a factor of 2 for µ, ν (figure 9) For the case of the DY, we use a similar value for q 0 . This procedure is straightforward to implement for the hard function running, let's see what effect it has on the ν running. Going back to the ν running, our exponent in JHEP04(2018)149 eq. (2.26) with the scale choice eq. (2.36) for ν L looks like Putting in our choices for ν L we have So the effect is to merely multiply the argument of the exponent by a factor of 1 − ζ(q T ). Notice that we have also put in µ run in the ν anomalous dimension since it is a function of µ L . This is basically the same as A → (1 − ζ)A and γ RS → (1 − ζ)γ RS in the expressions eqs. (3.5) and (3.6) for the rapidity evolution factor V Γ inside I b in eq. (3.2). Thus, using these profile scales simply modifies the definitions in eq. (3.6) to: and Ω, χ in eq. (3.6) remain unchanged (with the exception that µ L → µ run ). The final expressions for I 0 b and I k b given by eqs. (3.37) and (3.46) are modified accordingly. So what we are doing is smoothly taking the limit A → 0 (and γ RS → 0) as we enter the high q T region. We already know from the previous section (see eq. (3.53)) that in this limit I 0 b goes to 0 smoothly, which means the resummation factor is 0, and each I k b goes to the appropriate fixed order limit as well.
The central profiles we choose for the scales are: µ H is chosen as iQ, which implements the resummation of enhanced π 2 terms at each order in α s , which improves the perturbative convergence [38,39]. While implementing the hard function running, we take the absolute value of U H . We make six scale variations for µ, ν by a factor of 2 about their central values for each of the profiles shown in figure 8: to estimate higher order terms as shown in figure 9. In the first two variations in eq. (3.71) µ L and ν L are each varied independently, while in the last one, they are varied simulatneously. The anticorrelated variation ((µ L /2, 2ν L ) ← (µ L , ν L ) → (2µ L , ν L /2)) is not included to avoid double counting [32].

JHEP04(2018)149
We take the largest envelope in µ, ν and q 0 variations as our estimate of the error band. In principle, we can make variations at either end of the running (i.e. at the high scales µ H and ν H , or the low scales µ L and ν L ). In practice, since the value of α s is larger at lower scales, the scale variation at the low scale of running yields the largest error band. There is implicit q T dependence in z(q T ), µ run , and ν run . Now we are ready to give results for our final resummed cross section matched to the fixed-order QCD result. We give an explicit expression of the transverse spectrum which can be written as sum of resummed and nonsingular parts. We match the resummed and perturbative QCD results such that final result is valid in both small are large q T regions. This can be done by adding the resummed result to nonsingular terms where all logarithmic singular terms reproduced by EFT results in eq. (A.37) are subtracted from the perturbative result σ = σ res + σ ns , σ ns (µ) = σ pert (µ) − σ sing (µ) , (3.72) where the differential in q 2 T and rapidity y is implied. The fixed-order σ ns is evaluated at a scale µ, which we choose to be equal to |µ H |, the high scale to which the profile eq. (3.65a) goes for large q T . In this section we give explicit expressions for σ res and σ sing , and appendix C gives σ pert .
The resummed cross section σ res is given by eq.
while the two-loop coefficients F (2) k remain zero at NNLL accuracy. In the pure fixed order limit ζ = 1, the resummation is turned off as µ run = ν run = |µ H | = ν H , and we see the term containing explicit µ L , ν L in eq. (3.75) vanish. The integrals I b,k take the fixed order JHEP04(2018)149 form in eq. (3.64) with µ L → µ H . In this limit, F (0,1) 0 actually does not contribute due to the fact that I 0 b → 0. Either by putting eq. (3.75) and eq. (3.64) together or by inverse transforming eq. (A.37) to momentum space, we obtain the singular part of the fixed-order cross section at O(α s ) as The coefficients in the singular piece of the cross section can be found in appendix A. Together with σ pert given in appendix C, eq. (3.76) and eq. (3.72) give the non-singular part of the cross section at O(α s ), which we add to eq. (3.52) to obtain the final resummed and matched cross section up to NNLL+O(α s ).

Final resummed cross section in momentum space
Then, the resummed cross section in momentum space eq. (3.1), using eqs. (3.5), (3.7), and (3.46) to express the integral I b , implementing the profile scales eq. (3.65a), and matching onto the fixed-order QCD cross section for large q T , is given by the expression: This fairly compact expression still has many pieces to it. We provide a roadmap to where to find them all: The basic cross sections σ 0 for Higgs production (gg → H) and for DY (qq → γ * ) are given by:  19) respectively and the one-loop constants given by eq. (A.14). The top matching coefficient C 2 t for Higgs production is given by eq. (A.17) while it is set to 1 for DY. The µ (virtuality RG) evolution kernel U (µ L , µ H , µ T ) is given in eq. (2.26), with the parts of the exponent K Γ,γ , η Γ defined and expanded up to NNLL accuracy in appendix A.1. These all contain dependence on the cusp anomalous dimension, whose coefficients Γ n for DY and Higgs are given by eq. (A.9).
The factor C(µ L , ν L , ν H ), which came from expressing the exponentiated "conformal" part of the rapidity evolution kernel V Γ in eq. (2.56) as a Gaussian in ln b, is defined in eq. (3.6), and expanded to NLL and NNLL accuracy in appendix A.6. Figure 10. Systematic improvement in the accuracy of Higgs (left) and DY (right) cross sections with increasing number of terms, differential in y (at y = 0) and q T (and Q 2 for DY, shown at Q = 125 GeV for comparison). Exact (red) gives resummed cross section without Hermite expansion (i.e. numerical b integration). N = 6 (blue) is the result with six terms in the Hermite expansion, three each for real and imaginary terms. N = 7 (black) is the result with seven terms, four for real and three for imaginary. Here we plot only the purely resummed result, i.e. with no matching to the fixed order cross section.

JHEP04(2018)149
The coefficients F (n) k in the expansion eq. (3.73) of the F defined in eq. (3.3) that contains the fixed-order terms in the soft function, TMDPDFs, and "non-conformal" part of the rapidity evolution kernel V β defined in eq. (2.54) have been given above in eqs. (3.74) and (3.75) up to the order to which we shall need them for NNLL accuracy.
The integrals I k b which we defined in eq. (3.7) are given in final evaluated form in eq. (3.46), the calculation of which formed the bulk of this section 3 and is one of the main results of this paper. That result is in terms of derivatives eq. (3.50) of the integrals H n of Hermite polynomials against the Gaussian in eq. (3.28) appearing in I 0 b in eq. (3.37), the explicit results for which are given by eq. (3.33). Those integrals depend on the parameters α, a 0 and β, b 0 that we used in the expansion of the function Γ(1−ix) 2 in terms of Gaussianweighted Hermite polynomials in eq. (3.23). For the numerical results in this paper, we used the values which were given in eq. (3.24), and values of α, β given by α 2 − a 0 = 4 and β 2 − b 0 = 4, which were given in eq. (3.26). These choices are by no means unique; other values can also be chosen, which would then give different coefficients in the Hermite expansion. Our choices allowed sufficiently accurate representation of the exact function Γ(1 − ix) 2 with an economical basis of a few terms each for the real and imaginary parts, the coefficients of which we gave in eq. (3.27). Figure 10 illustrates the agreement of our resummed cross section eq. (3.77) to NNLL accuracy with a total of six or seven (four real, three imaginary) terms in the Hermite expansion vs. the result of numerically integrating the exact b integrand in eq. (3.1). The truncation error lies well within the perturbative NNLL uncertainty band. Table 1 gives the orders to which the anomalous dimensions and fixed-order pieces of eq. (3.52) are to be truncated to achieve NLL, NNLL, etc. perturbative accuracy in the resummed cross section. The central values of the resummation scales we choose in eq. (3.77) are given in eq. (3.70), wherein the central values for µ L = ν L are given by the solution of eq. (2.58) and illustrated in figure 5. The perturbative uncertainties are then estimated by performing the variations eq. (3.71).

JHEP04(2018)149
The final piece dσ ns /dq 2 T dy is defined by eq. (3.72) and can be obtained from the fixed-order results in appendix C, subtracting off the singular terms eq. (3.76).
It is good to remind ourselves at this point that the final formula eq. (3.77) is a threefold expansion as described in section 1.3: a perturbative expansion in α s and resummed logs, an additional fixed-order expansion of the non-conformal piece V β of the rapidity evolution inside F , and an expansion in Hermite polynomials in evaluating each I k b . It is thus quintessentially a "formula" according to a delightful definition we recently encountered: an expression given by an = sign with a controlled error of known parametric form. 5 4 Remarks on nonperturbative region of q T The final result eq. (3.52) for the resummed cross section is obtained by doing an integral eq. (3.2) over a wide range of b. It is a reasonable assumption that the perturbative resummation will hold as long as 1/b > Λ QCD ∼ 500 MeV. This is roughly the value at which we begin to see the Landau pole in the b space resummation scheme (see figure 11). Beyond this value of b, we expect that nonperturbative effects will play a nontrivial role. Although it is not the goal of this paper to include or advance any new method to account for these nonperturbative effects, we will freely make some loose observations here, mainly delineating where we do and do not need to worry about them. While they are certainly important, we will have to leave their incorporation into our results to future work.

Remarks on nonperturbative effects
As we approach such low values of 1/b, it is no longer possible to match perturbatively the beam functions onto PDFs as in eq. (A.27). Instead we need to retain the complete transverse momentum dependent beam function, which is now a fully nonperturbative function that need to be fit from data. In the b space resummation scheme, where µ ∼ 1/b, giving the result eq. (2.29), the running using the perturbative anomalous dimension will work only for 1/b Λ QCD . A sensible thing to do in this scheme is to freeze the resummation at 1/b ∼ 2Λ QCD and evaluate the beam function at this frozen low scale which is still perturbative. Interestingly, for the momentum resummation scheme leading to eq. (3.52), while we still need to replace the PDF's with the full nonperturbative TMDPDFs at low q T , the resummation is always in the perturbative regime since µ Λ QCD for any value of q T .
For the case of the soft function, since both resummation schemes use a b dependent value for ν, we need to freeze out the resummation at a perturbative value of ν L ∼ 1/b * > JHEP04(2018)149 Λ QCD and fit the soft function at this scale from data. All this seems rather complicated and it might appear that without complete information about the nonperturbative functions, it is not possible to give a prediction for the cross section. However, once again the nature of the resummed perturbative exponent comes to the rescue. Even before we reach a nonperturbative value, the double logarithmic term in b in the exponent completely damps out the integrand. Then the nonperturbative corrections become irrelevant since the region of b space in which they start contributing is heavily suppressed.
If we consider b space resummation, the damping which is provided by the resummed exponent depends mainly on the hard scale Q and the cusp anomalous dimension. For the Higgs, Q is fixed and the cusp anomalous dimension is large so the damping is always large. Increasing the center-of-mass energy only changes the x value where the PDFs are evaluated, but there is only a mild dependence on this factor. So, we can safely neglect any nonperturbative effects and rely completely on the perturbatively resummed cross section.
For the case of DY, the cusp anomalous dimension is much smaller and the value of Q is variable. So if we go to low Q, nonperturbative effects become important, see figure 11. As can be seen from this figure, b ∼ 3 GeV −1 is a rough estimate of the value beyond which nonperturbative effects become important where we begin observing the divergence due to the Landau pole in the b space resummation scheme. For low values of Q, several different ways of incorporating nonperturbative effects using model functions (which effectively cut off the Landau pole) in b space resummation have been proposed [33][34][35][36][37]. In this paper, we stick to showing results for larger values of Q where these effects are not as important. A detailed discussion of how to handle nonperturbative effects in our hybrid impact parameter-momentum space resummation scheme will be given in future work. We suspect, 6 among other things, that the nature of our asymptotic V β expansion in eqs. (2.44) and (2.54) will in fact give clues about the best way to include nonperturbative effects together with our perturbative resummation scheme. This follows from the fact that JHEP04(2018)149 in the b space resummation scheme, the Landau pole, related to the running of α s was the indicator of the onset of non-perturbative effects. In our scheme, we have expanded out the running of α s in the form of the V β function and it is natural that the breakdown of this expansion will dictate the onset of nonperturbative effects. (See [49] for a recent study of nonperturbative power corrections based on renormalon divergences of perturbative expansions of TMD functions.)

Remarks on perturbative low q T limit
The q T → 0 behavior of the perturbative resummed q T distribution has of course been extensively discussed, and is known to be affected by configurations of multiple large momentum emissions k i T that cancel vectorially i k i T = q i T ∼ 0, leading dσ/dq 2 T to go to a constant nonzero value e.g. [16,40,42,50]. We will not add anything to these discussions here, postponing consideration of this regime until we also include nonperturbative effects as remarked above.
We observe, however, that our scheme in section 3 applies practically only to the larger q T regime. Specifically, we are using an approximation for the ratio of gamma functions F (t) = Γ[−t]/Γ[1 + t] appearing in eq. (3.8) in t space. While the approximation that we are currently using is good enough in the perturbative regime (q T ≥ 2 GeV), below this q T scale, if we compare it to the CSS resummation then it shows exponential suppression as opposed to a constant behavior. However, the q T scale at which this deviation from the exact resummed cross section happens could be systematically lowered by improving our expansion of F (t) (i.e. adding more terms to our final formula eq. (3.37)). This is evident from the behavior of the integrand in t space eq. (3.12). As q T is lowered, both A and t 0 become large. Since A controls the suppression of the integrand which allows us to approximate the ratio of Γ functions as a series, as q T is lowered, this suppression becomes smaller and smaller which will force us to include more and more terms in our expansion in order to maintain the same accuracy. While we can get good accuracy in the perturbative regime q T ∼ 2 GeV with a few terms, below this scale it is impractical to continue using this series approximation. So in principle, to recover the behavior as q T → 0, we can no longer do an expansion in the Hermite polynomials.
We observe, however, that as A increases at lower q T , the integrand in b space (eq. (3.7)) is now highly suppressed at large b. This means that it would be more practical to now do an expansion of the Bessel function itself (which was not possible at larger q T ), rather than go to t space. For q T 2 GeV we would need a series expansion that approximates J 0 (bq T ) well just out to its first peak or so. It should be possible to find such an expansion similar to what we did for large q T above that still allows us to do the b space integrals analytically and accurately, which by construction would now display constant behaviour at low q T . We defer a presentation of the details of this procedure to future work.
So the Mellin-Barnes representation is in some sense the dual of the Bessel function, in that at perturbative q T , t space is more amenable to an expansion and hence an analytical result with a few terms, but fails as we move into the low-q T region where nonperturbative effects kick in, where expanding directly in b space makes more sense. Figure 12. Comparison of NNLL+NLO cross section (resummed cross section matched to O(α s ) fixed order cross section using profiles) in two schemes, b-space resummation eq. (2.29) and p-space resummation eq. (3.52). The overlap is a good cross-check of the accuracy of our method, and the improvement in the reliable estimation of uncertainties and computation time in our resummation scheme has been described in the text. The Higgs cross section is differential only in q T .

Comparison with previous formalisms
There have been numerous other techniques developed for implementing the resummation for transverse momentum spectra of gauge bosons. We will briefly comment on how ours compares to some of them, though we will not undertake any sort of in-depth comparison here and will not really do justice to any of these other methods. A more detailed discussion of them was given in [32] as well as [40].
The earliest one was the CSS formalism [10,31], which was applied in, e.g. [14,51], for computing DY and Higgs transverse momentum cross sections. The value of µ L is implicitly chosen to be 1/b 0 . There is no explicit independent scale ν, however a comparison of the resummed exponents reveals that the implicit choice for ν L is also 1/b 0 . So the central values agree with the b space resummation implemented in eq. (2.29) and an earlier paper [32].
The difference in the two approaches is two-fold. Firstly, the error analysis using scale variation is different in the absence of an independent scale ν. Varying only µ scales is likely to underestimate the uncertainty. Second, in the high q T region, the matching procedure is different. In the CSS formalism, there is no systematic way of turning off resummation while matching to the fixed order cross section, so that the predicted results differ in the high q T region. The Landau pole is handled by implementing a smooth cut-off in b space. However, as we have seen, as long as we are at high Q, this does not affect the prediction. An explicit comparison between the two schemes was given in [32].
In figure 12, we compare the b space resummation scheme for the implementation in [32] at NNLL+NLO accuracy for the Higgs and DY transverse spectrum with the hybrid b-space/momentum-space resummation scheme developed in this paper. This will serve transitively as a comparison also with other b space resummation schemes. We can deduce the following • The width of the error bands is comparable in the entire region of q T which is not JHEP04(2018)149 too surprising since the error analysis in both [32] and the present paper were based on the same variations eq. (3.71) around the respective central values.
• In the low q T region, the central value in our hybrid scheme is lower that the pure b space scheme even though it is within the error band. This is to be expected since the two schemes differ in subleading terms at a given resummation accuracy.
• In the high q T regime, the results agree exactly since in this range, the resummation has been turned off and the cross section is just the one-loop fixed order cross section in both schemes.
Another technique was implemented in [42,43] which again follows the CSS formalism with the implicit ν choice 1/b, but the µ choice is made in momentum space choosing µ ∼ q T + q * T , where q * T is chosen as 2 GeV for DY and 8 GeV for the Higgs. For the kinematics we chose to illustrate in section 2.3.5, we actually found very similar shifts at low q T based on our analysis of the scales which minimize the contributions of the residual fixed order logs, which in itself parallels the logic in [42,43], though we do not necessarily adopt the same physical interpretations. The matching procedure is again similar to the CSS case and hence differs from the scheme in [32] the high q T regime. Again a detailed discussion of the differences was provided in [32].
There also have been methods proposed for setting all renormalization scales in momentum space. The most recent [40] technique has been to solve the RG equations in momentum space directly. In momentum space the beam and soft functions are functions of plus distributions of the form 1/q 2 T ln n (q 2 T /Q 2 ) + and these terms are resummed directly in momentum space using a technique of distributional scale setting. This involves setting the scale under an integral of the plus distribution. The integral turns the plus distribution into ordinary logarithms which can be minimized by choosing a specific momentum scale. However, they also observe that for transverse momentum spectra of gauge bosons, a direct scale choice of µ, ν ∼ q T does not work since this scale choice gives a spurious contribution from highly energetic emissions (k T q T ) in the phase space and hence a scale that varies with the energy of emissions has to be used so that the region of phase space of energetic emissions is suppressed. This is reflective of the divergence observed in the soft resummation eq. (2.33) at low values of b, which, in our method we chose to cure by adding subleading terms in the cross section through eq. (2.36). Mathematically the solution proposed in [40] is quite elegant. It will be interesting to see its implementation numerically and to compare the results at NNLL.
Another method of obtaining the transverse momentum spectra has been proposed [16,41] that uses the coherent branching formalism. The cross section is given in terms of a convolution over independent emissions off the initial gluons (or quarks for DY). It then singles out the hardest emission which also sets the scale for ν which again suppresses the energetic emissions since all other emissions are by construction of lower energy. This differs mainly from [40] in this scale choice, as in [40] ν ∼ k i which follows the energy of each emission instead of just the hardest one.

JHEP04(2018)149
Amusingly, every proposal we know of so far (e.g. [16,40,41] and this paper) to implement TMD resummation in momentum space yields a result formally correct at a given order of logarithmic accuracy, but in terms of either an infinite sum or infinite nest of expressions (beyond the perturbative expansion itself) that must be truncated to yield a result that can be evaluated numerically. Refs. [16,40,41] obtain their final resummed results in terms of infinite sums over gluon emissions, which [16,41] implemented in a Monte Carlo routine. Our final result, on the other hand, contains the infinite sum in eq. (3.37), over Hermite polynomials in the basis expansion of the function Γ(t) 2 arising from the representation eq. (3.8) we used for the Bessel function in the inverse Fourier transform from impact parameter to momentum space. This is of course quite different from sums over gluon emissions. Truncating our formula corresponds to the level of numerical accuracy one attains for the Bessel function and resultant integral, rather than the number of gluons one includes in the emission amplitudes.
All the methods have their pros and cons in terms of the perturbative series obtained, error analysis, and how rigorously or easily nonperturbative effects can be included (see, e.g. [36]). Again, we leave it to future work to show how we do the latter in our method.

Conclusions
We took a fresh look at resummation for transverse momentum spectra of gauge bosons in momentum space. In contrast to the classic procedure which chooses both virtuality and rapidity scales µ, ν for resummation both in impact parameter space, we proposed a hybrid prescription for resummation, choosing the rapidity renormalization scale ν with impact parameter dependence and the virtuality scale µ in momentum space. We made a choice ν * L ∼ ν L (µ L b 0 ) −1+p for the low (soft) rapidity scale, and observe that with this choice, the integral over the b space rapidity resummation exponent is convergent. We stress that a well-defined power counting for ln(µ L b 0 ) is not possible before we have a stable soft exponent and that only when this exponent is in place, we can treat ln(µ L b 0 ) as a small log with an appropriate choice of µ L . We also give a prescription for obtaining the µ L scale in momentum space using the analysis of the b space integrand to justify our power counting ln(µ L b 0 ) ∼ 1, which shifts the scale up from the naïve momentum-space choice µ L ∼ q T .
We then use the idea that restricting the soft exponent in b space to be at most quadratic and thus Gaussian in ln(µ L b 0 ) allows us to obtain a semi-analytic formula for the cross section. Using the Mellin-Barnes representation of the Bessel function and the absence of the Landau pole in our resummation formalism, we are able, with certain approximations for the Bessel function appearing in the inverse Fourier transform that are independent of the details of the observable or kinematics, to give a closed-form analytic expression for the cross section at any order of resummation accuracy.
In brief, the main ideas and results of our paper are: • Exponentiation of quadratic fixed-order small logs of µ L b 0 from the soft function and rapidity evolution that are formally subleading at a given order of resummation accuracy, but automatically make the b integral in going to momentum space convergent JHEP04(2018)149 without an additional regulator or cutoff. This is formally achieved by the shifted scale choice ν L → ν * L in eq. (2.36).
• Division of the rapidity exponent V into an exponentiated part V Γ (ν * L , ν H ; µ L ) in eq. (3.5) that is quadratic and thus Gaussian in µ L b 0 , and a part V β in eq. (2.44) expanded in an asymptotic series, making the b integral eq. (3.2) doable.
• Use of the Mellin-Barnes representation eq. (3.8) of the Bessel function, transformation to the form eq. (3.18), and expansion of the pure function Γ(−c − ix) 2 appearing therein in a series of Hermite polynomials times Gaussians in eq. (3.23), which for NNLL accuracy in the cross section can be safely truncated to a few terms each in the real (even) and imaginary (odd) parts, each term of which gives rise to an integral over b (or x) which can be done be performed analytically, giving the result eq. (3.33).
• The above steps give rise to the final resummed cross section in momentum space, eq. (3.77), in which we can implement scale variations and profiles to reliably estimate theoretical uncertainty and match smoothly onto fixed-order predictions for large q T .
The final result eq. (3.77) represents a threefold expansion: perturbative expansion in α s and resummed logs in the RG and RRG evolution kernels and fixed-order hard, soft, and collinear functions; V β expansion, a fixed-order asymptotic expansion of the non-conformal part of the RRG evolution kernel to ensure a Gaussian rapidity kernel; and Hermite expansion, in number of terms in basis expansion of the pure Bessel function in the inverse Fourier transform between b and q T space. Each of these is systematically improvable.
We do not even claim that the particular methods, expansions, and strategies we implemented are the fastest or most accurate amongst all similar strategies. It is fast, and it is accurate. Keeping just a few terms in the Hermite expansion we obtain an error in the cross section at the percent level, much better than the NNLL perturbative accuracy to which we work in this paper, while obtaining the result with a ∼five-fold improvement in computation time in our tests. We hope our presentation provides a blueprint and an example to obtaining a faster, more accurate predictions for many TMD observables in momentum space, and is certainly open to further development and improvement.
We applied our results to obtain the transverse spectrum of the Higgs as well as the DY q T spectrum at NNLL, matched to fixed-order O(α s ) results at large q T . We give a comparison with results obtained using the CSS formalism and observe a very good agreement where they should agree, consistent within subleading terms which is observed from the overlapping of the error bands at both NLL and NNLL. We also gave cursory discussions of the relevance of nonperturbative effects in different kinematical regimes, and also of how our method compares with some recently proposed methods of resummation directly in momentum space for all renormalization scales.
The techniques we have proposed should be applicable to other observables that depend on a transverse momentum or are sensitive to "soft recoil," (e.g. [28,29]), and admit of a factorization of the form eq. (1.1) with a convolution between soft and collinear functions in (2-D) transverse momentum q T describing modes separated in rapidity as in figure 1.

JHEP04(2018)149
When a (semi-)analytic formula can be obtained as we have done, it should drastically cut down computation time and improve our understanding of the physical behavior of the cross section and its computational uncertainties as a function of the scales it depends on. We will perform a more detailed phenomenological study using our expressions with comparisons to data in the near future, and then also apply our techniques to other TMD observables.

Acknowledgments
We are grateful to D. Neill for many helpful conversations, especially for suggesting the use of an integral representation of the Bessel function to simplify the analytical computation of cross sections, and for a detailed review of a preliminary draft which (we believe) improved its presentation considerably, and to S. Fleming and O. Z. Labun for collaboration on early stages of this work and on related work. This work was supported by the U.S. Department of Energy through the Office of Science, Office of Nuclear Physics under Contract DE-AC52-06NA25396 and by an Early Career Research Award, through the LANL/LDRD Program, and within the framework of the TMD Topical Collaboration.

A.1 Evolution kernels
The evolution kernels K Γ (µ 0 , µ), η Γ (µ 0 , µ), K γ (µ 0 , µ) that appear in the RGE solutions for the hard and soft functions and TMDPDFs were defined in eqs. (2.5), (2.20a), and (2.20b). They can be rewritten in terms of integrals over the coupling α s (µ) via the relation Expanding the beta function and anomalous dimensions in powers of α s ,

JHEP04(2018)149
Similarly, Since η Γ is always multiplied by another large log (e.g. ln(Q/µ L ) or ln(ν H /ν L ) in eq. (2.24)), the first tower is again part of the LL series, the second the NLL series, etc. Alternatively, expanded in α s (µ), Finally, K γ is given by the same expansions as eqs. (A.8a) and (A.8b) with Γ i → γ i . But K γ is not multiplied by any additional large logs, so in this case the first column of terms in eqs. (A.8a) and (A.8b) for K γ begins at NLL, the second NNLL, etc.
In our numerical analysis we use the full NNLL expressions for K Γ,γ , η Γ in eq. (A.4), but to be consistent with the value of α s (µ) used in the NLO PDFs we only use the twoloop truncation of eq. (A.5), dropping the β 2 and β 2 1 terms, to obtain numerical values for α s (µ). Up to three loops, the coefficients of the beta function [52,53] and cusp anomalous dimension [54,55] in MS are T F n f , where C i is C F and C A for the quark and gluon, respectively.
where to O(α 2 s ), The one-loop constant terms I (1) ij (z) [29,63] are given by The two-loop terms I (2) ij (z) can be found from [21] (see also [20]). The 1-loop splitting functions defined in eq. (A.29) are given by 8 The anomalous dimensions γ i Rf that we need are given above in eq. (A.26).

A.5 TMD cross section
Combining the above fixed-order expansions of the hard and soft functions and TMDPDFs according to the factorization formula eq. (1.8), we obtain for the fixed-order expansion up dy y The third line of the O(α 2 s ) pieces cancels out the running of α s (µ) in the O(α s ) piece on the very first line. All the P (0,1) ij pieces cancel out the evolution of the PDFs f q (z i , µ). The remaining pieces on the first three lines that contain logs are all fixed by the RG evolution of the hard and soft functions and TMDPDFs in eq. (1.8).

A.6 Gaussian rapidity exponent
The pieces is in the exponentiated rapidity evolution kernel eq. (3.5) are given to all orders by eq. (3.6), and to NLL accuracy by:  which automatically resums the leading double logarithms of ln(µ L b 0 ). This scheme also partially resums single and double logarithms of the argument µ L b 0 at higher orders in α s . This is the simplest scheme that provides a stable b space kernel that respects the power counting that l = ln(µ L b 0 ) is small (see section 2.3). This scale choice is by no means unique. There is still a lot of space to play around with the choice of this scale, where each choice would differ from the other in exactly which set of small logs l get included in the exponent. All of these different schemes, therefore, would only differ from each other in subleading terms, and hence we would expect that each of these would lead to an overlapping error band at any given order in resummation.
In this section, we give a general prescription for the scale choice ν L , that covers all of these schemes, while still allowing us to obtain an analytical expression. So we still obey the constraint of putting terms at most of quadratic power in l in the exponent. The generalization that we propose isν * where we now expand out both r and s as a power series in α s . ζ 0 is a constant In practice, while resumming to a particular order, we truncate the series in r and s to that order in accuracy. The r i parameters will control the coefficient of both l and l 2 at each order in α s , while the s i parameters will control the coefficient of l. The s i parameters also induces constant terms in the exponent, which ideally, one would not find in an exponent, however, their effect at each order can be cancelled out by including the corresponding constant terms induced in the fixed order by this scale choice. We have checked that the effect of several different choices for the parameters r and s produce variations in the resummed cross sections smaller than the inherent perturbative uncertainty already present at each order seen in figure 6.

Γ(ix)
The integral in eq. (3.12) can be done numerically but by series expanding Γ(1 − ix)/Γ(ix) directly. We then apply the same strategy that was used in section 3.2, but now use it directly for f (t) = Γ(−t) Γ(1+t) . Near x = 0, we have the Taylor expansion as Note that the values ofã 0 andb 0 differ from those for a 0 , b 0 in eq. (3.24) by π 2 /6. Then, the series expansion can be written as 2n H 2n (βx) .

(D.2)
So the coefficients in eq. (D.2) are given bỹ Empirical tests imply the series converges well forα 2 −ã 0 andβ 2 −b 0 around 3 ∼ 5. Figure 7 shows the exact result and series expansion up to 4th order forα 2 −ã 0 = 4 and β 2 −b 0 = 4. Note that the deviations from the exact results above x = 1.5 is suppressed by the Gaussian kernel in eq. (3.12) and resulting error in the integral should be smaller than that appearing in figure 13. The coefficientsc 2n andd 2n are given bỹ Now it is straightforward to rewrite the integration in eq. (3.12) in terms of the basis integrations and to obtain the fixed order terms in the similar fashion to eqs. (3.28) and (3.50) in section 3.2.

D.2 A tailored basis for expanding Γ[−t] Γ[1−t]
While the weighted Hermite polynomial basis presented in the main text is a systematic expansion in an orthogonal basis, we can come up with a basis more closely tailored to the behavior of the function f (t) = Γ[−t] Γ [1−t] , although it is not as systematic in that it is not orthogonal and there is not a simple formula for the basis coefficients. This basis is: where N n are integers and a n and b n are complex constants. In the 2nd equality, we set t = c + ix and x 0 = i(1 + c). This form has been deliberately chosen in anticipation of our choice of c = −1.
Because we are not aware of a systematic expansion in terms of this basis unlike the weighted Hermite polynomial expansion, the values of a n , b n , and N n as well as c n in eq. (D.6) should be determined by fitting to the exact function f (t). The integration against the evolution kernel is given by = ω N (a, b) F 0 (a, b) , (D. 8) where ∂ L = ∂/(∂L) and ω N is determined by N th derivative of F 0 (a, b). The integral I b is rewritten as T n c n F Nn (a n , b n ) , (D.9) where the derivative on F N can be replaced by F 0 multiplied by a coefficient d N,k . Here, we show the result of using the basis in eq. (D.7) for N n =0 and 2. The real part of f app (t = −1 + ix), which we call f R (t) is even, while the imaginary part f I (t) is an odd function of x. We therefore write f R (t) = g 1 (e −g 2 x 2 − cos[g 3 x]) + g 4 x 2 e −g 5 x 2 = g 1 f (t; 0, g 2 , 0) − where t = −1 + ix. In practice we first find the fit in the first equality then, rewrite it in terms of our basis. The b space integral obtained by replacing f (t, N n , a n , b n ) by F Nn (a n , b n ). where we have defined L = ln 2Ω q T . The value of the parameters g i , h i shifts as we shift the contour via the value of c, so that the final result is independent of the contour chosen. In this paper we have made the following choice for the contour and hence the corresponding parameters c = −1, g 1 = 0.5532, g 2 = 1.77, g 3 = 2.465, g 4 = 0.4582, g 5 = 2.42, h 1 = 0.0525, h 2 = 4.09, h 3 = 0.98, h 4 = 0.793. The result of the fit with these parameters is shown in figure 14. It is to be stressed that once the contour is fixed, these parameters are also fixed and hence can be used for any observable in any kinematical regime. This is because the fitting is only done for the ratio of Gamma functions f (t) which, in no way involves the details of the specific observable or its kinematics. The only condition as we specified earlier that A be a small number to ensure adequate suppression.
Let us check if our functions eq. (D.11) satisfies the constraints in eq. This agrees with eq. (3.63) better than 1% that is acceptable at NNLL accuracy.
E Mathematical proofs E.1 A proof of the Mellin-Barnes identity for the Bessel function Here we present a short proof of the key identity eq. (3.8) we use in the b-space integral against the Bessel function: This identity can be found, e.g. in [76], section 10.9.22, which is given as valid for J α (z) for α > 0. We briefly verify that it works also for α = 0. For convenience let us change the integration variable in eq. (E.1) from t → −t. Then we have where now the contour lies to the right of all the poles of Γ(t), i.e. c > 0. The contour can be closed in the left half plane out at t → −∞. The value of the integrand falls rapidly in this limit, and the circular part of the contour contributes zero to the integral. Deforming the contour, we pick up the residues at all the poles t = −n of the Gamma function Γ(t), which is precisely the series representation of the Bessel function J 0 , proving the identity.

E.2 Integral of complex Gaussian
We can evaluate the integral over a complex Gaussian using the contour integral: where C is the contour shown in figure 15.

E.3 Proof of Gaussian integral of Hermite polynomials
Here we prove the result eq. (3.33) for the integrals H n given in eq. (3.31). Starting from the form of the result eq. (3.32), we have We need to identify the coefficient of each single power t n of t in order to read off the coefficients H n in eq. (3.30). Using the binomial theorem, (E.9) This is almost in the form eq. (3.30) where we can read off the coefficient of t n , but the order of summation needs to be flipped. As illustrated by figure 16, the following sums are equivalent: −2i(x+z 0 ) , (E.15) where we used z 0 = A( π 2 + iL). The first term in brackets then cancels with the z 0 term, and we have simply Now, we note that the factor −2x can be expressed as a derivative on the Gaussian: where in the last line we integrated the d/dx term by parts, using that the boundary terms are zero at x → ±∞. Now we can use the known recursion relation for derivatives of Hermite polynomials: H n (x) = 2nH n−1 (x) , (E. 19) and obtain ∂ L H n = 2iz 0 1 + a 0 A H n − 2inαA 1 + a 0 A H n−1 , (E.20) which proves eq. (3.49).

E.4.2 Second proof
Using the all orders result for H n in eq. (3.33). We then compute where we have used ∂ L H 0 (α, a 0 ) = 2iz 0 1 + a 0 A H 0 (α, a 0 ) (E. 22) For the second term in ∂ L H n (α, a 0 ), we can make the following observations • For even values of n, the term m = n/2 does not contribute to the derivative.
• For odd values of n, n/2 is the same as (n − 1)/2 .
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.