Momentum-space resummation for transverse observables and the Higgs $p_\perp$ at N$^3$LL+NNLO

We present an approach to the momentum-space resummation of global, recursive infrared and collinear safe observables featuring kinematic zeros away from the Sudakov limit. In the hadro-production of a generic colour singlet, we consider the family of inclusive observables which do not depend on the rapidity of the radiation, prime examples being the transverse momentum of the singlet, and $\phi^*$ in Drell-Yan pair production. We derive a resummation formula valid up to next-to-next-to-next-to-leading-logaritmic accuracy for the considered observables. This formula reduces exactly to the customary resummation performed in impact-parameter space in the known cases, and it also predicts the correct power-behaved scaling of the cross section in the limit of small value of the observable. We show how this formalism is efficiently implemented by means of Monte Carlo techniques in a fully exclusive generator that allows one to apply arbitrary cuts on the Born variables for any colour singlet, as well as to automatically match the resummed results to fixed-order calculations. As a phenomenological application, we present state-of-the-art predictions for the Higgs-boson transverse-momentum spectrum at the LHC at next-to-next-to-next-to-leading-logarithmic accuracy matched to fixed next-to-next-to-leading order.


Introduction
After the discovery of the Higgs boson [1,2], the precise measurements from Run 2 of the LHC programme have so far confirmed the Standard Model with remarkable precision. Given that signals of new physics will most likely be elusive, it is important to define and study observables that can be both experimentally measured and theoretically predicted with a few-percent uncertainty. In this scenario, a prominent role is played by processes featuring the production of a colour singlet of high invariant mass, for instance gluon-fusion Higgs and Drell-Yan, where quantities like the transverse momentum of the singlet or angular observables defined on its decay products have been studied with increasing accuracy in the last decades. The differential study of these processes not only is important from a purely phenomenological perspective, but also because it represents the ideal baseline for a more fundamental understanding of the underlying theory. Their structural simplicity indeed allows one to provide predictions that include several orders of perturbative corrections, hence probing in depth many non-trivial features of QCD.
In this paper, we consider the hadro-production of a heavy colour singlet, and we study the class of observables, henceforth denoted by the symbol v, which are both transverse (i.e. which do not depend on the rapidity of the radiation) and inclusive (i.e. that depend only upon the total momentum of the radiation). As such, they only depend on the total transverse momentum of the radiation. Specifically, we concentrate on the transverse-momentum distribution of a Higgs boson in gluon fusion, but we stress that the same formulae hold for the whole class of transverse and inclusive observables, for instance the φ * angle in Drell-Yan pair production. Moreover, although we limit ourselves to inclusive observables, the formalism presented in this work can be systematically extended to all transverse observables in colour-singlet hadro-production.
Inclusive and differential distributions for gluon-fusion Higgs production are nowadays known with very high precision. The inclusive cross section is now known at next-to-next-to-next-toleading-order (N 3 LO) accuracy in QCD [3,4] in the heavy top-quark limit. The N 3 LO correction amounts to a few percent of the total cross section, indicating that the perturbative series has started to manifest convergence and that missing higher-order corrections are now getting under theoretical control. Current estimates show that they are very moderate in size [5]. The state-ofthe-art results for the Higgs transverse-momentum spectrum in fixed-order perturbation theory are the next-to-next-to-leading-order (NNLO) computations of refs. [6][7][8][9], which have been obtained in the heavy top-quark limit. The impact of quark masses on differential distributions in the largetransverse-momentum limit is still poorly known beyond leading order, while in the moderate-p t region, next-to-leading-order (NLO) QCD corrections to the top-bottom interference contribution were recently computed [10][11][12].
Although fixed-order results are crucial to obtain reliable theoretical predictions away from the soft and collinear regions of the phase space (v ∼ 1), it is well known that regions dominated by soft and collinear QCD radiation -which give rise to the bulk of the total cross section -are affected by large logarithmic terms of the form α n s ln k (1/v)/v, with k ≤ 2n − 1, which spoil the convergence of the perturbative series at small v. In order to have a finite calculation in this limit, the subtraction of the infrared and collinear divergences requires an all-order resummation of the logarithmically divergent terms. The logarithmic accuracy is commonly defined in terms of the perturbative series of the logarithm of the cumulative cross section Σ as ln Σ(v) ≡ ln One refers to the dominant terms α n s ln n+1 (1/v) as leading logarithmic (LL), to terms α n s ln n (1/v) as next-to-leading logarithmic (NLL), to α n s ln n−1 (1/v) as next-to-next-to-leading logarithmic (NNLL), and so on.
The resummation of the p t spectrum of a heavy colour singlet was first analysed in the seminal work by Parisi and Petronzio [13], where it was shown that in the low-p t region the spectrum vanishes as dσ/dp t ∼ p t , instead of vanishing exponentially as suggested by Sudakov suppression. This power-law behaviour is due to configurations in which p t vanishes due to cancellations among the non-vanishing transverse momenta of all emissions. Around and below the peak of the distribution, this mechanism dominates with respect to kinematical configurations where p t becomes small due to all the emissions having a small transverse momentum, i.e. the configurations which would yield an exponential suppression. In order to properly deal with these two competing mechanisms, in ref. [14] it was proposed to perform the resummation in the impact-parameter (b) space, where both effects leading to a vanishing p t are handled through a Fourier transform.
Using the b-space formulation, the Higgs p t spectrum was resummed at NNLL accuracy in [15,16] using the formalism developed in [14,17], as well as in [18] by means of a soft-collinear-effectivetheory (SCET) approach [19,20]. A study of the related theory uncertainties in the SCET formulation was presented in ref. [21]. More recently, all the necessary ingredients for the N 3 LL resummation were computed [22][23][24][25][26], with the exception of the four-loop cusp anomalous dimension which is currently unknown. This paves the way to more precise predictions for transverse observables in the infrared region. The impact of both threshold and high-energy resummation on the small-transverse-momentum region was also studied in detail in refs. [27][28][29][30][31][32][33][34][35].
The problem of the resummation of the transverse momentum distribution in direct (p t ) space received substantial attention throughout the years [36][37][38], but remained unsolved until recently. Due to the vectorial nature of these observables, it is indeed not possible to define a resummed cross section at a given logarithmic accuracy in direct space that is simultaneously free of any subleading logarithmic contributions and of spurious singularities at finite values of p t > 0. Last year some of us proposed a solution to this problem by formulating a resummation formalism in direct space up to NNLL order [39], and used it to match the NNLL resummation to the NNLO Higgs p t spectrum. The problem of direct-space resummation for the transverse-momentum distribution was also considered more recently in ref. [40] following a SCET approach, where the renormalisationgroup evolution is addressed directly in momentum space. In this article we explain in detail the formalism introduced in [39]. Furthermore, we extend it to N 3 LL, and formulate it in general terms, so that a direct application at this logarithmic accuracy to all transverse, inclusive observables is possible. We point out that our final result lacks the contribution of the unknown four-loop cusp anomalous dimension, which is set to zero in the following.
The paper is structured as follows: in Section 2.1 we sketch the main features of our formalism, based on and extending the one developed in ref. [41], through the derivation of a simplified NLL formula relevant to the case of scale-independent parton densities. Section 2.2 discusses the choice of the resolution variable and kinematic ordering in the evolution of the radiation. In Section 2.3 we discuss the structure of higher-order corrections, and in particular in Section 2.3.2 we treat the inclusion of parton densities and of hard-collinear radiation, thereby making our formalism fully capable of dealing with initial-state radiation. In Section 2.4 we prove that our method is formally equivalent to the more common b-space formulation of transverse-momentum resummation. Section 3 shows how to evaluate our formula to N 3 LL order and in Section 3.2 we present a study of the scaling property of the differential distribution in the p t → 0 limit, and compare our findings to the classic result by Parisi and Petronzio [13]. Finally, in Section 4 we discuss the matching to NNLO, and in Section 4.4 we present N 3 LL accurate predictions for the Higgs-boson transverse momentum spectrum at the LHC, matched to NNLO.
In Appendix A we show that, at NLL, the approach used here is equivalent to a backwardevolution algorithm for this class of observables, while Appendix B collects some of the relevant equations used in the article.

Derivation of the master formula
We consider the resummation of a continuously global, recursive infrared and collinear (rIRC) safe [41] observable V in the reaction pp → B, B being a generic colourless system with high invariant mass M . It is instructive to work out in detail the case of NLL resummation first. This will be done in Section 2.1, where we assume that the parton densities are independent of the scale. We then discuss the inclusion of higher-order corrections in Section 2.3, and the correct treatment of the parton luminosity will be dealt with in Section 2.3.2. Finally, in Section 2.4, we discuss the connection to the impact-parameter space formulation for transverse-momentum resummation.

Cancellation of IRC divergences and NLL resummation
In the present subsection we assume that the parton densities are independent of the scale and set to one for the sake of simplicity. To set up the notation we work in the rest frame of the produced colour singlet, and we introduce two reference light-like momenta that will serve to parametrise the radiationp (1, 0, 0, 1) ,p 2 = M 2 (1, 0, 0, −1) , (2.1) where M is the invariant mass of the colour singlet with momentum p B that in this frame reads The directions of the two momenta in Eq. (2.1) coincide with the beam axis at the Born level.
Beyond the Born level, radiation of gluons and quarks takes place, so that the final state consists in general of n partons with outgoing momenta k 1 , . . . , k n , and of the colour singlet. Due to this radiation, the singlet acquires a transverse momentum with respect to the beam direction. We express the final-state momenta by means of the Sudakov parametrisation whereκ ti are space-like four-vectors, orthogonal to bothp 1 andp 2 . In the reference frame (2.1) eachκ ti has no time component, and can be written asκ ti = (0, k ti ), such thatκ 2 ti = −k 2 ti . Notice that since k i is masslessk In the chosen parametrisation, the emission's (pseudo-)rapidity η i in this frame is The observable V is in general a function of all momenta, and we denote it by V ({p}, k 1 , . . . , k n ); without loss of generality we assume that it vanishes in Born-like kinematic configurations. The transverse observables considered in this paper are those which obey the following general parametrisation for a single soft emission k collinear to leg : where k t is the transverse momentum with respect to the beam axis, g (φ) is a generic function of the angle φ that k t forms with a fixed reference vector n orthogonal to the beam axis, d is a normalisation factor, and a > 0 due to collinear and infrared safety. In particular, in this work we focus on the family of inclusive observables that will be defined in the next section. Examples of such observables are the transverse momentum of the colour-singlet system (corresponding to d = g (φ) = a = 1) 1 , and φ * [42] (corresponding to d = a = 1 , g (φ) = | sin(φ)|). In the latter case, the reference vector n is chosen along the direction of the dilepton system in the rest frame of the Z boson. The transverse momentum of the parametrisation (2.3) is related to the one relative to the beam axis, which enters the definition of the observable, by recoil effects due to hard-collinear emissions off the same leg . To find the relationship, we consider the radiation collinear top 1 . The momentum of the initial-state parton before any radiation p 1 is related to the latter as follows where the notation j ∈ 1 indicates all emissions k i radiated off leg 1. The above equation can be recast as We can use the above equation to expressp 1 as a function of p 1 . By plugging the resulting equation into Eq. (2.3), we find that the transverse momentum of emission k i with respect to p 1 is Generalising the above equation for k i emitted off any leg = 1, 2 we obtain where with the notation j ∈ we refer to partons that are emitted off the same legp as k i . When only one emission is present, the above relation reduces to In the soft approximation the two quantities coincide as y In the present section we work under the assumption of soft kinematics in order to introduce the notation and derive the NLL result. The treatment of hard-collinear emissions will be discussed in detail in Section 2.3.2, where we extend the results derived here to the general case of initial-state radiation.
The central quantity under study is the resummed cumulative cross section for V smaller than some value v, Σ(v), defined as In the infrared and collinear (IRC) limit, Σ(v) receives contributions from both virtual corrections and soft and/or collinear real emissions. The IRC divergences of the form factor exponentiate at all orders (see, for instance, refs. [43,44] and references therein), and we denote them by V(Φ B ) in the following discussion, where Φ B is the phase space of the underlying Born. Therefore we can recast Eq. (2.11) as follows where M is the matrix element for n real emissions (the case with n = 0 reduces to the Born matrix element M B ), and [dk i ] denotes the phase space for the emission k i . The Θ function represents the measurement function for the observable under consideration. Finally, to keep the notation concise, where dΦ n is the n-body phase space of the singlet system, and we have absorbed the partonic flux factor 1/(4p 1 ·p 2 ) into the squared amplitude |M | 2 (and analogously in |M B | 2 below). The renormalised squared amplitude for n real emissions (pp → B + n gluons) can be conveniently decomposed as 2 where we have introduced the n-particle correlated matrix elements squared |M (k a , . . . , k n )| 2 , which are defined recursively as follows and so on. These represent the contributions to the n-particle squared matrix element that vanish in strongly-ordered kinematic configurations, that can not be factorised in terms of lower-multiplicity squared amplitudes. Each of the correlated squared amplitudes admits a perturbative expansion where µ is a common renormalisation scale, and α s is the strong coupling constant in the MS scheme. The notation nPC in Eq. (2.13) stands for "n-particle correlated" and it will be used throughout the article. The rIRC safety of the observables considered here guarantees a hierarchy between the different blocks in the decomposition (2.13), in the sense that, generally, correlated blocks with n particles start contributing at one logarithmic order higher than correlated blocks with n−1 particles [41,45]. In the present article, we focus on the family of inclusive observables V for which (2. 16) In this case, we can integrate the nPC blocks for n > 1 inclusively prior to evaluating the observable. Hence, starting from Eq. (2.13) for the pure gluonic case, we can replace it with the following squared amplitude where Y abc... is the rapidity of the k a + k b + k c + . . . system in the centre-of-mass frame of the collision. We refer to this treatment of the squared amplitude as to the inclusive approximation. 3 With the above notation, we can rewrite Eq. (2.12) as Once the logarithmic counting for the squared amplitude has been set up, as a next step we need to discuss the cancellation of the exponentiated divergences of virtual origin against the real ones. At all perturbative orders at a given logarithmic accuracy, we need to single out the IRC singularities of the real matrix elements, which can again be achieved by exploiting [41,45,47] the rIRC safety of the observable V ({p}, k 1 , . . . , k n ) that we are computing.
We then order the inclusive blocks described by |M (k i )| 2 inc according to their contribution to the observable V (k i ), i.e. V (k 1 ) > V (k 2 ) > · · · > V (k n ). We consider configurations where the radiation corresponding to the first (hardest) block |M (k 1 )| 2 inc has occurred, where we use the fact that the contribution with n = 0 in Eq. (2.18) (which does not have any real emissions) vanishes since it is infinitely suppressed by the pure virtual corrections V(Φ B ) The rIRC safety of the observable allows us to introduce a resolution parameter 1 independent of the observable such that all inclusive blocks with V (k i ) < V (k 1 ) can be neglected in the computation of the observable up to power-suppressed corrections O( p V (k 1 )), that eventually will vanish once we take the limit → 0. Therefore, we classify inclusive blocks k as resolved if V (k) > V (k 1 ), and as unresolved if V (k) < V (k 1 ). This definition is collinear safe at all perturbative orders. With this separation Eq. (2.18) becomes The phase space of the unresolved real ensemble is now solely constrained by the upper resolution scale, since it does not contribute to the evaluation of the observable. As a consequence, it can be exponentiated directly in Eq. (2.19) and employed to cancel the divergences of the virtual corrections V(Φ B ).
We can now proceed with an explicit evaluation of Eq. (2.19) at NLL order. As we mentioned earlier, at different logarithmic orders the cross section will receive contributions from different classes of correlated blocks. This, for instance, means that double-logarithmic terms of the form α n s ln 2n (1/v) entirely arise from 1PC (0) blocks, in particular from their soft-collinear part. If one wants to control all the leading-logarithmic terms of order α n s ln n+1 (1/v) in ln (Σ(v)) (Eq. (1.1)) then the leading (soft-collinear) term of the 1PC (1) and 2PC (0) blocks must be included as well. In particular, within the inclusive approximation defined in Eq. (2.17) we find that where β 0 is the leading term of the QCD beta function (see Appendix B). Moreover, the QCD coupling is renormalised in the MS scheme. The contribution of the one-loop cusp anomalous dimension K, defined as enters at NLL order, and it will be considered later in this section. Up to, and including, the NLL term proportional to K in Eq. (2.20), one can integrate inclusively over the invariant mass of the 2PC (0) block, while keeping the bounds on the rapidity Y as computed from the massless kinematics. This approximation neglects terms which are at most NNLL, and are denoted by the ellipsis in the second line of Eq. (2.20). We notice that the leading soft-collinear terms proportional to β 0 in Eq. (2.20) can be entirely encoded in the running of the coupling of the single-emission squared amplitude 1PC (0) (k) through a proper choice of the scale µ at which the latter is evaluated. It is indeed easy to see from Eq. (2.20) that this is achieved by setting µ to the k t (equal tok t for soft radiation) of each emission k in the parametrisation (2.3) [48,49]. The inclusive matrix element squared and phase space controlling all α n s ln n+1 (1/v) terms are thus 22) where we use M sc (k) to denote the amplitude in the soft approximation. We denoted by C the Casimir relative to the emitting leg (C = C F for quarks, and C = C A for gluons).
For initial-state radiation, 1 − z ( ) is the fraction of the incoming momentum (entering the emission vertex) that is carried by the emitted parton. This will in general differ from the y ( ) fractions of the Sudakov parametrisation (2.3) when some emissions are not soft. In particular, while (1 − z ( ) ) ≤ 1, this is not true in general for the (1 − y ( ) ) appearing in our initial parametrisation. However, in the soft limit, the energy of the emission is much smaller than the singlet's mass M , which restricts y ( ) i to positive values in this limit.
For a single emission, the two variables are related by from which is clear that in the soft limit z ( ) 1 one has z ( ) y ( ) . The upper bound for z ( ) in the single-emission case can be worked out by imposing that y ( ) < 1 −k t /M , and subsequently relatingk t to k t relative to the beam axis. This yields To extend the above discussion to all NLL terms of order α n s ln n (1/v) in the logarithm of Σ(v), we must include the less singular part of the 1PC (1) and 2PC (0) blocks in the soft limit, that is the term proportional to K in Eq. (2.20) that was previously ignored. This simply amounts to replacing the inclusive (soft) matrix element in the r.h.s. of (2.22) with .
(2.25) The above operation is also known as the Catani-Marchesini-Webber (CMW) scheme [50] for the running coupling. 4 At this logarithmic order the cross section also receives contributions from the hard-collinear part of the 1PC (0) block, that we ignored so far. Thus, one has to modify Eq. (2.25) as where P (0) (z ( ) ) is the leading-order unregularised splitting function, reported in Appendix B. 5 At NLL order, the above hard-collinear contribution can be treated by neglecting the effect of recoil both in the phase-space boundaries of other emissions and in the observable, both of which enter at NNLL order. Therefore, also for this contribution we can use the soft kinematics derived in the first part of this section. Moreover, in colour-singlet production, we can use the azimuthally averaged splitting functions (see Appendix B) up to NNLL accuracy. At N 3 LL, corrections from azimuthal correlations arise [51], and they will be introduced in Section 2.3.3.
We insert Eq. (2.26) back into Eq. (2.19). At NLL accuracy, we can neglect the constant terms of the virtual corrections. The remaining singular structure of the virtual corrections only depends upon the invariant mass of the singlet M 2 The combination of unresolved real and virtual contributions is thus finite and gives rise to a Sudakov suppression factor 4 Although in the present article we are considering only inclusive observables, it can be shown [41,45,47] that for all rIRC safe observables (also non-inclusive ones) the inclusive approximation is accurate at NLL order. 5 For emissions off gluonic legs, P (0) receives contributions from both P gq , as it will be discussed in Sec. 2.3.3. In this case, we implicitly exploit the symmetry of P (0) gg under z ↔ 1 − z to recast it such that it has only a z → 1 singularity.
where R is the radiator which at this order reads [41,45] and . (2.31) The next and final step is to treat the resolved real blocks k i for which V (k i ) > V (k 1 ). It is therefore necessary to work out the kinematics and phase space in the presence of additional radiation, which modifies the relations (2.23) and (2.24) obtained in the single-emission case. For this we use the fact that the radiation is ordered in V (k i ). For a given inclusive block of total momentum k i , one then has 6 1 − y where emissions k 1 , k 2 , . . . , k i−1 have been radiated off the same hard leg before k i . In general, this implies that the phase space available for each emissions is changed by the previous resolved radiation. At the NLL order considered in this section, as already stressed, the real-radiation kinematics can be approximated with its soft limit [41,45]. This allows us to approximate y The squared matrix element (2.26) and phase space for a resolved real emission can be parametrised by introducing the functions and .
From the generic form of the rIRC safe observable V (k) (2.5), it is easy to verify that the R functions only depend upon the ratio v/(d g (φ)) up to regular terms, which are neglected [41,45]. Indeed, the only non-trivial integration in Eqs. (2.33) is the one over the rapidity of k, which can be performed inclusively since the observable V (k) does not depend on it (see Eq. (2.5)). Then the final integral only depends on the ratio of the two remaining scales, i.e. the invariant mass of the singlet M , and its transverse momentum that is set to (v/(d g (φ))) 1/a M by the constraint δ (v − V (k)). Upon inclusive integration over the rapidity of momentum k, by using Eq. (2.26), we can parametrise the inclusive squared amplitude and its phase space as With the above considerations, Eq. (2.19) finally becomes where we introduced the total Born cross section Eq. (2.36) resembles equation (2.34) of ref. [41] which after a number of approximations leads to the general NLL formula of the CAESAR method for global rIRC observables in processes with two hard legs. We remind the reader that additional corrections coming from the parton luminosities start at NLL order, and they will be discussed in Section 2.3.2.
Eq. (2.36) can be directly evaluated using Monte-Carlo (MC) techniques since it is finite in four dimensions. However, as it is formulated now it contains effects that are logarithmically subleading with respect to the formal NLL accuracy we are considering in this section. For observables that vanish only in the Sudakov limit, these subleading effects can be systematically disposed of by means of a few approximations, as described in ref. [41]. We now briefly review such approximations on Eq. (2.36), and show that in the case of observables that vanish away from the Sudakov region they lead to a divergent result, hence they cannot be trivially performed.
In order to neglect subleading corrections from Eq. (2.36), we need to consistently treat the resolved squared amplitude and the corresponding Sudakov radiator. In particular, with NLL accuracy, ref. [41] suggests to perform the following Taylor expansions in Eq. (2.36) This is motivated by the fact that at NLL the resolved real emissions are such that v i ∼ v 1 ∼ v, and hence the terms neglected in the above expansions are at most NNLL. Only by expanding consistently (i.e. to the same logarithmic order) the dependence in the Sudakov and in the resolved real emissions we are sure that the result is completely -independent. We observe that, since we expanded out the φ i dependence in R , we have dR(v)/d ln(1/v) = R (v) and Eq. (2.36) becomes At this stage, the integration over v 1 can be performed analytically, and Eq. (2.39) reproduces exactly the known CAESAR formula. 7 However, in order to perform the latter expansions about the observable's value v, one has to make sure that the ratio v i /v remains of order one in the real-emission phase space. rIRC safety ensures that emissions with v i v do not contribute to the observable, and are fully exponentiated and accounted for in the Sudakov radiator. Therefore, the condition v i /v ∼ 1 is fulfilled only if configurations in which v i v never occur. While the latter condition holds true for most rIRC observables, it is clearly violated for observables that vanish away from the Sudakov limit. An example is given by the transverse momentum of a colour singlet, which can vanish even in the presence of several emissions with a finite (nonzero) transverse momentum. In that case, as shown in ref. [39], Eq. (2.39) has a divergence at R (v) 2. For a different observable vanishing away from the Sudakov limit, the divergence will occur at a different, non-zero value of v.
For such observables, Eq. (2.36) cannot be expanded around v. As we will discuss in detail in Section 3.1, we suggest to perform the following alternative expansion about the observable's value of the hardest block v 1 In this way, the rIRC safety of the observable guarantees that v i ∼ v 1 (ζ i ∼ 1) and therefore the terms neglected in Eqs. (2.40) are at most NNLL. However, a class of higher-order terms still remains in Eq. (2.40) through the dependence of the considered terms on v 1 . These higher-order terms cannot be disposed of entirely, as they regularise the divergence discussed above. Therefore, while the resulting equation is finite and accurate at NLL order also for rIRC-safe observables that vanish away from the Sudakov limit, subleading corrections beyond NLL cannot be entirely removed.
The above approximations make the evaluation of Eq. (2.36) considerably simpler than its original form, as it will be shown in Section 3. Its implementation can be carried out efficiently with MC methods as described in detail in Section 4.3.

Choice of the resolution and ordering variable
The derivation that we carried out for the resummation formalism relies to a large extent on the introduction of a resolution variable that separates resolved real blocks from unresolved ones as discussed in the previous section. This resolution variable acts on the total momentum of each of the correlated blocks.
One has some freedom in choosing the resolution variable. In principle, the only necessary property for a good resolution variable is that it must guarantee, at all orders, the cancellation of the IRC divergences of the exponentiated virtual corrections, and hence has to be rIRC safe. A particular choice is motivated by convenience in formulating the calculation. For instance, choosing a variable that shares the same leading logarithms with the resummed observable allows for a much easier implementation of the all-order result, as it will be discussed in Section 3. A natural choice, 7 Some extra simplifications can be made at NLL: in the resolved real squared matrix elements R one can keep only the term proportional to M 2 sc as remaining terms are subleading. In order to guarantee the cancellation of the divergences in the regulator, the same approximation has to be made in the term coming from the expansion of the Sudakov radiator. Finally, the observable can be treated in its soft-collinear approximation given that, at NLL, the real emissions constitute an ensemble of soft-collinear gluons. which fulfils the above requirements, is the value of observable in its soft-collinear approximation, as discussed in refs. [41,45,46,52].
However, we note that for the whole class of transverse observables (that scale like Eq. (2.5) for a single emission), a more convenient choice for the resolution variable is V (k) = (k t /M ) a , k being the sum of the four-momenta in each correlated block. While this exactly coincides with the above prescription for observables with d = g (φ) = 1, it is a legitimate choice also for observables with d = 1, g (φ) = 1 since the dependence on d g (φ) first enters at NLL order, hence the leading logarithms of the resolution variable are the same as for the resummed observable.
The advantage of the latter choice, besides the simplifications in the implementation to be discussed in Section 3, is that it leads to a universal Sudakov radiator for all observables with the same a in the parametrisation (2.5), while the resolved real radiation will correctly encode the full observable dependence through the measurement function Θ (v − V ({p}, k 1 , . . . , k n+1 )). In the present article, we adopt this choice, and we present explicitly the case for a = 1. The generalisation to any a > 0 is straightforward following our derivation. With this choice, Eq. (2.36) reads where, with a little abuse of notation, we redefined ζ i = k ti /k t1 . As it will be described in Section 4.3, the above equation can be efficiently evaluated as a simplified shower of primary emissions off the initial-state legs, ordered in transverse momentum. This choice of the ordering variable is dictated by the choice of the resolution scale, that in turn leads to the Sudakov radiator for a k t ordered evolution in Eq. (2.41).

Structure of higher-order corrections
In deriving the main result of the previous section, Eq. (2.36), we made two approximations. Firstly, we ignored nPC correlated blocks with n > 2 in the squared amplitudes (2.17). Secondly, we did not specify a complete treatment of hard-collinear radiation. Indeed, the only hard-collinear contribution entering at NLL (in Eq. (2.26)) has been treated with soft kinematics. We discuss how to relax both approximations in the next two subsections.

Correlated blocks at higher-logarithmic order
Higher-order corrections require the inclusion of higher-multiplicity and higher-order blocks with respect to those relevant to Eq. (2.36). The relevant blocks necessary to a given order are summarised in Table 1. For instance, at NNLL, for the observables (2.16), one has to include 2PC (0) (i.e. the fully correlated double emission), and 1PC (1) both in the soft and in the hard-collinear limit, and 3PC (0) , 2PC (1) , and 1PC (2) blocks in the soft-collinear limit. Given the inclusive nature of the observables (2.16) that we are treating in this article, the inclusion of higher-order blocks can be done in a simple systematic way by adding more terms to the r.h.s. of Eq. (2.17). We remind the reader of the fact that, while at NLL the bounds for rapidity Y i of the inclusive block |M (k i )| 2 inc can be approximated with their massless limit (see Eq. (2.20) and comments below it), starting at NNLL the integration over the rapidity Y i must be performed exactly.

Hard-collinear emissions and treatment of recoil
In order to repeat the procedure that led to Eq. (2.36) at higher logarithmic accuracy, we need to handle the phase space in the multiple-emission kinematics. In the NLL case derived in the previous Logarithmic order Table 1. Blocks to be included in the squared-amplitude decomposition at a given logarithmic order. At each order, the higher-rank blocks are to be included in the soft-collinear limit ("sc" in the table).
section, indeed, all resolved real emissions are soft and collinear and therefore they do not modify each other's phase space. However, starting at NNLL one or more real emissions can be hard and collinear to the emitting leg and this changes the available phase space for subsequent real emissions.
More precisely, at NNLL we need to work out the corrections due to a single hard-collinear resolved emission within an ensemble of soft-collinear radiation. Similarly, at N 3 LL, one has to consider up to two resolved hard-collinear emissions embedded in an ensemble of soft-collinear radiation. The kinematics and the proper treatment of hard-collinear emissions, still missing in our formulation, will be discussed in this section.
To correctly include the evolution of the hard-collinear radiation in our formulation, we first consider how initial-state radiation modifies the real-emission kernels, illustrating this in the singleemission case for the sake of clarity. Throughout this section and in the rest of this article we use the tree-level splitting functions as reported in Appendix B.
We start by formulating the single-emission probability for a gluon-initiated process. For the sake of concreteness, all prefactors in this subsection are given under the assumption that the colour singlet is a single particle, e.g. a Higgs boson. We express the probability of emitting either a gluon or a quark off leg 1 (an analogous term can be written for an emission off leg 2), for an observable v, as is the gluon density renormalised in the MS scheme, evaluated at a factorisation scale µ F , andP denotes the regularised splitting function. SinceP For the virtual correction, we simply use the first-order expansion of the resummed form factor V(Φ B ) [43] expressed in terms of leading-order splitting functions, of which we take the limit in four dimensions. The unregulated soft and collinear divergences of the four-dimensional virtual corrections manifestly cancel against the ones in the real emissions at the integrand level. We stress once again that in colour-singlet production we can use the azimuthally averaged splitting functions (see Appendix B) up to NNLL accuracy. At N 3 LL, corrections from azimuthal correlations arise [51], and they will be introduced in Section 2.3.3.
In general, the upper bound of the z integration in the virtual corrections is different from the one in the real correction when more than one hard-collinear emission is present, since the available phase space for the real emissions is changed by the presence of the hard-collinear radiation. However, for the single-emission case treated in Eq. (2.42), the upper bound, derived in Eq. (2.24), is identical for the real and virtual contributions.
Eq. (2.42) also contains constant contributions arising from both the finite terms of the virtual form factor in MS, and the O(α s ) collinear coefficient functions. For the sake of simplicity, in the following discussion we neglect these NNLL constant terms, which we will however include in our final formula.
We now add and subtract the term By using the symmetry of the P gg splitting function under z ↔ 1 − z, one finds that (2.45) which allows us to recast the previous equation as Analogously, it is straightforward to show that the logarithmic part for a quark-initiated process with an emission off the leg 1 reads where we have setP qg (z). In Eqs. (2.46) and (2.47), the last integral from 1 − k t /M to 1 gives rise to regular terms and can therefore be neglected. As far as the remaining terms are concerned, we notice that the squared matrix element for an initial-state emission, which corresponds to the terms containing a Θ function in Eqs. (2.46) and (2.47), can be separated into two pieces: • The first one, encoded in the third line of Eqs. (2.46) and (2.47), modifies neither the flavour nor the momentum fraction of the incoming partons, and the bounds of the relative z integration are those of the corresponding virtual phase space. This contribution is fully analogous to the case treated in Sec. 2.3, that gives rise to R in Eq. (2.36). When evaluating this term explicitly, we can further split it, as done in Eq. (2.26), into a soft term and a hard-collinear contribution. The exact upper bound of the z integral is only relevant in the soft contribution, while it can be extended up to 1 in the hard-collinear term up to regular (non logarithmic) terms. In the following, we will refer to this term as the R contribution.
• The second one (second and fourth lines of Eqs. (2.46) and (2.47)) does modify both flavour and momentum fraction. This contribution corresponds to an exclusive step of DGLAP evolution. The corresponding z integration can be extended up to the soft limit (z = 1) as this limit is regularised by the plus distribution in the corresponding splitting function. We stress once again that the latter extension of the upper bound of the z integration in the hard-collinear radiation's phase space is correct up to regular terms that are ignored in our treatment. We will refer to this term as the exclusive DGLAP evolution step.
This decomposition is only a convenient way of expressing the squared amplitude and phase space for an initial-state emission, and only the sum of all logarithmic terms in Eqs. (2.46) and (2.47) is physically well defined. The considerations above will be useful in the rest of this section when the all-order kinematics is discussed.
As anticipated in the beginning of this subsection, in order to achieve N 3 LL accuracy, one has to consider configurations with up to two resolved hard-collinear emissions together with any number of soft-collinear partons in the final state. We therefore study how the presence of hard-collinear emissions affects the phase space of the remaining radiation in the all-order picture. 8 We consider again the emissions ordered according to their transverse momentum. In this picture, the relation between the z ( ) variable and the Sudakov variable y ( ) for a given emission k will be modified by the radiation that occurred before k as described in Eq. (2.32).
We consider the case of an ensemble of resolved emissions off a leg of which a single one is hard and collinear, while all the remaining radiation is soft. We can group the emissions into the following three sets: the soft emissions that occur before the hard-collinear parton is emitted (i.e. at larger transverse momenta), the hard-collinear emission itself, and the soft emissions that occur after the hard-collinear one (at smaller transverse momenta). The soft radiation emitted before the hard-collinear emission has z . Finally, soft emissions that occur after the hard-collinear one will again have k ti k ti but now From the above equation we see that the phase space of the soft radiation emitted after the hardcollinear emission is modified by the presence of the latter. However, the squared amplitude and phase space for emissions in the soft limit only depend on z . Therefore, using the relation dz and using the fact that k ti k ti for these emissions, we can replace the integral over z ( ) i with an integral over y ( ) i whose upper bound is given by This allows one to disentangle the phase space of all emissions in the considered kinematic configuration and, hence, to iterate the procedure at all orders. The remaining kinematic configuration to be considered in a N 3 LL resummation is given by an ensemble of soft-collinear emissions accompanied by two hard-collinear ones. We label the two hard collinear emissions by k hc 1 and k hc 2 and we assume, without any loss of generality, that k hc 1 is emitted before k hc 2 (hence it has a larger transverse momentum in our picture). The upper bounds of the corresponding z ( ) integrals for the real contribution will now be complicated functions of the transverse momenta k hc t1 and k hc t2 that can be obtained starting from Eqs. (2.9), (2.32). However, things are much simplified if we use the decomposition described in the first part of this section, as follows. We recall that the real matrix element can be decomposed as a sum of the R contribution (that does not modify the momentum fraction of the emitter, and whose kinematics is soft by construction), and an exclusive DGLAP step that modifies the momentum fraction of the emitting leg, as shown in Eqs. (2.46), (2.47). In the latter term, the upper bound of the z ( ) integration can be extended to 1 (hence it becomes independent of the kinematics of the rest of the event) since the soft limit is regularised by the plus prescription in the corresponding splitting functions. As for the R contributions relative to k hc 1 and k hc 2 , they can be further decomposed into a soft-collinear term and a term that contains the hard-collinear part of the matrix element (which however does not modify the momentum fraction of the emitting leg). Once again, in the latter contribution the z ( ) integration can be extended to 1, while in the soft-collinear contribution one can simply replace the z ( ) integral with an integral over y ( ) by means of Eq. (2.49). Moreover, using the fact that for a soft emissionk t k t , the corresponding upper bound of the y ( ) integral can be replaced by This procedure allows one to disentangle completely the phase space of the R contributions (whose kinematics is soft by construction) from that of the exclusive DGLAP evolution step which are by construction hard and collinear. The lower bounds in the z ( ) integrals of multiple resolved DGLAP evolution steps are entangled as each of them modifies significantly the momentum available for the subsequent hard-collinear ones, resulting in a convolution between the splitting kernels and the corresponding parton density.
The above treatment of the double-hard-collinear case is valid up to regular terms. In this section we neglected the constant terms that arise from the finite part of the renormalised form factor, and from the collinear coefficient functions, which are relevant already for a NNLL resummation. For inclusive observables considered in this article, the collinear coefficient functions factorise in front of the Sudakov factor and, for the processes considered here, they were computed to O(α 2 s ) in refs. [22][23][24]. These will be introduced in the following section when we iterate the arguments discussed here at all perturbative orders in α s .

Resummed formula for initial-state radiation
The arguments derived in the previous section can be used to formulate the structure of the cross section at all orders by iterating the single-emission picture defined above. Given the inclusive nature of the observables studied here, the inclusion of higher-order logarithmic corrections can be achieved in a simple way by just adding the relevant correlated blocks (as reported in Table 1) in the inclusive approximation (2.17). The contribution to the cross section from each inclusive block, in turn, can be split into an R -type contribution (which does not modify either the momentum fraction or the flavour of the emitting leg), and a DGLAP step (inclusive in the content of each correlated block, but differential in its transverse momentum), and hence it can be treated in a fully analogous way to what done for single emissions in the previous subsection. This simple prescription allows us to discuss the inclusion of the parton densities by referring to emissions (for the sake of simplicity), while keeping in mind that they are to be thought of as inclusive sums of correlated blocks as defined in Eq. (2.17).
To show how the parton densities are accounted for, we start by evaluating them at a scale µ 0 that is assumed to be smaller than all transverse momenta in the event. We consider the situation in which the emissions are ordered in transverse momentum, and the hardest (resolved) emission k 1 occurred. The phase-space diagram for any secondary emission k i with i > 1 is depicted in Fig. 1 in the ln(k t /M ) − η (Lund) plane, where now η denotes the rapidity in the centre-of-mass frame of the incoming partons which are extracted from the proton at a factorisation scale µ 0 , and the transverse momentum k t is taken with respect to the beam direction. As stated in Section 2.1, due to rIRC safety, only emissions that take place in the strip between k t1 and k t1 (labelled with "REAL EMISSIONS" in Fig. 1) modify the observable significantly and are resolved. The remaining unresolved real emissions (k ti < k t1 ) are combined with the virtual corrections, which populate the whole region below the two diagonal lines that denote the upper rapidity limits. The result of this combination is indeed the Sudakov form factor associated with the first emission that vetoes secondary emissions in the yellow region (labelled with "SUDAKOV SUPPRESSION" in Fig. 1) of the Lund plane. In addition, the combination of virtual and unresolved emissions gives also rise to a constant term that multiplies the Sudakov and encodes both the finite part of the virtual corrections and the constant contribution due to soft and/or collinear emissions exactly at the edges of their phase space, encoded in the collinear coefficient functions.
In the initial-state-radiation case at hand, hard-collinear emissions define the evolution of the parton densities. These emissions occur on a strip (labelled with "DGLAP" in Fig. 1) along the upper rapidity bounds, and their evolution is encoded in the DGLAP equations. In the unresolved region (k ti < k t1 ), the DGLAP evolution can be performed inclusively since emissions in this phasespace region do not affect the value of the observable. On the other hand, when k t1 > k ti > k t1 the corresponding hard-collinear emissions modify significantly the observable's value and therefore must be treated exclusively, namely unintegrated in k t . In addition to the parton densities, starting at O(α s ), one needs to include the coefficient functions that emerge from their renormalisation, and originate from emissions that occur at the edges of the phase space in Fig. 1. The coefficient functions contribute to the logarithmic structure only through the scale of their running coupling, which is the transverse momentum of the emission(s) they are associated with. As done for the parton densities, one can evaluate them initially at a scale µ 0 smaller than any transverse momentum in the event, and subsequently evolve them inclusively up to the resolution scale k t1 . Their evolution must be instead treated exclusively in the resolved strip k t1 > k ti > k t1 .
In order to introduce the all-order result, it is convenient to simplify the flavour structure of the evolution for the time being. We neglect real-emission kernels that modify the flavour of the emitting leg, namely those that do not have a soft singularity P qg and P gq . This ensures that the flavour of the initial parton densities is only modified by the coefficient functions and is conserved by the resolved real radiation. This approximation is made without any loss of generality, and for the only sake of simplicity. The extension to the full flavour case will be trivial once the final formula is obtained.
For the remaining part of the section, it is useful to introduce a matrix notation to simplify the structure of our expressions in flavour space. We define f as the array containing the 2n f + 1 partonic densities, where n f denotes the number of active flavours. To handle different Born configurations with different incoming flavours c , we then define the coefficient-function matrix C c as a (2n f + 1) × (2n f + 1) diagonal matrix in flavour space whose entries are where C ij are the collinear coefficient functions, c is the flavour of the leg entering the Born process, and f (a) is the flavour corresponding to the a-th entry of the parton-density array. For instance, we explicitly show the above convention in the case of Higgs production, considering only a single quark flavour q. By defining the array f = (f g , f q , fq) T , the matrix C g reads The evolution of (2.51) between two scales is entirely encoded in the evolution of the running coupling. By introducing the corresponding anomalous-dimension matrix Γ (C) we can write the Renormalisation-Group evolution (RGE) of the coefficient function matrix as In principle, the matrix Γ (C) should also explicitly carry a label c to specify that it evolves the coefficient function C c associated with the Born flavour c . We omit this label as the notation in what follows is unambiguous. We stress however that the flavour of the coefficient function is not modified by its RG evolution, indeed it is manifestly flavour diagonal. The iterative structure of the squared amplitudes appears more transparent if we work in Mellin space, where convolutions become products. We therefore introduce the Mellin transform of a function g(x) as (2.55) The DGLAP [53][54][55] evolution of the parton-density vector f can be conveniently written in Mellin space as In the previous equation P is the path-ordering symbol, and the matrix Γ is defined as whereP f (a)f (b) are the regularised splitting functions (see Appendix B). We stress that, within the simplifying assumption made above on flavour-conserving real-emission kernels, no splitting functions involving a real quark emission are included, therefore the matrix Γ is diagonal. Within this assumption, the path ordering in Eq. (2.56) can be lifted. With this notation, the hadronic cumulative cross section, differential with respect to the Born phase space Φ B , can be written as where the sum runs over all possible Born configurations and we employed a double inverse Mellin transform. The contours C 1 and C 2 are understood to lie along the imaginary axis to the right of all singularities of the integrand. In Eq. (2.58), and from now on, we define the notation where Ω B denotes any set of internal phase-space variables used to parametrise the colour-singlet system. The right-hand side differs from the squared amplitude |M B | 2 c1c2 simply by a jacobian factor.
The matrixΣ encodes the effect of the all-order radiation that evolves the partonic cross section and the corresponding parton densities. To write down an all-order expression forΣ for the observables (2.16), we need to iterate the single-emission probability derived in the previous section. Given that the phase space of the R contributions and the exclusive DGLAP evolution steps are completely disentangled in the resolved real radiation, this operation can be performed straightforwardly in Mellin space, yieldinĝ where now ζ i = k ti /k t1 since we are using the transverse momentum as a resolution and ordering variable. R is a diagonal matrix in flavour space: given the flavour c of the Born leg , it describes the flavour-conserving resolved radiation off leg . It is defined as and R is defined in Eq. (2.33). The Sudakov operator R is then defined as The terms proportional to R in Eq. (2.59) encode the contribution of the radiation which is flavour-diagonal, and does not modify the momentum fraction of the incoming partons. This is the analogue of what has been derived in Sec. 2.1 in the case of scale-independent parton densities. In addition, the real emission probability now involves the exclusive evolution for the parton densities and coefficient functions. The matricesΣ c1,c2 are diagonal in flavour space within the flavour assumption that we are making here. The first line of Eq. (2.59) contains the factor C c1;T N1 (α s (µ 0 ))H(µ R )C c2 N2 (α s (µ 0 )) that encodes the hard-virtual corrections to the form factor and the collinear coefficient functions. Explicit expressions for these quantities will be given later (see Sec. 3.1 and references therein). As discussed above, the coupling of the coefficient functions here is evaluated at µ 0 and subsequently evolved up to k t1 by the operator containing the diagonal matrix Γ (C) N in the second line of (2.59). Similarly, the parton densities are evolved from µ 0 up to k t1 . As it was shown in ref. [51], starting at a given order in perturbation theory one needs to include the contribution from the collinear coefficient functions G, that describe the azimuthal correlations with the initial-state gluons. Such a contribution starts at O(α 2 s ) (i.e. N 3 LL) for gluon-fusion processes, and at yet higher orders for quark-initiated ones. It is included in the above formulation by simply adding to Eq. (2.59) an analogous term where one makes the replacements and Γ N is defined analogously to Eq. (2.53), and the flavour structure of G is analogous to the one of the C matrix. In what follows this contribution, whenever not reported, is understood.
Eq. (2.59) has been derived by iterating the single-emission probability. As discussed above, higher-order logarithmic corrections are simply included by adding higher-order correlated blocks. Specifically, this amounts to including higher-order logarithmic corrections to the radiator R and its derivative R , as well as in the anomalous dimensions which drive the evolution of the parton densities and coefficient functions.
We conclude the discussion by pointing out that even if the all-order formulation has been conveniently obtained in Mellin space, it is possible to evaluate Eq. (2.58) directly in momentum space at any given logarithmic order. We will describe how to do this in Sec. 3.1. Eq. (2.59) holds for all inclusive observables (see definition in Sec. 2.3) that do not depend on the rapidity of the initial-state radiation. In the remaining part of this article we specialise to the study of the transverse-momentum case, but analogous conclusions will apply to other observables of the same class.

Equivalence with impact-parameter-space formulation
In this section we show how to relate our Eq. (2.58) to the impact-parameter-space formulation of [13]. We show the equivalence for the differential partonic cross section (2.59) in the case of the transverse momentum p t . An analogous proof can be carried out in the case of the φ * .
Our starting point is the differential partonic cross section, where we now set µ 0 = µ R = M without loss of generality: We transform the δ function into b-space as and we evaluate the azimuthal integrals, which simply amounts to replacing each of the factors e ±i b· kt with a Bessel function J 0 (bk t ). It is now straightforward to see that the sum in Eq. (2.64) gives rise to an exponential function, yielding .

(2.66)
We finally notice that we can set → 0 in the above formula, given that now the cancellation of divergences is manifest. The k t1 integrand is a total derivative and it integrates to one, leaving We now insert the resulting partonic cross section back into the definition of the hadronic cross section (2.58), and use the second and third terms in the exponent of Eq. (2.67) to evolve the parton densities and the coefficient functions down to b 0 /b, with b 0 = 2e −γ E . After performing the inverse Mellin transform, and neglecting N 4 LL corrections, we obtain (hereafter we simplify the notation for the parton densities by omitting their x 1 and x 2 dependence, which is determined by the Born kinematics Φ B ) where R CSS, and H CSS (M ) are the Sudakov and hard function commonly used for a b-space formulation [14]. As shown in ref. [51], and as already stressed above, both Eqs. (2.68) and (2.69) receive an extra contribution due to the azimuthal correlations which are parametrised by the G coefficient functions. We omit them in this comparison for the sake of simplicity, however it is clear that analogous considerations apply in that case. The comparison between Eqs. (2.68) and (2.69) allows us to extract the N 3 LL ingredients from the latter formulation as obtained in refs. [22,23,25,26], that will be reported in the next section. We start by using the relation 10 where we ignored N 4 LL terms. In the above formula the derivative in the second term of the right-hand-side is meant to act on the integral whose bounds are set by Θ(k t − b0 b ). This yields, at N 3 LL, The second term in the exponent of Eq. (2.71) starts at N 3 LL, so up to NNLL the two definitions (the one in terms of a J 0 and the one in terms of the theta function) are manifestly equivalent. To relate the two formulations we recall the definition of R in Eq. (2.60) and we express the Sudakov radiators as (2.61) (2.72) The anomalous dimensions A and B relative to leg and the hard function H admit an expansion in the strong coupling as (2.73) The relation between the coefficients that enter at N 3 LL can be deducted by equating Eqs. (2.68) and (2.69), obtaining 10 See appendix of ref. [57] for a derivation.
The above equations constitute the ingredients for our N 3 LL resummation. Physically, the extra terms proportional to ζ 3 arise from the fact that the O(α 2 s ) terms proportional to δ(1 − z) in the coefficient functions in momentum space differ from their b-space counterpart. This difference precisely amounts to the new contributions in Eqs. (2.74). We stress that only the combination of A (4) , B (3) , H (2) and C (2) is resummation-scheme invariant, hence our choice of absorbing the new terms into A (4) , B (3) , H (2) is indeed arbitrary. One could analogously define an alternative scheme in which the extra terms are directly absorbed into the O(α 2 s ) coefficient functions, thus leaving the two-loop form factor unchanged.

Evaluation up to N LL
In this section we evaluate our all-order master formulae (2.58) and (2.59) explicitly up to N 3 LL accuracy. The latter equations can already be evaluated as they are by means of Monte Carlo techniques; however, at any given logarithmic order it is possible, and convenient, to further manipulate them in order to evaluate them directly in momentum space, without the need of the Mellin transform.

Momentum-space formulation
We firstly focus on the partonic cross section (2.59). There are three main ingredients: the Sudakov radiator and its derivative, the block containing coefficient functions C(α s ) and hard-virtual corrections to the form factor H(µ R ), and the anomalous dimensions that rule the evolution of parton densities and coefficient functions.
For colour-singlet production, the coefficients entering the Sudakov radiator satisfy A (2) have been known for several years [19,58,59], and they are collected, for instance, in the appendix of ref. [57]. The N 3 LL coefficient B (3) can be extracted from the recent result [25,26]. For gluon processes it reads: while for quark processes The remaining N 3 LL anomalous dimension A (4) is currently incomplete given that the four-loop cusp anomalous dimension is still unknown. Here we compute A (4) according to Eq. (71) of ref. [19] or Eq. (4.6) of ref. [60], using the results of refs. [25,26] for the soft anomalous dimension, and setting the four-loop cusp anomalous dimension to zero. For gluon-initiated processes we get while for quark-initiated ones We have left the additional terms arising from Eq. (2.74) unexpanded to facilitate the comparison to the existing literature. The remaining quantities are evaluated with n f = 5. The expression of the Sudakov radiator is analogous to the b-space one, i.e.
and, as above, we define R as the logarithmic derivative of R where we defined In order to make the numerical evaluation of our master formula Eq. (2.59) more efficient, we can make a further approximation on the integrand without spoiling the logarithmic accuracy of the result. Before we describe the procedure in detail, we stress that this additional manipulation is not strictly necessary and one could in principle implement directly Eq. (2.59) in a Monte-Carlo program.
Since the ratios k ti /k t1 for all resolved blocks are of order 1, we can expand R and its derivative about k t1 , retaining terms that contribute at the desired logarithmic accuracy. At N 3 LL one has where the dots denote N 4 LL terms, and we have employed the usual notation ζ i = k ti /k t1 . We recall that the transverse momenta of blocks in the resolved ensemble are parametrically of the same order. This is because rIRC safety ensures that blocks k with k t k t1 do not contribute to the observable and are encoded in the Sudakov radiator. Therefore, since ln(1/ζ i ) in the above formula is the logarithm of an O(1) quantity, each term in the right-hand-side of Eq. (3.8) is logarithmically subleading with respect to the one to its left.
The logarithms ln(1/ ) in the first line of Eq. (3.8) are a parametrisation of the IRC divergences arising from the combination of real-unresolved blocks and virtual corrections, expanded at a given logarithmic order. The dependence exactly cancels against the corresponding terms in the resolved real corrections (denoted by the same-order derivative of R) upon integration over ζ i , as it will be shown below. This is a convenient way to recast the subtraction of IRC divergences at each logarithmic order in our formulation.
The terms proportional to R (k t1 ) are to be retained starting at NLL, those proportional to R (k t1 ) contribute at NNLL and, finally, the ones proportional to R (k t1 ) are needed at N 3 LL. Starting from the NLL ensemble, we note that correcting a single block with respect to its R (k t1 ) approximation (i.e. including for that block the subleading terms of Eq. (3.8)) gives rise at most to a NNLL correction of order O(α n s L n−1 ) in our counting. Modifying two blocks would lead to a relative correction of order O(α n s L n−2 ), i.e. N 3 LL, and so on. Therefore, at any given logarithmic order, it is sufficient to keep terms beyond the R (k t1 ) approximation only for a finite number of blocks (namely a single block at NNLL, two blocks at N 3 LL, and so forth). Consistently, one has to expand out the corresponding terms in the Sudakov that cancel the divergences of the modified real blocks to the given logarithmic order. This prescription has been derived and discussed in detail at NNLL in ref. [45], and will be used later in this section.
As a next step we address the evolution of the parton densities and relative coefficient functions encoded in Eq. (2.59), whose anomalous dimensions Γ N and Γ (C) N have been defined in Eqs. (2.56), and (2.54). Only a finite number of terms in their perturbative series needs to be retained at a given logarithmic accuracy: in particular, contributions from the O(α n s ) term in Γ N enter for a N n+1 LL resummation (we recall that the series of Γ N starts at O(α 0 s ), hence these terms start contributing at NLL). On the other hand, the contribution of the coefficient functions, and therefore of the corresponding anomalous dimension, starts at NNLL. Therefore the O(α m s ) term in Γ (C) N is necessary at N m+1 LL, since its expansion starts at O(α s ).
We can then perform the same expansion about k t1 for the terms in Eq. (2.59) containing Γ and Γ (C) . Up to N 3 LL we expand the exponent of the evolution operators as N (α s (k t )) ln 1 + . . . , (3.10) and the corresponding resolved real-emission kernels as where as usual L = ln(M/k t1 ). The first terms on the right-hand side of Eqs. (3.9), and (3.10) represent the evolution operator that runs the parton densities and the coefficient functions, respectively, from µ 0 up to k t1 . The remaining terms describe the exclusive evolution of the parton densities and of the coefficient functions in the resolved strip. In particular, the -dependent terms completely cancel against the corresponding terms in the real-emission kernel of Eqs. (3.11), and (3.12) upon integration over the resolved-radiation phase space. At NLL the coefficient functions are an identity matrix in flavour space, and therefore their evolution operator is trivial. The contribution of the Γ N in the exponent starts at NLL, while the exclusive evolution of the parton densities in the resolved strip starts at NNLL since it corresponds to emissions in the hard-collinear edge of the phase space. Therefore, at NLL one only needs to retain the first term in the right-hand side of Eq. (3.9), and ignore everything else in Eqs. (3.9), (3.10), (3.11), and (3.12), which corresponds to evaluating the parton densities at µ F = k t1 . At this order, the evolution can be carried out by means of the tree-level anomalous dimension γ N ). At this order also the coefficient functions start contributing with their inclusive evolution, therefore one needs to add the first term in the r.h.s. of Eq. (3.10). The corresponding exclusive evolution of the coefficient functions in the resolved strip, encoded in the r.h.s. of Eq. (3.12) only starts at N 3 LL. At higher orders, one simply needs to add subsequent terms from the above equations, and evaluate the anomalous dimensions at the appropriate perturbative accuracy.
As discussed above for the Sudakov radiator, at any given logarithmic order beyond NLL, it is sufficient to include the extra -dependent terms from Eqs. (3.9), (3.10) in the exponent, and the corresponding terms in the resolved real radiation from Eqs. (3.11), (3.12) only for a finite number of emissions, namely a single emission at NNLL, two emissions at N 3 LL, and so forth.
Finally, we need to deal with the block C c1;T N1 (α s (µ 0 ))H(µ R )C c2 N2 (α s (µ 0 )) in Eq. (2.59). As discussed in the previous section, for a generic process this block receives a contribution from the gluon collinear correlations G, as in Eq. (2.63). Since the contribution of the G functions starts at N 3 LL, at this order one can drop the dependence in their evolution; namely, in the analogue of Eq. (3.10) with Γ (C) N , only the first term on the right-hand side needs to be retained. This amounts to evaluating the coupling of the G coefficient functions at k t1 .
With the expansions detailed above, Eq. (2.59) becomeŝ Following the procedure of ref. [45], we can express the ln(1/ ) singularities in the exponent of Eq. (3.13) as integrals over dummy real emissions as follows (3.14) and subsequently expand out the divergent part of the exponent, retaining the terms necessary at a given logarithmic order. We further introduce the average of a function G({p}, {k i }) over the measure dZ where we simplified the notation by using The dependence on the regulator cancels exactly in Eq. (3.15).
We can plug Eq. (3.13) into the definition of the hadronic cross section (2.58). We define the derivatives of the parton densities by means of the DGLAP evolution equation whereP (z, α s (µ)) is the regularised splitting function (2) (z) + . . . (3.18) Moreover, we introduce the following parton luminosities 22) and Y is the rapidity of the colour singlet in the centre-of-mass frame of the collision at the Born level. |M B | 2 cc is the Born squared matrix element, and L = ln(1/v 1 ), with v 1 = k t1 /M , v = p t /M . We transform back to momentum space, thus abandoning the matrix notation used so far, by means of the following identities, valid up to N 3 LL where we defined ∂ L = d/dL. Since we evaluated explicitly the sum over the emitting legs i , the convolution of a regularised splitting kernelP (0) with the NLL parton luminosity is now defined aŝ The termP (0) ⊗P (0) ⊗ L NLL (k t1 ) is to be interpreted aŝ (3.25) Including terms up to N 3 LL, we can therefore recast Eqs. (3.13), (2.58) as Until now we have explicitly considered the case of flavour-conserving real emissions, for which we derived Eq. (3.26). We now turn to the inclusion of the flavour-changing splitting kernels, that enter purely in the hard-collinear limit and contribute to the DGLAP evolution.
We observe that at a given logarithmic order only a finite number of hard-collinear emissions are actually necessary. As we mentioned several times in the above sections, at N 3 LL one needs to account for the effect of up to two hard-collinear resolved partons. Therefore, the inclusion of the flavour-changing kernels can be done directly at the level of the splitting functions and parton luminosities in Eq. (3.26).
In the above expressions for the luminosity we have used the following expansions in powers of the strong coupling for the functions C, H and G, up to N 3 LL: ab (z), (3.29) where µ is the same scale at which the parton densities are evaluated, and µ R is the renormalisation scale.
The expressions for C (1) and H (1) have been known for a long time, and are collected, for instance, in the appendix of ref. [57]. The hard-virtual coefficient H(µ R ) is defined as the finite part of the renormalised QCD form factor in the MS renormalisation scheme, divided by the underlying Born squared matrix element. The hard coefficients for gluonic processes up to O(α 2 s ) evaluated at the invariant mass of the colour singlet H (1) (M ) and H (2) (M ) read [61][62][63] where the last term in H (2) g was deliberately left symbolic to stress its origin from Eq. (2.74). Analogously, for quark-initiated reactions one has [64][65][66] The renormalisation-scale dependence of the first two hard-function coefficients is given by where d B is the strong-coupling order of the Born squared amplitude (e.g. d B = 2 for Higgs production). The C (2) and G (1) functions for gluon-fusion processes are obtained in refs. [22,24], while for quark-induced processes they are derived in ref. [23]. In the present work we extract their expressions using the results of refs. [22,23]. For gluon-fusion processes, the C (2) gq and C (2) gg coefficients normalised as in Eq. (3.27) are extracted from Eqs. (30) and (32) of ref. [22], respectively, where we use the hard coefficients of Eqs. (3.30) without the new term proportional to β 0 in the H (2) g (M ) coefficient. 11 The coefficient G (1) is taken from Eq. (13) of ref. [22]. Similarly, for quark-initiated processes, we extract C (2) qg and C (2) qq from Eqs. (32) and (34) of ref. [23], respectively, where we use the hard coefficients from Eqs. (3.31) without the new term proportional to β 0 in the H Eq. (3.26) resums all logarithmic towers of ln(1/v) (with v = p t /M ) up to N 3 LL, therefore neglecting subleading-logarithmic terms of order α n s ln 2n−6 (1/v). Constant terms of order O(α 3 s ) relative to the Born will be extracted automatically from a matching to the N 3 LO cumulative cross section in Section 4. This will allow us to control all terms of order α n s ln 2n−6 (1/v) in the matched cross section, therefore neglecting terms O α n s ln 2n−7 (1/v) . We have split the result into a sum of three terms. The first term (first line of Eq. (3.26)) starts at LL and contains the full NLL corrections. The second term of Eq. (3.26) (second to fourth lines) is necessary to achieve NNLL accuracy, while the third term (fifth to ninth lines) is purely N 3 LL.
Since Eq. (3.26) still contains subleading-logarithmic terms (i.e. starting at N 4 LL in ln(M/p t )), one could, even if not strictly required, perform further expansions on each of the terms of Eq. (3.26) in order to neglect at least some of the corrections beyond the desired logarithmic order. For instance, for a N 3 LL resummation, the full N 3 LL radiator is necessary in the first term of Eq. (3.26), while the radiator can be evaluated at NNLL in the second term, and at NLL in third term. Analogously, for a NNLL resummation, the NLL radiator suffices in the second term of Eq. (3.26). Furthermore, at NNLL, one could split R (k t1 ) into the sum of a NLL termR (k t1 ) and a NNLL one δR (k t1 ), and expand Eq. (3.26) about the former retaining only contributions linear in δR (k t1 ). The last two considerations relate Eq. (3.26) to Eq. (9) of ref. [39] where this approach was first formulated at NNLL for the Higgs-boson transverse-momentum distribution.
Eq. (3.26) can be evaluated in its present form with fast Monte Carlo techniques, as we will discuss in Section 4.
We performed numerous tests to verify the correctness of Eq. (3.26). Firstly, we performed the expansion of Eq. (3.26) to O(α 3 s ) relative to the Born for the transverse momentum of the boson as well as for the φ * distribution in Drell-Yan production, and compared it to the corresponding result from the b-space formulation, finding full agreement for the N 3 LL terms. This is a highly non-trivial test of the logarithmic structure of Eq. (3.26). The differential O(α 2 s ) expansion for both observables was also compared to MCFM [67] and we found that the difference between the two predictions vanishes in the logarithmic region. Finally, we checked numerically that the coefficient of the scaling Σ(p t ) ∝ p 2 t in the small-p t limit of Eq. (3.26) agrees with the prediction obtained with the b-space formulation. The agreement of the NNLL prediction obtained using our formula (3.26) with the b-space result from the program HqT [16] across the spectrum was shown in ref. [39].

Perturbative scaling in the p t → 0 regime
In this section we show that our formulation of the transverse-momentum resummation of Eq. (3.26) reproduces the correct scaling in the p t → 0 limit as first observed in [13]. Moreover, we obtain a correspondence between the logarithmic accuracy and the perturbative accuracy in this limit. In the following we follow the approximations made in Ref. [13] to derive an analytic estimate for the p t → 0 scaling of the differential cross section. Such approximations are further discussed in Appendix C. To perform a comparison with the results of [13], we consider NLL resummation and neglect the evolution of the parton densities with the energy scale. However the same procedure can be easily extended to the general case. We have where Before proceeding to the evaluation of Eq. (3.36), a remark is in order. At NLL one would be tempted to perform the replacement (see Sec. 2.4) (3.37) and recast Eq. (3.36) as (3.38) The above result is singular for R (k t1 ) ≥ 2, owing to the fact that the integrand scales as b 1−R (kt1) in the b → 0 limit. This singular behaviour is however entirely due to the approximation in Eq. (3.37), where all power-suppressed terms are neglected, while Eq. (3.36) is regular, as the integral in its exponent vanishes as O(b 2 ) for small b. Therefore, when using Eq. (3.37) one must regularise the b → 0 limit, for instance by means of modified logarithms as in ref. [15]. In our formalism, instead, Eq. (3.36) is evaluated numerically without further approximations so that the b → 0 region is correctly described.
It is interesting to study the scaling of Eq. (3.36) in the small-p t limit. In this limit, the dominant mechanism that produces a vanishing p t involves several soft and collinear emissions with finite transverse momentum that mutually balance in the transverse plane.
In this kinematic configuration one has k t1 p t , thus expanding k t1 about p t in Eq. (3.36) is not allowed: such an operation would give rise to spurious singularities at R (p t ) ≥ 2, as reported several times in the literature [19,37,39,40,52,68].
We therefore evaluate the b integral of Eq. (3.36) and observe that in the limit where M k t1 namely it is constant in p t in first approximation. In this regime Eq. (3.36) becomes In order to directly compare with the result of ref. [13], we specialise to the case of the Drell-Yan process, and compute R(k t1 ) at the lowest order using the leading-order running coupling expressed in terms of the QCD scale Λ QCD (with n f = 4), .
We obtain (A (1) = 2C F in this case) We now integrate over k t1 in Eq. (3.40) from Λ QCD up to the invariant mass of the Drell-Yan pair, obtaining that reproduces the scaling of ref. [13]. 12 We stress that this power-like scaling is not due, by any means, to higher-order effects that one would be missing in performing the naive expansion of k t1 about p t , but rather to a collective kinematical effect that requires the presence of any number of emissions. Indeed, the expansion of Eq. (3.36) to any order in the strong coupling only gives rise to logarithmic effects and no terms scaling as O(p t ) arise. To reproduce the correct scaling an all-order treatment is necessary.
In order to study how this result is modified by the inclusion of higher-order logarithmic corrections, we evaluate Eq. (3.40) in the fixed-coupling-constant approximation. This is a simple toy model for the more complicated running coupling case. At lowest order one has with A (1) = 2C (with C = C A for gluons and C = C F for quarks), and L = ln M/k t1 . In the perturbative regime Eq. (3.40) therefore reads Eq. (3.43) shows that in the small-p t limit the differential spectrum features a non-perturbative scaling in α s (see also Eq. (2.12) of ref.
[13] 13 ). However, the coefficient of this scaling can be systematically improved in perturbation theory: the inclusion of NLL terms α n s L n in the right-hand side of Eq. (3.40) contributes an O(1) correction to the right-hand side of Eq. (3.43). Analogously, NNLL terms α n s L n−1 will produce an O(α s ) correction relative to the non-perturbative factor e π/(2Cαs) / √ 2Cα s , and so on. In particular, with our N 3 LL calculation we have control over the terms of relative order O(α 2 s ). From this scaling we deduce that the correspondence L ∼ 1/α s is still valid in the deep infrared regime. However, this does not mean that the above prediction is accurate in this limit: indeed non-perturbative effects due to soft-gluon radiation below Λ QCD , as well as due to the intrinsic transverse momentum of the partons in the proton, feature a similar scaling. This is because the colour singlet's transverse momentum is sensitive to non-perturbative dynamics only through kinematical recoil, that is the same mechanism that drives the scaling (3.41).

Numerical implementation
In order to have a prediction that is valid accross different kinematic regions of the spectrum, one needs to match the resummed calculation, valid in the small-v limit, to a fixed-order calculation that describes the hard (large-v) region. In this section we discuss the matching of the result described in the previous sections, in particular Eq. (3.26), to a fixed-order prediction that is NNLO accurate in the hard region of the phase space. We then describe how to evaluate Eq. (3.26) exactly using a Monte Carlo Markov process, and discuss the implementation in a parton-level generator that is fully differential in the Born kinematics. 12 In the last step we have neglected a factor of 1/Λ 2 QCD ln(M 2 /Λ 2 QCD ), as done in ref. [13]. 13 Please note that only the leading contribution for αs 1 is reported in the right-hand side of that equation.

Normalisation constraint and resummation-scale dependence
In order to match the resummed calculation to a fixed-order prediction one has to ensure that the hard region of the phase space receives no contamination from resummation effects. We therefore need to modify Eq. (3.26) so that at large v (v = p t /M in the transverse-momentum case) all resummation effects vanish. At N 3 LL, it reduces to where L N 3 LL is defined in Eq. (3.21). The normalisation constraint (4.1) can be implemented in several ways; in what follows we impose it by modifying the structure of the logarithms L everywhere in Eq. (3.26), as commonly done for this observable in the literature. Before defining the modified logarithms, it is convenient to have a way to estimate the resummation uncertainties due to higher-order logarithmic corrections that are not included in the calculation. To this aim, we introduce the dimensionless resummation scale x Q by using the identity and then we expand the right-hand side about ln(x Q /v 1 ) to the nominal logarithmic accuracy (in terms of ln(x Q /v 1 )), neglecting subleading corrections. In the transverse-momentum case one has v 1 = k t1 /M and x Q = Q/M , where Q, the resummation scale, has dimension of a mass. A variation of x Q will therefore provide an estimate of the size of higher-order logarithmic corrections. The normalisation constraint can now be imposed by replacing the resummed logarithms where the positive real parameter p is chosen in such a way that resummation effects vanish rapidly enough at v 1 ∼ x Q . Eq. (4.3) amounts to imposing unitarity by introducing in the resummed logarithms power-suppressed terms that scale as (x Q /v 1 ) p , which ultimately give rise to terms of order v −p in the cumulative cross section Σ(v). Given that the differential spectrum tends to zero with a power law (∼ v −n with positive n) at large v, it follows that one should have p ≥ n − 1 in order not to affect the correct fixed-order scaling at large v. However, since we are interested in turning off the resummation at transverse momentum values of the order of the singlet's mass, the relevant scaling n to be considered in the choice of p is the one relative to the differential distribution in this region. We stress, finally, that the prescription (4.3) is only one of the possible ways of turning off resummation effects in the hard regions of the spectrum. For instance one could, analogously, directly constrain the first block to have k t1 ≤ Q, which would naturally suppress radiation effects at large v. This solution would however lead to more complicated integrals in the expansion of the resummation formula used in the matching to fixed order. For this reason, we stick to prescription (4.3) while leaving the study of alternative solutions for future work. We notice that, with the prescription (4.3), the single-emission event in the first line of Eq. (3.26) is not a total derivative any longer. One can however restore this property by introducing the jacobian factor in all integrals over v 1 = k t1 /M in Eq. (3.26). This jacobian tends to one at small v 1 and therefore does not modify the logarithmic structure. Moreover, in the large-v region where the single-emission event dominates, this prescription prevents the proliferation of power-suppressed terms. The prescription (4.3) effectively maps the point at which the logarithms are turned off onto infinity. This also gives us the freedom to extend the upper bound of the integration over k t1 from M to ∞ in Eq. (3.26) without spoiling the logarithmic accuracy. We therefore implement the prescription (4.3) in the Sudakov radiator and its derivatives. We denote all modified quantities by a '∼' superscript. The expansion about ln(x Q /v) induces some constant terms in the Sudakov radiator that are expanded out up to O(α 2 s ) and included in the hard-function coefficients. The modified quantities in Eq. (3.26) arẽ where the functions g i are given in Appendix B. All derivatives of the R function are to be consistently replaced by derivatives ofR with respect toL. Notice that no constant terms are present in the radiator and therefore g i (0) = 0.
The same replacement must be consistently performed in the parton densities. In addition, it is convenient to have the latter evaluated at a common factorisation scale µ F at large v 1 , in order to match the fixed-order convention. Both steps can be implemented by expressing the parton densities f at the scale µ F e −L , and expanding out the difference between f (µ F e −L , x) and f (k t1 , x) neglecting regular terms as well as logarithmic terms beyond N 3 LL. The relevant terms in this expansion can be absorbed into a redefinition of the coefficient functions C (i) (z), thereby introducing an explicit dependence upon µ F and x Q . We obtaiñ Finally, we also approximate the strong coupling in the terms proportional to α 2 s (k t1 ) in Eq. (3.26), featuring the convolution of one and two splitting functions with the NLL luminosity, by retaining only terms relevant to N 3 LL as Summarising, the final formula that we employ in the matching to fixed order will be Eq. (3.26) with the following replacements: Moreover the coupling is treated according to Eq. (4.7) in the termsP (0) ⊗L NLL andP (0) ⊗P (0) ⊗ L NLL , and the upper bound of the k t1 integration in Eq. (3.26) is extended to infinity. The modified luminosity factors appearing in the previous equation are defined as (4.11)

Matching to fixed order
To match the above result to a fixed-order calculation we design a scheme belonging to the class of multiplicative matchings [68,69]. This, at present, is preferable to the more common additive R scheme [48], since the O(α 3 s ) constant terms of the cumulative cross section are currently unknown analytically (except for the three-loop corrections to the form factor that were computed in ref. [70][71][72]) and they can therefore be recovered numerically from our matching procedure. This ensures that our matched prediction controls all terms up to and including O(α n s ln 2n−6 (1/v)). Moreover, the multiplicative scheme has the feature of being less sensitive to numerical instabilities of the fixed-order prediction close to the infrared and collinear regions.
However, the multiplicative scheme in hadronic collisions can give rise to higher-order terms in the high-p t tail, due to the cross product of parton luminosities. These are effectively subleading and therefore they never spoil the perturbative accuracy, nevertheless they can be numerically non-negligible, especially for processes featuring large K factors like Higgs production. In order to suppress such spurious terms, we introduce a factor Z defined as where v 0 is the point at which the fixed-order is recovered, while h and u are positive parameters. h should be larger than two in order to avoid small kinks in the differential distribution. In our predictions below we set v 0 = 1/2 and h = 3, and check that the variations v 0 = 1 and h = 1, 2 do not produce sizeable differences. The parameter u will be discussed shortly. In what follows, with a slight abuse of notation, we denote by Σ(v, Φ B ) the generic exclusive cross section dΣ(v)/dΦ B . We therefore define the matched cross section as where Σ FO is the fixed-order cross section at order α n s differential in the Born kinematics, and Σ EXP is the expansion of the resummed cross section Σ RES to O(α n s ). The factor Z ensures that the resummation is smoothly turned off for v ≥ v 0 . We stress that at small v the factor Z leads to extra terms which are suppressed as (v/v 0 ) u . Therefore u can be chosen in order to make these terms arbitrarily small, although they are already very suppressed in the small-v region. In our case we simply set u = 1.
Up to N 3 LO we now express the fixed-order and the expanded cross sections as whereΣ (0) EXP (v, Φ B ) = σ (0) , and we defined σ (i) (Φ B ) = dσ (i) /dΦ B as the i-th order of the total cross section differential in the Born kinematics (4.15) With this notation, Eq. (4.13) becomes , (4.16) where terms contributing at different orders in α s are separated by an extra blank line in the above equation.
To work out the expansion, we start from the three contributions of Eq. (3.26) with the replacements discussed in Sec. 4.1. The first contribution starts with a single emission, the second features at least two emissions, and the third contributes to events with at least three emissions. The single-emission term can be worked out analytically, since the integrand is a total derivative, while the remaning terms can be expanded to O(α 3 s ) at the integrand level and integrated over the real-emission phase space. When the integrand is expanded out, one can safely set = 0 as the cancellation of all singularities is now manifest. The expanded result can be expressed as a linear combination in terms of the following three classes of integrals (we write them in terms of v 1 = k t1 /M ):

Event generation
Before presenting a phenomenological application of this formalism, we comment briefly on how Eq. (3.26) is implemented numerically using a Monte Carlo method. We follow a variant of the procedure used in refs. [41,45,52]. For the first emission we generate v 1 uniformly according to the integration measure dv 1 /v 1 J (v 1 /x Q , p), and assign it a weight in terms of the Sudakov radiator and parton luminosities. All the identical emissions belonging to the ensemble dZ[{R , k i }] are generated via a shower ordered in v i . This is done by expressing the term R (kt1) as with r being a random number extracted uniformly in the range [0, 1]. The above equation has no solution for ζ i > ζ i−1 , therefore this amounts to a shower ordered in ζ i (or, equivalently, in v i ). The procedure is stopped as soon as a ζ i < is generated. The azimuthal angles are generated uniformly in the range [0, 2π] for all emissions. Finally, the special emissions, denoted by the subscript s in Eq. (3.26), do not have an associated Sudakov suppression since their contribution is always finite in four dimensions. Therefore we generate them according to their phase-space measure and weight as they appear in the master formula. This recipe is sufficient to evaluate Eq. (3.26), and it can be implemented in a fast numerical code. We stress that it is an exact procedure, meaning that no truncation at any perturbative order is involved. The algorithm leads to the generation of an arbitrary number of emissions with ζ i > , while all unresolved emissions with ζ i < are accounted for analytically in the Sudakov radiator. This ensures that the whole singular part of the radiation phase space and all perturbative orders are treated exactly. We choose conservatively = e −20 for our tests, although we observe that a much larger value (e.g. ∼ e −7 ) can be chosen in practice given that emissions below this threshold will be very soft and/or collinear, hence improving slightly the efficiency of the event generation.
We generate Born events using the LO matrix elements and phase-space-integrator routines of MCFM [67], and we use HOPPET [73] to handle the evolution of the parton densities and the convolution with the various coefficient functions.
For each Born event we run the above algorithm to produce the initial-state radiation, and fill the histograms on the fly, thereby yielding dΣ RES (v)/dΦ B . As a byproduct, this allows us to have exclusive events with N 3 LL accuracy for the observables treated in this article. For each Born event we also generate a histogram filled with the expansion counterterm, which is computed as described in the previous section. After the generation, the two histograms are combined with the corresponding fixed-order cumulative distribution according to Eq. (4.16).
We point out that the Sudakov radiator has a singularity in correspondence of the Landau pole at 2α s (µ R )β 0L = 1 (see expressions in Appendix B). One could use different prescriptions to handle this singularity, all differing by power-suppressed terms in the perturbative expansion. We choose to set the result to zero below the singularity which, anyway, occurs at very small p t values. We stress that other schemes can be adopted, and that this choice has no consequences above the scale of the singularity.
The resummation and matching as described above are implemented in the program RadISH that can simulate the production of any colour singlet with arbitrary phase-space cuts on the Born kinematics. The code will be released in due course.

Predictions for Higgs-boson production at 13 TeV pp collisions
We now apply the method described in the previous sections to obtain the inclusive transversemomentum distribution of the Higgs boson at the LHC. We stress that the results shown in the following are to be considered as a proof of concept of our method, and a more detailed phenomenology discussion on the precise choice of the matching scheme as well as on the theory uncertainties will be the subject of a forthcoming publication.
We perform the calculation in the large-top-mass limit, and we match our N 3 LL result to the NNLO distribution that was computed in refs. [6,7,9]. In particular, here we use results obtained with the code of ref. [8] with a cut on the Higgs transverse momentum at 5 GeV. The matched distribution integrates to the inclusive N 3 LO cross section that is taken from ref. [3].
We consider 13 TeV collisions, and we use parton densities from the PDF4LHC15_nnlo_mc set [74][75][76][77][78][79]. The value of the parameter p appearing in the modified logarithmsL is chosen considering the scaling of the spectrum in the hard region, in order to make the matching to the fixed order smooth in this region. On the other hand, its value should not be too large, in order to prevent the peak of the distribution from being artificially pushed upwards due to the normalisation constraint. We therefore set p = 2 as our reference value, but nevertheless checked that the choice p = 3 induces negligible differences.
As central scales we employ µ R = µ F = m H , and x Q = Q/m H = 1/2. The perturbative uncertainty is estimated by performing a seven-scale variation of µ R , µ F by a factor of two in either direction, while keeping 1/2 < µ R /µ F < 2 and x Q = 1/2; moreover, for central µ R and µ F scales, x Q is varied around its central value in a range that we now turn to discuss. The total error is defined as the envelope of all above variations.
In the case of the transverse momentum k t1 of a colour singlet of mass M , the resummation scale Q is introduced by splitting the resummed logarithms as 20) and subsequently assuming that The latter condition is true at small k t1 , and it allows one to expand ln(M/k t1 ) about ln(Q/k t1 ), retaining only terms relevant to a given logarithmic accuracy. In this case, variations of Q give a handle to estimate the size of subleading-logarithmic terms in the region where all-order effects are important. However, in the matching region k t1 ∼ M/2, condition (4.21) is violated for k t1 Q 2 /M . In this regime, the variation of the resummation scale is physically meaningless, since the logarithmic hierarchy it is based upon is not valid at these scales. In particular, for Higgs production, a variation of Q by a factor of two around m H /2 can have a couple of drawbacks. On the one hand, for Q = m H /4, it leads to values of Q 2 /m H which are below the peak of the distribution, implying that the corresponding resummation-scale variation is technically reliable only to the left of the peak. On the other hand, for Q = m H , resummation effects are allowed to survive up to the Higgs scale, which is a fairly hard region of the phase space, where one expects to be predictive with the sole fixed-order calculation. In practice, however, in our matching procedure the resummed contribution is subtracted up to the perturbative order one is matching to, which ensures that the residual variations of Q away from the region of large logarithms induce effects that are numerically very small.
For these two reasons, we believe that a more suitable variation range is given by Q ∈ [m H /3, 3m H /4], which corresponds to a variation by a factor of 3/2 around the central value Q = m H /2. This range, that was already adopted in ref. [80], ensures that the resummation-scale variation is reliable in the peak region and that resummation effects are turned off well below the hard scale of the reaction, hence avoiding artifacts in the matched spectrum.
To study the impact of this choice, in the left panel of Figure 2 we show the comparison between the pure resummed N 3 LL normalised spectra with two uncertainty prescriptions: in the green coarse-textured band, Q is varied by a factor of two around m H /2, while the red fine-textured band involves the aforementioned reduced variation by a factor of 3/2; in both cases µ R and µ F undergo the seven-point variation described above. As expected, the choice Q ∈ [m H /3, 3m H /4] reduces the impact of the resummation-scale uncertainty in the matching region where the logarithms are not large, while leaving the uncertainty unchanged in the small-p t regime where the all-order treatment is necessary.
The right panel of Figure 2 shows the comparison between the two prescriptions for the matched N 3 LL+NLO distribution. 14 In the NLO matching, the resummed component is subtracted up to and including O(α 2 s ) terms relative to the Born. Therefore, in the region where the logarithms are moderate in size, the issues due to the large scale variation are suppressed by O(α 3 s ), and we indeed observe that the two bands differ negligibly at intermediate p t values.
We conclude that the resummation-scale variation by a factor of 3/2 still provides a wide enough variation range to probe the size of subleading-logarithmic corrections, while avoiding that some moderate resummation effects persist away from the region where the logarithms are large. We therefore adopt the modified variation in our prescription to estimate the perturbative uncertainty.  We next turn to the comparison with NNLL. The left panel of Figure 3 shows a comparison between the pure resummed predictions for the normalised spectrum at N 3 LL and NNLL. In this plot, the NNLL curve is normalised to the NLO total cross section, while the N 3 LL curve is normalised to the NNLO total cross section. The plot shows that the inclusion of the N 3 LL corrections leads to a reduction in the scale uncertainty of the resummed prediction compared to the NNLL result. 15 The right plot of Figure 3 shows the matching of the NNLL and N 3 LL predictions to NLO. Both curves are now normalised to the NNLO total cross section. We observe that at the matched level, the N 3 LL corrections amount to ∼ 10% around the peak of the spectrum, and they get slightly larger for smaller p t values ( 10 GeV). A substantial reduction of the total scale uncertainty is observed for p t 10 GeV.
We notice that, at the matched level, the impact of the N 3 LL corrections is reduced with respect to the sole resummation shown in the left plot of Figure 3. This is to a good extent due to the matching scheme that we chose here. Indeed, in a multiplicative scheme we include the O(α 2 s ) constant terms already at NNLL, although they are formally of higher-order accuracy. While these terms enter at N 3 LL, they are numerically sizeable and therefore their inclusion reduces the difference between the N 3 LL+NLO and the NNLL+NLO predictions.
To conclude this section, in Figure 4 we report the N 3 LL+NNLO prediction for the normalised distribution. The latter is compared both to NNLL+NNLO and to the pure NNLO result. All curves 14 Preliminary results at N 3 LL+NLO for this observable have been also shown at [81]. 15 An identical reduction in size is observed when varying Q by a factor of two around its central value.  in the plot are now normalised to the total N 3 LO cross section. When matched to NNLO, the N 3 LL corrections give rise to a few-percent shift of the central value with respect to the NNLL+NNLO prediction around the peak of the distributions, while they have a somewhat larger effect for p t 10 GeV. We recall that some of the N 3 LL effects are already included in the NNLL+NNLO prediction by means of the multiplicative matching scheme that we adopt here. As a consequence, this reduces the difference between the N 3 LL+NNLO and the NNLL+NNLO curves. We also observe that the matched N 3 LL and NNLL predictions are only moderately different in their theoretical-uncertainty bands. While this is of course expected in the hard region of the spectrum, we point out that, in the region p t 30 GeV, the latter feature is due (and increasingly so at smaller p t ) to numerical instabilities of the fixed-order runs with one of the scales (µ R or µ F ) set to m H /2. As we already observed at NLO, it is indeed necessary to have stable fixed-order predictions for p t < 10 GeV in order to benefit from the uncertainty reduction due to the higher-order resummation. We leave this for future work.

Conclusions
In this article we presented a formulation of the momentum-space resummation for global, recursive infrared and collinear safe observables that vanish far from the Sudakov limit because of kinematic cancellations implicit in the observable's definition. In particular, we studied the class of inclusive observables that do not depend on the rapidity of the QCD radiation. Members of this class are, among others, the transverse momentum of a heavy colour singlet and the φ * observable in Drell-Yan pair production. We obtained an all-order formula that is valid for all observables belonging to this class, and we explicitly evaluated it to N 3 LL up to effects due to the yet unknown fourloop cusp anomalous dimension. In the case of the transverse momentum of a colour singlet, we proved that our formulation is equivalent to the more common solution in impact-parameter space at this accuracy. This evidence is also supported by the numerous checks that we have documented. This equivalence allowed us to extract the ingredients necessary to compute the Sudakov radiator at N 3 LL using the recently computed B (3) coefficient [25,26]. The radiator is universal for all observables of this class [45], which can therefore be resummed to this accuracy with our approach.  The all-order result was shown to reproduce the correct power-like scaling in the small-p t limit, where the perturbative component of the coefficient of the intercept can be systematically improved by including higher-order logarithmic corrections. We implemented our results in the exclusive generator RadISH, which performs the resummation and the matching to fixed order, and allows the user to apply arbitrary kinematic cuts on the Born phase space. Although we explicitly treated the case of Higgs production, the code developed here can automatically handle any colour-singlet system. As a phenomenological application, we computed the Higgs transverse-momentum spectrum at the LHC. In comparison to the NNLL+NLO prediction, we find that N 3 LL+NLO effects are moderate in size, and lead to O(10%) corrections near the peak of the distribution and they are somewhat larger for p t 10 GeV. The scale uncertainty of the matched calculation is reduced by the inclusion of the N 3 LL corrections in the small transverse-momentum region. When matched to NNLO, the effect of the N 3 LL is pushed towards lower p t values, leading to a few percent correction to the previously known NNLL+NNLO prediction [39] around the peak, and to more sizeable effects at smaller p t values. In order to further improve the theoretical control in the small-medium transverse momentum region, it will be necessary to consider the deviations from the large-m t approximation. Recently, progress has been made in this respect by computing the NLO corrections to the top-bottom interference [12]. Higher-order effects due to the leading tower of logarithms of p t /m b were addressed in ref. [82] and were found to be moderate in size. The procedure for the inclusion of mass effects in the context of transverse-momentum resummation is a debated topic. While some prescriptions are available [83,84], further studies are necessary to estimate these effects in the logarithmic region at this level of accuracy. WB is supported by the European Research Council grant 614577 HICCUP (High Impact Cross Section Calculations for Ultimate Precision), and the work of ER is supported by a Marie Skłodowska-Curie Individual Fellowship of the European Commission's Horizon 2020 Programme under contract number 659147 PrecisionTools4LHC. PM would like to thank the Erwin Schrödinger Institute of Vienna for hospitality and support while part of this work was carried out. PT would like to thank CERN's Thoretical Physics Department for hospitality during the development of this work.
A Connection with the backward-evolution algorithm at NLL It is interesting to relate our formulation for the transverse-momentum resummation to a NLLaccurate backward-evolution algorithm [85][86][87]. We start from Eq. (2.59), that was deduced by considering only flavour-conserving real splitting kernels, for the sake of clarity. We briefly comment on the general flavour case below.
After neglecting the effect of the hard and coefficient functions, which starts at NNLL, we recast the NLL partonic cross section aŝ where 1 (c1,c2) enforces the flavour of the two parton densities to be identical to that entering the Born process, i.e. f T 1 (c1,c2) f = f c1 f c2 . At NLL order, the emission probabilities involve only treelevel splitting functions, whose coupling we evaluate in the CMW scheme, as discussed in Sec. 2.1: where K is defined in Eq. (2.21). In order to perform the inverse Mellin transform of Eq. (A.1), we observe that, when inverted into z space, each of the real-emission probabilities acts on a generic parton distribution f (x i ) as described in Section 2.3.3: We furthermore introduce the shower Sudakov form factor ∆(Q i ), that at NLL reads ∆(Q i ) = exp − As shown in the main text, in the all-order picture, the correct z ( ) bounds for each emission depend on the radiation that was emitted before it. Following the discussion of Section 2.1, however, we recall that these effects contribute beyond NLL accuracy, and therefore can be neglected in the present case. We then plug Eq. (A.1) into Eq. (2.58) and perform the inverse Mellin transform as just described, obtaining after the flavour-changing collinear emission has taken place, then it becomes quite cumbersome to determine the correct colour factor for the former. This is because coherence guarantees that a soft gluon feels the effective colour charge of the radiation at smaller angles, which now may involve combinations of different flavours. A correct solution to this problem requires to reformulate the evolution by ordering the radiation in angle. This ensures that the hard-collinear emissions contributing to the DGLAP evolution happen at last (see also the discussion in Appendix E.2 of ref. [41]), and the colour structure of the soft radiation is easily determined. It is possible to show that the backward-evolution algorithm reproduces the resulting evolution formula in that case as well, and it is therefore NLL accurate.

C Additional considerations on the small-p t scaling
In this Appendix we discuss further the analysis of the p t → 0 limit of the differential cross section carried out in Sec. 3.2. Specifically, following Ref. [13], we have made a number of approximations to derive the scaling of the integral (3.36) with Λ 2 QCD /M 2 . Such approximations, however, are too rough to capture the correct O(1) normalisation of the scaling, and in this appendix we quantify the difference from the correct result obtained by directly integrating Eq. (3.36).
For the sake of convenience, we define the quantity where d 2 Σ(v) ptdptdΦ B was obtained in Eq. (3.36). The p t → 0 limit simply corresponds to setting J 0 (p t b) = 1 + O(p 2 t ), leading to The result given in Eq.  To numerically quantify the difference between Eqs. (C.3) and (C.2), we consider the case of Z production with m Z = 91.1876 GeV. We evaluate numerically Eq. (C.2) and Eq. (C.3) and we compare them with the prediction obtained with the RadISH code. For the sake of simplicity, we use everywhere the LL expressions for R and its derivative R . To regulate the b → 0 (+∞) limits in Eq. (C.2) we set a lower (upper) limit of b 0 /m Z (b 0 /Q 0 ) in the b integral, where Q 0 is the Landau singularity of the integrand which reads Correspondingly, we integrate over k t1 in both Eqs. (C.2) and (C.3) between Q 0 and m Z and we use the same limits in the numerical results obtained with RadISH. The results are displayed in Fig. 5, where we observe that the numerical result obtained with RadISH converges in the p t → 0 limit to the result given in Eq. (C.2). Both results differ significantly from the approximated result Eq. (C.3), which overestimates the normalisation by more than a factor of three. We observe that such a difference can be also obtained by computing the ratio of Eq. (C.3) to the p t → 0 limit of the original b-space formulation. 18