N-jettiness Subtractions for NNLO QCD Calculations

We present a subtraction method utilizing the N-jettiness observable, Tau_N, to perform QCD calculations for arbitrary processes at next-to-next-to-leading order (NNLO). Our method employs soft-collinear effective theory (SCET) to determine the IR singular contributions of N-jet cross sections for Tau_N ->0, and uses these to construct suitable Tau_N-subtractions. The construction is systematic and economic, due to being based on a physical observable. The resulting NNLO calculation is fully differential and in a form directly suitable for combining with resummation and parton showers. We explain in detail the application to processes with an arbitrary number of massless partons at lepton and hadron colliders together with the required external inputs in the form of QCD amplitudes and lower-order calculations. We provide explicit expressions for the Tau_N-subtractions at NLO and NNLO. The required ingredients are fully known at NLO, and at NNLO for processes with two external QCD partons. The remaining NNLO ingredient for three or more external partons can be obtained numerically with existing NNLO techniques. As an example, we employ our method to obtain the NNLO rapidity spectrum for Drell-Yan and gluon-fusion Higgs production. We discuss aspects of numerical accuracy and convergence and the practical implementation. We also discuss and comment on possible extensions, such as more-differential subtractions, necessary steps for going to N3LO, and the treatment of massive quarks.


Contents
1 Introduction The precise knowledge of QCD corrections is a key ingredient for interpreting the data from collider experiments. In hadronic collisions, the inclusive QCD cross section for the production of a final state X can, if the hard scale Q associated with X is large enough, be obtained in terms of a perturbatively calculable partonic cross section convolved with parton distribution functions (PDFs). Perturbative calculations performed using the leading order (LO) term in α s typically suffer from large theoretical uncertainties due to missing higher-order perturbative corrections. Often, next-to-leading order (NLO) is the first order at which the normalization and in some cases the shape of cross sections can be considered reliable. As such, this level of accuracy has become standard for comparing with data from the LHC. For some processes the experimental uncertainties are becoming so small, or the perturbative uncertainties at NLO are still so large, that next-to-next-to-leading order (NNLO) computations are called for.
For many important benchmark processes, the required virtual amplitudes are known at NNLO. However, as is well known, the computation of the full cross sections beyond leading order is complicated by infrared (IR) divergences -explicit divergences in virtual amplitudes, and divergences in the phase-space integration over the real-emission amplitudes in regions where particles become soft or collinear to other particles. These divergences only cancel after integrating the real-emission amplitudes over the phase space of unresolved particles and adding the result to the virtual loop amplitudes order by order.
To handle these divergences in practice one typically makes use of some subtraction method. That is, one subtracts terms from the real emission contributions that reproduce the IR soft and collinear behaviour of the real emissions, which then allows the phase-space integral of the full amplitude minus the subtraction terms to be performed numerically in d = 4 dimensions, giving a finite result. The subtracted terms have to be sufficiently simple that they can be integrated over the phase space of emitted particles in d = 4 − 2 dimensions. They are then added back to the virtual contributions, where they cancel the explicit 1/ n IR poles.
The goal of typical NLO subtraction schemes like FKS subtractions [1][2][3] or CS subtractions [4][5][6] is to construct subtraction terms that reproduce the correct IR-singular behaviour of the full real-emission amplitude point-by-point in phase space. Over the past decade enormous effort has been devoted to extend such local subtraction methods to NNLO using different approaches . This extension is very involved due to the many overlapping singularities at NNLO, which have to be isolated by appropriate phase-space parameterizations. At the same time, the subtractions have to remain simple enough that the 1/ n IR poles can be extracted from the integrated subtractions.
The basic idea of our method, which we call N -jettiness subtractions, is to use a physical jet-resolution variable T N to control the infrared behaviour of the cross section.
The key point is that, if the (factorized) structure of the leading contribution to the T Ndifferential cross section in the IR limit T N → 0 is known, the singular part can often be determined analytically and used to construct an IR subtraction term. A major advantage of using a physical observable is that the differential and integrated subtraction terms are then equivalent to the singular limits of a physical cross section, which can indeed be significantly easier to calculate than the full cross section. A well-known example of such a physical subtraction scheme is the q T -subtraction method for color-singlet production in hadron collisions [39], which has been successfully applied to a variety of processes [40][41][42][43][44][45][46][47]. (It has also been suggested that this method can be applied to compute heavy-quark pair production at NNLO [48,49].) Our N -jettiness subtraction method generalizes this to arbitrary numbers of QCD partons in the initial and final state. It employs the N -jettiness global event shape [50] as the physical N -jet resolution variable. In this paper, we limit ourselves to massless quarks; the extension to massive quarks is in principle possible and commented on in section 5.
The key feature of N -jettiness is that it has very simple factorization properties in the singular limit. The factorization theorem for the N -jettiness cross section is known [50][51][52] from soft-collinear effective theory (SCET) [53][54][55][56][57][58]. It can be used to systematically compute the leading singular contributions (thus determining the subtraction terms) by performing standard fixed-order calculations of soft and collinear matrix elements in SCET. At NLO, all necessary ingredients have been known for some time, and by now, essentially all necessary NNLO ingredients are available. For processes with hadronic initial states a key ingredient that has become available recently are the two-loop quark and gluon beam functions [59,60].
The price one has to pay for using a single physical observable to describe the IR is that the subtraction does not act point-by-point in phase space, but only on a more global level after a certain amount of phase-space integration has been carried out. In essence, the large number of terms in a fully local subtraction method are projected onto a single, nonlocal subtraction term. In practice, this means that the numerical convergence may be slower than for the fully local case. However, this is compensated by the significant reduction in complexity of the subtractions. Furthermore, as we will discuss, it is possible to make the subtractions step-by-step more local by making the N -jettiness cross section more differential in additional variables. This is again possible by using SCET to factorize and calculate the singular contributions of more differential cross sections (see e.g. refs. [52,[61][62][63][64][65]).
There are several important benefits of using a physical observable as jet resolution variable, as already emphasized in refs. [66]. It allows one to directly reuse the existing NLO calculations for the corresponding N + 1-jet cross sections, and the resulting NNLO calculation is automatically fully-differential in the Born phase space. Moreover, the calculation will be in a form which makes it directly suitable to be combined with higher-order resummation as well as parton showers by using the general methods developed in refs. [66,67].
The idea of using N -jettiness as an N -jet resolution variable is not new. In fact, this is what largely motivated its invention in the first place. For example, it is utilized for precisely this purpose in the Geneva Monte-Carlo program [67]. For color-singlet production, the N -jettiness subtraction method reduces to an analogue of q T subtractions [39] using an alternative physical resolution variable. The differential version as a subtraction was used at NLO in ref. [68]. In its simplest form as a phase-space slicing method it has been successfully applied already to calculate the top quark decay rate at NNLO [69], and while this work was being finalized to the pp → W/H + jet cross sections at NNLO [70,71].
In this work we give a general description of how N -jet resolution variables, and specifically N -jettiness, can be used as subtraction terms to compute fixed-order cross sections. In section 2, we discuss how the IR singularities in QCD cross sections are encapsulated by an N -jet resolution variable. We demonstrate that this naturally leads to subtraction terms for fixed-order calculations, and show how these can be used in phase-space slicing and as differential subtractions. In section 3, we review the definition of N -jettiness and its general factorization theorem for N -jet production. We show how the subtraction terms are defined in terms of functions in the factorization theorem. We explicitly construct the subtraction terms at NLO and NNLO for generic N -parton processes. We also discuss the extension to N 3 LO and to more differential subtractions. In section 4, we discuss how these subtractions may be implemented in parton-level Monte-Carlo programs. We also show results for Drell-Yan and gluon-fusion Higgs production at NNLO and use these as an example to discuss some of the numerical aspects. We conclude in section 5.

Notation
We denote the N -jet cross section that we want to compute by σ(X). Here, X collectively stands for all differential measurements and kinematic cuts applied at Born level. In particular, it contains the definitions of the N identified signal jets in σ(X) and all cuts required to stay away from any IR-singularities in the N -parton Born phase space.
The cross section at leading order (LO) in perturbation theory can then be written as where the measurement function X(Φ N ) implements X on an N -parton final state. The Born contribution, B N (Φ N ), is given by the square of the lowest-order amplitude, A (0) , for the process we are interested in, 1 2) where Φ N denotes the complete dependence of the amplitude on the external state (including all dependence on momentum, spin, and partonic channel). For hadronic collisions, the PDFs f a,b are included in B N (Φ N ) and Φ N also includes the corresponding momentum fractions x a,b . Correspondingly, the integral over dΦ N in eq. (2.1) includes all phase-space integrals and sums over helicities and partonic channels. For simplicity, we also absorb into it flux, symmetry, and color and spin averaging factors. We use N to denote the number of strongly-interacting partons in the final state. There can also be a number of additional nonstrongly interacting final states at Born level, which are included in Φ N but we suppress for simplicity.

Singular and nonsingular contributions
Any N -jet cross section σ(X) can also be measured differential in a generic N -jet resolution variable T N , which we write as dσ(X)/dT N . Then σ(X) may be written as dividing the more differential cross section into a region around T N = 0 and a region away from T N = 0. For T N to be an N -jet resolution variable it must satisfy the following conditions: In words, T N must be a physical IR-safe observable that resolves all additional IR-divergent real emissions, such that the cross section dσ(X)/dT N is physical and IR finite for any T N > 0, and the IR singular limit corresponds to T N → 0. 2 Hence, we have We use the convention that T N is normalized to be a dimension-one quantity, and for convenience we also define the dimensionless quantities Here, Q is a typical hard-interaction scale of the Born process (whose precise choice however is unimportant). For example, canonical choices would be Q = E cm for e + e − → jets, Q = q 2 for Drell-Yan pp → V → , Q = m H for gg → H, and Q = p jet T for pp → dijets. We define the "singular" part of the T N spectrum to contain all contributions that are singular in the T N → 0 limit, i.e., all contributions which are either proportional to δ(T N ) or that behave as ln n (τ )/τ for τ → 0. It can be written as where the L n (τ ) are the usual plus distributions, This logarithmic structure of the singular contributions directly follows from the IR singular structure of QCD amplitudes, the KLN theorem, and the fact that T N is an IR-safe physical observable. Since the infrared limit of the QCD amplitudes, and hence the IR singularities, depends only on the Born phase space, the singular coefficients C n only depend on the underlying Φ N . That is, We can therefore consider the singular distributions directly as a function of the full Φ N and independently of the specific measurement X, (2.10) In the second line, we have expanded the singular coefficients in α s . At LO, the only nonzero coefficient is C so at LO the singular spectrum reproduces the LO cross section, consistent with eq. (2.5), At NLO, the coefficients C −1,0,1 (Φ N ) are nonzero, while at NNLO, the coefficients C −1,0,1,2,3 (Φ N ) contribute. Writing the singular spectrum in terms of plus distributions as in eqs. (2.7) and (2.10) precisely encodes the cancellation between real and virtual IR divergences. The C −1 coefficient contains the finite remnant of the virtual contributions after the real-virtual cancellation has taken place. By itself, it is not unique, but depends on the boundary conditions adopted in the definition of the plus distributions, which is encoded in the choice of τ (the choice of Q). Changing the boundary conditions is equivalent to rescaling the arguments of the plus distributions according to (see e.g. ref. [72]) While this rescaling moves contributions between different C n , it does not change the overall 1/T N scaling, which implies that the sum of all terms in eq. (2.7) is unique 3 and in fact independent of the choice of Q. 4 Once the singular spectrum is written in terms of distributions as in eq. (2.7), one can easily integrate it up to T N ≤ T cut N to obtain the singular cumulative distribution (or cumulant in short) (2.14) The "nonsingular" contributions are defined as the difference between total and singular contributions, They start at O(α s ) relative to σ LO (X) (which is part of dσ sing ). By definition of the singular terms, the nonsingular spectrum contains at most integrable singularities for T N → 0, the largest terms being dσ nons (X)/dT N ∼ α n s ln 2n (τ ). Equivalently, the nonsingular cumulant behaves for T cut N → 0 as Hence, also the underlying matrix-element contributions yielding the nonsingular terms can be safely integrated in the infrared.

T N -subtractions
Up to this point, the decomposition of a cross section into singular and nonsingular terms is just notation and holds for any T N . The key point of the T N -subtraction method is that if we have analytic control of the singular T N dependence, we can turn the singular spectrum dσ sing (X)/dT N and its integral σ sing (X, T cut N ) into subtractions, as discussed next. This requires that for some N -jet resolution variable T N , the underlying coefficients C n (Φ N ) in eq. (2.10) can be determined explicitly. 5 In particular, the ability to explicitly compute C −1 (Φ N ) is precisely equivalent to being able to compute the integrated subtractions in a classical subtraction method. All these conditions are satisfied for N -jettiness, as we will discuss in section 3.

T N -slicing
If the singular contributions for a given T N are known, we can use T cut N to divide the phase space into two regions: The actual physical scales appearing together with TN in the logarithms are set by the hard Born kinematics. The reason to think of Q as a typical hard scale is that this provides the natural power suppression of the nonsingular terms. 5 They do not necessarily have to be known fully analytically, and in general they will not be. All we really need is a sufficiently fast way to compute their numerical values for given ΦN to in principle any desired accuracy.
δ IR = T δ /Q is an (in-principle) arbitrarily small IR cutoff, the singular terms will numerically dominate the nonsingular for T N < T cut N . In fact, since the nonsingular cumulant σ nons (X, T δ ) is of O(T δ /Q) = O(δ IR ), we can neglect it in this limit. Hence, we get This is precisely a phase-space slicing method, which we will call T N -slicing. Calculating σ(X) to N n LO in this way requires determining σ sing (X, T δ ) to N n LO, which includes the N n LO virtual contributions. Beyond that, since the T N spectrum only starts at O(α s ) relative to σ(X), the problem is reduced to the N n−1 LO calculation for the cross section dσ(X)/dT N for T N > T δ . Furthermore, if an N n−1 LO calculation is available, the slicing only needs to be performed for the pure N n LO terms.

Differential T N -subtractions
It is instructive to rewrite the T N -slicing in eq. (2.17) in the form of a subtraction as follows, This reorganization shows that the integral of the singular spectrum acts as a global subtraction for the integrated full spectrum, while the cumulant σ sing (X, T off ) is the corresponding contribution of the virtual terms (sitting at T N = 0) plus the integrated subtraction. The value of T off is arbitrary and exactly cancels between the first and third terms. It determines the upper limit in T N up to which the subtractions are used. The subtraction term in this case is maximally nonlocal, as it is applied after all phase-space integrations. Hence, one would naively expect the numerical cancellations to be maximally bad. This also shows that T δ really is an IR cutoff below which only the singular (subtraction) terms are used, due to limited numerical precision. Looking at eq. (2.18), we can also move the singular spectrum underneath the T N integration, which turns the singular spectrum into an actual subtraction which is local (point-bypoint) in T N . It is of course still nonlocal in the remaining real radiation phase space. To use eq. (2.19), one now has to explicitly calculate the singular differential spectrum. This requires essentially no additional effort, since the required singular coefficients are the same as in σ sing (X, T cut N ). Writing it as in the second line of eq. (2.19) shows explicitly that the numerical integral over T N now only encounters an integrable singularity for T N → 0 since the integrand is precisely the nonsingular contribution. The integral is cut off by T δ , since the integrand is still given by the difference of two diverging integrands. Finally, we note that the neglected contributions due to the numerical IR cutoff T δ are precisely the same as in eq. (2.17) for the same value of T δ . The numerical error introduced by such a cutoff is discussed in the next section.
We stress that a technical IR cutoff analogous to δ IR must exist in any numerical fixedorder calculation using subtractions, since the QCD amplitudes (and their subtractions) become arbitrarily large in the IR. Below the cutoff, the full QCD amplitudes are always approximated by the subtraction terms, so that below the cutoff only the integral of the subtraction is used, while the nonsingular cross section below the cutoff is power suppressed by δ IR and neglected.

Estimating numerical accuracy
We can judge the numerical accuracy of the T N -slicing and differential T N -subtractions using some simple scaling arguments. First, it is important to quantify the effect of the IR cutoff δ IR . Using N -jettiness as an example, at N n LO relative to the Born cross section, the most dominant singular terms in the spectrum and the cumulant are, for a given partonic channel, (2.20) Here, Γ 0 = 4 is the one-loop coefficient of the cusp anomalous dimension, C i = C F for quarks and C i = C A for gluons, and the ellipsis denote terms with fewer powers of logarithms at each order in α s . 6 Correspondingly, the leading nonsingular term in the cumulant has the form nons is not known in general, but we take C (n) nons = 1 here, which is the correct value for 2-jettiness in e + e − (i.e. thrust).
We denote the missing nonsingular contribution due to approximating the full result by the singular contributions below T N < T δ by ∆σ IR (δ IR ) and expand it in α s as The size of the dominant nonsingular terms in eq. (2.21) at τ = δ IR is indicative of the size of ∆σ IR . For the production of a color singlet X in the pp → X and pp → X + jet channels, the missing terms at NLO and NNLO scale as (plugging in the relevant color factors): qq → X : ∆σ (2.23) To estimate the impact of these terms relative to the full NLO and NNLO contributions, we write the full result for the cross section as We assume that the K-factors at each order of perturbation theory for qq → X and qq → Xg, qg → Xq processes are 10%, so σ (n) /σ LO ≈ 10 n . For gg → X and gg → Xg processes, we assume the K-factors are 30%, so that σ (n) /σ LO ≈ 30 n for these cases. These factors roughly scale like the prefactors in eq. (2.23). Hence, a rough estimate of the relative size of the missing terms at each order is given by The dependence of these corrections on δ IR is plotted in figure 1, where we take a between 1/3 and 3. The dashed line shows the known exact NLO result for thrust. This implies that when working to NNLO, we need δ IR 10 −3 − 10 −4 to have a reasonable O(10%) determination of the α 2 s NNLO contribution to the cross section. For typical applications with Q ∼ O(100 GeV) this implies that T δ 0.1 − 0.01 GeV. To the extent that the NNLO terms are only a small part of the total cross section (as is the case for Drell-Yan, for example), a larger error on the NNLO terms might be tolerable. However, we stress that these estimates can only serve as an indication, and in practice one should carefully test the size of missing corrections, for example by studying the δ IR dependence as discussed in section 4.3.
An important comment concerns the fact that it is in principle possible and straightforward (though perhaps tedious in practice) to derive subleading factorization theorems for N -jettiness and other observables using SCET. These can then be used to systematically determine the next-to-singular O(τ ) corrections and include them in the same way in the subtractions. This would substantially reduce the size of the missing nonsingular corrections by one power of δ IR . A complete factorization theorem at subleading order for a single-jet process has been derived for semileptonic heavy quark decays in ref. [73]. For recent work in this direction for thrust in e + e − see e.g. refs. [74][75][76].
A second important aspect concerns the required numerical precision in a practical implementation. For both T N -slicing and differential T N -subtractions, the full QCD and singular cross sections are probed in regions of phase space with T N T δ , where there are significant numerical enhancements due to the nearby IR singularity at T N = 0. For δ IR ∼ 10 −4 , the cancellations between the full QCD and singular T N distributions can easily reach the O(10 4 ) level and only increase as δ IR is lowered further. Getting a result at O(10 −k ) relative numerical precision in this case demands at least an O(10 −(k+4) ) relative numerical precision in the evaluation of the squared QCD amplitudes.
For T N -slicing, the numerical cancellations only happen after the T N integration, which means that in the worst case the T N integral itself may have to be carried out to the same high precision. In practice, this will strongly depend on the process and the chosen T δ , since the numerical cancellations actually happen between the two terms in eq. (2.17) rather than the the last two terms in eq. (2.18). In any case, using Monte-Carlo integration to determine the integral of the unsubtracted full result down to T N ≥ T δ very accurately requires very high statistics and good phase-space sampling. Since NLO codes are usually not designed for this purpose, this strongly limits how low T δ can be taken.
For the T N -subtractions, the QCD amplitudes in the integrand still require the same high numerical precision at small T N to obtain an accurate result for the nonsingular spectrum. However, since the cancellations now happen already at the integrand level, the T N integration itself has to be carried out only to the nominal O(10 −k ) relative precision. Hence, the statistical requirements on the Monte-Carlo integration of the nonsingular spectrum in eq. (2.19) are much more modest compared to the T N -slicing. This also means that T δ can now be taken as low as the numerical precision in the integrand allows. The main nontrivial requirement now is that one must be able to sample phase-space for fixed T N , which we discuss further in section 4.

N -jettiness Subtractions
In this section, we now specify T N to be N -jettiness and explicitly construct the N -jettiness subtractions. We first discuss the Born kinematics and the definition of N -jettiness in section 3.1. In section 3.2 we review the factorization theorem for the singular contributions in T N and how the virtual QCD amplitudes enter into it. Then in section 3.3 we explicitly write out the T N subtractions at NLO and NNLO. Finally, in section 3.4 we discuss how the subtractions can be made more differential and thereby more local.

Born kinematics
We always use the indices a and b to label the initial states, and 1, . . . , N to label the final states. Unless otherwise specified, a generic index i always runs over a, b, 1, . . . , N . We denote the momenta of the QCD partons in the Φ N Born phase space by {q a , q b ; q 1 , . . . , q N } and the parton types (including their spin/helicity if needed) by where Φ L (q) denotes the phase space for any additional nonhadronic particles in the final state, whose total momentum is q. (For ep or ee collisions, one or both of the incoming momenta are considered part of Φ L (q).) We will mostly suppress the nonhadronic final state. For us, it is only relevant because it contributes to momentum conservation in Φ N , which reads When there is no ambiguity, we will associate κ i ≡ i (e.g., we use f a ≡ f κa ), and we use the collective label κ to denote the whole partonic channel, i.e., We write the massless Born momenta q i as In particular, for the incoming momenta we have where E cm is the total (hadronic) center-of-mass energy andẑ points along the beam axis. The x a,b are the light-cone momentum fractions of the incoming partons, and momentum conservation implies The total invariant mass-squared Q 2 and rapidity Y of the Born phase space are The complete dΦ N phase-space measure corresponds to where dΦ N (...) on the right-hand side denotes the standard Lorentz-invariant N -particle phase space, the sum over κ runs over all partonic channels, and s κ is the appropriate factor to take care of symmetry, flavor and spin averaging for each partonic channel.

N -jettiness
Given an M -particle phase space point with M ≥ N , N -jettiness is defined as [50] T where i runs over a, b, 1, . . . , N . (Here we use a dimension-one definition of T N following refs. [52,62].) For ep or ee collisions, one or both of the incoming directions are absent. The Q i are normalization factors, which are explained below. The p k are the M final-state parton momenta (so excluding the nonhadronic final state) of Φ M . The q i in eq. (3.9) are massless Born "reference momenta", and the corresponding directions n i = q i /| q i | are referred to as the N -jettiness axes. For later convenience we also define the normalized vectorsq The q i are obtained by projecting a given Φ M onto a corresponding Born point Φ N (Φ M ). For this purpose, any IR safe phase-space projection can be used. That is, in any IR singular limit where Φ M → Φ N , the Born projection has to satisfŷ including the proper flavor assignments. In particular, for M = N , we simply havê there is always at least one p k that cannot be exactly aligned with any of the q i , which means that T N (Φ M ) > 0. The minimization condition in eq. (3.9) ensures that for each p k the smallest distance to one of the q i enters the sum, which together with eq. (3.11) implies that T N (Φ M → Φ N ) → 0. Hence, N -jettiness satisfies all the criteria of an IR-safe N -jet resolution variable given in eq. (2.4). Some examples of suitable Born projections are discussed in section 3.1.3 below. Although the precise procedure to define the Born projection and the q i is part of the definition of N -jettiness, it is important that it does not actually affect the singular structure of the T N -differential cross section. Different choices only differ by power-suppressed effects, as explained in ref. [50], which means the precise choice only affects the nonsingular contributions. Hence, constructing the singular contributions and the subtraction terms does not actually require one to specify the Born projection, as they are constructed in the singular limit starting from a given Φ N . 7 This fact provides considerable freedom in the practical implementation, which we will come back to in section 4.
The singular structure of T N is determined by the minimization condition in eq. (3.9) and the choice of the Q i . The minimization effectively divides the Φ M phase space into N jet regions and up to 2 beam regions, where each parton in Φ M is associated ("clustered") with the q i it is closest to, where the Q i determine the relative distance measure between the different q i . We can then rewrite eq. (3.9) as follows, where the T i N are the contributions to T N from the ith region. The Q i can be chosen depending on the Born kinematics in Φ N (subject to the constraint that the resulting distance measure remains IR safe). A variety of possible choices are discussed in detail in refs. [52,62]. An "invariant-mass" measure is obtained by choosing common Q i = Q. In this case, the sum of the invariant masses of all emissions in each region will be minimized. A class of "geometric measures" is obtained by choosing Q i proportional to E i , which makes the value of T N itself independent of the E i , i.e., where the ρ i are dimensionless numbers which determine the relative size of the different regions. In this case, the sum of the small light-cone momenta, n i · p k , of all emissions relative to their associated N -jettiness axis are minimized. The singular structure of the cross section does explicitly depend on the distance measure. When discussing the singular contributions in the next section, we will keep the Q i arbitrary, thus enabling various choices to be explored using our results. [As discussed in ref. [50], one can generalize N -jettiness further to use any IR-safe distance measure d i (p k ) in eq. (3.9), which has been used for example in the application to jet substructure [77,78]. For our purposes, the canonical form d i (p k ) =q i · p k is suited well, because the simple linear dependence on p k simplifies the theoretical analysis and computations.]

Example Born projections
To construct a generic Born projection, it suffices to use any IR-safe jet algorithm to cluster the M -parton final state into N jets with momenta P i . One can then define massless finalstate q µ i = E i n µ i by taking (i = 1, . . . , N ) where any of the choices for E i can be used. To ensure that the total transverse momentum in the Born final state adds up to zero, one can then for example boost the hadronic system or recoil the leptonic final state in the transverse direction. Finally, the initialstate momenta q a and q b , which always lie along the beam directions as in eq. (3.5), are determined by momentum conservation from eq. (3.6). When using a geometric measure as in eq. (3.13), the canonical way to determine the N -jettiness axes n i is by an overall minimization of the total value of T N . Up to NNLO the relevant cases are M = N + 1 and M = N + 2, i.e., one and two extra emissions, in which case the overall minimization to find the N -jettiness axes is still fairly easy to work out explicitly.
Let us take ρ i = 1 for simplicity and consider the case of hadron-hadron collisions, such that we have N jet axes plus the two fixed beam axes n a,b = ±ẑ. When M = N + 1, it is easy to see that N − 1 axes must be aligned with N − 1 of the p k momenta. For the last axis, there are two possibilities, and the one which gives a smaller T N is selected: Either it is aligned with one of the two remaining p k (this occurs if the last p k momentum lies close enough to one of the beam directions), or it lies along the direction of the sum of the two remaining p k . The appropriate expression for T N for M = N + 1 is then: The first term in the overall minimization corresponds to the first case above (extra emission clustered to the beam), whilst the second term corresponds to the second case (extra emission clustered to a jet). When M = N + 2 there are two extra emissions. Now, N − 2 axes will always be aligned with N − 2 of the p k momenta, and there are four possible cases how the remaining two axes can be chosen based on the remaining four p k . The appropriate expression for The first term in the overall minimization corresponds to both extra particles being clustered to a beam direction. The second term corresponds to one particle being clustered to a beam, and two particles being clustered together in a jet. The third term corresponds to clustering three particles together in a jet, and the final term corresponds to clustering two sets of two particles into two separate jets. In all cases the remaining jet directions are set by the remaining unclustered p k momenta.

Factorization theorem
We start by writing the N -jettiness singular cross section differential in Φ N and all indi- The factorization of the N -jettiness cross section in the singular limit [for the linear measures defined by eq. (3.9)] was derived in SCET in refs. [50][51][52]. It takes the form The first argument(s) of the beam, jet, and soft functions B i , J i , and S κ determine the contributions to the T i N from the respective collinear and soft sectors. The beam function B a (t a , x a , µ) contains all collinear emissions (virtual and real) from the incoming parton a, and depends on the parton's flavor κ a and light-cone momentum fraction x a . The jet function J i (s, µ) contains all collinear emissions from the outgoing parton i, and depends on the parton's flavor κ i . The soft function S κ contains all soft emissions between all partons and depends on the directionsq i . It is a matrix acting in the color space of the partonic channel κ. More precisely, it acts in the color-conserving subspace of the full color space. The hard Wilson coefficient C(Φ N , µ) is a vector in the same color space, and C † (Φ N , µ) is its conjugate (see below). It contains the QCD amplitudes for the N -parton process and depends on the full N -parton phase space Φ N .
All functions in the factorized cross sections have an explicit µ dependence (due to their nonzero anomalous dimensions). This µ dependence exactly cancels between the different functions at each order. The remaining internal µ-dependence is the usual one due to the running of α s (µ) which cancels up to the order one is working at. In the general case, the µ dependence is used to resum the logarithms of T N to all orders in α s at a given order in logarithmic counting. For our purposes, we require the strict fixed-order expansion in α s (µ) at NLO and NNLO.
We note in passing that starting at N 4 LO, the partonic QCD cross section receives a contribution from noncancelling Glauber modes in graphs with the same structure as figure 5 in ref. [79]. Such contributions are not reproduced by eq. (3.18). However this is far beyond NNLO, which is the level we are concerned about here.

QCD amplitudes and color space
The hard coefficients C(Φ N ) contain the virtual N -parton amplitudes from QCD. They formally arise as the matching coefficients from QCD onto SCET. How this matching is performed in practice for generic processes using QCD helicity amplitudes is discussed extensively in refs. [80,81] (see also refs. [82][83][84][85]). We refer the reader there for details and only summarize the features relevant for our discussion here. The important point is that when working in pure dimensional regularization with MS, the coefficients C(Φ N ) are given by the infrared-finite part, A fin , of the full N -parton QCD amplitude after UV renormalization. 8 Hence, we have where we have explicitly written out the color indices {α a , . . . , α N } of the external partons.
(All remaining dependence on external helicities and momenta are contained in Φ N .) The color indices {α i } span the full color space for the partonic channel κ. We can now pick a complete basis of color structuresT αa···α N k , which span the color-conserving subspace. (For practical purposes, the basis can be overcomplete and does not have to be orthogonal.) For example, for κ = gqq the color-conserving subspace is still one-dimensional, since the only allowed color structure isT aαβ ≡ (T a αβ ). For κ = ggqq, one choice would bē Given a basisT αa···α N k , we write the hard coefficients in this basis as This is in one-to-one correspondence to choosing a particular color decomposition for the N -parton amplitude, and so the coefficients C are directly given by the IR-finite parts of the color-ordered (or color-stripped) amplitudes. The precise form of the amplitude's color decomposition is irrelevant for our discussion and any convenient color basis can be used. The conjugate C † of the vector C is defined by where the superscript T denotes the transpose and is the matrix of color sums for the basis chosen for the partonic channel κ. The typically used color bases are not orthonormal, in which case T κ is not equal to the identity operator 1 κ and C † is not just the naive complex conjugate transpose of C. We then have (3.24)

Leading order
It is instructive to see how the LO cross section arises from eq. (3.18). At LO, we have where the LO soft function is the identity operator in color space, Plugging this back into eq. (3.18) we get

Single-differential subtractions
We now project onto the single-differential N -jettiness T N . Equations (3.17) and (3.18) yield where the single-differential soft function is the projection of the multi-differential one appearing in eq. (3.18), see eq. (A.21). We expand this singular contribution to the Njettiness cross section as [cf. eq. (2.10)] The L n (τ ) are the usual plus distributions defined in eq. (2.8). Here we explicitly denote the dependence of the subtraction coefficients C (m) n (Φ N , ξ, µ) on the renormalization scale µ. The individual coefficients also depend on the arbitrary dimension-one parameter ξ, which drops out exactly in the sum of all coefficients at each order in α s . (In section 2.2 we used ξ ≡ Q.) Finally, the coefficients also depend on the N -jettiness measures Q i , which we suppress for simplicity.
To determine the subtraction coefficients, we simply expand all the functions in the factorization theorem eq. (3.28) in terms of α s (µ), plug these back, and collect all contributions to each order in α s and each power in ln T N . Explicit results for the jet, beam, and soft functions through O(α 2 s ) are given in Appendix A.

NLO subtractions
At NLO, the differential subtractions require the subtraction coefficients C The jet function contributions in the first line effectively correspond to collinear finalstate subtractions, while the beam function contributions in the second line effectively correspond to collinear initial-state subtractions. The soft function contribution in the last line effectively corresponds to a soft subtraction. As explained in section 2, the coefficient C −1 determines the integrated subtractions plus the virtual contributions, which becomes obvious when choosing ξ = T off . At NLO, we have The first line contains the IR-finite virtual one-loop amplitudes in C (1) (Φ N ). The remaining lines effectively correspond to the integrated collinear and soft subtractions. The NLO beam, jet, and soft function coefficients, B a,n (x, µ, µ F , λ), J i,n (λ), and S One can see that the structure of the subtraction terms has a close resemblance with FKS subtractions. The important difference is that here one does not have to divide up phase space in order to individually isolate all possible IR singular regions. Instead, all the singular regions are projected onto the single variable T N . An analogous phase-space division for the soft emissions now happens in the calculation of the N -jettiness soft function. It is also important to note that there are no overlaps (i.e. double counting) between the soft and collinear subtraction terms. In principle, such overlaps can exist and must be removed, which in SCET corresponds to removing so-called zero-bin contributions [86]. A nice feature of N -jettiness is that all such overlap contributions automatically vanish in pure dimensional regularization at all orders in perturbation theory.

NNLO subtractions
For simplicity of the presentation, we define the abbreviations where we use roman letters (B, J, S, C, f) to avoid any confusion with some of the coefficients listed in Appendix A. The NNLO coefficients J (2) i,n and B (2) i,n as well as the soft function coefficients S (2) n≥0 are all known analytically, see Appendix A. The N -jettiness soft function describes how the soft radiation is split into the different N -jettiness regions. Obtaining the two-loop soft constant S (2) κ,−1 (the coefficient of the δ(k), see eq. (A.23)) is the remaining principal challenge. It is known analytically for processes with two external partons, see eqs. (A.34) and (A.35), but it is currently unknown for generic N -jet processes. It can however be determined numerically by extending the NLO calculation in ref. [52] using existing NNLO results. A procedure to do so has been outlined recently in ref. [87], where numerical results for 1-jettiness in pp collisions were presented.
We conveniently denote the genuine m-loop contributions from jet, beam, and soft functions to the C for the convolution L m ⊗ L n according to eq. (3.42).
Using this notation, we write the two-loop cross terms related to real-virtual contributions and involving the one-loop virtual amplitudes in C (1) as Finally, the cross terms with two one-loop coefficients of jet, beam, or soft functions from the associated L n ⊗ L m convolution are denoted as b,n C †(0) S (1) m C (0) .

(3.36)
With these definitions, the NNLO subtraction coefficients read The δ(T N ) coefficient C They are given in table 1 for m, n ≤ 1. Their expression for general m, n can be found in Appendix B of ref. [72].

Toward N 3 LO subtractions
Using the notation introduced in the previous subsection it is straightforward to also write down the N 3 LO N -jettiness subtraction terms. Besides the genuine three-loop terms X , where the latter is associated with the convolution L n ⊗ L m ⊗ L l . The N 3 LO subtraction coefficients then schematically take the form The cross terms in eq. (3.43) are a linear combination of the above listed X's, whose numerical coefficients can be easily worked out by evaluating the relevant convolutions among the L n≤3 distributions in analogy to the NNLO case.
For processes with only two colored external partons, so e + e − → qq, DIS, or pp → color singlet, analytic expressions for all X (3) n≥0 are in fact available. This is because the threeloop anomalous dimensions of jet and beam functions, the PDFs, and the hard function are known [88][89][90][91][92][93][94][95], which also fixes the three-loop soft anomalous dimension. This means the complete set of O(α 3 s ) logarithmic (L n ) terms of the renormalized jet, beam, and soft functions are determined by their RGE. The only coefficient that is not fully known is C

Constructing more-differential subtractions
As mentioned already, the T N -subtractions we have defined thus far are nonlocal, in the sense that all the singular regions are projected onto the single variable T N and the subtraction acts only after the corresponding phase-space integrations. It is conceivable that in order to improve the numerical stability or convergence of the NNLO calculation one might wish to use a more local subtraction -indeed, many of the available NNLO subtraction schemes utilize highly local subtraction terms.
In our approach, it is straightforward, at least conceptually, to progressively increase the locality of the subtractions. All one needs to do is split T N up into further IR-safe observables that cover the phase space and which are sensitive to emissions in different regions, and/or introduce further observables that resolve the nature of emissions, e.g. allowing one to discriminate between double-real and single-real(+virtual) emissions in a given region. The subtraction is then given by the singular cross section differential in all of these observables. In practice, this requires the relevant factorization theorem for this more-differential cross section.
Let us demonstrate how this works for a simple example. The factorization theorem in eq. (3.18) is already differential in the individual N -jettiness contributions T i N . For simplicity, we take N = 0 and consider the X + 0j NNLO cross section. In this case, 0jettiness (aka beam thrust) effectively splits the event into two hemispheres (beam regions) a and b, whose N -jettiness axes are defined by the beam directions. The total 0-jettiness is given by T 0 = T a 0 + T b 0 , where T a 0 and T b 0 are the contributions from the two hemispheres [cf. eq. (3.12) and its discussion].
Following the procedure in section 3.3 we can use the total T 0 to construct a subtraction. However, instead of taking the sum, we can also consider T a ≡ T a 0 and T b ≡ T b 0 separately, and perform the subtraction differential in both of these observables. Each of them is then sensitive to a subset of the singular regions, namely, collinear (and soft) emissions closer to beam a will only affect T a , whilst emissions closer to beam b will only affect T b . Following the logic of section 2.3, we first write down the appropriate formula for the corresponding double-differential phase-space slicing: Here, we substitute in the double-differential singular cross section when both T a and T b are below the IR cutoff T δ , which is correct up to O(δ IR ). Having either T a or T b nonzero requires at least one additional emission, so the remaining three regions only require an NLO calculation. Of course, there are singularities in the second term as T b → 0 with nonzero T a (and similar singularities in the third term when T a → 0 with nonzero T b ), but these are handled as part of the NLO calculation. Performing the slicing method using both T a and T b has no clear advantage over the slicing method using T 0 alone, as in both methods one basically removes a small region of size T δ around T 0 = 0, and handles it using the singular cross section. However, let us rewrite eq. (3.44) as a subtraction by adding and subtracting the singular cross section for the shaded region in figure 2, arranged in the following way: Figure 2. Division of the T a , T b phase space for double-differential N -jettiness subtractions.
This equation is the two-variable analogue of eq. (2.19). The parameter T off controls again where we turn off the subtraction, and the dependence on it precisely cancels between all contributions. The total cumulant in the first term contains the two-loop virtual corrections together with the corresponding integrated subtraction terms. The cross sections in the second and third terms are differential in one of the variables and integrated in the other. Since one of the variables is nonzero, while the other is integrated, they require an NLO calculation with one additional resolved emission. These terms contain all the realvirtual contributions and the singular cross section acts as the corresponding real-virtual subtraction. The fourth term involves the double-differential cross sections and since both variables are nonzero only requires a LO calculation with two resolved emissions, one in each hemisphere. The double-differential singular cross section then acts as the corresponding double-real subtraction, which is point-by-point in both T a and T b . (Contributions with two real emissions in the same hemisphere are part of the NLO calculations in the second and third terms.) Hence, by considering separately T a and T b , one is able to disentangle different real-virtual and double-real contributions and also make the subtractions more local. The price one has to pay is that the double-differential singular cross section in the last term requires the double-differential NNLO soft function, which is more complicated.
(For the beam and jet functions this requires no additional effort.) A further important point to make is that T a and T b are defined such that requiring T a > T δ or T b > T δ forces the corresponding emission to be in hemisphere a or b. At NLO, there is only one real emission, so only one out of T a and T b can be nonzero. Then, the double-differential subtraction essentially splits the T 0 -subtraction into two pieces, acting in the two hemispheres. At NNLO, this splits the real-virtual contributions into the two pieces in the second and third lines of eq. (3.45). If this is undesired, one can instead consider the two variables T min = min{T a , T b } and T max = max{T a , T b }. This effectively folds the phase space in figure 2 in half along the diagonal where T a = T b , and combines the second and third terms in eq. (3.45) into one. Now let us return to the general case with N partons in the Born process. Then there are N + 2 contributions T a N , T b N , T 1 N , . . . , T N N , and one can consider the subtraction separately in all of them. At NLO, only one of them can be nonzero, while at NNLO at most two of them can be nonzero. This means that there will be many different contributions, where in each contribution only one or two of the T i N are differential and nonzero, while all the others are integrated over. Each T i N can only be nonzero when the corresponding emission is in the ith N -jettiness region. Hence, if desired, using the individual T i N as the resolution variables automatically yields a division of phase space into different singular regions around each of the N partons, very similar to the phase-space divisions encountered in traditional local subtraction methods. On the other hand, if the proliferation of phasespace regions is undesired, one can still have the same gain at NNLO as in eq. (3.45) by considering two combinations of all T i N , e.g. the minimum and maximum nonzero T i N , or the sum of all T i N together with the sum of all but the largest T i N . Instead of or in addition to splitting T N into its different components, one can also increase the locality of the subtraction by performing it differentially in both T N and another independent N -jet resolution variable. For example, one could look into each N -jettiness region i and compute the scalar sum of transverse momenta with respect to the corresponding N -jettiness axis E T i = k∈i |p T k |, performing the subtraction also differential in the E T i . Doing so resolves part of the radiation phase space, which would otherwise be integrated over when considering only T N by itself. For the X + 0j case one could for example consider T 0 together with the transverse momentum p T of the color-singlet final state X. The relevant factorization formulae differential in T 0 and p T have been discussed and written down in refs. [63,98] (see also refs. [99,100]), and the corresponding double-differential two-loop quark beam functions have been computed in ref. [101].
We have discussed several options how to extend the single-differential N -jettiness subtractions, but of course this is not an exhaustive list. Constructing such more-differential subtractions requires the appropriate singular cross section differential in all of the chosen jet resolution variables, and in order to experience the maximum advantage in terms of convergence, these differential cross sections should reproduce the correct singular behaviour in all of the relevant singular kinematic regimes. The factorization of multi-differential cross sections in SCET accurate in all relevant kinematic regimes is a topic that has received much interest recently, see e.g. refs. [52,[61][62][63][64][65], and it would be interesting to apply this work to the issue of calculating NNLO QCD cross sections.

Practical Considerations and Implementation
In this section, we discuss in more detail how the singular cross section in eq. (3.29) can be implemented in practice as a subtraction term following our general discussion in section 2.3. We first discuss the NLO case in section 4.1, where we also highlight the similarities to FKS subtractions, and then the NNLO case in section 4.2. In section 4.3, we discuss numerical aspects using the NNLO rapidity spectrum in Drell-Yan and gluon-fusion Higgs production as an example.

FKS subtractions
In the notation of eq. (2.1), the cross section at NLO is given by where V N is the N -parton virtual one-loop contribution and B N +1 is the N + 1-parton real-emission contribution, The additional α s in B N +1 compared to B N is contained in A (0) (Φ N +1 ). As indicated in eq. (4.1), the limit → 0 can only be taken in the sum of V N and integral over B N +1 . When implementing eq. (4.1) using FKS subtractions [1-3, 102, 103], the cross section is obtained as follows: Here, the phase space is first sampled over Φ N . For a fixed Φ N point, one then further samples over the radiation phase space Φ rad , where the sum over k runs over all the different IR-singular regions. The real-emission contribution and measurement ( indicates that B N +1 is divided up between the regions in such a way that it is precisely reproduced in the sum over all regions. The phase-space mapΦ k N +1 and the subtraction terms S k N +1 are specific to each singular region. The S k N +1 are directly constructed in the singular limit, meaning they are functions of Φ N and Φ rad only, and in particular do not depend on the actual mapΦ k N +1 . In practice, there is again a tiny IR cutoff δ IR required on the Φ rad integral due to limited numerical precision and the fact that B k N +1 and S k N +1 each individually diverge. The subtracted virtual, V S N , contains the finite remainder after combining the virtual contributions with the integral of the subtractions and cancelling all 1/ IR poles.

T N -subtractions
As discussed in section 2.3, the full cross section for X at T N > 0 only requires a lower-order calculation. At NLO, we need its LO expression given by where it is obvious that this is a LO quantity.
Since the subtractions are used up to the upper cutoff T N < T off , as seen in eq. (2.19), it is most convenient to set ξ = T off in the subtraction coefficients. For the singular spectrum at T N > 0, we can simply drop C −1 and replace L n (τ ) → ln n (τ )/τ . The subtraction terms at NLO are then where for convenience we included the θ(T N < T off ) in the singular spectrum. Using the above with eq. (2.17), the T N -slicing at NLO becomes This calculation is very easy from an implementation point of view, since it boils down to performing two LO phase-space integrals. As already eluded to, the main practical limitation are the large numerical cancellations between both terms, requiring the phasespace integrals to be evaluated to very high precision. Using eq. (4.4), the differential T N -subtraction in eq. (2.19) takes the form For a numerical implementation, one must be able to solve the δ function in the dΦ N +1 integral, which amounts to being able to sample over all of Φ N +1 that gives a fixed T N (Φ N +1 ). One option to do so is to decompose Φ N +1 as is precisely the Born projection used to define T N (Φ N +1 ), see section 3.1. The Ω rad ≡ Ω rad (Φ N +1 ) contains the remaining information needed to fully specify Φ N +1 , which includes the continuous angular radiation variables as well as the discrete information about flavor, spin, and in which N -jettiness region the additional emission goes. We can then rewrite eq. (4.7) as One now samples first over Φ N and then T N . For fixed Φ N and T N , one further samples over Ω rad and evaluates the real-emission contribution at the Φ N +1 point reconstructed from all of these. Being able to reconstruct Φ N +1 (Φ N , T N , Ω rad ) is equivalent to inverting the Born projection. Recall however, that the singular contributions dσ sing (Φ N )/dT N are independent of the Born projection. Therefore, one has the freedom to specifically choose the Born projection to facilitate this inversion, making it easily possible. Note that the Ω rad integral contains a discrete sum over all N -jettiness axis/regions. If one were to separate T N into its individual components T i N as discussed in section 3.4, this sum would become explicit and the single subtraction term dσ sing (Φ N )/dT N would effectively separate into different subtraction terms for each region.
We also note the close similarity of eq. (4.9) with the FKS subtraction in eq. (4.3). Basically, T N ⊗ Ω rad now acts as Φ rad , while the split up into singular regions is now determined by the definition of T N . The LO piece C

NNLO
At NNLO, the subtraction terms are As at NLO, we have chosen ξ = T off and included the θ(T N < T off ) in the singular spectrum.
The full T N -differential cross section at T N > 0 is now needed at NLO, where it is given by In the second equation we wrote it in the form of an NLO calculation with FKS-like subtractions, analogous to eq. (4.3), where now Φ k N +2 =Φ k N +2 (Φ N +1 , Φ rad ). In general, theΦ k N +2 map used in the N + 1-jet NLO calculation will not preserve T N , that is, ). This means we have to be careful in implementing the T δ cutoff, because in order for the neglected pieces to be nonsingular, the cutoff must be applied on the true T N (Φ N +2 ). We can do this by treating the cutoff analogous to the measurement X. That is, we define the NLO calculation which is fully-differential in Φ N +1 and satisfies (4.13) Using the above together with eq. (2.17), the T N -slicing at NNLO is given by This is again quite easy to implement, only requiring a LO phase-space integral for the first term and a NLO calculation for the second term. The practical limitation is the achievable numerical precision in the NLO calculation and the Φ N +1 integral, which strongly limits how low T δ can be pushed. From eq. (2.19), the differential T N -subtraction at NNLO takes the form To implement this numerically, one must be able to compute the T N spectrum dσ(X)/dT N to NLO for a given T N , which requires to solve the δ functions in eq. (4.11). For Φ N +1 we can use the same procedure as at NLO together with the N + 1-jet NLO cross section dσ NLO (X, T δ )/dΦ N +1 , giving This can be implemented like the NLO case in eq. (4.9), with the LO B N +1 (Φ N +1 ) replaced by dσ NLO (X, T δ )/dΦ N +1 . The subtraction in eq. (4.16) is not completely local in T N , since the Φ N +2 points being integrated over in dσ NLO (X, T δ )/dΦ N +1 will generically not have the correct T N value. However, a simple phase-space mapΦ k N +2 that approximately preserves T N might be sufficient in practice.
To achieve an exact point-by-point cancellation in T N , one also has to solve the δ(T N − T N (Φ N +2 ) constraint in eq. (4.11). This requires constructing a mapΦ k N +2 for the N +1-jet NLO calculation that preserves ) and is equivalent to inverting the Born projectionΦ N (Φ N +2 ) underlying the definition of T N (Φ N +2 ). This is quite a bit more challenging than at NLO. It has been achieved in ref. [67] for a slightly modified version of T N . Assuming, we have aΦ k N +2 map like this, we can pull the T δ cut out of the NLO calculation, such that The differential T N -subtraction then becomes The subtraction is now fully localized in T N , and the only nonlocality is in the dΩ rad variables.
In a practical implementation of eq. (4.16) or eq. (4.18), the subtraction terms are easily evaluated and the most nontrivial ingredient is in fact the NLO calculation of dσ NLO (X)/dΦ N +1 , for which one can use any existing FKS-like NLO calculation or one can iterate the N -jettiness subtractions and perform it using T N +1 -subtractions. Note that in all cases above the X measurement is performed inside dσ NLO (X)/dΦ N +1 . If theΦ k N +2 map preserves X, so X(Φ k N +2 ) = X(Φ N +1 ), then it can be pulled out of the N + 1-jet NLO calculation.

Example: NNLO rapidity spectrum for Drell-Yan and Higgs
To illustrate our method with a nontrivial example, we consider the rapidity distribution of the vector boson in Drell-Yan production, pp → Z/γ → + − , and of the Higgs boson in gluon fusion, gg → H, which are known to NNLO [39,40,[104][105][106][107][108]. Since the size of the perturbative corrections in the two cases are very different, they provide very useful and complementary test cases. In both cases, 0-jettiness T 0 is the resolution variable and all of the ingredients necessary to implement the T 0 -subtractions through NNLO are known. (We use the geometric measure with ρ i = 1, see eq. (3.13), which makes T 0 identical to beam thrust.) The results are obtained for the LHC with a center-of-mass energy of 13 TeV. We always use CT10 NNLO PDFs [109]. We choose common renormalization and factorization scales, µ R = µ F = µ, with µ = m Z for Drell-Yan production and µ = m H for Higgs production. For the latter we use m H = 125 GeV and work in the top EFT limit. For the Z + 1-jet and H + 1-jet NLO calculations we use MCFM [110,111].
An important validation of the N -jettiness subtractions is to confirm that the singular T N spectrum is correctly describing the T N → 0 singularities of the full QCD result. This is done by calculating the nonsingular T N spectrum as in eq. (2.15) as the difference of the full QCD and singular T N spectra. The decomposition of the T 0 spectrum into singular and nonsingular components is shown in figures 3 and 5 for Drell-Yan and Higgs production, respectively, where we separately show the O(α s ) (NLO) and O(α 2 s ) (pure NNLO) corrections, counted relative to the LO Born cross section. (We plot the magnitudes of the contributions on a logarithmic scale, and the dips at large T 0 and around T 0 = 1 GeV are due to the spectra going through 0. The small jitters in the pure NNLO nonsingular are due to numerical inaccuracies.) One can clearly see the large numerical cancellations between the full and singular results for small T 0 , where the nonsingular spectrum is several orders of magnitude smaller than the full and singular spectra.
As shown in eq. (2.19), the nonsingular spectrum is precisely the quantity that one integrates numerically when using the differential T N -subtractions. The fact that the nonsingular only contains integrable singularities is seen in figures 3 and 5 by its smaller slope toward T 0 → 0. To check explicitly that the subtractions work and the nonsingular does not contain any 1/T N singularities, we consider the distribution dσ nons /d ln τ = τ dσ nons /dτ , which must go to 0 in the T N → 0 limit. We plot it in figure 4 for Drell-Yan and in figure 6 for Higgs production, again separately for the NLO and pure NNLO corrections, and using τ = T 0 /m Z and τ = T 0 /m H , respectively. One can see that dσ nons /d ln τ → 0 for τ → 0, as it must. The error bars come from the statistical integration uncertainties in the full result obtained from MCFM. The numerical uncertainties in the singular result are negligible in comparison.
To obtain results for the NNLO rapidity spectrum, we use the simple T 0 -slicing in eq. (4.14). As explained earlier, the missing O(δ IR ) contributions due to the T δ cutoff are the same irrespective of how the subtractions are implemented. At NLO, we use Specifically, we use MCFM to compute the NLO cross section for T 0 > T δ in the second term in eq. (4.14), in bins of Y for the processes pp → Z/γ → + − + jet and pp → H + jet. Since there are no cuts on the final-state jets other than the requirement T 0 > T δ , for small T δ the calculation probes deep into the singular region and care must be taken to obtain reliable and numerically stable results. This is combined with our own implementation of the NNLO singular cross section for T 0 < T δ , σ sing (Y, T δ ), in the first term of eq. (4.14).
The results for the rapidity spectra are shown in figures 7 and 9 for Drell-Yan and Higgs production, respectively. The two contributions from T 0 < T δ and T 0 > T δ are shown in red and green and the total result given by their sum in black. The error bars here show the scale variations up and down by a factor of two. Note that the relative size of the two contributions and the degree of cancellation between them can change significantly as the scale (or T δ value) is changed. To validate the results from T 0 -slicing method, we compare to results from Vrap [104,105] for Drell-Yan and from HNNLO [39,108] for Higgs, which are shown by the blue line and band. For both processes we find excellent agreement.
In figures 8 and 10 we show the fractional difference of the T 0 -slicing results relative to Vrap and HNNLO, respectively. At NLO with T δ = 0.03 GeV, the agreement is excellent. For Drell-Yan at NNLO, there is a small offset between the two results visible in figure 8, representing ≈ 0.1% of the total cross section. A similar offset is also present in the Higgs case, but hardly visible because the scale variations are much larger. It is due to the missing O(δ IR ) nonsingular terms for T 0 < T δ = 0.1 GeV. A smaller value of T δ would be needed to reduce this effect. The size of the missing nonsingular terms we observe here is also consistent with our estimates in section 2.3.3. Nevertheless, it is actually encouraging to see that even with the simple T 0 -slicing we are able to obtain this level of agreement. We would expect that an implementation of the differential T 0 -subtractions will allow one to use δ IR values well below 10 −4 . We conclude this discussion by noting that it is important, particularly for more complex processes, to carefully quantify the size of the neglected O(δ IR ) nonsingular contributions. In particular, as already seen in figure 1, one cannot draw any conclusions for their possible size at NNLO from knowing their size at NLO. Also, the difference in the result when varying the T δ value is not necessarily a good estimate of the absolute size of the missing nonsingular terms, because as discussed in section 2.3.3, their scaling with δ IR for δ IR → 0 is much weaker than linear. A crucial check one should perform is to plot the nonsingular distribution as in figures 4 and 6 and check its convergence toward zero.

Conclusions
Higher-order computations in QCD require the use of some subtraction technique that allows one to extract the collinear and soft phase-space divergences from the real-emission diagrams, and cancel these against the explicit divergences from the virtual loop diagrams. We explained how a subtraction scheme can be constructed using an IR safe N -jet resolution variable to control the approach to the IR-singular limit. The N -jettiness observable T N is ideally suited for this task due to its known and simple factorization properties. Our resulting N -jettiness subtraction method is similar in spirit to the q T subtraction method introduced by Catani and Grazzini for color-singlet production, but may be applied to processes with arbitrarily many colored final-state partons (plus any color-singlet final state).
In our method, the subtraction term corresponds to the appropriate fixed-order expansion of the singular N -jettiness cross section, which can be efficiently computed using SCET. In this context, SCET allows the subtraction term to be broken down into various pieces (beam, jet, and soft functions) that are easier to compute, with the beam and jet functions being reusable for processes with any number of jets. The extension beyond NNLO is possible and requires the calculation of the beam, jet, and soft functions at three-loop order.
We discussed in depth the details of the subtraction procedure, giving explicitly the equations and ingredients needed to construct the T N -subtraction terms at NLO and NNLO. The only ingredient which is not explicitly known is the µ-independent constant term of the NNLO N -jet soft function for three or more N -jettiness axes. It can however be obtained relatively straightforward with existing technology. We also discussed how the N -jettiness subtractions can be implemented in practice. To demonstrate the method and study some of its numerical aspects, we presented NNLO results for the Drell-Yan and Higgs rapidity spectra computed using 0-jettiness subtractions in its simplest form as a slicing method.
We have also suggested and discussed several different ways in which the numerical convergence of the N -jettiness subtraction method can be systematically improved. One option would be to include the leading nonsingular terms in the subtraction. These corrections are described by subleading factorization theorems for N -jettiness and SCET offers a systematic framework to compute them. Another way to improve the numerical convergence would be to make the subtraction more local, by performing the subtraction differentially in additional observables (such as p T ) and/or splitting the total N -jettiness observable into its components in the jet and beam regions. Much of the recent work in SCET on deriving factorization formulae for multi-differential cross sections can be very useful in this direction.
We only explicitly discussed the case of massless partons here. The construction of analogous T N -subtractions for processes involving massive quarks is possible with the same techniques. For m q Q, one would consider a massive quark jet with its own N -jettiness axis making use of the tools in SCET developed for the treatment of massive collinear quarks [112][113][114][115][116][117]. For Q ∼ m q , e.g. tt pair production and similar processes, an analogous approach to refs. [48,49] can be used. This amounts to treating the heavy quarks as part of the hard interaction (without its own N -jettiness axis) together with a more complicated soft function to account for soft gluon emissions from the heavy quarks. We leave further development in this direction to future work.

A Subtraction Ingredients
We write the α s expansion of the QCD beta function and the cusp and noncusp anomalous dimensions as The coefficients of the MS beta function and cusp anomalous dimensions we need are and Γ q n = C F Γ n , Γ g n = C A Γ n , Γ 0 = 4 , For the quark jet and beam functions in MS we have [92,94] For the gluon jet and beam functions in MS we have [93,95,118]

A.1 Jet function
We write the α s expansion of the quark (i = q) and gluon (i = g) jet functions as The coefficients have the form where the L n (x) are plus distributions as defined in eq. (2.8). The jet function is naturally a distribution in s/µ 2 , and this is the only µ dependence of the coefficients. Rescaling the arguments of the distributions using eq. (2.13), we have where ξ is an arbitrary dimension-one parameter, which exactly cancels between the different rescaled coefficients and that we can choose at our convenience. The J (m) i,n (λ) are the coefficients appearing in the explicit expressions for the subtraction terms in sections 3.3.1 and 3.3.2.
The jet function coefficients in eq. (A.8) read up to two loops The δ(s) pieces for the quark jet function are [119,120] and for the gluon jet function they are [93,118,121]

A.2 Beam function
The beam function is given by [51,94] where f j (x, µ F ) are the standard PDFs and I ij (t, z, µ, µ F ) are perturbative matching coefficients. Here, we have explicitly separated the µ F dependence, which cancels between the matching coefficients and the PDFs, such that the beam function is µ F independent up to higher orders in α s (µ). (Usually, one takes µ F = µ in the fixed-order beam function, since these are not really formally distinct scales.) For our purposes, the µ F dependence in the beam function determines the complete µ F factorization scale dependence in the singular fixed-order cross section, while the µ dependence contributes to the usual renormalization scale dependence. We expand the beam function matching coefficients as (A.14) The perturbative coefficients have the structure ij,n z, where the L n (x) are the plus distributions defined in eq. (2.8). The beam function is naturally a distribution in t/µ 2 . Rescaling the arguments of the distributions using eq. (2.13), we have ij,n+k (z, λ F ) ln k λ , (A. 16) where ξ is an arbitrary dimension-one parameter, which exactly cancels between the different rescaled coefficients and that we can choose at our convenience. From these coefficients we also define the corresponding beam function coefficients as The NLO coefficients have been computed in refs. [51,94,95], and are given by ij,−1 (z, λ F ) = 2I (1) ij (z) + ln λ F 2P The NNLO coefficients have been computed in refs. [59,60], and read ij,−1 (z, λ F ) = 4I  ij (z) and I (2) ij (z) as well as the splitting functions P (0) ij (z) and P (1) ij (z) and all required convolutions between them can be found in refs. [59,60] in the same notation that we use here.

A.3 Single-differential soft function
The single-differential N -jettiness soft function is related to the one of eq. (3.18), which is multi-differential in the soft contributions to the T i N , by Recall that the subscript κ encodes the information on the Born partonic channel. For the soft function, it specifies the color space of the external partons in which it acts. We expand the soft function in α s (µ) as The dimension-one parameter ξ is again arbitrary and exactly cancels between the coefficients. The coefficients S (m) κ,n ({q i }, λ) are those appearing in the explicit expressions for the subtraction terms in sections 3.3.1 and 3.3.2. In the rest of this subsection the dependence on the jet axesq i of the soft function and its anomalous dimension is always understood and we often suppress the explicit {q i } argument.
The renormalization scale dependence of the soft function is subject to the renormalization group equation derived in ref. [52], Here, ∆ ij = 1 if the partons i and j are both incoming or both outgoing and ∆ ij = 0 if one of them is incoming and the other one outgoing. The invariant is always positive with our conventions and corresponds to an angular measure between any two partons (depending on the precise choice of the Q i ). Note that Γ cusp (α s ) here has the overall color factor removed, see eqs. (A.2) and (A.4). To write the last line in eq. (A.26), we defined the abbreviations Note that for ee and ep collisions, I is always proportional to 1 κ and can be ignored, as it drops out of eq. (A.25). Similarly, for pp collisions it can be ignored for 0-jet and 1-jet processes where the color space is still trivial. Up to two loops the noncusp soft anomalous dimension is given by The I 0 and I 1 are finite phase-space integrals, which are required for three or more Njettiness axes. They are not known fully analytically, but can be evaluated numerically for a given set {q i }. Their explicit expressions and an algorithm to reduce them to simple one-dimensional numerical integrals for arbitrary N is provided in ref. [52]. With only three N -jettiness axes, the integrals are still planar. Starting from four axes, the angles φ lm also enter, which are the azimuthal angles between theq m andq l axes in the plane transverse to theq i andq j axes. Iteratively solving the RGE in eq. (A.25), we obtain the two-loop coefficients For two external partons, κ = qq and κ = gg, the result for the two-loop constant is known analytically [122][123][124] and does not depend on the whether the partons are incoming or outgoing, i.e., it is the same for 0 → qq, q → q, and qq → 0, and similarly for two gluons [125], The two-loop constants S (2) κ (κ = ggg, qqg) required e.g. for 1-jettiness in pp collisions, have recently been computed numerically in ref. [87]. The two-loop constant for arbitrary N -jet processes can in principle be obtained numerically from known results for two-loop soft amplitudes as outlined in ref. [87].