Renormalization-group improved fully differential cross sections for top pair production

We extend approximate next-to-next-to-leading order results for top-pair production to include the semi-leptonic decays of top quarks in the narrow-width approximation. The new hard-scattering kernels are implemented in a fully differential parton-level Monte Carlo that allows for the study of any IR-safe observable constructed from the momenta of the decay products of the top. Our best predictions are given by approximate NNLO corrections in the production matched to a fixed order calculation with NLO corrections in both the production and decay subprocesses. Being fully differential enables us to make comparisons between approximate results derived via different (PIM and 1PI) kinematics for arbitrary distributions. These comparisons reveal that the renormalization-group framework, from which the approximate results are derived, is rather robust in the sense that applying a realistic error estimate allows us to obtain a reliable prediction with a reduced theoretical error for generic observables and analysis cuts.

An alternative approach to go beyond NLO -that is to some extent complementary to fixed-order NNLO calculations -is to apply threshold resummation and obtain renormalization-group (RG)-improved results. In this way, logarithmically-enhanced contributions related to the emission of soft gluons can be resummed. By re-expanding the resummed cross section to fixed-order in perturbation theory and assuming the logarithmicallyenhanced terms are dominant, it is also possible to obtain approximate higher-order expressions for the hard scattering kernels.
The precise definition of the soft parameter and hence the exact form of the logarithm to be resummed, depends on the kinematics which in turn depends on the particular observable that is analyzed. For the total cross section, it is possible to work in the limit of a tt pair nearly at rest. The small parameter in this case is the velocity of the top quarks, β ≡ 1 − 4m 2 t /s, where √ s is the partonic centre of mass energy and m t is the mass of the top quark. Apart from potentially large logarithms of the form α n s (ln β) m (with m ≤ 2n) relative to the Born term, there are also Coulomb contributions of the form (α s /β) n that can be resummed in this approach. This approach has been used to compute the total cross section up to next-to-next-to-leading logarithmic accuracy (NNLL) [34][35][36][37][38][39][40][41][42][43].
If one is interested in more detailed information on the final state it is advantageous to consider a different 'soft' kinematics that does not require the top pair to be nearly at rest. In pair-invariant mass (PIM) kinematics [44] the soft parameter is identified by the quantity √ s(1 − z), where z = M 2 /s and M is the invariant mass of the top pair. The soft parameter in single-particle inclusive (1PI) kinematics [45,46] is identified by the variable s 4 = (pt + k) 2 − m 2 t , where pt is the 4-momentum of the unobserved anti-top quark and k is the 4-momentum of the real radiation in the final state. The partonic threshold region is defined as a limit of the soft parameter (soft limit) which describes the emission of soft real radiation in the scattering process. In PIM the soft limit corresponds to z → 1 while in 1PI it corresponds to s 4 → 0.
Using PIM kinematics, theoretical predictions for the invariant mass distribution at NNLL have been obtained [47], whereas the transverse momentum and rapidity distributions of the (anti-)top quark have been computed at NNLL [48][49][50] applying 1PI kinematics. These results extend previous calculations obtained both in PIM and 1PI kinematics at NLL accuracy in [51][52][53][54][55][56][57]. Recently the special case of highly boosted top quarks has also been investigated in both kinematics [58][59][60]. Integrating the various differential cross sections (or using the small β approximation) also allows one to obtain the total cross section at NNLL. By re-expanding these results, approximate NNLO predictions for the total cross section were obtained prior to the completion of the full NNLO calculation. Whatever resummation is used, there will always be non-logarithmic terms that are missing in these approximate results. Since they differ by subleading terms, comparing results obtained by the various approaches is a useful tool to estimate the importance of the omitted contributions and, thereby, obtain a realistic estimate of the theoretical error. Of course, for the total cross section, the full NNLO result is now known and can be used to validate (or not) the procedure. Generally speaking, using a conservative estimate of the theoretical uncertainties, approximate NNLO predictions obtained from re-expanding RG-improved results are in reasonably good agreement with the exact NNLO result. A similar assessment of the quality of this method and the accompanying uncertainty can be made by comparing approximate NLO results to the complete NLO results for various distributions.
With this in mind, a natural next step is to generalize the RG-improved NNLL predictions for differential cross sections to a completely exclusive level. To this end, a fully differential NLO partonic Monte Carlo (MC) program, including corrections to the decay of the (anti-)top quark, can be improved, as a first step, by adding approximate NNLO corrections obtained from re-expanded RG-improved calculations. 1 In this article we present such a calculation. We have implemented PIM and 1PI kinematics and extended previous approximate NNLO results by adding the spin-correlated decay of the top quarks. Comparing 1PI and PIM implementations, that differ from each other by subleading terms, and studying a large variety of observables at various collider setups, allows us to critically assess the reliability of the method. Of course, such a calculation is not meant to be a replacement for a complete fully differential NNLO result including the decay of the top quarks. But, as we will argue, it does allow us to improve NLO predictions at the partonic level for a wide variety of observables including arbitrary cuts on the final state jets, leptons and missing energy. Unlike the total inclusive cross section, invariant-mass distributions or transverse momentum distributions, the exclusive observables our framework allows us to examine are much closer to the quantities actually measured in the experiments.
Apart from its direct phenomenological use for top pair production at hadron colliders, this work also addresses the important question on the reliability and universality of the soft gluon resummation procedure in obtaining approximate higher-order predictions for generic observables. In particular, we can study for example the transverse momentum distribution of the (anti-)top using the ('wrong') PIM kinematics and compare to the results obtained using the ('correct') 1PI kinematics. As we will see, the prominent features of resummation are rather robust, giving further assurance that the methods presented here provide us with a reasonable approximation and an improvement compared to fixed order NLO calculations.
The work presented here relies heavily on RG-improved calculations at NNLL using the Soft-Collinear Effective Theory (SCET) approach presented in Refs. [47,49]. Hence, we will start by presenting PIM and 1PI kinematics within the SCET framework in Section 2 and discuss the way in which the results of [47,49] have to be modified in order to include the decays of the top quarks. Here we also give details of our MC implementation of the approximate (N)NLO contributions obtained in this approach. In Section 3 we first critically assess the quality of the approximate results and discuss how we assign a reasonable theory error to our results. We then present our new approximate NNLO results for several phenomenologically interesting observables for the LHC (at 8 and 14 TeV) and the Teva- 1 We note that this is not an entirely new idea -using resummed results to improve fully differential results, in e + e − collisions and in Drell-Yan production, is a key concept of the Geneva MC [61]. tron for a typical cut-based analysis. Finally, in Section 4 we briefly discuss some realistic future improvements that can be made to our approach and end with some concluding remarks.

Theoretical setup
Initially, the standard technique to study RG-improved predictions for hard-scattering processes at hadron colliders has been to work in Mellin moment space. However, over the last few years an alternative method [62,63] based on SCET was developed. This formalism allows to resum, directly in momentum space, soft gluon emissions to all orders in the strong coupling constant α s . This method was successfully applied to several collider processes, and, in particular, to the pair production of heavy quarks at the Tevatron and LHC [47,49]. The formalism is rather flexible since it allows to obtain resummed predictions at the level of the differential distributions.
The SCET approach is based on the factorization of the partonic cross section in the soft limit, in which it can be written as a convolution of hard and soft functions. The hard functions contain the effects of the virtual corrections, while the soft functions describe the emission of the soft real radiation in the scattering processes. In fixed-order perturbation theory, partonic cross sections contain both regular terms and singular distributions. The singular pieces can be expressed as functions of a soft parameter which is closely related to the energy of the soft gluons emitted. The details of the soft limit are sensitive to the particular kinematic setup considered. In [47] the PIM kinematics was employed to investigate the invariant mass distribution of the top pair, while in [49] the 1PI kinematics was used to derive improved predictions for transverse momentum and rapidity distributions of the (anti-)top quark.
In [47] it was shown that after the convolution of the hard-scattering kernels with the parton luminosities, the singular terms in the threshold region generally contribute for more than 90% of the total NLO tt cross section. In the threshold region the contributions of the singular terms become large and therefore they eventually need to be resummed to all orders in perturbation theory. The enhancement of the singular terms in the cross section is known to be due to the steeply falling behaviour of the parton luminosities outside the threshold region. This effect is called dynamical threshold enhancement.
In this section we will summarize the main results of [47,49] which are important for our work and we will show how to extend these results to include the semi-leptonic decays of the top quarks. In [47,49] the authors studied the process where N 1 and N 2 are the incoming protons in the case of the LHC, or proton anti-proton in the case of the Tevatron and X is an inclusive hadronic final state. In these works the top quarks were always treated as stable on-shell partons and their decay was omitted. The NWA, where the heavy particles produced in the scattering process are allowed to decay to the physical final states whilst remaining on-shell, is the first natural step to go beyond the stable top approximation. In this limit the amplitudes describing the process factorize into a part describing the production of a pair of on-shell top quarks and parts describing the decay of an on-shell top and anti-top quark. In the NWA the radiative corrections to the process are given by the factorizable corrections to the production and decay. In this work we use the SCET approach to perturbatively improve the production subprocess beyond NLO, while keeping the accuracy of the decay subprocess at fixed NLO.
At LO in perturbation theory the two partonic channels contributing to the production subprocess in Eq. (1) are where the momenta of the incoming partons p i (i = 1, 2) are related to the momenta of the initial state hadrons via the relation p i = x i P i . We also introduce some kinematical invariants which are important for the description of the partonic scattering subprocesses:

PIM and 1PI kinematics
In this subsection we briefly summarize the main features of the PIM and 1PI kinematics setups. In the case of PIM kinematics it is convenient to introduce the quantities where the threshold region is identified by the limit z → 1. The doubly differential partonic cross section in M and in θ (the scattering angle of the top quark in the partonic centre of mass frame) takes the following form where µ f is the factorization scale and the sum is over all possible incoming partons (i, j). The functions C PIM, ij in Eq. (6) are usually called hard-scattering kernels and they can be computed perturbatively. In fact, independently of the kinematics, the kernels C ij have an expansion in powers of α s At LO in α s , the non-zero scattering kernels correspond to the quark anti-quark annihilation channel and to the gluon fusion channel, i, j ∈ {qq, gg}. At higher orders in α s one has to consider the virtual and real corrections to the Born approximation as well as the new partonic channels that open up at that order, for example gq → ttq at order α s . The hard-gluon emission and the additional production channel contribute to the NLO part of our calculation and are taken into account exactly via the matching procedure. Formally, near the partonic threshold, these contributions are suppressed by powers of the soft expansion parameter and therefore can be neglected in this region.
In [47] it was shown that in the limit of soft gluon emissions, z → 1, the hard-scattering kernels C ij factorize into a product of hard and soft functions which are matrices in colour space The hard functions H ij are obtained from the virtual corrections and they are ordinary functions of their arguments. The soft functions S PIM, ij describe the real emission of soft gluons and contain distributions which become singular in the threshold limit z → 1. At order α n s the soft functions depend on terms proportional to plus-distributions of the form and on terms proportional to the delta distribution δ(1 − z). The 1PI kinematics is commonly used to describe observables related to a single heavy particle rather than a pair. At the partonic level, 1PI kinematics allows one to write the top quark rapidity (y) and transverse momentum (p T ) distributions as As in the PIM case, in 1PI kinematics the hard-scattering kernels factorize into a product of hard and soft functions in the limit of soft gluon emissions s 4 → 0 where again H ij and S 1PI, ij are matrices in colour space. Whilst the hard functions in PIM and 1PI kinematics are identical, the soft functions are different. In particular, at order α n s the 1PI soft functions include terms proportional to singular plus-distributions which depend on

Inclusion of tree-level decays
Here we describe how we extend the formalism outlined in the previous subsection in order to include the tree-level semi-leptonic decays of the top quarks where Given that we are interested in keeping full spin-correlation effects, the spins of the top and anti-top quarks must not be summed over. Unfortunately the results available in the literature for the tt hard functions, H ij , are already summed over the spins of the external particles and do not allow for a direct inclusion of the spin-correlated top decays. For this reason the hard functions had to be computed anew keeping the spin information explicit. This was achieved using spinor-helicity methods, and in particular making use of the oneloop helicity amplitudes for the tt production process which were recently computed in [64]. Using the standard procedure for sewing together helicity amplitudes for the production of a heavy quark pair with the appropriate helicity amplitudes for the decay subprocess (and summing over the helicities of the heavy quarks), it was possible to construct the required hard functions where the spin-correlations are included. For each helicity configuration of the eight external particles, {λ} ≡ (λ 1 , . . . , λ 8 ), the amplitude can be expressed as a colour-decomposed function of eight external momenta {p i } 8 1 ≡ (p 1 , . . . , p 8 ) The (c ij I ) {a} are independent colour structures which represent orthogonal basis vectors in the space of colour-singlet amplitudes and {a} ≡ (a 1 , a 2 , a 3 , a 4 ) is the set of colour indices relative to the external coloured particles which can be in the fundamental or adjoint representation. In the NWA, when the decay is added at tree-level, the colour indices carried by the b-quarks are equal to the colour indices of the top quarks to which they are attached, therefore a 3 = a t and a 4 = at. Explicitly, the colour basis is chosen to be where the SU(N ) matrices t a satisfy the usual relations [t a , t b ] = if abc t c and {t a , t b } = d abc t c + δ ab /N . In order to construct the hard function matrices the infrared (IR) poles must be removed from the QCD amplitudes. This is done following the procedure in [65] which, in short, can be written as where the IR poles are removed by the matrices Z −1 ij ( ) [66,67]. The hard function matrices have a perturbative expansion in α s where d R = N for the qq channel and d R = N 2 − 1 for the gg channel. The matrix elements H IJ (for simplicity we drop the subscript ij labelling the production channel) can be expressed in terms of the renormalized colour-decomposed QCD amplitudes as where we have summed over all possible external helicities. After including the tree-level decays, the hard functions H ij ({p i } 8 1 , m t , µ f ) depend explicitly on the external momenta which are subject to the constraints in Eqs. (14) and (15). It would be straightforward to extend these results and compute the hard functions where the one-loop corrections to the decay are included in the NWA. This would be needed to implement some improved prediction at the decay level, but for the moment we focus on adding the decay corrections strictly at fixed order (LO/NLO). Hence the computation of these additional pieces is not needed for the present work.
Since we work in the NWA and the decay is added at fixed order, the soft functions S PIM and S 1PI are not changed by the inclusion of the tree-level top decays. Therefore, we make use of the results for the PIM and 1PI NLO soft functions which are available in the literature. The calculation of the PIM soft functions at NLO was carried out in [47] while the results for the 1PI soft functions can be found in [49].

Structure of the hard-scattering kernels
The hard H ij and soft S ij functions for the production subprocess satisfy renormalizationgroup equations (RGEs) whose precise form was derived in [47] for the PIM case and in [49] for 1PI. The relevant two-and three-loop anomalous dimensions which enter the RGEs can be found in [67][68][69]. Amongst these, the two-loop anomalous dimensions for massive partons [67] are a key ingredient of this approach. Given full knowledge of the NLO hard and soft functions, it is possible to exploit the information from the RGEs to derive fixed-order approximate O(α 2 s ) expressions for the hard-scattering kernels. The procedure to obtain approximate NNLO results from the NNLL resummation formula was described in detail in [47,49]. We do not repeat the derivation here and instead focus on the results of this procedure that are relevant for our case. In general the NNLO hard-scattering kernels have the following expansion in terms of hard and soft functions where currently only the first term on the r.h.s. of Eq. (22) is known exactly. As for the second term, only the logarithmic parts of S (2) ij are known completely and can be obtained by solving the RGE for the soft functions at O(α 2 s ). In principle, the scale-dependent parts of H (2) ij , relevant to the third term in the sum, can also be computed via RGE. However, the inclusion of these terms is problematic as this cannot be done in an unambiguous way and might introduce an artificial reduction of the scale dependence. Hence, we follow the choice made in [49] to drop these terms altogether.
In momentum space the NNLO hard-scattering kernels in Eq. (7), have the following structure in the case of PIM kinematics while in the case of 1PI kinematics we have The expressions of the coefficients multiplying the plus-distribution to O(α 2 s ), D PIM, m and D (2) 1PI, m where m = 0, . . . , 3, are known exactly. They depend on the momenta of the external particles {p i } 8 1 , on the variables z and s 4 for PIM and 1PI respectively, on the top mass m t and on the scale µ f . The coefficients of the δ-function are only partly known, with the missing terms coming from the constant pieces of the two-loop soft functions and the scale-independent parts of the two-loop virtual corrections. The former pieces are currently unknown while the situation has recently changed for the latter. Namely, the two-loop amplitudes for stable top production have become available in the form of a large numerical grid [70]. However, the results are not given in a helicity-amplitudestructure-like form, therefore are not (yet) suitable for a direct inclusion of the top decays. The functions R 1PI (s 4 , m t , µ f ) are finite in the limit z → 1 and s 4 → 0 respectively. These remainders do not include all the regular terms at O(α 4 s ) since the total regular contributions could only be obtained with a full calculation at NNLO accuracy.
Finally the hard-scattering kernels in Eqs. (23) and (24) can be integrated over the appropriate phase space to get the partonic cross sections contributions in PIM and 1PI kinematics at order O(α 4 s ) These expressions are in a form that is suitable for Monte Carlo integration.

Monte Carlo Implementation
In this subsection we briefly summarize the MC integration method and give details on the implementation of the approximate (N)NLO formulas in the MC framework. The starting point for a MC integration of a final-state phase space is the formula for the cross section where dΦ n is the usual final-state phase-space measure for n final state particles, |M| 2 is the squared matrix element for the process considered, f 1,2 are the parton distribution functions (PDFs), S is the hadronic centre-of-mass energy squared and F is the measurement function, which defines the (infrared (IR) safe) observable to be studied. The motivation to perform the above phase-space integrals numerically comes when these become prohibitively difficult to compute analytically. This can happen both when the final state consists of many particles and when one is interested in imposing conditions on the final state (i.e. when F is different from the identity, for example, to account for finite reaches of detectors, reconstructing jets out of final-state partons etc.). The most convenient way of performing such integrals numerically is via MC integration. Random numbers are generated and mapped to phase-space points {p i } n 1 which are then used to evaluate the matrix-element. This procedure yields a weight which can then be binned into histograms according to any distribution one wishes to study. The sum over all weights over the full range of possible bins of each observable is equal to the total cross section. It is clear that supplementing the phase space, Eq. (28), with conditions on the final state does not complicate the MC integration, in the sense that to achieve this it is simply necessary to throw away (not bin) the weights whose associated phase-space configuration does not satisfy the given conditions. 2 Beyond LO the numerical phase-space integration becomes a little more difficult to deal with due to the different phase spaces the virtual and real corrections live in (at NLO these are dΦ n and dΦ n+1 respectively). The integrations over these phase spaces are individually divergent, but it is well-known that divergences cancel in the sum. For NLO there exist standard methods that achieve this cancellation numerically and in our implementation of NLO corrections we have employed both the FKS [71,72] and Catani-Seymour [73,74] subtraction methods (the complete agreement of the results of the two methods is a strong check of our implementation).
The hard-scattering kernels described in Sections 2.1-2.3 can be used in Eq. (27) in a similar way as the standard LO or NLO hard-scattering matrix elements. In a MCintegration approach the only change with respect to the kernels of [47,49] being that we have now expressed them explicitly as functions of the outgoing particle momenta. We obtain approximate (N)NLO results using these hard-scattering kernels derived from both PIM and 1PI kinematics, which, apart from the kernels themselves, differ in precisely what approximations are made in the phase-space integration, dΦ n,PIM or dΦ n,1PI .

Monte Carlo and PIM
The PIM cross-section formula is where τ = (p t +pt) 2 /S and we have replaced the integration on M and cos θ in Eq. (6) with the full phase space of the tt decay products, dΦ tt . To make a connection between Eq. (29) and the general expression given in Eq. (27) we note that dΦ tt can be obtained by starting from dΦ n=3 for a t,t, g final state and integrating over the angular variables of the (soft) gluon. If the decay of the (anti-)top is to be included, dΦ tt also contains the corresponding angular integrations. For simplicity we think of the hard-scattering kernels C PIM as being multiplied by the measurement function F introduced in Eq. (27). It is important to note that in order to exactly reproduce the phase-space integration found in [47] (i.e. change the phase space from the 'full' to the 'approximate' one, dΦ tt → dΦ tt, PIM ) an additional factor of 1/z must be included in the z-integral. This comes about in the derivation of the factorization formula when two separate factors of z 1/2 have been set equal to 1, which, of course, is perfectly acceptable in the limit z → 1, in which the factorization formula is strictly true. When performing the integral using an MC method, introducing this additional factor of 1/z allows us to numerically reproduce the differential results in [47].
Using the change of variables x 1 = x, x 2 = τ /(x z) and v = z it is possible to rewrite the cross section in the form where we have explicitly extracted the integration variable v from the complete dΦ tt, PIM phase space and the remaining integrations dΦ tt [vs] now depend on the reduced partonic centre of mass energy squared vs. This way of writing the phase-space integration is perhaps more familiar for use in MCs, since this is the typical form of the standard NLO initial-state collinear contributions. We have made it clear above that the hard-scattering kernel is now evaluated using momentum configurations that depend on the variable v, which is the argument of the PIM plus-distributions of Eq. (9).

Monte Carlo and 1PI
For 1PI, the formula we have implemented has a somewhat different structure. The starting point for our MC implementation are Eqs. (42) and (43) of [49], which can be formulated in the following form once the dependence on p T (t) and y(t) is replaced by the phase space of the tt decay products, dΦ tt, 1PI . To obtain Eq. (31), a change of variables is performed by trading x 1 or x 2 in favour of s 4 . To do so we introduce the functions and a set of hadronic kinematical variables related to the partonic ones The upper bound on the s 4 integration is found to be The cross section in Eq. (31) is written as a sum of two terms (related via the exchange t 1 ↔ u 1 ) in order to preserve the symmetry of the gluon channel under the interchange y(t) → −y(t). Before performing the integrals using the MC method, it is important (in order to match our results with those of [49]), to make sure the same approximations are made in all relevant places. In [49] the choice was made to keep only the explicit dependence on s 4 non-zero in the first argument of the C 1PI -functions. This means that momenta (phase space) we generate and use to evaluate the hard-scattering kernels must be of 'Born-level' kinematics, i.e. we generate momenta satisfying respectively for the first and second term in the sum in Eq. (31) and we evaluate the hard-scattering kernels with these momenta as Note though, that where there is explicit dependence on the variable s 4 (for example, in the plus-distributions , Eq. (12)), this must of course be evaluated keeping s 4 = 0. Making the further choice of exchanging one of the kinematical variables inside dΦ tt, 1PI with the PDF momentum fraction x 2 makes it possible to explicitly extract this integration variable and rewrite the cross section in terms of the remaining phase-space integrations dΦ tt where the upper bounds on the s 4 integrations have to be expressed in terms of the new integration variables. Evaluating the 1PI cross section in Eq. (37) using MC methods, we find complete agreement with the distributions of [49]. It must be emphasised that there is significant freedom in the final form chosen to implement. As has been discussed in depth in [49], this depends heavily on where the s 4 = 0 approximation is enforced. In the MC approach this feeds into how the phase space is generated and thus how the hard-scattering kernels are evaluated. The various choices of course all differ by formally subleading, O(s 4 ), contributions which nonetheless can have significant numerical impact. For this work we have made the choice to treat the subleading corrections in a way such that we recover the differential results of [49].

Flexibility of Monte Carlo
It should be clear that Eqs. (30) and (37) enable us to obtain differential cross sections for top-quark pair production at approximate (N)NLO accuracy by using C PIM/1PI (C PIM/1PI ). Since we also wish to study the decay products of the top, the top-pair phase space is straightforwardly extended to the phase space of W + W − bb-production (in the NWA for the top quarks) for these approximate (N)NLO cross sections.
As has already been mentioned, the MC implementation naturally lends itself to imposing conditions on the final state, clustering final-state partons into jets etc. Thus, in this more flexible framework it is possible to study the predictions of PIM and 1PI under more realistic analysis setups. Strictly speaking, the PIM and 1PI results should only be used to make predictions for {M tt , cos θ} and {p T (t), y(t)} distributions, since it is only for these that the analytic formulas are formally valid. However, given that the MC approach deals with weights and momenta, it is possible to make a prediction for any distribution using the PIM and 1PI kernels. Of course, no factorization formula has been derived for any of the 'wrong' distributions we obtain or indeed any of the 'right' distributions, where analysis cuts have been imposed on the final state. However, importantly, we have strong checks on when and where to trust the RG-improved results and we will argue that a posteriori these new results are indeed sensible.

Structure of results
Given that we are working in the NWA, corrections to the production and decay of the pair of on-shell top quarks can be treated separately. The (potential) improvement to the current state-of-the-art NWA predictions [11][12][13] we explore here is the addition of an approximate NNLO correction in the production subprocess. To aid comprehension in the following section, we introduce the differential cross sections dσ xx prod and dσ xx full . The former includes the decays of top quarks at LO only, whilst the latter contains the NLO corrections to the decays. Here, xx will label the accuracy of the results, i.e. this will be either the normal fixed-order LO or NLO results or the approximate next-to-leading order (nLO) or approximate next-to-next-to leading order (nNLO) results. In detail, the sets of differential cross sections that we will examine are and Here, dσ t→l −ν lb are the standard LO and NLO differential production cross sections and partial widths. The approximate O(α 3 s ) NLO and O(α 4 s ) NNLO corrections to the production subprocess are denoted by dσ (1) tt and dσ (2) tt respectively. They can be obtained from integrating either the 1PI or PIM hard-scattering kernels. We point out that the differential cross sections, dσ  When including the NLO corrections to the top or anti-top decays as well as in the production, the width of the top ought to be treated in a way such that, when integrating over the full phase space of the top decay products, the inclusive cross section for the production of a tt-pair multiplied by a branching fraction is recovered. At NLO and nNLO we adopt the treatment detailed in [12,13] and expand the NLO top width, keeping only terms in the differential cross section that constitute a strict NLO correction. This means that in the above equations we replace Γ NLO t → Γ LO t in the prefactor and at the same time modify the Born-level differential cross section by where the factor of two comes from the fact that we have two top quarks decaying.

Checks
We have performed extensive checks on our results and MC implementations. Firstly, our newly computed hard functions for the production process are consistent with the hard functions computed in [47]. In addition, by tracing in colour space the matrix multiplications of the one-loop hard functions with the corresponding tree-level soft functions we fully reproduce the numerical results for the one-loop matrix elements, as expected. Secondly, regarding our implementation of the approximate (N)NLO kernels into a MC and the associated phase-space integration, when we are inclusive over the top decay products (i.e. no analysis cuts applied), we find complete agreement between the numbers produced by our code and n(N)LO inclusive cross sections results presented in [47,49,75]. Furthermore, again when we integrate over the top decay products, we fully reproduce the differential distributions presented in [47,49]. This is a strong check indicating that our MC implementation of the n(N)LO contributions is correct, both for PIM and 1PI kinematics. Finally, regarding the treatment of the top width, we have compared several observables for the exclusive setup of Eq. (45) with the publicly available code MCFM [13], with which we find agreement within MC errors (better than 1%) at NLO, both when NLO corrections are or are not included in the decay. This provides an excellent check of our implementation of the NLO corrections, both in production and decay as well as our treatment of the top width.

Total cross sections
We first assess the quality of the approximation by analyzing the tt total cross sections. We compare the n(N)LO vs the (N)NLO results taking into account the scale uncertainties. The NNLO cross sections are obtained using Top++ [25]. This analysis is not meant to be a systematic study of the tt total cross sections (for detailed studies we refer to [25,75]) but  Table 1   In Figure 3.1 we show a graphical representation of the cross sections reported in Table 1 including the error bands for the scale uncertainty. The red bands refer to the NLO results and the nLO results in PIM and 1PI kinematics. At nLO, both for PIM and 1PI, the central values are very close to the NLO results and are consistent with each other. The individual PIM and 1PI nLO error bands underestimate the theoretical uncertainty of the NLO results, in particular for the LHC. For this reason we choose to provide a more realistic estimate of the theoretical uncertainty by taking the envelope of the PIM and 1PI bands. We should also mention that the nLO results do not include the qg channel. Therefore a more direct comparison between nLO and NLO could be made individually for the qq and gg channels. We refer to the next subsections where this comparison is made at the level of the distributions.
The green bands refer to the NNLO results and to the nNLO results in PIM and 1PI kinematics. The PIM and 1PI nNLO central values are very close and consistent with each other independently of the collider. Unfortunately, the central values of both PIM and 1PI fall below the NNLO central cross sections. Nevertheless, at the LHC there is quite a large overlap between the NNLO lower uncertainty bands and the nNLO upper bands. At the Tevatron however the overlap is only marginal. The fact that the nNLO predictions are consistently lower than the NNLO ones is likely due to the fact that a large part of the corrections that contribute to the delta terms in Eqs. (23) and (24) is missing. In particular, as mentioned in Section 2.3, the finite contributions of the two-loop hard and soft functions are not included in our nNLO result. Comparing to the nLO predictions, where the delta terms are fully known, we observe a better agreement between nLO and NLO. When the delta contributions become available we expect that the comparison of nNLO vs NNLO will be as good as the nLO vs NLO case. The discrepancy between the approximate and full results may also partly originate from the phase space. Integrating the PIM and 1PI kernels over a full phase space instead of an approximate one could lead to a positive contribution to the cross sections. We do not investigate this second possibility here, but postpone this study to future work.
We deliberately refrain from adapting the treatment of subleading terms to optimize the agreement between the nNLO and NNLO results for the total cross section, because such an a posteriori justification for a particular treatment of subleading terms cannot necessarily be generalized to arbitrary distributions. From the comparison between the nNLO and NNLO results for the total cross section we conclude that our differential distributions are likely to be at the lower end of the NNLO predictions for the same observables. However, when the fully differential NNLO results will be available, we do expect to find a substantial overlap between the uncertainty bands of the nNLO and NNLO distributions.

Universality and uncertainties of approximate results
In this subsection we study the reliability of our method for distributions. To do so, we compare our (approximate) nLO results to full NLO results with the aim of obtaining a procedure that gives a reliable error estimate for our approximate results.
We will use the PIM and/or the 1PI kinematics to obtain approximate expressions for arbitrary observables. Thus, we have to investigate the universality of the approximate terms obtained using the two different kinematics. Furthermore, the impact of cuts that might be applied in a realistic analysis has to be considered as well.
To get an idea of the latter, we will apply a jet algorithm and require that the event has a b-jet, J b and ab-jet, Jb. As an example we have used the k T clustering algorithm with the resolution parameter set to R = 0.7, but obviously any other jet definition would be possible. All observables we study will be constructed from the momenta of the final-state objects J b , Jb and the decay products of the W bosons. We will only consider the decay W + → e + ν and W − → e −ν and sometimes assume the W bosons can be reconstructed fully from their decay products. In particular, the momentum of the reconstucted top is defined as p(t) ≡ p(W + ) + p(J b ) = p t , with an analogous expression for the reconstructed anti-top. We will also study the impact of applying some additional standard cuts on the transverse momenta p T , and transverse (missing) energies E T . The precise definition of these cuts is as follows These cuts are meant as an illustration only. They can easily be changed and, in particular, additional cuts on the rapidity can be applied.
To study the dependence of the predictions on the kinematics applied, we start by considering in the left panel of Figure 2 the distribution of the invariant mass of the reconstructed top pair M (t,t) ≡ M (W + , W − , J b , Jb). The appropriate kinematics for this observable is PIM, but we will also use the 'wrong' kinematics 1PI. We want to investigate how well the approximate nLO result reproduces the full NLO corrections to the production of the top pair. Note that the NLO corrections to the decay are not affected and, hence, will be left out in this comparison. Thus, in Figure 2 we show the comparison of the nLO results dσ nLO prod obtained with PIM (green) and 1PI (blue) to the full NLO result dσ NLO prod (red). In both cases the decay of the (anti-)top is included at LO only and we are using MSTW08NLO PDFs [76] and set µ f = µ R = m t . The main message is that the approximate results for both kinematics are in reasonably good agreement with the exact results and the application of cuts does not have a negative effect on this agreement. Actually in this case it turns out that the agreement is even better if cuts are applied (dashed curves).
As expected, for the invariant mass of the top pair the PIM results are slightly better than the 1PI results. We now repeat this comparison for the transverse momentum distribution of the reconstructed top quark p T (t) ≡ p T (W + , J b ), where 1PI is the appropriate kinematics. In the right panel of Figure 2 we show again the comparison of dσ nLO prod obtained with 1PI and PIM to dσ NLO prod . Once more, the approximations using both kinematics work well, with and without applying cuts. What is somewhat surprising is that the 'wrong' kinematics, PIM, gives better agreement with the full NLO results than the 'right' kinematics, 1PI. We have also investigated the rapidity of the top in an analogous manner. Again, both PIM and 1PI give very good approximations. If no cuts are applied, the 1PI result gives better agreement as one would expect. However PIM does slightly better than 1PI if cuts are applied.
To summarize these considerations we can state that the bulk of the NLO corrections to the production obtained using approximate results appears to be almost independent of the kinematics applied. The difference between PIM and 1PI results are of course due to the fact that different approximations have been made in the two kinematics. As for the total cross section, this difference has to be taken into account in addition to the usual scale dependence when giving a theoretical error for the approximate result for a generic observable.
In order to obtain a procedure for assigning a realistic error estimate of our approximate results, we start by considering a single partonic channel, the dominant gg initial state for the LHC at 8 TeV. Again we consider corrections to the production only and will take M (t,t) (left panel of Figure 3) and p T (t) (right panel of Figure 3) as our test observables. For the complete NLO result we use the conventional scale variation m t /2 ≤ µ ≤ 2m t and show the uncertainty as a red band in Figure 3. The upper (lower) bands are without (with) our standard cuts. The uncertainty band for our approximate results are obtained by taking the envelope of the scale variation and the variation over the implementations. These bands are shown as green bands in Figure 3.
Looking at the results shown in Figure 3 we see that for the bulk of the cross section, the bands obtained with our error estimate for the nLO results contains the error band of the full NLO result. This is the case whether or not we apply cuts. At the high-end tail of the p T distribution, the agreement between the full and approximate results gets somewhat worse. This is not surprising, as this region is outside the range of applicability of the approximations we make. However, this region will not have a large effect on generic distributions. We also note that it seems to be a common feature that the application of cuts improves the quality of the agreement between the full and approximate results.
Generally speaking, our procedure overestimates the error of the approximate results for a particular partonic channel. On the other hand, the approximate nLO results for an actual observable do not include all partonic channels. In fact, as mentioned previously, the nLO results do not include any contributions at all from the gq channels. Thus the overestimate of the uncertainty in the partonic channel gg (and qq) can be seen as a compensation for the complete omission of partonic channels that are not included in the approximate results.
We would like to stress that a reliable error estimate requires taking the envelope over the different kinematics. Simply using either PIM or 1PI, and considering only the scale variation usually underestimates the error. This is illustrated in Figure 4, where we show a comparison of the standard NLO bands with nLO bands obtained for a fixed kinematics. Contrary to Figure 3, in Figure 4 we include all partonic channels. The top panel shows the error obtained from using PIM and taking the envelope from scale variation only, whereas in the bottom panel 1PI only is used. It is clear that in particular the 1PI bands obtained solely from scale variation considerably underestimate the theoretical error. However, if we take the envelope over both kinematic implementations as well as the scale dependence, we get a more realistic indication of the uncertainty.
We have verified that this procedure works for generic observables in that the error band obtained this way has a large overlap with the standard error band from scale variation of the full NLO result. This is consistent with what we have found in Section 3.1 for the total cross section.

Differential distributions for LHC8, LHC14 and Tevatron
We will now consider generic distributions at different colliders where the decay of the tops is included at NLO and where we always apply our standard cuts Eq. (45). Our best predictions will be dσ nNLO full as defined in Eq. (43). This result contains the exact NLO corrections for the production and the decay and approximate NNLO corrections to the production. For LO, nLO and NLO results MSTW08NLO PDFs are used while MSTW08NNLO PDFs are used for the nNLO results. The band of these results (shown in green in the figures of this section) is obtained by taking the envelope of the standard scale dependence m t /2 ≤ µ ≤ 2m t and the variation over PIM and 1PI kinematics. We have seen in Section 3.1 that when using this procedure for the LHC, the total cross section gives bands that have an overlap with the full NNLO bands (obtained from standard scale variation) but are generally somewhat lower. For the Tevatron, the overlap is only marginal. Thus a certain care has to be taken when interpreting these bands as theoretical error.
To give a complete picture, we will also show the LO and NLO results dσ NLO full as light brown and red bands respectively, These bands have been obtained by varying m t /2 ≤ µ ≤ 2m t . Finally, in these plots we give an indication of the quality of the dσ nLO full approximation. We show the corresponding bands obtained in the same way as for dσ nNLO full as green dashed lines. These bands are to be compared to the red bands to assess the quality of the In Figure 5 we present our final results for our two standard variables M (W + , W − , J b , Jb) and p T (W + , J b ). As expected the scale dependence is reduced going from LO to NLO. There is also a very large overlap between the red dσ NLO full and green dashed dσ nLO full bands. Only for the large p T tail, where the approximation we make is not justified, the nLO results start to differ from the NLO results. Finally, the bands of the nNLO results, dσ nNLO full , are considerably smaller than and mostly within the NLO bands. This suggests that the perturbative expansion is under control and there are no unexpected large corrections in going from NLO to NNLO.
The picture is very much the same for other observables. As an example we consider the pseudorapidity of the reconstructed top, η(W + , J b ), in the top left panel of Figure 6. One would expect that 1PI is the appropriate kinematics for this observable, but as for the transverse momentum of the top, PIM kinematics gives very similar results. This is further evidence that the bulk of the corrections is independent of the precise details of the kinematics. We can also study generic observables that do not necessarily have a direct link to either the PIM or 1PI kinematics. For example, we show in Figure 6 the cosine of the angle between the two charged leptons, cos θ l + l − (top right panel) which is interesting in the study of angular correlations, the invariant mass of the lepton-jet system, M (J b , l + ) (lower left panel) which is a useful observable to measure the top mass, and the transverse momentum of the b-jet, p T (J b ) (lower right panel). For all these observables the general features are the same in that the nNLO bands are mostly within the NLO bands and that the nLO approximation has a very large overlap with the full NLO band.
In the upper tail of the invariant mass of the lepton-jet system, M (J b , l + ) the nNLO band is of the same size as the NLO band. This is as expected given that in the NWA this region of phase space only receives corrections from (hard) real emission corrections which are only included via our matching with the NLO. In any case, this region is not  reliably predicted since our results have been obtained with strictly on-shell top quarks and it is known that close to kinematic boundaries such as the upper edge of the M (J b , l + ) distribution, off-shell effects are sizeable [16,17].
The main features of the results are not changed at all if we consider the LHC at the larger energy of 14 TeV. To illustrate this we show some sample results for LHC14 in Figure 7 with the same analysis cuts as for LHC8. In the top two panels we show our standard observables, M (W + , W − , J b , Jb) and p T (W + , J b ). Again, applying PIM and 1PI kinematics gives very consistent results for these two observables apart from the high p T tail. This justifies the application of our method to other observables. As an example, in the lower left panel of Figure 7 we show η(l + ), the pseudorapidity of the charged lepton from the decay of the top. Once more we find a very consistent picture when comparing the various approximations and we have verified that this is that case for a large number of standard observables. We mention in passing that if we were to consider observables related to the decay products of the anti-top rather than the top quark it would be more appropriate to implement the 1PI kinematics where thet is singled out. This would only introduce a trivial change in the implementation. Of the many observables we have studied, we now consider one of the most problematic cases, where our approximation is less reliable: in the lower right panel of Figure 7 we consider H T (J b , Jb), the scalar sum of the transverse momenta of the two b-jets. There are notable differences between the nLO and NLO bands and the reduction in the error band from NLO to nNLO is less pronounced. This is an observable that combines information from the decay products of the t and thet. Hence the applicability of the 1PI kinematics could be problematic and, indeed, we have verified that it is the 1PI kinematics that leads to the enhancement in the peak region and to the undershoot in the tail region of the nLO result compared to the NLO result. If we were to take the envelope of the PIM results only, the nLO band would be perfectly consistent with the NLO band. This is in line with the general observation that the approximate results using PIM kinematics seem to lead to more consistent results. In this context we should also stress that whenever an observable is strongly affected by hard gluon radiation, our approximate results cannot be trusted.
Finally we consider distributions for the Tevatron. We have seen in Section 3.1 that in the case of the Tevatron the agreement between NNLO and nNLO for the total cross section is marginal, casting doubts on the reliability of the approximation we make. Furthermore, for the cuts we have chosen, the NLO corrections to the decay are particularly important at the Tevatron. Given this, and that we use a pure NLO calculation for the decay corrections we are not expecting any improvement beyond NLO in these circumstances. Let us illustrate this for the rapidity of the (reconstructed) top quark. In addition to the LO result, the results including the decay corrections, dσ nLO full , dσ NLO full and dσ nNLO full are shown in the left panel of Figure 8. While dσ nLO full is perfectly consistent with dσ NLO full , there is virtually no improvement in going from dσ NLO full to dσ nNLO full . This is to be contrasted with the results shown in the right panel of Figure 8, where only the corrections to the production have been taken into account. Firstly, comparing the red bands on the right panel (dσ NLO prod ) to the left panel (dσ NLO full ) we see the decisive impact of the NLO corrections to the decay. This is mainly due to the cuts imposed on the transverse momenta of the b-jets. Secondly, if only the corrections to the production are taken into account, the improvement in going from NLO (red band on right panel) to nNLO (green band on right panel) is apparent.
To sumarize, we have investigated a large number of observables for the LHC8, LHC14 and the Tevatron and the results we have found are well represented by the examples we have presented in this subsection. While the applicability of the method in its current implementation for the Tevatron is questionable, in the case of the LHC our nNLO results for distributions are very likely to include the bulk of the NNLO corrections to the production. The advantage of our results is that in addition to the corrections to the production, the NLO corrections to the decay are also included. Depending on the cuts that are applied, these corrections can be very important. At the same time, we reiterate that currently we have no improvements beyond NLO for the decay part.

Future improvements and conclusions 4.1 Possible improvements
While the generic set of differential results we have presented in the previous section paint a consistent picture as far as perturbative stability is concerned, there are clear hints that this is not the full picture and that our results can be improved in a number of ways.
Firstly, given that the nNLO inclusive cross section underestimates the full NNLO cross section, we are motivated to understand the reason behind this. Since the nLO cross section approximates the NLO cross section well, this points towards that reason being, to a large extent, due to the terms we miss from the two-loop soft and hard functions at NNLO. Once these become available it is easy to include them in our setup and we fully expect to see an improvement in the agreement of nNLO vs NNLO. As discussed in Section 2.4, the approximate results are obtained through an integration of the hard scattering kernels over an approximated phase space. The approximations made are perfectly valid in the soft limits, however it would be interesting to investigate the possibility of integrating the hard scattering kernels derived in [47,49] over the complete phase space instead of over an approximate one. Doing this could well lead to positive contributions moving the nNLO cross section closer to that of the NNLO.
Secondly, our exploration of scale variations for several distributions in Section 3.3 highlight the importance of higher-order corrections in the decay, when being exclusive in the top decay products. This is particularly evident in the Tevatron results of Figure 8. In order to obtain more reliable results and in particular a more reliable uncertainty estimate, it is clear that the accuracy to which the top decay is computed must ideally match that of the accuracy of the production subprocess. The NNLO QCD corrections to the top decay have been recently computed in [77,78] and a natural realistic next step is to include these in our framework.
Finally, in view of the increasing need to match the experimental precision on the measurement of the top quark mass parameter with an equally good theory interpretation, it is important to be able to describe as realistically as possible the spectra of the top decay products. In particular this involves relaxing the assumption of the NWA and treating the top quark as off-shell. This can straightforwardly be done, while including the nNLO corrections in the production, as an extension of the effective-theory approach of [17]. Furthermore, it is also important to study the effect on fully differential observables of using a renormalization scheme other than the pole scheme for the top mass, something which again is possible within the framework we have presented here. 3

Conclusions
In this work we have presented an extension of the approximate (N)NLO cross section for stable tt production, differential in (M and cos θ) or (p T and y) in PIM and 1PI kinematics respectively, computed in [47,49]. The spin-correlated LO decays of the top quarks have been included in a new calculation of the hard functions. Our best prediction consists of the approximate NNLO corrections to the production subprocess, matched to the full NLO results, where NLO corrections in the decay are also included in the NWA. We have implemented the approximate (N)NLO corrections in a fully differential parton-level Monte Carlo that allows us to study arbitrary (IR-safe) observables constructed out of the final state jet and lepton momenta.
The flexibility of having the approximate corrections implemented in a Monte Carlo made it possible to assess how well the approximate NLO results model the NLO for observables other than the ones they were designed for. From an examination across colliders and energies and for an example cut-based analysis, we empirically conclude that the approximate results for generic observables are largely independent of the specific softgluon kinematics on which they are built. It also appears that the hard kernels constructed in the PIM kinematics do a better job of approximating the NLO, even, surprisingly, for the case of the p T distribution where one would expect 1PI to do better.
While our results are very encouraging, in that in most cases they display good perturbative convergence, unfortunately for the inclusive cross section the approximate NNLO numbers undershoot the full NNLO. We have argued that the reason is likely due to missing terms in our approximate approach and also perhaps due to the treatment of (approximate) phase space. To account for this additional theory error in our results, we choose to take the envelope of PIM and 1PI scale uncertainty bands (as opposed to just PIM or just 1PI) and in doing so expect to find overlap with the NNLO bands once differential results become available for the latter.
Finally, we reiterate that there is a mismatch between the accuracy to which the production and decay subprocesses are each computed. This is reflected in some of the plots shown (or some regions of phase space) where the scale uncertainty does not decrease when going from NLO to nNLO. Once again, this underlines the importance of higher order corrections in the decay, when studying in detail observables sensitive to the top decay products. The inclusion of NNLO corrections to the decay, together with possible improvements for the production part mentioned above, offer a clear path for further improvements in our approach.