Top-quark pair production at the LHC: Fully differential QCD predictions at NNLO

We report on a new fully differential calculation of the next-to-next-to-leading-order (NNLO) QCD radiative corrections to the production of top-quark pairs at hadron colliders. The calculation is performed by using the $q_T$ subtraction formalism to handle and cancel infrared singularities in real and virtual contributions. The computation is implemented in the Matrix framework, thereby allowing us to efficiently compute arbitrary infrared-safe observables for stable top quarks. We present NNLO predictions for several single- and double-differential kinematical distributions in $pp$ collisions at the centre-of-mass energy $\sqrt{s}=13$ TeV, and we compare them with recent LHC data by the CMS collaboration.


Introduction
The production of top quarks at high-energy colliders is a process of utmost importance, both in testing the validity of the Standard Model (SM) and in the quest for new physics. Within the SM, the main source of top-quark events in hadronic collisions is top-quark pair (tt) production. The large data set delivered by the CERN LHC enables precise measurements of the tt production cross section as a function of the tt kinematics (see e.g. Refs. [1][2][3][4][5][6][7][8][9][10]), which can be compared with the SM predictions. At the same time these studies have a wider relevance, since tt production is a crucial background in many new-physics searches.
The calculation of the next-to-next-to-leading-order (NNLO) QCD corrections to the tt total cross section was completed a few years ago [35][36][37][38]. NNLO results for some differential distributions were presented in Refs. [39][40][41]. This calculation was recently combined with NLO EW corrections [42]. The tt charge asymmetry is known at NLO [43] and NNLO [44] in QCD, and also including NLO EW corrections [45]. First NNLO QCD results including top-quark decays are starting to appear [46].
In the present paper we deal with on-shell tt production in NNLO QCD. The calculation of the tt production cross section at this perturbative order requires tree-level contributions with two additional final-state partons, one-loop contributions with one additional parton and purely virtual contributions. The required tree-level and one-loop scattering amplitudes are known. They enter the NLO calculation of the associated production of a tt pair and one jet [47,48], but in the case of NNLO tt production they need to be accurately evaluated also in the infraredsingular regions where the jet becomes unresolved. The purely virtual contributions entail the square of one-loop scattering amplitudes and the two-loop scattering amplitudes. The squared one-loop amplitudes are known [49][50][51]. The complete computation of the two-loop amplitudes has been carried out numerically [52,53]. Partial results for these amplitudes are available in analytic form [54][55][56][57]. Recent progress in the computation of non-planar two-loop master integrals [58,59] indicates that the analytic calculation can be completed in the near future.
The implementation of the various scattering amplitudes in a (fully differential) NNLO calculation is definitely a non-trivial task because of the presence of infrared (IR) divergences at intermediate stages of the calculation. Various methods have been proposed and used to overcome these difficulties at the NNLO level (the interested reader can consult the list of references in Ref. [60]).
In a recent paper [69] we have presented a new calculation of the inclusive tt production cross section in NNLO QCD. This calculation completes a previous computation that was limited to the flavour off-diagonal partonic channels [70]. The calculation uses the q T subtraction formalism [71] to handle and cancel IR-singular contributions in real and virtual corrections, and it is now completely integrated into the Matrix framework [72]. This allows us to perform fast and efficient computations of fiducial cross sections and (multi-)differential kinematical distributions for the production of on-shell top quarks.
In the present paper we extend the results of Ref. [69] in various respects. We present NNLO QCD predictions for several differential distributions in the transverse momenta and rapidities of the top quarks, as well as in the invariant mass and the rapidity of the tt system, and we discuss the results obtained by using different scale choices. We compare these results with CMS measurements in the lepton+jets channel at the centre-of-mass energy √ s = 13 TeV [9]. We then consider double-differential distributions and compare our results with the corresponding measurements by CMS [9].
The paper is organized as follows. In Section 2 we illustrate the framework in which the calculation is performed. In Section 3 we present results for single-differential and doubledifferential distributions, and we compare them with the experimental measurements. Finally, in Section 4 we present our conclusions. In Appendix A we present a quantitative comparison of our NNLO differential results with those available in the literature.

Calculation within the MATRIX framework
Our fully differential NNLO computation of tt production is carried out within the Matrix [72] framework. Matrix features a completely automated implementation of the q T subtraction formalism [71] to compute NNLO corrections, and it is thus applicable to the production of an arbitrary set of colourless final-state particles in hadronic collisions [73], as long as the twoloop virtual corrections to the corresponding leading-order (LO) process are provided. With appropriate modifications of the NNLO subtraction counterterm and the explicit computation of additional soft contributions (see below), Matrix can now also deal with the production of heavy-quark pairs.
According to the q T subtraction method, the NNLO differential cross section dσ tt NNLO for the production process pp → tt + X can be written as where dσ tt+jet NLO is the tt+jet cross section at NLO accuracy.
The square bracket term of Eq. (1) is IR finite in the limit in which the transverse momentum of the tt pair, q T , vanishes. However, the individual contributions dσ tt+jet NLO and dσ tt, CT NNLO are separately divergent. The contribution dσ tt+jet NLO can be evaluated with any available NLO method to handle and cancel IR divergences. The IR subtraction counterterm dσ tt, CT NNLO is obtained from the NNLO perturbative expansion (see e.g. Refs. [70,74,75]) of the resummation formula of the logarithmically-enhanced contributions to the q T distribution of the tt pair [76][77][78]: the explicit form of dσ tt, CT NNLO is fully known. To complete the NNLO calculation, the second-order functions H tt NNLO in Eq. (1) are needed. These functions embody process-independent and process-dependent contributions. The process-independent contributions to H tt NNLO are analogous to those entering Higgs boson [71] and vector-boson [79] production, and they are explicitly known [73,[80][81][82][83]. Since in tt production both the gg and the qq partonic channels contribute at the same perturbative order, all these process-independent contributions are required. In the flavour off-diagonal channels the process-dependent contributions to H tt NNLO involve only amplitudes of the partonic processes qq → tt and gg → tt up to the one-loop level, and the explicit results on the NLO azimuthal-correlation terms in the transverse-momentum resummation formalism [78]. The computation of H tt NNLO in the flavour diagonal qq and gg channels additionally requires the two-loop amplitudes for qq → tt and gg → tt, and the evaluation of new contributions of purely soft origin. The two-loop amplitudes are available in a numerical form [53], and the corresponding grids have been implemented into Matrix through a suitable interpolation routine. The computation of the additional soft contributions has been completed by some of us [84] 1 , and it has been implemented into Matrix as well.
The core of the Matrix framework is the Monte Carlo program Munich 2 , which includes a fully automated implementation of the dipole-subtraction method for massless [86,87] and massive [88] partons, and an efficient phase-space integration. All the required (spin-and colour-correlated) tree-level and one-loop (squared) amplitudes are obtained by using Open-Loops 2 [89,90], except for the four-parton tree-level colour correlations that are based on an analytic implementation. OpenLoops 2 relies on its new on-the-fly tensor reduction [91] that guarantees stability all over the phase space, especially in the IR-singular regions, while scalar integrals from Collier [92,93] are used. To the purpose of validating our results for the real-virtual corrections, we have also used the independent matrix-element generator Recola [94,95], which employs tensor reduction and scalar integrals from Collier, and we find complete agreement.
The subtraction in the square brackets of Eq. (1) is not local, but the cross section is formally finite in the limit q T → 0. In practice, a technical cut on q T is introduced to render dσ tt+jet (N)LO and dσ CT (N)NLO separately finite. Therefore, in our actual implementation, the q T subtraction method is very similar to a phase-space slicing method. It turns out that a cut, r cut , on the dimensionless quantity r = q T /m tt (m tt denotes the invariant mass of the tt pair) is more convenient from a practical point of view. The absence of any residual logarithmic dependence on r cut is a strong evidence of the correctness of the computation, since any mismatch between the contributions would result in a divergence of the cross section in the limit r cut → 0. The remaining power-suppressed contributions vanish in that limit, and they can be controlled by monitoring the r cut dependence of the cross section.
The r cut → 0 extrapolation for the total cross section is carried out by using the approach introduced in Ref. [72]. A quadratic least χ 2 fit to the r cut dependent results is performed and repeated by varying the upper bound of the r cut interval. Finally, the result with the lowest χ 2 /degrees-of-freedom value is taken as the best fit, while the remaining results are used to estimate the extrapolation uncertainty. In addition to this analysis at the level of the total cross section, we have performed a similar bin-wise extrapolation in the computation of differential cross sections. We find that the results are in good agreement with those obtained by directly using a sufficiently low value of r cut (r cut 0.15%).

Results
To present our quantitative results, we consider pp collisions at √ s = 13 TeV, and we fix the pole mass m t of the top quark to the value m t = 173.3 GeV. We consider n F = 5 massless quark flavours, and we use the corresponding NNPDF31 [96] sets of parton distribution functions (PDFs) with α S (m Z ) = 0.118. In particular, N n LO (with n = 0, 1, 2) predictions are obtained by using PDFs at the corresponding perturbative order and the evolution of α S at (n + 1)-loop order, as provided by the PDF set.
QCD scale uncertainties are estimated through the customary procedure of independently varying the renormalization (µ R ) and factorization (µ F ) scales by a factor of two around their common central value µ 0 with the constraint 0.5 ≤ µ F /µ R ≤ 2, i.e. we use the standard 7-point scale variation.
We note that the LO and NLO results are not fully consistent within the corresponding uncertainties, indicating that, at least at LO, scale variations cannot be trusted as perturbative uncertainties. Similar features are shared by various other hard-scattering processes at hadron colliders. In contrast, the NLO and NNLO predictions are consistent, suggesting that scale variations can be used to estimate the size of perturbative contributions beyond NNLO.
The characteristic hard-scattering scale that controls the perturbative QCD behaviour of the total cross section σ tt is m t . In our calculation of σ tt , as reported in Eq. (2), we have used QCD scales (µ R and µ F ) at values of the order of m t . Differential cross sections are controlled by corresponding characteristic hard scales, and we use QCD scales of that order in our computation of these observables. The characteristic hard scale specifically depends on the differential cross section under consideration.
Having at our disposal a fully differential calculation we can also use dynamical QCD scales.
By dynamical we mean hard scales that refer to multi-differential cross sections eventually integrated over the phase space to obtain the specific differential cross section under consideration. The use of a dynamical scale produces practical simplifications since it allows us to compute several observables (e.g., differential cross sections) simultaneously, without changing the QCD scales on an observable-dependent basis. In practice, we use dynamical scales that are expected to be "effectively similar" to characteristic hard scales. Moreover, the study of dynamical scales is of interest independently of how we use them.
The default dynamical scale that we use throughout the paper is set to the central value µ 0 = H T /2, where H T is the sum of the transverse masses of the top and antitop quarks, with and p T,t and p T,t are the transverse momenta of the top and the antitop quark, respectively. We present differential cross sections that are obtained by using µ 0 = H T /2 and values of µ 0 of the order of the characteristic hard scale for that cross section. We also show results obtained by using central scales that are lowered by a factor of 1/2. A reduced central scale, such as H T /4, was considered in the studies of Ref. [97] on the basis of features of fastest perturbative convergence of some observables, and it was also already suggested in Ref. [21].
We have chosen the dynamical scale H T since it is expected to be parametrically of the same order as the characteristic hard scale of the observables that we examine in this paper. This a priori expectation is based on the kinematical features of these observables and on the general dynamical features of tt production. In the following paragraph we briefly comment about this. Independently of the expectation, throughout the paper we comment on the actual quantitative results that we obtain by using different QCD scales.
Owing to dynamics, the typical size of both p T,t and p T,t is of the order of m t (see, e.g., Figs. 1-3 and 7). Therefore, in the case of observables that are inclusive over p T,t and p T,t , such as the total cross section and the pair rapidity distribution in Fig. 5, H T /2 turns out to be of the same order as m t , which is the characteristic hard scale for these observables. Analogously, since p T,t ∼ p T,t , H T /2 turns out to be of the same order as the transverse masses, which are the characteristic hard scales for the differential cross sections in Figs. 1-3 and 7. The invariant mass m tt of the tt pair is the characteristic hard scale in the case of the differential cross sections in Figs. 4, 6 and 8. The invariant mass is of the same order as H T with the exception of the kinematical subregions where the transverse momentum p T,tt of the pair or the rapidity separation |y t − yt| between the top and the antitop quark are large. However, these subregions are dynamically suppressed, and therefore they give a minor contribution to the inclusive (over p T,tt and |y t − yt|) cross sections in Figs. 4, 6 and 8.
Our numerical results for differential cross sections are compared with the measurements of the CMS collaboration [9] (the data correspond to an integrated luminosity of 35.8 fb −1 ) in the lepton+jets channel at parton level. The extrapolation from particle to parton level is carried out by the CMS collaboration in the inclusive phase space, and therefore no kinematical cuts    are applied to obtain our theoretical predictions. To perform the comparison, our results are multiplied by the factor 0.292, which corresponds to the value 0.438 [98] of the semileptonic decay fraction of the tt pair, multiplied by a factor of 2/3 since Ref. [9] considers only the decay into electrons and muons.
In Ref. [9] the CMS data for single-and double-differential distributions are compared to theoretical results obtained with the NLO Monte Carlo event generators POWHEG [99][100][101], interfaced either to PYTHIA8 [102] or to HERWIG++ [103], and MG5 aMC@NLO [104] interfaced to PYTHIA8 [102] (using the FxFx method [105] to deal with multijet merging). In addition, some of the measured parton-level single-differential distributions, namely the transverse-momentum and rapidity distributions of the leptonically and hadronically decaying top quark and the invariant-mass and rapidity distribution of the tt pair, are also compared to the NNLO QCD+NLO EW results of Ref. [42]. None of the double-differential distributions in Ref. [9] are compared to theoretical results beyond NLO QCD.

Single-differential distributions
In this section we present LO, NLO and NNLO results for a selection of single-differential distributions and compare them with the CMS measurements from Ref. [9]. At each perturbative order the scale-uncertainty bands in the figures are computed as explained at the beginning of Section 3.
We start the presentation by considering the transverse-momentum distributions of the top and antitop quarks. For each event we classify the transverse momenta according to their maximum and minimum values, p T,t high and p T,t low .    The characteristic hard scale of a transverse-momentum distribution is the transverse mass m T . Therefore, in Fig. 1 (central and right) we also report the p T,t high distribution for central scales µ 0 = m T,t high and µ 0 = m T,t high /2, respectively, while in Fig. 2 (central and right) we consider p T,t low for µ 0 = m T,t low and µ 0 = m T,t low /2. The p T,t high distribution is peaked at p T,t high ∼ 100 GeV, while the p T,t low distribution is peaked at p T,t low ∼ 60 GeV.
We first discuss the p T,t high distribution and focus on the p T,t high → 0 region. If p T,t high is small, both top quarks are forced to have small transverse momenta. As a consequence, this kinematical region corresponds to a small transverse momentum of the top-quark pair, p T,tt . The small-p T,tt region exhibits Sudakov-type divergences [76][77][78]107] at fixed order in perturbation theory, from the strong unbalance of real and virtual contributions due to softcollinear emissions. In the computation of p T,t high , the unphysical fixed-order behaviour of p T,tt is smeared due to the integration over p T,t low , and the Sudakov-type perturbative divergences disappear, by leaving (possibly large) residual effects. The amount of smearing is controlled by the shape of the p T,t high distribution in the low-p T region at LO, which affects the unbalance between real and virtual contributions. The steeply rising LO distribution at low p T strongly suppresses real radiation, and the NLO radiative corrections to p T,t high tend to be large and negative as p T,t high decreases (a large and positive effect occurs at NNLO, and so forth). Accurate theoretical predictions of the detailed shape of the p T,t high distribution at small p T require studies of all-order resummation effects of Sudakov type. However, in the case of large p T bins (as is the case in Fig. 1), reliable predictions can be obtained by considering perturbation theory at a sufficiently high order.
Comparing the results in Fig. 1 for the scales µ 0 = H T /2 (left) and µ 0 = m T,t high (central) we see that they are rather similar, and that the NNLO prediction agrees with the data. The scale µ 0 = m T,t high /2 also leads to good agreement with the data, but the corresponding NNLO   We now discuss the p T,t low distribution. In the region p T,t low → 0, for LO kinematics both the top and the antitop quark are required to have small p T . At NLO, real corrections open up a phase-space region where one top quark has a small p T and the other one has a relatively large p T , thereby leading to large positive radiative contributions. The perturbative instability affecting the p T,t high → 0 behaviour is now spread over the entire region of transverse momenta since, contrary to the low-p T,t high region, small values of p T,t low do not constrain p T,tt to be small. The choices µ 0 = H T /2 and µ 0 = m T,t low lead to rather similar results. In both cases, at low and large p T,t low NLO and NNLO bands overlap, whereas they do not in the intermediate region where the NLO band shrinks, showing that NLO perturbative uncertainties are underestimated. We note that the scale µ 0 = m T,t low /2 makes the perturbative convergence worse and the scale uncertainties larger at both NLO and NNLO.
We next consider the distribution in the transverse momentum of the hadronically decaying top or antitop quark, p T,t had . Since our calculation refers to stable top quarks, a prediction for the p T,t had distribution can be obtained by computing the transverse-momentum spectra of the top and the antitop quark, and taking their average afterwards. 4 Discussing our theoretical predictions, we refer to this as the p T,tav distribution. The corresponding LO, NLO and NNLO results are depicted in Fig. 3 for three different scale choices. We show the predictions for our default choice µ 0 = H T /2 on the left. In the predictions for our natural scale choices, the top (antitop) p T distributions required to compute the average are evaluated at the corresponding transverse mass, µ 0 = m T,t(t) (central) and µ 0 = m T,t(t) /2 (right). We denote these scale choices as m T,tav and m T,tav /2, respectively. The p T,t had distribution has a maximum at p T,t had ∼ 80 GeV. The LO and NLO scale-uncertainty bands do not overlap, except for µ 0 = m T,tav /2. This is consistent with what happens for the corresponding total cross sections. The NLO and NNLO bands do overlap in the entire p T,t had range, suggesting a good convergence of the perturbative expansion. In Fig. 3 we also observe that the scale choices µ 0 = H T /2 (left) and µ 0 = m T,tav (central) give rather similar results. On the contrary, the choice µ 0 = m T,tav /2 suggests a faster convergence of the perturbative expansion [97]. However, we also note that with this scale choice the NLO scale dependence is similar to what we obtain with µ 0 = H T /2 and µ 0 = m T,tav , whereas the NNLO scale dependence is significantly smaller than at NLO, thereby suggesting a possible underestimation of the perturbative uncertainty at NNLO. We note that µ 0 = m T,tav /2 is also the scale used for the NNLO QCD+NLO EW prediction [42] to which the CMS data are compared in Ref. [9]. The data show that the measured p T,t had distribution is slightly softer than the NNLO prediction. This is noticed also by the CMS collaboration [9] and in previous comparisons between NNLO results and LHC measurements [1][2][3][4][5][6][7][8]. However Fig. 3 shows that the NNLO result and the data are consistent within the respective uncertainties. Our predictions for the p T spectrum of the leptonically decaying top quark are, of course, identical to those for p T,t had , and they are not shown here. The comparison with the data shows similar features.
We add a few comments on the perturbative behaviour of the p T,t high , p T,t low and p T,tav distributions presented in Figs. 1, 2 and 3. The three distributions are identical at LO, but their behaviour beyond LO is clearly very different. As we can see from Fig. 3, the shape of the p T,tav distribution is almost unchanged with respect to the LO prediction. This feature is somehow expected, since the transverse-momentum spectrum of the top (antitop) quark at higher orders is affected by recoiling hard multijet radiation, which leads to a partly softer spectrum only at quite high p T,tav (beyond the p T,t had range in Fig. 3), where hard multijet radiation is kinematically suppressed.
The physical shape of the p T,t high and p T,t low distributions is expected to be different from p T,tav at low and intermediate values of p T . There we roughly have p T,t high − p T,t low ∼ p T,tt . The p T,tt distribution, which is confined to p T,tt = 0 at LO, has an average value of about 50 GeV (which is already achieved at NLO), and it is localized in the small-p T,tt region with a peak around 10 GeV [76,107]. We thus expect that the physical shape of p T,t high (p T,t low ) is harder (softer) than its LO counterpart, with shape distortions of few tens of GeV as given by the size of p T,tt . Indeed, this is what we can observe from the comparison between the data and the LO prediction at small and intermediate values of p T in Figs. 1 and 2. This shape distortion has a smaller effect at high values of both p T,t high and p T,t low .
In view of these physical expectations, it is not surprising that the shape of the p T,t high and p T,t low distributions is strongly affected by beyond-LO contributions. As discussed before, their fixed-order perturbative features are a smoothened version of the corresponding features of the p T,tt distribution [107], the smoother behaviour being due to the smearing that is produced by We first comment on the convergence of the perturbative series for the three scale choices. In the cases µ 0 = H T /2 and µ 0 = m tt /2 we see that LO and NLO bands do not overlap, analogously to what was previously observed for the p T,t had distribution and for the total cross section in Eq. (2). In the case µ 0 = H T /2, the NNLO corrections enhance the NLO result by about 10% in the peak region, and their effect slightly increases with m tt up to about 15% in the highestm tt bin. Using µ 0 = m tt /2, the NNLO effect is similar in the peak region, but it increases to about 20% at high m tt . The NLO and NNLO bands do overlap using both µ 0 = H T /2 and µ 0 = m tt /2. As observed in Ref. [97], the choice µ 0 = H T /4 leads to a faster convergence of the perturbative series. However, in the region where m tt > 360 GeV we see that the size of the scale-variation band is very much reduced in going from the NLO to the NNLO result. This behaviour suggests that the central scale µ 0 = H T /4 is (accidentally) quite close to a region of (local) minimal sensitivity [108] of the scale dependence of the NNLO result. In view of this feature, we think that the NNLO scale variation band with µ 0 = H T /4 likely underestimates the perturbative uncertainty in this m tt region and, especially, in the intermediate mass range 400 GeV ∼ < m tt ∼ < 1 TeV.
We now comment on the comparison with the data. The first bin, 300 GeV < m tt < 360 GeV, deserves a separate discussion. The experimental result in this bin is significantly above the theoretical predictions, independently of the scale choice. This disagreement may have various origins. We first note that the result of our calculation for on-shell top quarks is non-vanishing only in the limited region where 2m t = 346.6 GeV < m tt < 360 GeV. In this region the NLO and NNLO effects are largest, regardless of the scale choice. Moreover, this region is particularly |y tt | ratio to NNLO sensitive to the value of the top-quark mass. If the top-quark mass is smaller 5 by a couple of GeV, the predicted cross section in this bin becomes larger, without significantly affecting the NNLO result in each of the higher-m tt bins. Another possible reason for the discrepancy may be the unfolding procedure [9] that is used to convert the data from particle to parton level. We expect such unfolding procedure to be particularly delicate in the threshold region: both the actual value of the top-quark mass used in the extrapolation and off-shell effects may have a significant impact. Further perturbative QCD effects beyond NNLO may also be relevant, but they are not expected to be large enough to explain the observed discrepancy with respect to the data. Overall, excluding the first bin, the NNLO result for the central scale µ 0 = m tt /2 leads to the best agreement with the data. The result for µ 0 = H T /2 is also consistent with the data within uncertainties.
We finally consider the distribution in the rapidity of the top-quark pair, y tt . A natural scale choice for this distribution is the top-quark mass, as in the case of the computation of the total cross section. In Fig. 5 we report the rapidity distribution of the tt system for our default choice µ 0 = H T /2 (left), for µ 0 = m t (central) and for µ 0 = H T /4 (right). We first observe that the choices µ 0 = H T /2 and µ 0 = m t lead to rather similar results. The LO and NLO bands marginally overlap, and the impact of NNLO corrections in the central rapidity region is about 10%, consistently with what we find for the total cross section. As previously observed for the m tt distribution, the choice µ 0 = H T /4 leads to a faster convergence of the perturbative expansion [97], and to a strong reduction of scale uncertainties at NNLO. The data nicely agree with the NNLO predictions for all the three scale choices.
We conclude this section with a few comments on the effects of EW corrections on the distributions considered so far. The effect of the complete EW corrections to top-quark pair production was studied in Ref. [32], and NNLO QCD+NLO EW predictions for some differential observables were presented in Ref. [42]. In the case of the p T,t had distribution, the EW corrections are negative and lead to percent-level effects at low transverse momenta, which increase up to about −5% in the highest-p T bins in Fig. 3. Such effects are of the order of (or smaller than) the residual perturbative uncertainties of the NNLO result, and significantly smaller than the current experimental uncertainties. An analogous conclusion can be drawn for the invariantmass distribution, which receives EW corrections of about +2% in the small-m tt region and −2% in the highest-m tt region in Fig. 4. The impact of EW corrections on the rapidity distributions is typically of O(1%) or smaller. Therefore, we conclude that the impact of EW corrections is not expected to significantly affect our comparison with the data. Similar conclusions can be drawn for the double-differential distributions that we present in the following. More precise data, with higher reach in transverse momenta, will definitely call for the inclusion of EW corrections [42].

Double-differential distributions
In this section we present results for double-differential distributions and compare them with the CMS measurements from Ref. [9]. We consider three double-differential distributions, namely • the y tt distribution in m tt intervals (see Fig. 6); • the p T,t had distribution in y t had intervals (see Fig. 7); • the m tt distribution in p T,t had intervals (see Fig. 8).
We present results for the default scale choice µ 0 = H T /2 and at an additional scale of the order of the characteristic hard scale of the corresponding double-differential distribution.
The y tt distribution in four m tt intervals is presented in Fig. 6. The central scales are µ 0 = H T /2 (upper) and µ 0 = m tt /2 (lower), and they lead to similar results, consistently with those for the single-differential cross sections in Figs. 4 and 5. The impact of the radiative corrections is relatively uniform in y tt and in m tt . The scale µ 0 = m tt /2 leads to slightly larger radiative corrections in the highest-m tt interval. In the first m tt interval, the NNLO prediction slightly undershoots the data, consistently with what is observed in the low-m tt region in Fig. 4. We note that, at variance with the first m tt bin in Fig. 4, here the first m tt interval extends up to 450 GeV, thereby leading to a better agreement between the NNLO prediction and the data. At high values of m tt , the scale µ 0 = m tt /2 leads to a slightly better agreement with the data. In the highest-y tt bin, for both central scales, the experimental result is below the NNLO prediction in all invariant-mass intervals except the first one.
In Fig. 7 we present the p T,t had distribution in y t had intervals, where y t had is the rapidity of the hadronically decaying top or antitop quark with transverse momentum p T,t had . The QCD results use the scales µ 0 = H T /2 (upper) and µ 0 = m T,tav (lower) as a natural scale, analogously to the p T,t had distribution in Fig. 3. We observe that the two scale choices lead to very similar results. The impact of the radiative corrections is rather uniform in the considered y t had intervals. Discussing the single-differential p T,t had spectrum in Fig. 3, we observed that the  NNLO prediction is slightly harder than the data. In Fig. 7 we observe that this feature holds in all the y t had intervals. Nonetheless, the overall agreement between the NNLO prediction and the data is good.
We finally discuss the m tt distribution in p T,t had intervals, which is shown in Fig. 8. We use the central scales µ 0 = H T /2 (upper) and µ 0 = m tt /2 (lower). In each p T,t had interval we have p T,min < p T,t had < p T,max . The introduction of kinematical cuts on p T,t had significantly affects the m tt distribution. In the presence of a cut, p T,max , on the maximum p T of the hadronically decaying top quark, high values of m tt can be reached by either increasing the transverse momentum of the top quark decaying to a leptonic final state, or increasing the rapidity separation |y t − yt|. These kinematical regions are both dynamically suppressed for heavy-quark production and, therefore, p T,max produces a faster suppression at high invariant masses. This effect can be observed by comparing the m tt single-differential distribution in Fig. 4 with the present double-differential distribution. 1.5 < |y thad | < 2.5 1.5 < |y thad | < 2.5 1.5 < |y thad | < 2.5 1.0 < |y thad | < 1.5 1.0 < |y thad | < 1.5 1.0 < |y thad | < 1.5 0.5 < |y thad | < 1.0 0.5 < |y thad | < 1. 1.5 < |y thad | < 2.5 1.5 < |y thad | < 2.5 1.5 < |y thad | < 2.5 1.0 < |y thad | < 1.5 1.0 < |y thad | < 1.5 1.0 < |y thad | < 1.5 0.5 < |y thad | < 1.0 0.5 < |y thad | < 1.  The presence of a lower cut, p T,min , on the minimum p T,t had has an even more drastic effect on the m tt distribution, as it imposes the constraint m tt > 2 m 2 t + p 2 T,min ≡ 2m T,min for LO kinematics. Below this unphysical threshold the LO result vanishes, and the NLO and NNLO results are effectively LO and NLO predictions, respectively. As a consequence, they suffer from larger theoretical uncertainties, which is reflected by the stronger scale dependence. Above this threshold, the LO distribution sharply increases up to a kinematical peak close to 2m T,min . Owing to this LO behaviour, soft-collinear radiation produces shape instabilities [109] in this m tt region at each subsequent perturbative order. The qualitative behaviour of these shape instabilities is completely analogous to that observed and discussed in Ref. [110] (see Figs. 10 and 20 and related comments therein) in the case of diphoton production in the presence of p T cuts. These perturbative instabilities are localized in a very narrow region around the LO threshold, and therefore their effect is smeared if a sufficiently large bin size is considered, as is the case for the differential distribution in Fig. 8.  The comparison with the data shows that in the first two p T,t had intervals the NNLO prediction undershoots the data in the first m tt bin. This discrepancy at low-m tt is not resolved in the two highest-p T,t had intervals, since the larger bin size (300 GeV < m tt < 430 GeV) renders the distribution less sensitive to the behaviour close to the physical threshold. These observations are consistent with the expectations from the behaviour of the single-differential distribution in Fig. 4 at low m tt . Excluding the narrower bins at low m tt , the NNLO prediction in Fig. 8 is in very good agreement with the experimental measurements. The results with µ 0 = H T /2 (upper) and µ 0 = m tt /2 (lower) turn out to be rather similar, consistently with our general expectation at the beginning of Section 3. The exception is the highest-p T,t had interval, where the scale µ 0 = m tt /2 leads to a quite large NLO scale dependence at low m tt , which is drastically reduced at NNLO. In this region of low m tt and high p T,t had , the invariant mass m tt is not the characteristic hard scale anymore: the scale choice µ 0 = m tt /2 is not expected to be optimal, and the scale µ 0 = H T /2 turns out to be more appropriate.

Summary and outlook
In this paper we have presented a new fully differential NNLO calculation of top-quark pair production at hadron colliders. The calculation is carried out by using the q T subtraction formalism to handle IR divergences from real and virtual contributions, and it is implemented in the Matrix framework. Our code enables fast and efficient calculations of fiducial cross sections and multi-differential distributions.
We have computed several single-and double-differential distributions of the top quarks, and we have compared our results with recent measurements performed by the CMS collaboration in the lepton+jets decay channel. We have considered several values of the renormalization and factorization scales to compute each of the distributions. We have used natural scales (i.e. m t , m tt /2 and the relevant transverse masses m T ) of the order of the characteristic hard scale of the computed distribution, and we have shown that the corresponding results are similar to what is obtained with the overall choice µ 0 = H T /2. We find that both the natural scales and µ 0 = H T /2 lead to a reasonable perturbative behaviour for all the distributions that we have examined. The NNLO corrections substantially reduce the uncertainties of the theoretical predictions, and they improve the overall agreement with the experimental measurements. The largest deviation between data and the NNLO result occurs close to the m tt threshold in singleand double-differential distributions. This discrepancy could be related to a variety of effects, including issues in the extrapolation of the data from particle to parton level, which is expected to be delicate in such threshold region. A lower value of the top-quark mass also has a significant impact close to the threshold.
The code that is used to perform these calculations is going to become public in a future Matrix release, providing a fast and flexible tool to compute (multi-)differential distributions with arbitrary cuts on the top-quark kinematical variables. The inclusion of NLO EW corrections and of top-quark decays is left to future work.

A Appendix
The calculation presented in this paper represents the first complete application of the q T subtraction formalism to the NNLO computation of (multi-)differential cross sections for the production of a colourful final state. As a validation of our results, in this appendix we carry out a detailed comparison of our predictions with those available for tt production at NNLO.
In Ref. [70], the q T subtraction results for the contributions of the flavour off-diagonal partonic channels to the tt total cross section were compared with those of the numerical program Top++ [111] by considering pp collisions at √ s = 2 TeV and pp collisions at √ s = 8 TeV. In Ref. [69], the comparison with Top++ results at the LHC energies √ s = 8 and 13 TeV was performed by considering all the partonic channels, and we found agreement within the numerical uncertainties. We have successfully repeated this comparison in pp collisions at Tevatron energies and in pp collisions with centre-of-mass energies √ s up to 100 TeV. We thus conclude that our calculation of the total cross section is in perfect agreement with the calculation of Refs. [35][36][37][38]. The typical computing time needed to obtain O(0.1%) precise inclusive cross sections (including scale uncertainties) with our program is about 1000 CPU days.
We point out that the only ingredient that is not computed independently in our implementation and in the calculation of Refs. [35][36][37][38] is the finite part of the two-loop virtual amplitudes [52,53]. This contribution, however, turns out to have a very small quantitative impact, namely O(0.1%) of the NNLO total cross section at √ s = 13 TeV.
In the following we present a comparison of our NNLO differential results with the analogous results from Ref. [97]. To this purpose, we exactly follow the setup therein: we consider pp collisions at √ s = 13 TeV, and we use CT14NNLO [112] PDFs. The QCD running of α S is evaluated at three-loop order with α S (m Z ) = 0.118, and the pole mass of the top quark is fixed to m t = 173.3 GeV. We consider four differential distributions: invariant mass (m tt ) and absolute rapidity (|y tt |) of the top-quark pair, and averages of the transverse momenta (p T,tav ) and of the absolute rapidities (|y tav |) of the top and the antitop quark. The values of the renormalization and factorization scales are fixed to different values for the different distributions. In case of the m tt , |y tt | and |y tav | distributions the scales are set to H T /4. In case of the p T,tav distribution, each of the p T,t and p T,t distributions is calculated with both scales set to half of the corresponding transverse mass. This scale choice corresponds to µ 0 = m T,tav /2 in the notation of Section 3.1.
The comparison of our results with those of Ref. [97] is illustrated in Fig. 9. In each case the upper panel shows the respective distribution, while the lower panel reports the ratio with respect to the corresponding result from Ref. [97] (CHM). Our results are stated with their numerical uncertainties. The computing time needed to obtain our results is about 2000 CPU days, mainly to achieve good statistical precision in the tails of the kinematical distributions. The results from Ref. [97] are quoted without an associated statistical uncertainty. However, Ref. [97] states that "the narrowest bins possible" are chosen in order to "keep the Monte Carlo integration error within about 1% in almost all bins". Correspondingly, the approximate uncertainty of ±1% is reported in the plots for reference. We see that the agreement for all  Figure 9: Comparison between the NNLO differential distributions obtained with Matrix (orange) and the results from Ref. [97] (CHM, grey). The orange bands indicate the numerical uncertainty of our results, which is typically well below 1%. The grey band approximates the statistical uncertainty expected for the results of Ref. [97], based on corresponding comments in the text therein. the considered distributions is excellent, also in the extreme kinematical regions, i.e. large rapidities, large transverse momenta and large invariant mass of the tt pair.