Top-quark pole mass extraction at NNLO accuracy, from total, single-and double-differential cross sections for t ¯ t + X production at the LHC

: We extract the top-quark mass value in the on-shell renormalization scheme from the comparison of theoretical predictions for pp → t ¯ t + X at next-to-next-to-leading order (NNLO) QCD accuracy with experimental data collected by the ATLAS and CMS collaborations for absolute total, normalized single-differential and double-differential cross-sections during Run 1, Run 2 and the ongoing Run 3 at the Large Hadron Collider (LHC). For the theory computations of heavy-quark pair-production we use the MATRIX framework, interfaced to PineAPPL for the generation of grids of theory predictions, which can be efficiently used a-posteriori during the fit, performed within xFitter . We take several state-of-the-art parton distribution functions (PDFs) as input for the fit and evaluate their associated uncertainties, as well as the uncertainties arising from renormalization and factorization scale variation. Fit uncertainties related to the datasets are also part of the extracted uncertainty of the top-quark mass and turn out to be of similar size as the combined scale and PDF uncertainty. Fit results from different PDF sets agree among each other within 1 σ uncertainty, whereas some datasets related to t ¯ t decay in different channels (dileptonic vs. semileptonic) point towards top-quark mass values in slight tension among each other, although still compatible within 2 . 5 σ accuracy. Our results are compatible with the PDG 2022 top-quark pole-mass value. Our work opens the road towards more complex simultaneous NNLO fits of PDFs, the strong coupling α s ( M Z ) and the top-quark mass, using the currently most precise experimental data on t ¯ t + X total and multi-differential cross-sections from the LHC.


Introduction
Top-quark measurements at the Large Hadron Collider (LHC) play a pivotal role in modern particle physics for a number of reasons.First, they are crucial for precisely extracting key parameters of the Standard Model (SM), helping to refine our general understanding of fundamental interactions.Second, these measurements provide critical insights into the electroweak symmetry breaking mechanism, shedding light on how particles acquire mass.Finally, top-quark studies are a vital component of searches for physics beyond the SM (BSM) as one of the most important backgrounds, but also for potentially uncovering new phenomena, e.g. through anomalous couplings to top quarks.In this context, measurements of t t + X hadroproduction serve as a cornerstone of the LHC physics program.
The significance of top-quark physics in ongoing and forthcoming research at highenergy colliders has been acknowledged, see e.g.Refs.[1][2][3], and the need for accurate theoretical predictions accompanying experimental studies has been underlined and motivated again in Ref. [4] summarizing the recent Snowmass 2021 process.The present study aims at the determination of the top-quark mass, a topic which has been extensively reviewed, e.g., in Refs.[5][6][7][8].The pole mass of the top quark is extracted from a comparison of inclusive and differential cross-section data for t t + X production collected by the LHC experiments ATLAS and CMS, with theoretical predictions including higher-order corrections in quantum chromodynamics (QCD) and computed for the top-quark mass in the on-shell renormalization scheme.
From the theory point of view, predictions for pp → t t + X at next-to-leading order (NLO) QCD accuracy were presented already many years ago, starting from the works of Refs.[9][10][11].Total cross-sections at next-to-next-to-leading order (NNLO) accuracy are available already since more than ten years [12].Public codes like, e.g., HATHOR [13] and Top++ [14] have opened the possibility to access to them already since long.Predictions first at approximate NNLO (aNNLO) [15,16] and then at approximate N 3 LO (aN 3 LO) [17] accuracy obtained from fixed-order expansions of threshold-resummed results at next-tonext-to-leading logarithmic accuracy, have been also produced, targeting both total and differential cross-sections.
Limiting the discussion to fixed-order calculations, NNLO QCD differential cross sections have been computed more recently than NNLO total cross sections.First computations in this direction were performed in Ref. [18] for pp collisions and in Refs.[19,20] for pp collisions, using the sector-improved residue subtraction scheme STRIPPER [21][22][23] for the cancellation of infrared divergences at NNLO, born from the combination of some ideas of FKS NLO subtraction [24] with those of sector decomposition [25][26][27].Although the corresponding code is still private, part of the results of these computations are nowadays accessible to the whole HEP community through the HighTEA analyzer project [28], following a first release as fastNLO [29] grids [30].In parallel, partial NNLO results, limited to the q q channel [31][32][33], were also obtained by a group developing and using the antenna subtraction method [34], properly extended to deal with colorful initial states and colorful massive final states [35].More recently, the q T -subtraction method [36] was also extended and applied to the calculation of the cross sections for this process, starting from the off-diagonal channels in Ref. [37] and later presenting complete NNLO results for the total and differential cross sections in Refs.[38] and [39], respectively.The results were obtained and implemented within the MATRIX framework [40].Those implementations adopt the on-shell top-quark mass renormalization scheme.As an alternative, following [41], in Ref. [42] predictions with top-quark mass renormalized in the MS scheme were presented.The implementation using the on-shell top-quark mass renormalization scheme has been made available in the public version of MATRIX that, in principle, enables the whole HEP community to make predictions of single-and even double-differential cross sections by just installing and running the code after having specified some inputs.In this way it is possible to obtain predictions which can be directly compared to the experimental data released during Run 1, 2 and 3 at the LHC (see e.g. the experimental data published in Refs.[43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58]).
In practice, the amount of computations for different sets of input parameters required to calculate the uncertainties on these predictions, in particular due to parton distribution functions (PDF) and the top-quark mass, is quite large and very demanding in terms of computing resources.Strategies have been proposed and/or already developed to improve the speed of these computations, shortening the processor and memory usage.Saving the results in grids, which can be used a-posteriori via interpolation for further analyses, e.g.fits of PDFs and/or top-quark mass value, turns out to be an indispensable step, at least considering present computing resources.In Ref. [16] a PDF determination was carried out using predictions at aNNLO accuracy computed with DiffTop interfaced to fastNLO [29] and xFitter [59].The importance of saving NNLO predictions in grids was discussed in Ref. [30], and a PDF fit using them for single-differential tt distributions was presented in Ref. [60], a study where NNLO results were mimicked by using NLO calculations interfaced to APPLgrid [61] and supplementing them with bin-by-bin K-factors.The latter were defined as the ratio of the NNLO to NLO predictions for a single PDF set, and the same K-factors were even applied to NLO predictions obtained with different PDFs, thus ignoring their implicit dependence of the PDFs.A simultaneous fit of the top-quark mass and the strong coupling α s (M Z ) at NNLO using single-differential tt + X distributions was performed in Ref. [62].The scale dependence of the top-quark mass in the MS renormalization scheme, i.e. its running through NNLO in QCD, has been investigated in Refs.[63,64] using data from the CMS collaboration for the single-differential cross section in the invariant mass of the tt system.
In the current study, thanks to tools that we describe in more detail in Section 2, it was possible for us to use a customized version of MATRIX, optimized for the pp → t t + X process and interfaced to PineAPPL [65], for the computation of all NNLO QCD theory predictions with uncertainties (without utilizing K-factors or other approximations) for absolute total as well as normalized single-and double-differential cross sections at the LHC, that we have included in our fits of top-quark pole mass values at NNLO accuracy.We use experimental measurements of cross sections as a function of the invariant mass M (tt) and rapidity |y(tt)| of the tt pair.These are the first fits of the top-quark pole mass with NNLO accuracy using as input LHC double-differential t t + X data, to the best of our knowledge.For completeness, it is worth to also notice that some of the double-differential data considered in our work have been used in the extraction of PDFs, comparing them to theoretical predictions obtained using NNLO/NLO K-factors (see e.g.Refs.[66,67]).Furthermore, the impact of the CMS double-differential production cross-section dataset at 8 TeV [55] on the gluon PDFs has been investigated in Ref. [68] by comparing them to genuine NNLO double-differential predictions in fastNLO tables obtained within the STRIPPER approach.
A comparison of our predictions before fit, to the experimental data from the ATLAS and CMS collaborations considered in the fit is reported in Section 3, whereas the fit methodology and results are presented in Section 4. A summary and perspectives for future developments are presented in the Conclusions in Section 5.
2 Strategy for high-performance computations of NNLO (double-)differential cross-sections for t t + X production at the LHC 2.1 q T subtraction and the MATRIX framework The q T -subtraction formalism is a method for handling and cancelling the infrared divergences from the combination of real and virtual contributions in computations of total and differential cross sections for the production of massive final states in QCD at NLO and NNLO accuracy.It uses as a basis the q T -resummation formalism, where q T is the transverse momentum of the produced high-mass system.The latter allows to compute the infrared subtraction counterterms, constructed by the evaluation of the q T distribution of the massive final-state system in the limit q T → 0. In the case of a colorless massive final state, the q T distribution in this limit has a universal structure, known at N 3 LO accuracy from the expansion of the corresponding resummed result.q T resummation, and, as a consequence, q T subtraction, were first developed and applied to the inclusive hadro-production of Higgs [36] and vector [69] bosons, and then extended to the case of heavy-quark pair hadro-production [70][71][72][73].In the case of final states containing heavy quarks, which are colored objects, additional soft singularities appear, absent in the case of colorless final states.No new collinear singularity enters the calculation, thanks to the heavy-quark mass.Recently, the methodology was used even for first NNLO computations of heavy-quark pair production in association with a Higgs [74] or a W -boson [75,76] at the LHC.The master formula for the differential (N)NLO partonic cross section for pp → t t + X production following this formalism, can be written as where dσ t t LO is the differential cross section at LO accuracy, dσ t t+jet (N )LO is the cross section for t tj + X production at (N)LO accuracy, built according to the standard LO and NLO formalisms and dσ t t,CT (N )N LO are appropriate counterterms capturing the singular behaviour of dσ t t+jet (N )LO in the limit q T → 0, such that the content of the square bracket is infrared-safe in the limit q T → 0. These counterterms are built from the (N)NLO fixed-order expansion of the q T resummation formula for t t production (see also Ref. [77]).
The form of these counterterms is nowadays fully known.In practice, the computation of the part in square brackets in eq.(2.1) is carried out introducing a cut-off r 0 = q T,min /M (tt) on the dimensionless quantity r = q T /M (tt), where M (tt) is the invariant mass of the t t pair.This renders both terms in the square brackets separately finite.Below r 0 , which acts effectively as a slicing parameter, the two terms in square brackets are assumed to coincide, which is correct up to power-suppressed contributions.These power corrections vanish in the limit r 0 → 0, and as a mitigation strategy, to control their impact on the cross section in eq.(2.1), MATRIX computes the quantity in square brackets for different discrete r 0 values in the same run.The results for different r 0 values are then fitted, and the extrapolation to r 0 = 0 is taken, to get information on the exact NNLO result.Extrapolation uncertainties are also accounted for and included, as explained in Ref. [40].A phase-space slicing approach like the q T -subtraction formalism can also be subject to linear power corrections from fiducial cuts on the final state decay products (see studies for the Drell-Yan process in Refs.[78,79]).So far, such effects have been treated in MATRIX only for color-singlet final states [80].
The term H t t (N )N LO in eq. ( 2.1) includes information on the virtual corrections to the process at hand and contributions that compensate for the subtraction of the counterterms.H t t (N )N LO can be splitted into a process-independent and a process-dependent part.The process-independent part is the same entering Higgs-and vector-boson production, and is explicitly known [81][82][83][84].The process-dependent part of H t t N N LO in the flavour-off-diagonal channels (i.e., all except for q q and gg) originates from one-loop amplitudes for the partonic processes q q → t t and gg → t t, plus the heavy-quark azimuthal correlation terms [72] in the q T -resummation formalism.The process-dependent part of H t t N N LO in the flavour-diagonal channels (q q and gg) requires in addition the knowledge of two-loop amplitudes for q q → t t and gg → t t for which the results of Refs.[85,86] are adopted, plus the computation of additional soft contributions, completed in Ref. [87].More details can be found in Refs.[37,72,87].All these ingredients are available and have been combined together in the MATRIX framework, which, in turn, relies on the MUNICH code for the combination of real and virtual NLO contributions and for the evaluation of dσ t t,CT (N )N LO , on additional code for the evaluation of H t t (N )N LO , and on the OpenLoops code for the evaluation of all color-correlated and spin-correlated tree-level and one-loop amplitudes 1 .
The MATRIX computer program has been kept quite general and this allowed the implementation of a number of different processes in a comprehensive framework.We use a customized version of MATRIX, tailored to the t t + X case only and optimized for it.In particular, we started from the MATRIX version used for the computations presented in Ref. [39] and we have performed a number of optimizations in the program flow and execution, which include 1) recycling of parts of computations already done, instead of recomputing multiple times identical pieces, 2) adaptation of the code in view of its execution on local multicore machines2 , 3) optimizations in distributing the computation on different machines/cores, in the job and job failure handling, 4) optimization in the input/output information exchange with the computer cluster during remote job execution, 5) reduction in the memory usage and in the size of the stored output, leading to an overall gain in the speed of the computation, in the memory consumption and in the space allocation without compromising the final results.The exact gain in the speed, memory and disk space consumption depends on the required accuracy for the cross section to be computed and the number of parallel jobs.In our case, these modifications allowed us to reach the desired 0.2 accuracy for the total tt + X cross sections.
All our computations apply the on-shell renormalization scheme for the top-quark mass.The theoretical aspects, advantages and disadvantages of this choice are well known [8].In particular, cross sections for t t + X hadro-production can be subject to another class of power corrections that originate from renormalons, see e.g., Ref. [88].Such effects or any estimates of their size are not included in our analysis.
Also, electroweak corrections at NLO accuracy [89] are not included in our calculations.According to Ref. [90], and considering the thin binning in that work, their size varies from +2% to −4% as a function of M (tt), and within 1% as a function of |y(tt)|.However, we have checked that for the bins of the experimental measurements which we use in our work this effect does not exceed 1%.Thus, the missing electroweak corrections are expected to be covered by the assigned theoretical uncertainties described in Section 2.2.
In Fig. 1 we compare the differential cross sections at NNLO we have computed with MATRIX, after the customization and optimization steps described above, with those from Ref. [20].For the MATRIX framework, we compare the differential cross sections computed using the cuts r 0 = 0.0015 and r 0 = 0.00053 .For this calculation we use the NNPDF3.0PDF set [92] and a top-quark pole mass value m pole t = 173.3GeV.When computing the cross sections as a function of the transverse momentum of either the top or the antitop quark, the factorization and renormalization scales are set to µ r = µ f = m 2 t + p 2 T,t /2, i.e. to a half of the transverse mass of either the top or the antitop quark, while for all other observables the scales are set to µ r = µ f = H T /4 where H T is the sum of the transverse masses of the top and the antitop quarks, consistent with Ref. [20].While no numerical uncertainties are provided for the results from Ref. [20], we roughly estimate them to be 1%, based on corresponding comments in the text therein, as done also in Ref. [39], for the comparison of the NNLO predictions against those of Ref. [20].The results of the two computations agree within ≈ 1%.No trends are observed apart from a very small p T (t) slope (< 1%) and the general normalization: the MATRIX r 0 = 0.0015 predictions are ≈ 0.5% above those from Ref. [20].This could be related to the fact that, in the version of the code used in this study, the differential distributions are computed with finite cuts r 0 = 0.0015 and r 0 = 0.0005.For the total tt + X cross section MATRIX performs automatically an extrapolation to r 0 = 0, which results in a systematic correction of similar size, −0.5%, thus suggesting power corrections in q T as a source for the observed difference4 .In Fig. 2 we present the comparison of MATRIX predictions for the double-differential cross section as a function of M (tt) and |y(tt)| obtained with r 0 = 0.0015 and r 0 = 0.0005, using the binning scheme of the experimental measurement [48].The results agree within ≈ 0.5% for all distributions.For the phenomenological analysis this small difference does not create any issue, since an effect of the size of 0.5% is very well covered by e.g. the scale variation uncertainties which we do take into account (see Section 3.3).Additionally, this effect partially cancels between numerator and denominator when considering normalized differential cross sections, that form the basis of our fits, as discussed in Sections 3 and 4.
Furthermore, in Fig. 3 we compare the NNLO differential cross sections as a function of M (tt) computed with MATRIX to the results obtained using the recently developed HighTEA project platform [28].For this calculation we use the NNPDF4.0PDF set [67], m pole t = 172.5 GeV and µ r = µ f = H T /4.Unfortunately, the HighTEA theoretical predictions are currently accompanied by quite large numerical uncertainties which vary from 1% to ∼ 100% at large M (tt) values.The two calculations agree within these uncertainties.We note that the current level of accuracy for differential tt + X production, that can be reached through the HighTEA project platform, is not sufficient for a phenomenological analysis of the present LHC data.We expect that this can indeed be improved by enlarging the event files stored there to include a higher number of events.Therefore, for the time being, in the following of this work we limit ourselves to the use of MATRIX.and d) the rapidity of the top-antitop quark pair (lower right), computed with MATRIX using the cuts r 0 = 0.0015 and r 0 = 0.0005, and the results from Ref. [20], dubbed CHM in the figure, with their numerical uncertainties.

Interface to PineAPPL
A target precision of a few per mill accuracy requires the generation of various billions of t t + X NNLO events, which takes O(10 5 ) CPU hours.Repeating such a computation for many different PDF parametrizations and/or different scale choices is not feasible with present standard CPU computing resources available to the theory and experimental HEP communities.While the possibility to store the calculation results for a few different scale sufficiently small cut (r0 ≲ 0.0015).choices is already available within the MATRIX framework, in a single run there is neither the possibility to compute predictions using different PDF sets nor different error members within a single PDF set, at least for the time being.A general solution to this problem is to use interpolation grids, such as fastNLO [29], APPLgrid [61] or PineAPPL [65].In these grids, partonic matrix elements are stored in such a way that they can be convoluted later with any PDF + α s (M Z ) set.We choose the PineAPPL library which is capable of generating grids and dealing with them in an accurate way to any fixed order in the strong and electroweak couplings, and which supports variations of µ r and µ f .For each event computed by MATRIX, we fill the PineAPPL grid for the given perturbative QCD contribution and parton luminosity with the corresponding event weight divided by the product of the two proton PDFs involved in the event-weight computation.Cross-section terms which have different µ r and/or µ f dependence are stored in separate grids.The PineAPPL grids are three-dimensional Lagrange-interpolation grids binned in variables which represent the longitudinal momentum fractions for the two incoming partons and µ f .The values stored in each grid are independent of α s and PDFs.Further details can be found in Ref. [65].
To produce the grids, we have chosen 30 bins for each internal variable of the grids.In order to validate our implementation of the interface to PineAPPL, in Fig. 4 we compare the genuine theoretical predictions from MATRIX with those obtained using the PineAPPL interpolation grids (derived by employing only the ABMP16 NNLO PDF + α s (M Z ) fit [93] as input of the MATRIX computation) and either the ABMP16 or the CT18 NNLO [94] PDF + α s (M Z ) sets.We do this comparison in the M (tt) and |y(tt)| bins of the experimental measurement from Ref. [48], which, among all the experimental analyses producing the data which we use in this work, has the largest number of bins and covers the largest phase space.For the MATRIX calculation used to produce the grids, we require 0.2 accuracy for the total tt + X cross section.This results in numerical uncertainties which are also shown in Fig. 4.They vary from 0.1% to 0.3% for the bins of the experimental measurement of Ref. [48].The typical difference between the original MATRIX predictions and those obtained using the PineAPPL grids turns out to be ≈ 1 for the ABMP16 set and a few for the CT18 set for each event.As expected, the PineAPPL interpolation uncertainty is very small if the same PDF set is used to produce the grids.The slight increase when using a different PDF set is related to the fact that the grids are organized in bins of finite size.On the other hand, when accumulating a big number of events, for the ABMP16 set the differences remain of the order of , whereas for the other set (CT18) the differences never exceed 1% and are consistent with the MATRIX numerical uncertainties, since in this case two independent calculations are compared.The same level of agreement can be expected for any other reasonably smooth PDF set.Thus, the usage of interpolation grids does not deteriorate the accuracy of our MATRIX calculation with the current numerical uncertainties.Whenever necessary (e.g. for a finer binning scheme), the level of agreement can be improved further by producing new interpolation grids with an increased number of bins for the internal variables, however, at the price of an increased computer memory and disk space needed for the grids (currently it is more than one GByte and it varies depending on multiple factors, including among others not only the number of bins, but even the number of observables and parallel processes, considering that all runs are parallelized).
Based on these validation studies, we assign 1% uncorrelated uncertainty in each bin of the predictions in order to cover any numerical inaccuracy and possible small systematic effects which might be present either in the theoretical calculations (see Figs. 1 and 2) or in the usage of the interpolation grids (see Fig. 4).This uncertainty is always smaller than the experimental uncertainties of the existing single-and double-differential measurements of tt + X production at the LHC which typically amount to a few percent (although the 1% additional uncertainty is sizeable, e.g., in some bins of the data from Ref. [48]). 3 Pre-fit comparison of theory predictions and LHC experimental data

Experimental data
For our analysis, we use measurements of the absolute total and normalized differential inclusive tt + X cross sections5 as a function of various top-quark related observables O i , (dσ/dO i )/σ.We collect all available and up-to-date ATLAS and CMS measurements of total tt + X cross sections [43][44][45][46][47][48][49][50][51].These are ten data points which appear on the summary plot of the total tt + X cross sections by the LHC Working Group on Top Quark Physics as of June 2023 [95] Table 1.The measurements of total inclusive tt + X cross sections included in our analysis.
On the other hand, for differential measurements, we choose cross sections as a function of the invariant mass of the tt pair, or (if available) double-differential cross sections as a function of the invariant mass M (tt) and rapidity |y(tt)| of the tt pair.In particular, the tt+X cross sections as a function of M (tt) are very useful to constrain m pole t , while doubledifferential tt+X cross sections as a function of M (tt) and |y(tt)| impose constraints on the PDFs owing to the improved resolution of parton momentum fractions x 1 and x 2 .Indeed, at LO the invariant mass M (tt) and rapidity |y(tt)| of the tt pair are directly related to x 1 and x 2 as x 1,2 = (M (tt)/ √ s) exp [±y(tt)].Therefore, measurements of double-differential cross sections as a function of M (tt) and |y(tt)| are most sensitive to the PDFs and provide strong constraints both on m pole t and the PDFs.Furthermore, we choose only those differential datasets which satisfy all of the following criteria: • measured cross sections should be defined at parton level in the full phase space, i.e. without any restrictions on the decay products of the top and antitop quarks, in order to be consistent with the parton-level NNLO calculations, • normalized cross sections must be available, in order to avoid a complicated treatment of the common normalization with the datasets for the total tt + X cross sections, • bin-by-bin correlations must be available.
Under these criteria, we use the nine datasets listed in Table 2.In total, they contain 112 data points, taking into account that one data point from each measurement should be discarded during the comparison of normalized data and theory, as explained later in Section 3.2.
In our work, we extract the top quark pole mass, m pole t , by comparing the measured tt + X cross sections to NNLO theoretical predictions which depend on m pole t .In order to measure the tt + X cross sections, experimental collaborations use an unfolding procedure.The aim of the unfolding procedure is to obtain the tt+X cross sections at parton level using as input various kinematic distributions at detector level.While the unfolding procedure uses MC simulations, by construction it is supposed to provide tt + X cross sections at parton level which do not explicitly depend on the details of the MC simulations.This is verified by various checks, such as that of control distributions when one compares event distributions in the data and in the MC simulations, as well as closure tests, when, using toy distributions, one validates that the unfolding procedure restores the truth reasonably well independently of the MC simulations used.Any remaining dependence on the MC simulations is supposed to be covered by systematic uncertainties which are estimated e.g. by varying theoretical parameters in the MC simulations including, but not limiting to, the MC mass, and by adopting alternative models/tunes.With such a method, the measured parton-level cross sections are meant to be directly compared to fixed-order predictions, which is often done also by the experimental collaborations in their original publications (see e.g.Refs.[48,52,53,55]).This is different from other determinations of the top-quark mass at the LHC that rely upon the reconstruction of the top-quark from its decay products, and are usually dubbed "direct measurements" (see e.g. the recent review of the top-quark mass measurements in CMS [96]).The results obtained in direct measurements of the MC top-quark mass are affected by ambiguities originating from theoretical uncertainties and limitations of the current MC simulations.Nevertheless, in the case of tt pairs produced in hadron-hadron collisions, where the underlying hard interactions unavoidably involve partons in non-singlet color configurations, the conceptual issues that affect the direct measurements are eventually emerging for all top-quark mass measurement methods, once a precision of 0.5 GeV or better is reached [8].In view of this, we regard the present determination of the top-quark mass as the extraction of m pole t from the differential distributions at the parton-level obtained by the experimental collaborations from those at the detector-level using MC event generators, for identifying in a more precise way the extraction of top-quark mass values that we performed.This extended formulation accounts for the fact that there might be a residual implicit dependence of our extracted top-quark mass value (merely dubbed as m pole t throughout the text), on the MC topquark mass, m MC t , although the experimental collaborations did their best to minimize the dependence on m MC t of the unfolded data at the basis of our fits.Further investigations concerning this dependence go beyond the scope of the present work and might require access to the detector-level experimental data, not publicly available, in order to study how to further improve the unfolding procedure.
Furthermore, one should distinguish a situation when some unfolded data distribution might strongly depend on a parameter of the MC simulations (such as m MC t ), as it happens to the differential cross section as a function of the invariant mass of the tt pair in the dilepton channel when using the so called full kinematic reconstruction (due to a peculiar aspect of the kinematic reconstruction, namely because of the need to use the top-quark mass constraint at the reconstruction level, m kin t , in order to deal with the unknown momenta of the neutrinos which escape the detector).This dependence is covered by changing the value of m Table 2.The measurements of differential inclusive tt + X cross sections included in our analysis.
The number of data points (n) reported does not account for the last data bin, considering that the latter is discarded for all measurements in the data-to-theory comparison process.
Since we had to limit our choice to the measurements done in the full phase space, we do not use any of the LHCb measurements of tt + X production [97][98][99], which provide tt + X cross sections with cuts on the tt decay products.In the future, any measurements of tt production at LHCb reported after unfolding to the top-quark level without explicit cuts on the decay products would be useful.

χ 2 definition
The level of agreement between data and theory can be quantified using the χ 2 estimator.A χ 2 value is calculated by taking into account statistical and systematic experimental uncertainties as well as theoretical uncertainties: Here R N −1 is the column vector of the residuals calculated as the difference of the measured cross sections and theoretical predictions obtained by discarding one of the N bins (see below), and Cov N −1 is the (N − 1) × (N − 1) submatrix obtained from the full covariance matrix by discarding the corresponding row and column.The matrix Cov N −1 obtained in this way is invertible, while the original covariance matrix Cov is singular because for normalised cross sections one degree of freedom is lost.The covariance matrix Cov is calculated as: where Cov stat and Cov syst are the covariance matrices corresponding to the statistical and systematic uncertainties reported in the experimental papers, respectively, Cov th consists of numerical uncertainties of the theoretical predictions, and Cov PDF is the covariance matrix which comprises the PDF uncertainties.For some of the datasets [52, 55], a detailed breakdown of systematic uncertainties into individual sources is reported in the corresponding paper, instead of the covariance matrix.In such a case, the systematic covariance matrix Cov syst is calculated as where C i,k,l stands for the systematic uncertainty from variation l of source k in the ith bin, and N k is the number of variations for source k (most of the systematic uncertainty sources consist of convenient positive and negative variations, i.e.N k = 2), and the sums run over all sources of the systematic uncertainties and all corresponding variations.The covariance matrix Cov th is a diagonal matrix which includes a 1% uncorrelated uncertainty on the predictions, which we assign to cover any numerical inaccuracy and possible small systematic effects in the theoretical computations (see Section 2).In the same way, Cov PDF is calculated from the variations of theoretical predictions obtained using individual PDF error members.All uncertainties are treated as additive, i.e. the relative uncertainties are used to scale the corresponding measured value in the construction of Cov stat , Cov syst , Cov th and Cov PDF .This treatment is consistent with the cross-section normalization procedure and makes the χ 2 independent of which of the N bins is excluded.
No correlation has to be assumed between individual datasets, because no such information is provided by the experiments 6 .We deliberately opted to use normalized differential cross sections in order to minimize the impact of the lack of this information, since many correlated experimental systematic uncertainties cancel out for normalized cross sections.

Comparison of the NNLO theoretical predictions with the experimental data
For our comparison of data and NNLO theoretical predictions, we use four state-of-the-art NNLO proton PDF sets as input of the theory computations: ABMP16 [93], CT18 [94] 7 , that we already used for the validation/comparison of purely theoretical predictions in Section 2, MSHT20 [100] and NNPDF4.0[67] 8 .For each PDF set, we take the associated α s (M Z ) value and α s evolution via LHAPDF [101].Namely, in the extraction of ABMP16 PDFs the value α s = 0.1147 ± 0.0008 was obtained in the fit, while for the extraction of the CT18, MSHT20, and NNPDF4.0PDF sets a fixed value of α s = 0.118 was used.We remind here that the ABMP16 fit has incorporated data on total t t + X production cross section at the LHC Run 1, while the CT18, MSHT20 and NNPDF4.0ones have also incorporated some single-or double-differential distributions for this process.For one of the PDF + α s (M Z ) sets (ABMP), we consider different top-quark pole mass values (m pole t = 170, 172.5, 175 GeV), as well as 7-point scale variation around a central renormalization and factorization scale (µ R , µ F ) = (ξ R , ξ F ) ∈ {(1, 1), (0.5., 0.5), (0.5, 1), (1, 0.5), (1, 2), (2, 1), (2, 2)}(µ 0 R , µ 0 F ), with the nominal scales , where H T is the sum of the transverse masses of the top and the antitop quarks [20,39].
We start in Fig. 5 by comparing the absolute total tt + X cross-sections at √ s = 5.02, 7, 8, 13 and 13.6 TeV from Refs.[43-51] with the theoretical predictions.The first row of plots shows predictions obtained with different PDF + α s (M Z ) sets, at fixed m pole t = 172.5 GeV, the second row shows predictions for different m pole t mass values, whereas the third row shows predictions with ABMP16 for different (ξ R , ξ F ) combinations (see previous paragraph).For each comparison of data points and the corresponding theoretical predictions, a χ 2 value is reported.In addition, for each PDF + α s (M Z ) set, an additional χ 2 value is provided (indicated in parentheses in the panels), that omits the PDF uncertainties.Such χ 2 values characterize the level of agreement between the data points and theoretical predictions obtained using the central PDF set only, while the difference between these χ 2 values and the ones calculated with the PDF uncertainties provides an indication about the extent a given data sample could potentially constrain the PDF uncertainties of a particular PDF set.From the first row of plots one can see that, within the PDF uncertainties, all considered PDF sets describe the data well.The smallest PDF uncertainties occur for the NNPDF4.0PDF set, followed by ABMP16, MSHT20 and CT18 PDF sets.The sensitivity of the theoretical predictions to the PDF set decreases with increasing √ s, since lower √ s probe larger values of x, and the large-x region is characterized by a bigger PDF uncertainty, especially for the gluon PDF.From the second row of plots, one can conclude that for the case of ABMP16 the value of m pole t which is preferred by the data is between 170 and 172.5 GeV, while the larger value m pole t = 175 GeV is clearly disfavoured.Note that for the comparison with theoretical predictions which use different values of m pole t or varied 7 In this work the PDF uncertainties of the CT18 set, evaluated at 90% confidence level, are rescaled to 68% confidence level for consistency with other PDF sets.
8 For the NNPDF4.0set, we use its variant with eigenvector uncertainties in order to be able to calculate the Cov PDF matrix in analogy to the other PDF sets.
scales (i.e. for the second and third rows of plots) we do not show the PDF uncertainties on the plots for clarity, but we include them when calculating the χ 2 values displayed on these plots.As expected from kinematic considerations, the predicted tt + X cross sections decreases with increasing m pole t .Furthermore, it is noticeable that the sensitivity of the predicted cross section to m pole t decreases slightly with increasing √ s, since the mass then plays a smaller role in the kinematic region of the process.In the third row of plots the behaviour of the predicted cross sections under different (ξ R , ξ F ) combinations is shown, according to the 7-point scale variation around the central dynamical scale µ 0 = H T /4.One can see that scale uncertainties are asymmetric, amount roughly to +3 −5 % and slightly decrease with increasing √ s.Thus, the NNLO scale variation uncertainties for the total tt + X cross sections are larger than the experimental uncertainties of the most precise measurements of this process (e.g. in Ref. [51] the measured total tt +X cross section is reported with a 1.9% total uncertainty).No uncertainty associated with the scale dependence of the cross sections is included in the χ 2 calculation because the scale variation uncertainty does not follow a Gaussian distribution, while for the extraction of m pole t the scale uncertainties are propagated directly (see Section 4).
After the discussion of the absolute total inclusive cross sections, we present results for the comparison of NNLO QCD theory predictions with normalized differential experimental data on inclusive t t + X hadroproduction.The comparisons are shown in Figs.6-14, which refer to the single-or double-differential experimental data of Refs.[48, 52-54] obtained during Run 2, with the t t-quark pair decaying in all possible channels (dileptonic, semileptonic, all-hadronic), and of Refs.[55-58] obtained during Run 1 9 .
In Fig. 6 the absolute value of the rapidity distribution of the t t-quark pair, |y(tt)|, is plotted in various t t invariant mass M (tt) bins, corresponding to different panels, and compared to the experimental data of Ref.
[48], a CMS analysis with t t-quark pairs decaying in the semileptonic channel.Considering the phase space of the measurement, the number of measured data points and their experimental uncertainties, this is presently the most precise LHC dataset for double-differential tt + X production cross section as a function of M (tt) and |y(tt)|, employing unfolding from the final-state particle to the final-state parton level.As in Fig. 5, the first row of plots shows predictions obtained with different PDF + α s (M Z ) sets, at fixed m pole t = 172.5 GeV, the second row shows predictions for different m pole t mass values, whereas the third row shows predictions for different (ξ R , ξ F ) combinations.As for the absolute total tt + X cross sections, each comparison of data to the corresponding theoretical predictions is characterized by a χ 2 value.The first row clearly demonstrates that, in examining normalized cross sections, the best agreement between theoretical predictions and experimental data for the shape of the distributions is achieved when using the ABMP16 PDFs.Predictions with the CT18, MSHT20 and NNPDF4.0show a similar trend among each other, but the shapes are systematically different from those of the experimental distribution at large |y(tt)|, overestimating it.This is particularly evident in the large M (tt) bins.As in the case of the total tt + X (tt) [    cross sections, the PDF uncertainties are smallest for the NNPDF4.0PDF set, followed by the ABMP16, MSHT20 and CT18 PDF sets.Sizeable differences among the χ 2 values are observed for the predictions obtained using different PDF sets, especially among those predictions which do not take into account the PDF uncertainties, providing a hint that these data are able to constrain uncertainties of many of the modern PDF sets.At first sight, the better agreement of the ABMP16 predictions with the experimental data could look surprising, given that this fit includes only t t + X total cross-section data, whereas the other fits also include some t t + X differential data.However, one should recall that the ABMP16 is the only one, among those fits, where PDFs are fitted simultaneously with α s (M Z ) and m t , whereas the other PDFs are fitted for fixed m t and α s (M Z ) values.In the recent analysis assessing the impact of top-quark production at the LHC on global analyses of NNPDF4.0PDFs in Ref. [102], it is claimed that this dataset cannot be reproduced by the NNLO SM predictions, resulting in a 22 σ deviation, while here with the χ 2 values we demonstrate that the same dataset is perfectly consistent e.g. with the ABMP16 PDF set (with χ 2 /dof = 20/34 and the p-value of 0.97) even using the nominal value m pole t = 172.5 GeV, and reasonably consistent with the NNPDF4.0PDF set (χ 2 /dof = 55/34, p = 0.013, which corresponds to 2.5σ).The plots of the second row, all obtained with the ABMP16 PDFs, show that, the larger is the top-quark mass, varied in the range 170 GeV < m pole t < 175 GeV, the smaller is the cross section for low M (tt) close to the threshold, while the opposite is true for high M (tt) > 420 GeV because of the crosssection normalization.From all panels of this row it is evident that the shape of the |y(tt)| distribution is almost insensitive to the top-quark pole mass value.The plots of the third row show the behaviour of the distribution under different (ξ R , ξ F ) combinations.One can see that scale uncertainties increase at large M (tt), reaching up to ± 3% values in the highest M (tt) bin, that are comparable to the data uncertainties in this kinematic region.Due to the cross-section normalization, the average size of the scale uncertainties for the normalized differential cross sections is a few times smaller than for the total cross section 10 .The largest cancellation of the scale uncertainties between numerator and denominator for the normalized differential cross sections happens at small values of M (tt) and |y(tt)|, i.e.where scale uncertainties in the numerator and denominator have similar size, while for high values M (tt) ≳ 800 GeV and |y(tt)| ≳ 1.5 the size of the scale uncertainties for the normalized differential cross sections approaches the one for the total cross section.As for the case of the total tt + X cross sections, no uncertainty associated with the scale dependence of the cross section is included in the χ 2 calculation, while later these uncertainties are propagated to the extracted values of m pole t (see Section 4).Fig. 7 presents a similar comparison, again for the |y(tt)| distribution in different M (tt) bins, but this time obtained in the CMS study of Ref.
[52] from the dileptonic decay channel of the top quarks.In this experimental work, triple-differential distributions were considered, in |y(tt)|, M (tt) and the number of additional jets N j .Here we limit ourselves to double-differential distributions, because for a meaningful analysis of the N j distribution, merged predictions for t t + X, t tj + X, t tjj + X, etc. should be considered.
NNLO QCD predictions, however, at present are only available for t t + X.In the first row of plots, results from different PDFs are compared, whereas in the second row, the effect of varying m pole t at fixed PDF + α s (M Z ) is shown.The same trends as already noticed for the CMS data of Ref. [48] are observed even in this case, i.e., the shape of the |y(tt)| distribution is better reproduced by the ABMP16 fit, than by other PDF sets, and the use of different m pole t values does not have an impact on it, but only affects the normalization of the results.Theory predictions with all the considered central PDF sets slightly underestimate the data in the smallest M (tt) bin, but are still compatible with them within 2σ.Like in Fig. 6, increasing the top-quark mass value decreases the value of the prediction of the |y(tt)| distribution at small invariant masses, whereas at high M (tt), i.e. for M (tt) > 400 GeV, the trend is the opposite due to the cross-section normalization.
Fig. 8 presents the M (tt) distribution in different |y(tt)| bins, corresponding to different panels.Theory predictions with different PDF + α s (M Z ) sets at fixed m pole t value (first row) and for different m pole t values in case of the ABMP16 fit (second row), are compared to the ATLAS data of Ref. [53] with the tt-quark pairs decaying in the semileptonic channel.One can make similar comments as for Fig. 6 and Fig. 7, i.e. that for small M (tt) increasing the top-quark mass value decreases the predictions, vice versa in the high M (tt) tails.Additionally, from the four panels in the first row, it is clear that the theory predictions agree with the data within larger experimental uncertainties even for small M (tt), differently from what has been observed in case of the CMS experimental data of Fig. 7 and Fig. 10 (see the plot in the first panel of the first row).However, among all the χ 2 values per degree of freedom for the theory and data comparison in Figs.6-14 the best ones vary within 0.25 ≲ χ 2 /dof ≲ 1, and the corresponding p-values indicate that the experimental uncertainties might be conservative.It would be interesting to understand the origin of this difference, that might be related to technical details of the analyses performed independently by the two collaborations (ATLAS and CMS).In particular, to some extent experimentally measured cross sections always depend on the value of the topquark mass parameter used in the Monte-Carlo (MC) simulations.The MC simulations are a necessary ingredient to unfold the detector-level cross sections to the parton level.In the CMS analysis of Ref.
[52] such a dependence was studied, and a special analysis technique, called "loose kinematic reconstruction", was developed in order to avoid the dependence on the MC top-quark mass.In that analysis, the loose kinematic reconstruction was used to measure triple-differential cross sections as a function of M (tt), |y(tt)|, and extra jet multiplicity.As a result, for this measurement the dependence of the measured tt + X cross sections on the top-quark MC mass was demonstrated to be negligible (see [52] and discussion in Section 3.1).No such studies were reported in the papers describing the other CMS or ATLAS tt + X differential measurements.
Fig. 9 shows again the |y(tt)| distribution in M (tt) bins, this time for the ATLAS experimental data of Ref. [54], with the t t-quark pairs decaying in the all-hadronic channel.Considering the larger data uncertainties, one can see a reasonable agreement between theory predictions and experimental data in all bins.The shape of distributions with different PDFs is in qualitative agreement with the one already observed in the case of the analyses of t t in the dileptonic and semileptonic decay channels.Theory predictions slightly overestimate the data in the two largest M (tt) bins at M (tt) > 700 GeV.Thus, this data set agrees better with theoretical predictions with a lower value of m pole t , however, due to the very wide 0 < M (tt) < 700 GeV bin, the sensitivity of these data to m pole t is small.Fig. 10 presents also a comparison between theory predictions and experimental data similar to those presented in Figs. 6, 7 and 9, but considering the double-differential data in |y(tt)|, M (tt) of Ref. [55], from an analysis of the CMS collaboration using the dileptonic tt decay channel at √ s = 8 TeV (Run 1).From the comparison with previous figures, all  related to Run 2, one can observe again compatible trends, with the ABMP16 PDFs well reproducing the shape and the normalization of the |y(tt)| distribution for M (tt) > 400 GeV, whereas the theory/data agreement for M (tt) < 400 GeV is slightly worse, but still compatible within 2σ.
Figs. 11-14 refer to single-differential M (tt) distributions, compared to Run 1 experimental ATLAS data at √ s = 7 and 8 TeV.Considering the reduced amount of information with respect to the case of double-differential distributions, due to the integration over y(t t) on the full phase-space for each M (tt) bin and the small number of data points, all the measurements agree well with the theoretical predictions and have a limited sensitivity to PDFs and m pole t .For each of these datasets, the best χ 2 value is χ 2 < 2 for 4-6 degree of freedom, suggesting that even for these ATLAS measurements of tt + X differential cross sections performed during Run 1 the experimental uncertainties might be too conservative.
Considering all together the comparisons between the experimental data and theory predictions, both for the total and differential inclusive tt + X cross sections, we report an overall good agreement between the NNLO QCD predictions and the data.For all datasets, the best value of χ 2 per degree of freedom does not exceed ≈ 1, whereas for several ATLAS datasets this value and the corresponding p-value hint towards a possible overestimation of the experimental uncertainties.For the differential measurements, one can conclude that, apart from a few (|y(tt)|, M (tt)) bins, the experimental data from different analyses can be considered as roughly compatible among each other.

NNLO fits of the top-quark pole mass value
We now focus on the fits of the top-quark pole mass value based on the NNLO description discussed in the previous section.The fits were performed using the xFitter framework [59], an open source QCD fit framework, initially developed for extracting PDFs, then extended to also extract SM parameters correlated with PDFs (e.g.heavy-quark masses) and, more recently, to constrain couplings in the SM as an effective field theory -25 -    (SMEFT) [103].To study the sensitivity to PDFs and α s (M Z ), each fit was carried out using as input different PDF sets with their associated α s (M z ).The same state-of-the-art sets considered in the previous section (ABMP16, CT18, MSHT20 and NNPDF4.0)were considered also here.As we will see in the following, the extracted m pole t values associated with different (PDF + α s (M Z )) sets are well compatible among each other, which justifies the procedure.A more sophisticated procedure would have involved a simultaneous fit of m t , PDFs and α s (M Z ), in line with the procedure by Ref. [93] at NNLO and by, e.g., Refs.[52, 104,105] at NLO.However, Ref. [93] includes only t t + X total cross sections, whereas in this work we also consider single-and double-differential distributions, which would make a simultaneous fit of the three quantities at NNLO quite a major effort.Additionally, due to the differential cross sections being normalized, the correlation degree between α s (M Z ) and m t , that is very large when considering absolute total cross-section data, is significantly reduced.We also observe that, while the ABMP16 fit has incorporated data on the total t t + X production cross section at the LHC Run 1, the CT18, MSHT20 and NNPDF4.0ones have also included some single-and double-differential distributions from Run 1 and 2. No triple-differential t t + X distributions from Ref. [52] have been incorporated in the standard released version of any of these PDF fits, at least so far.
On the other hand, dedicated studies on the impact of top-quark production cross sections on various recent PDF fits have also been published.In particular, Ref. [106] refers to the work building upon the MMHT PDFs, Ref. [107] describes the impact of ATLAS and CMS single-differential data at √ s = 8 TeV on the CTEQ-TEA fit, Ref. [66] explores the impact of single-differential data at √ s = 13 TeV on the CT18 PDF fit, Ref. [108] shows the impact of including t t-quark pair distributions in the CT14HERA2 PDFs, Ref. [102] assesses the impact of tt production at the LHC on the NNPDF PDFs and on Wilson coefficients in the SMEFT, whereas Ref. [109] describes efforts to constrain the top-quark mass within the global MSHT fit.Our work employs consistently NNLO predictions (instead of NLO ones rescaled by means of K-factors) and considers a wider set of more specific (i.e.double-differential) state-of-the-art t t + X experimental data than most of the previous ones, and it considers simultaneously multiple modern PDF fits, aiming to provide a more comprehensive overview.
In each fit, the value of m pole t is extracted by calculating a χ 2 from data and corresponding theoretical predictions for a few values of m pole t , and approximating the dependence of the χ 2 on m pole t with a parabola.The minimum of the parabola is taken as the extracted m pole t value, while the uncertainty on the latter is derived from the ∆χ 2 = 1 variation.In this prescription one assumes a linear dependence of the predicted cross sections on m pole t .As an example, in Fig. 15 the χ 2 profiles for the m pole t scans are shown in case of our most global fit, including total and differential cross-section data from both Run 1 and Run 2. The left panel shows that the profiles obtained using as input different PDF sets all have a parabolic shape with a clear minimum and are quite similar among each other.We use the values m pole t = 170, 172.5 and 175 GeV to build each parabola, but we also show the χ 2 values obtained repeating the computations with an input of m pole t = 165, 167.5 and 177.5 GeV.The latter values agree with the parabola reasonably well, thus justifying the linear dependence assumption, even outside the range of 170 < m pole t < 175 GeV.The topquark mass value at the minimum value of χ 2 varies by up to 0.6 GeV between the PDF sets considered.For the ABMP16 PDF set, the value of χ 2 at the minimum is χ 2 = 101 for 121 dof (corresponding to the total of 122 data points).From the right panel, obtained using as a basis the ABMP16 PDFs, it is clear that the extracted top-quark mass value is quite stable with respect to the (µ R , µ F ) 7-point scale variation by a factor of two around the central (µ R , µ F ) scale.For this variation, the difference in the central value of the extracted top-quark mass amounts to a maximum of 0.2 GeV.On the other hand, the uncertainty on m pole t derived from a ∆χ 2 = 1 variation, that accounts for the uncertainties of ABMP16 PDFs, the data and other sources included in the covariance matrix, see eq. (3.2), and thus excludes scale uncertainties, amounts to 0.3 GeV.We assign the maximum difference on the χ 2 values from the (µ R , µ F ) 7-point scale variation as a scale variation uncertainty on the extracted m pole t value.When treated in this way, the scale variation uncertainty cannot be constrained by data (see also the related comment in Section 3.3).These uncertainties are asymmetric, as a consequence of the fact that the changes of the cross sections under scale variations are asymmetric.The χ 2 values obtained in the fit for each data set are summarized in Table 3  In Figs.16 and 17 the results for the m pole t extraction from various (groups of) experimental datasets are shown.In the left and right upper panels of Fig. 16, results related to Run 1 and Run 2 measurements of differential tt + X cross sections are shown separately, considering the fit to each individual Run 1 (Run 2) dataset, and the fit to all Run 1 (Run 2) differential datasets.Figure 17 reports the results of the fit to the Run 1 + Run 2 data on total cross sections only, as well as the results of the global fit, including both total and single/double-differential cross-section Run 1 + Run 2 data.In the right panel of Fig. 16 we do not show as a separate plot the results of the fit to the ATLAS measurement in the all-hadronic tt decay channel of Ref. [54], since it has a very low sensitivity to m pole t , and the m pole t uncertainties amount to ∼ 5 GeV when using this dataset alone, but this measurement is in any case considered in both the Run 2 and (Run 1 + Run 2) global fits.By comparing the two panels in Fig. 16, one can see that for fits to individual datasets the fit uncertainty component due to data uncertainties 11   case of Run 1 than Run 2 datasets.In the Run 2 analyses, in general, more accurate systematic uncertainty estimates have taken place, sometimes leading to an increase of the latter with respect to Run 1 cases.On the contrary, the statistics uncertainties in the Run 2 data are indeed smaller.The balance between these two trends reduces datarelated uncertainties by a factor ∼ 1.5.On the other hand, data uncertainties in the case of fits to t t data in the dileptonic channel can be larger or smaller than those in the semileptonic channel, depending on the details of the analysis.In the dileptonic channels, the neutrino and antineutrino are not detected, and their tri-momenta (six unknowns) are reconstructed on the basis of kinematic considerations, whereas in the semileptonic channel only one of them needs to be reconstructed, which is an advantage when reconstructing differential distributions.However, jet energy scale uncertainties may become important in the semileptonic channel, due to the presence of two light jets (absent in the dileptonic channel).The dileptonic channel is less sensitive to pile-up, and when studying the (e ± , µ ∓ ) signature, becomes the most practical one for measuring total cross sections, due to the absence of Drell-Yan background.For more information on the peculiarities of each analysis and the kinematic reconstruction techniques, the reader can refer to the corresponding experimental paper (see the lists in Tables 1 and 2).Both in case of Run 1 and in case of Run 2 global fits, the m pole t central values obtained for each PDF set are compatible among each other, and, although systematically slightly lower, also compatible with the value m pole t = 172.5 ± 0.7 GeV reported in the PDG [110], when considering the uncertainties.The results of the extraction using either differential or total tt + X cross sections agree with each other within ≈ 1σ, for any PDF set.We consider the compatibility of the results obtained as a sign of their robustness.The global fit to the ensemble of Run 2 datasets shows slightly smaller data-related uncertainties than the global fit to the ensemble of Run 1 datasets, whereas scale uncertainties are similar (and slightly visible) was added to the data uncertainties on these plots.larger or smaller, depending on the PDF) in the two cases.One can also observe that data related to t t decays in the dileptonic channel (Refs.[52,55]) point towards central m pole t values smaller than data related to decays in the semileptonic channel (Ref.[53] by ATLAS and Ref. [48] by CMS).To explore this further, in Fig. 18 we compare the values of m pole t extracted using the ABMP16 PDF set from differential measurements which are grouped either by the experiment or decay channel.The values extracted from all ATLAS and all CMS differential measurements are compatible within 2.5 σ, and the same level of compatibility is observed for the results extracted from the measurements in the dileptonic or semileptonic tt decay channels.In both cases, the difference originates almost entirely from the CMS measurements of Refs.[52,55]    As for the theoretical uncertainties on the extracted m pole t values, noticeably, the PDF uncertainties are larger for single-differential measurements than for the double-differential ones.We attribute this to the fact that the usage of the |y(tt)| observable provides an added value to constrain the PDF uncertainties and decorrelate the gluon PDF and m pole t .
In general, all extractions from the differential cross-section measurements are dominated by the data uncertainties, while the PDF and scale variation uncertainties are a few times smaller.Even for the combined extraction from Run 1 or 2 differential cross-section measurements, both the PDF and scale uncertainties are about a factor two smaller than the   duction data, is within reach.We plan to perform such a fit in future work, upgrading the precision and accuracy of the NLO fit results we have presented in Ref. [105].

2 t + p 2 TFigure 1 .
Figure 1.Comparison between the NNLO differential cross sections as a function of a) the average transverse momentum of the top quark and antiquark (upper left), b) the average rapidity of the top quark and antiquark (upper right), c) the invariant mass of the top-antitop quark pair (lower left)and d) the rapidity of the top-antitop quark pair (lower right), computed with MATRIX using the cuts r 0 = 0.0015 and r 0 = 0.0005, and the results from Ref.[20], dubbed CHM in the figure, with their numerical uncertainties.

Figure 2 .
Figure 2. Comparison between the NNLO differential cross sections as a function of the rapidity of the top-antitop quark pair in different ranges of the invariant mass of the top-antitop quark pair, computed with MATRIX using the cuts r 0 = 0.0015 and r 0 = 0.0005.

Figure 3 .
Figure3.Comparison of the NNLO differential cross sections as a function of M (tt) computed with MATRIX with the results obtained using the HighTEA project platform[28].The error bars account for the numerical uncertainties accompanying the two computations.

Figure 4 .
Figure 4. Comparison of the NNLO differential cross sections with the ABMP16 (upper row) and the CT18 (lower row) PDFs as a function of |y(tt)| computed with MATRIX with their numerical uncertainties and the ones obtained from the PineAPPL grids.Each panel in a row refers to a different M (tt) region, as in the experimental measurement from Ref. [48].

Figure 5 .
Figure 5.Comparison of the experimental data on the total tt + X cross sections at different √ s from Refs.[43-51] to the NNLO predictions obtained using different PDF sets (upper), and, for the ABMP16 central PDF member, different m pole t

1 / 4 ABMP16 2 =20 1 / 1 /Figure 6 .
Figure 6.Comparison of the experimental data from Ref. [48] to the NNLO predictions obtained using different PDF sets (upper), and, for the ABMP16 central PDF member, different m pole t values (middle) and different scales (lower).

1 / 4 ABMP16 2 =21 1 /Figure 7 .
Figure 7.Comparison of the experimental data from Ref. [52] to the NNLO predictions obtained using different PDF sets (upper) and, for the ABMP16 central PDF member, different m pole t values (lower).

Figure 8 .
Figure 8.Comparison of the experimental data from Ref. [53] to the NNLO predictions obtained using different PDF sets (upper) and, for the ABMP16 central PDF member, different m pole t values (lower).

2 ABMP16rFigure 15 .
Figure 15.The m pole t extraction at NNLO from all experimental data using different PDF sets (left) and using the ABMP16 PDF set with different scale choices (right).

Figure 16 .
Figure 16.The m pole t values extracted at NNLO from Run1 (left) and Run2 (right) measurements of differential tt + X cross sections.

Figure 17 .
Figure 17.Summary of the m pole t values extracted from Run1 and Run2 measurements of differential and total tt +X cross sections (the first two insets are the same as the lowest insets in Fig. 16.).
± 1 GeV, see e.g.Refs.[52,55]).These variations are included in the systematic uncertainties accompanying the measured cross sections.If, however, the real value of these two parameters is outside the range of considered variations, it would introduce a bias.This m kin t constraint is not used in the loose kinematic reconstruction developed in the CMS measurement from Ref. [52].As a consequence, for this measurement the dependence of the measured tt + X cross sections on m MC t was demonstrated to be negligible (see Fig. C.1 in Ref. [52]), cf. also Section 3.3.
[57]arison of the experimental data from Ref.[56]to the NNLO predictions obtained using different PDF sets (left) and, for the ABMP16 central PDF member, different m pole Comparison of the experimental data from Ref.[57]to the NNLO predictions obtained using different PDF sets (left) and, for the ABMP16 central PDF member, different m pole .
is often larger in the

Table 3 .
The global and partial χ 2 values for each data set with its number of data points (n) obtained in the m pole t extraction using different PDF sets and the best-fit values of m pole t .An additional χ 2 value is indicated in parentheses, which omits the PDF uncertainties.
which point to a lower value of m pole