1 Introduction

Measurements of top quark pair (\(\hbox {t}{\bar{\hbox {t}}}\)) production are important for checking the validity of the standard model (SM) and searching for new phenomena. In particular, the large data set delivered by the CERN LHC allows precise measurements of the \(\hbox {t}{\bar{\hbox {t}}}\) production cross section as a function of \(\hbox {t}{\bar{\hbox {t}}}\) kinematic observables. These can be used to check the most recent predictions of perturbative quantum chromodynamics (QCD) and to constrain input parameters, some of which are fundamental to the SM. At the LHC, top quarks are predominantly produced via gluon-gluon fusion. Using measurements of the production cross section in a global fit of parton distribution functions (PDFs) can help determine the gluon distribution at large values of x [1,2,3], where x is the fraction of the proton momentum carried by a parton. Furthermore, measurements of the cross section as a function of the \(\hbox {t}{\bar{\hbox {t}}}\) invariant mass, from the threshold to the TeV region, provide high sensitivity for constraining the top quark pole mass, \(m_{{\text {t}}}^{{\text {pole}}}\), which is defined as the pole of the top quark propagator (see e.g. Refs. [4,5,6]). At LHC energies, a large fraction of \(\hbox {t}{\bar{\hbox {t}}}\) events is produced with additional hard jets in the final state. Events containing such additional jets constitute important backgrounds for interesting but rare SM processes such as the associated production of a Higgs boson and \(\hbox {t}{\bar{\hbox {t}}}\), as well as for searches for new physics associated with \(\hbox {t}{\bar{\hbox {t}}}\) production, and must therefore be well understood. Within the SM, processes with extra jets can also be used to constrain the strong coupling strength, \(\alpha _{S}\), at the scale of the top quark mass. Furthermore, the production of \(\hbox {t}{\bar{\hbox {t}}}\) in association with extra jets provides additional sensitivity to \(m_{{\text {t}}}^{{\text {pole}}}\) since gluon radiation depends on \(m_{{\text {t}}}^{{\text {pole}}}\) through threshold and cone effects [7].

Differential cross sections for \(\hbox {t}{\bar{\hbox {t}}}\) production have been measured previously in proton-antiproton collisions at the Tevatron at a centre-of-mass energy of 1.96\(\,{\text {TeV}}\) [8, 9] and in proton-proton (\({\text {p}}{\text {p}}\)) collisions at the LHC at \(\sqrt{s} = 7\) \(\,{\text {TeV}}\) [10,11,12,13,14], 8\(\,{\text {TeV}}\) [14,15,16,17,18,19,20,21], and 13\(\,{\text {TeV}}\) [22,23,24,25,26,27]. A milestone was reached in three CMS analyses [20, 22, 23], where the \(\hbox {t}{\bar{\hbox {t}}}\) production dynamics was probed with double-differential cross sections. The first analysis [20] used data recorded at \(\sqrt{s}=8\) \(\,{\text {TeV}}\) by the CMS experiment in 2012. Only \(\hbox {t}{\bar{\hbox {t}}} \) decays where, after the decay of each top quark into a bottom quark and a \({\text {W}}\) boson, both of the \({\text {W}}\) bosons decay leptonically were considered. Specifically, the \({\text {e}}^\pm {\upmu }^\pm \) decay mode (\({\text {e}}{\upmu }\)) was selected, requiring two oppositely charged leptons and at least two jets. Our present paper provides a new measurement, following the procedures of Ref. [20]. It is based on data taken by the CMS experiment in 2016 at \(\sqrt{s}=13\) \(\,{\text {TeV}}\), corresponding to an integrated luminosity of \(35.9 \pm 0.9{\,{\text {fb}}^{-1}} \). In addition to \({\text {e}}{\upmu }\), the decay modes \({\text {e}}^{+} {{\text {e}}}^{-} \) (\({\text {e}}{\text {e}}\)) and \({{\upmu }}^{+} {{\upmu }}^{-} \) (\({\upmu }{\upmu }\)) are also selected, roughly doubling thereby the total number of expected \(\hbox {t}{\bar{\hbox {t}}}\) signal events. Our latest measurement complements the analyses [22, 23], based on data taken at \(\sqrt{s}=13\) \(\,{\text {TeV}}\), but using \(\hbox {t}{\bar{\hbox {t}}}\) decays in the \(\ell {\text {+jets}}\,\) (\(\ell ={\text {e}},{\upmu }\)) final state.

As in the previous work [20], measurements are performed of the normalised double-differential \(\hbox {t}{\bar{\hbox {t}}} \) production cross section as a function of observables describing the kinematic properties of the top quark, top antiquark, and the \(\hbox {t}{\bar{\hbox {t}}}\) system: the transverse momentum of the top quark, \(p_{{\mathrm {T}}} ({\text {t}})\), rapidity of the top quark, \(y({\text {t}})\); the transverse momentum, \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\), the rapidity, \(y(\hbox {t}{\bar{\hbox {t}}})\), and invariant mass, \(M(\hbox {t}{\bar{\hbox {t}}})\), of the \(\hbox {t}{\bar{\hbox {t}}}\) system; the pseudorapidity difference between the top quark and antiquark, \({\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}})\), and the angle between the top quark and antiquark in the transverse plane, \({\varDelta }\phi ({\text {t}},{\bar{{\text {t}}}})\). When referring to the kinematic variables \(p_{{\mathrm {T}}} ({\text {t}})\) and \(y({\text {t}})\), we use only the parameters of the top quark and not of the top antiquark, to avoid double counting of events. In all, the double-differential \(\hbox {t}{\bar{\hbox {t}}}\) cross section is measured as a function of six different pairs of kinematic variables. As demonstrated in Ref. [20], the different combinations of kinematic variables are sensitive to different aspects of the QCD calculations.

For the first time at the LHC, the triple-differential cross section is measured as a function of \(M(\hbox {t}{\bar{\hbox {t}}})\), \(y(\hbox {t}{\bar{\hbox {t}}})\), and \(N_{{\text {jet}}}\), where \(N_{{\text {jet}}}\) is the number of extra jets not arising from the decay of the \(\hbox {t}{\bar{\hbox {t}}}\) system. For this purpose, the kinematic reconstruction algorithm is optimised to determine the invariant mass of the \(\hbox {t}{\bar{\hbox {t}}}\) system in an unbiased way. As will be shown below, the triple-differential measurements provide tight constraints on the parametrised gluon PDF, as well as on \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\). Previous studies of additional jet activity in \(\hbox {t}{\bar{\hbox {t}}}\) events at the LHC can be found in Refs. [23, 28, 29]. The \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\) parameters were also extracted from measurements of the total inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production cross sections in Refs. [30,31,32,33,34,35].

The measurements are defined at parton level and must therefore be corrected for effects of hadronisation, and detector resolution and inefficiency. A regularised unfolding process is performed simultaneously in bins of the two or three variables in which the cross sections are measured. The normalised differential \(\hbox {t}{\bar{\hbox {t}}}\) cross section is determined by dividing the distributions by the measured total inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production cross section, where the latter is evaluated by integrating over all bins in the respective observables.

The parton-level results are compared with theoretical predictions obtained with the generators powheg  (version 2) [36, 37] and mg5_amc@nlo [38], interfaced to pythia [39, 40] for parton showering, hadronisation, and multiple-parton interactions (MPIs). They are also compared to theoretical predictions obtained at next-to-leading-order (NLO) QCD using several sets of PDFs, after applying corrections for non-perturbative (NP) effects.

The structure of the paper is as follows: Sect. 2 contains a brief description of the CMS detector. Details of the event simulation are given in Sect. 3. The event selection, kinematic reconstruction, and comparison between data and simulation are described in Sect. 4. The unfolding procedure is detailed in Sect. 5, the method to determine the differential cross sections is presented in Sect. 6, and the assessment of the systematic uncertainties is discussed in Sect. 7. We show the results of the measurement and their comparison to theoretical predictions in Sect. 8. Section 9 presents the extraction of \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\) from the measured \(\hbox {t}{\bar{\hbox {t}}}\) cross section, using several sets of PDFs, and Sect. 10 presents the simultaneous fit of the PDFs, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\) to the data. Finally, Sect. 11 provides a summary.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of \(6\,{\text {m}} \) internal diameter, providing a magnetic field of \(3.8\,{\text {T}} \). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the \(\eta \) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionisation detectors embedded in the steel flux-return yoke outside the solenoid. Events of interest are selected using a two-tiered trigger system [41]. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around \(100\,{\text {kHz}} \) within a time interval of less than \(4\mu \hbox {s}\). The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimised for fast processing, and reduces the event rate to around \(1\,{\text {kHz}} \) before data storage. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [42].

3 Event simulation

Simulations of physics processes are performed with Monte Carlo (MC) event generators and serve three purposes: firstly, to obtain representative SM predictions of \(\hbox {t}{\bar{\hbox {t}}}\) production cross sections to be compared to the results of this analysis. Secondly, when interfacing generated \(\hbox {t}{\bar{\hbox {t}}}\) signal events with a detector simulation, to determine corrections for the effects of hadronisation, reconstruction and selection efficiencies, and resolutions that are to be applied to the data. Thirdly, when interfacing generated background processes to the detector simulation, to obtain predictions for the backgrounds. All MC programs used in this analysis perform the event generation in several steps: matrix-element (ME) level, parton showering matched to ME, hadronisation, and underlying event, including multiparton interaction (MPI). The \(\hbox {t}{\bar{\hbox {t}}}\) signal processes are simulated with ME calculations at NLO in QCD. For all simulations the proton structure is described by the NNPDF 3.0 NLO PDF set with \(\alpha _{S}(m_{{\text {Z}}}) = 0.118\) [43] where \(m_{{\text {Z}}} = 91\) \(\,{\text {GeV}}\) is the \({\text {Z}}\)boson mass [44], and the value of the top quark mass parameter is fixed to \(m_{{\text {t}}}^{{\text {MC}}} =172.5\,{\text {GeV}} \). For the default signal simulation, the powheg (version 2) [36, 45, 46] generator is taken. The \(h_{\text {damp}}\) parameter of powheg, which regulates the damping of real emissions in the NLO calculation when matching to the parton shower, is set to \(h_{\text {damp}} = 1.581 m_{{\text {t}}}^{{\text {MC}}} \) [47]. The pythia program (version 8.2) [40] with the CUETP8M2T4 tune [47,48,49] is used to model parton showering, hadronisation and MPIs. An alternative sample is generated using the mg5_amc@nlo (version 2.2.2) [38] generator, including up to two extra partons at the ME level at NLO. In this setup, referred to as \({\textsc {mg}}5\_{{\text {a}}}{\textsc {mc@nlo}}\ + {\textsc {pythia}}, {\textsc {MadSpin}}\) [50] is used to model the decays of the top quarks while preserving their spin correlation. The events are matched to pythia using the FxFx prescription [51]. A second alternative sample is generated with powheg and interfaced with herwig++  (version 2.7.1) [52] using the EE5C tune [53].

The main background contributions originate from single top quarks produced in association with a \({\text {W}}\) boson (\({\text {t}}{\text {W}}\)), \({\text {Z}}/{\upgamma }^{*}\) bosons produced with additional jets (\({\text {Z}}\)+jets), \({\text {W}}\) boson production with additional jets (\({\text {W}}\)+jets) and diboson (\({\text {W}}\) \({\text {W}}\), \({\text {W}}\) \({\text {Z}}\), and \({\text {Z}}\) \({\text {Z}}\)) events. Other backgrounds are negligible. For all background samples, the NNPDF3.0 [43] PDF set is used and parton showering, hadronisation, and MPIs are simulated with pythia. Single top quark production is simulated with powheg (version 1) [37, 54] using the CUETP8M2T4 tune in pythia with the \(h_{\text {damp}}\) parameter set to 172.5\(\,{\text {GeV}}\) in powheg. The \({\text {Z}}\)+jets process is simulated at NLO using mg5_amc@nlo with up to two additional partons at ME level and matched to pythia using the FxFx prescription. The \({\text {W}}\)+jets process is simulated at leading order (LO) using mg5_amc@nlo with up to four additional partons at ME level and matched to pythia using the MLM prescription [55]. Diboson events are simulated with pythia. Predictions are normalised based on their theoretical cross sections and the integrated luminosity of the data sample. The cross sections are calculated to approximate next-to-NLO (NNLO) for single top quark in the \({\text {t}}\) \({\text {W}}\) channel [56], NNLO for \({\text {Z}}\)+jets and \({\text {W}}\)+jets [57], and NLO for diboson production [58]. The \(\hbox {t}{\bar{\hbox {t}}}\) simulation is normalised to a cross section of \(832\;\begin{array}{c} +20\\ -29 \end{array}{{\text {(scale)}}} \pm 35({{\mathrm {PDF}}} + \alpha _{S})\,{\text {pb}} \) calculated with the Top++ (version 2.0) program [59] at NNLO including resummation of next-to-next-to-leading-logarithm (NNLL) soft-gluon terms assuming \(m_{{\text {t}}}^{{\text {pole}}} = 172.5\,{\text {GeV}} \) and the proton structure described by the CT14 NNLO PDF set [60].

To model the effect of additional \({\text {p}}{\text {p}}\) interactions within the same bunch crossing (pileup), simulated minimum bias interactions are added to the simulated data. Events in the simulation are then weighted to reproduce the pileup distribution in the data, which is estimated from the measured bunch-to-bunch instantaneous luminosity assuming a total inelastic \({\text {p}}{\text {p}}\) cross section of 69.2\(\,{\text {mb}}\) [61].

In all cases, the interactions of particles with the CMS detector are simulated using Geant4 (version 9.4) [62].

4 Event selection and \(\hbox {t}{\bar{\hbox {t}}}\) kinematic reconstruction

The event selection procedure follows closely the one reported in Ref. [27]. Events are selected that correspond to the decay topology where both top quarks decay into a \({\text {W}}\) boson and a \({\text {b}}\) quark, and each of the \({\text {W}}\) bosons decays directly into an electron or a muon and a neutrino. This defines the signal, while all other \(\hbox {t}{\bar{\hbox {t}}}\) events, including those with at least one electron or muon originating from the decay of a \({\uptau }\) lepton are regarded as background. The signal comprises three distinct final state channels: the same-flavour channels corresponding to two electrons (\({\text {e}}^{+} {{\text {e}}}^{-} \)) or two muons (\({{\upmu }}^{+} {{\upmu }}^{-} \)) and the different-flavour channel corresponding to one electron and one muon (\({\text {e}}^\pm {\upmu }^\pm \)). Final results are derived by combining the three channels.

At HLT level, events are selected either by single-lepton or dilepton triggers. The former require the presence of at least one electron or muon and the latter the presence of either two electrons, two muons, or an electron and a muon. For the single-electron and -muon triggers, \(p_{{\mathrm {T}}}\) thresholds of 27 and 24\(\,{\text {GeV}}\) are applied, respectively. The same-flavour dilepton triggers require either an electron pair with \(p_{{\mathrm {T}}} > 23\,{\text {GeV}} \) for the leading electron and \(p_{{\mathrm {T}}} > 12 \,{\text {GeV}} \) for the subleading electron or a muon pair with \(p_{{\mathrm {T}}} > 17 \,{\text {GeV}} \) for the leading muon and \(p_{{\mathrm {T}}} > 8 \,{\text {GeV}} \) for the subleading muon. Here leading and subleading refers to the electron or muon with the highest and second-highest \(p_{{\mathrm {T}}}\), respectively, in the event. The different-flavour dilepton triggers require either an electron with \(p_{{\mathrm {T}}} > 23 \,{\text {GeV}} \) and a muon with \(p_{{\mathrm {T}}} > 8 \,{\text {GeV}} \), or a muon with \(p_{{\mathrm {T}}} > 23 \,{\text {GeV}} \) and an electron with \(p_{{\mathrm {T}}} > 8 \,{\text {GeV}} \).

Events are reconstructed using a particle-flow (PF) algorithm [63], which aims to identify and reconstruct each individual particle in an event with an optimised combination of information from the various elements of the CMS detector. Charged hadrons from pileup are subtracted on an event-by-event basis. Subsequently, the remaining neutral-hadron component from pileup is accounted for through jet energy corrections [64].

Electron candidates are reconstructed from a combination of the track momentum at the main interaction vertex, the corresponding energy deposition in the ECAL, and the energy sum of all bremsstrahlung photons associated with the track [65]. The electron candidates are required to have \(p_{{\mathrm {T}}} > 25 \,{\text {GeV}} \) for the leading candidate and \(p_{{\mathrm {T}}} > 20 \,{\text {GeV}} \) for the subleading candidate and \(|\eta | < 2.4\). Electron candidates with ECAL clusters in the region between the barrel and endcap (\(1.44<|\eta _{{\mathrm {cluster}}} |< 1.57\)) are excluded, because the reconstruction of an electron object in this region is not optimal. A relative isolation criterion \(I_{{\text {rel}}} < 0.06\) is applied, where \(I_{{\text {rel}}}\) is defined as the \(p_{{\mathrm {T}}}\) sum of all neutral hadron, charged hadron, and photon candidates within a distance of 0.3 from the electron in \(\eta \)\(\phi \) space, divided by the \(p_{{\mathrm {T}}}\) of the electron candidate. In addition, electron identification requirements are applied to reject misidentified electron candidates and candidates originating from photon conversions. Muon candidates are reconstructed using the track information from the tracker and the muon system. They are required to have \(p_{{\mathrm {T}}} > 25\,{\text {GeV}} \) for the leading candidate and \(p_{{\mathrm {T}}} > 20 \,{\text {GeV}} \) for the subleading candidate and \(|\eta |<2.4\). An isolation requirement of \(I_{{\text {rel}}} < 0.15\) is applied to muon candidates, including particles within a distance of 0.4 from the muon in \(\eta \)\(\phi \) space. In addition, muon identification requirements are applied to reject misidentified muon candidates and candidates originating from in-flight decay processes. For both electron and muon candidates, a correction is applied to \(I_{{\text {rel}}}\) to suppress residual pileup effects.

Jets are reconstructed by clustering the PF candidates using the anti-\(k_{{\mathrm {T}}}\) clustering algorithm [66, 67] with a distance parameter \(R = 0.4\). The jet energies are corrected following the procedures described in Ref. [68] and applied to the data taken by the CMS experiment in 2016. After correcting for all residual energy depositions from charged and neutral particles from pileup, \(p_{{\mathrm {T}}}\)- and \(\eta \)-dependent jet energy corrections are applied to correct for the detector response. A jet is selected if it has \(p_{{\mathrm {T}}} > 30\,{\text {GeV}} \) and \(|\eta | < 2.4\). Jets are rejected if the distance in \(\eta \)\(\phi \) space between the jet and the closest lepton is less than 0.4. Jets originating from the hadronisation of \({\text {b}}\) quarks (\({\text {b}}\) jets) are identified with an algorithm [69] that uses secondary vertices together with track-based lifetime information to construct a \({\text {b}}\) tagging discriminant. The chosen working point has a \({\text {b}}\) jet tagging efficiency of \({\approx }80\)–90% and a mistagging efficiency of \({\approx }10\%\) for jets originating from gluons, as well as \({\text {u}}\), \({\text {d}}\), or \({\text {s}}\) quarks, and \({\approx }30\)–40% for jets originating from \({\text {c}}\) quarks.

The missing transverse momentum vector \({\vec { p}}_{\text {T}}^{{\text {miss}}}\) is defined as the projection on the plane perpendicular to the beams of the negative vector sum of the momenta of all PF candidates in an event. Its magnitude is referred to as \(p_{{\mathrm {T}}} ^{\text {miss}}\). Jet energy corrections are propagated to improve the determination of \({\vec { p}}_{\text {T}}^{{\text {miss}}}\).

Events are selected offline if they contain exactly two isolated electrons or muons of opposite electric charge. Furthermore, they need to contain at least two jets and at least one of these jets must be \({\text {b}}\) tagged. Events with an invariant mass of the lepton pair, \(M(\ell \ell )\), smaller than 20\(\,{\text {GeV}}\) are removed in order to suppress contributions from heavy-flavour resonance decays and low-mass Drell–Yan processes. Backgrounds from \({\text {Z}}\)+jets processes in the \({\text {e}}^{+} {{\text {e}}}^{-} \)and \({{\upmu }}^{+} {{\upmu }}^{-} \) channels are further suppressed by requiring \(|m_{{\text {Z}}}-M(\ell \ell ) | > 15\,{\text {GeV}} \), and \(p_{{\mathrm {T}}} ^{\text {miss}} > 40\,{\text {GeV}} \). The remaining background contribution from \({\text {t}}{\text {W}}\), \({\text {Z}}\)+jets, \({\text {W}}\)+jets, diboson and \(\hbox {t}{\bar{\hbox {t}}}\) events from decay channels other than that of the signal are estimated from the simulation.

In this analysis, the \(\hbox {t}{\bar{\hbox {t}}}\) production cross section is also measured as a function of the extra jet multiplicity, \(N_{{\text {jet}}}\). Extra jets (also referred to as additional jets) are jets arising primarily from hard QCD radiation and not from the top quark decays. At generator level, the extra jets are defined in dilepton \(\hbox {t}{\bar{\hbox {t}}}\) events as jets with \(p_{{\mathrm {T}}} > 30\,{\text {GeV}} \), \(|\eta | < 2.4\), built of particles except neutrinos using the anti-\(k_{{\mathrm {T}}}\) clustering algorithm [66, 67] with a distance parameter \(R = 0.4\), and isolated from the charged leptons (i.e. \({\text {e}}\) or \({\upmu }\)) and \({\text {b}}\) quarks originating from the top quark decays by a minimal distance of 0.4 in \(\eta \)\(\phi \) space. The charged leptons and \({\text {b}}\) quarks are taken directly after \({\text {W}}\) and top quark decays, respectively. At reconstruction level the extra jets are defined in dilepton \(\hbox {t}{\bar{\hbox {t}}}\) candidate events as jets with \(p_{{\mathrm {T}}} > 30\,{\text {GeV}} \), \(|\eta | < 2.4\), and isolated from the leptons and \({\text {b}}\) jets originating from the top quark decays by the same minimal distance in \(\eta \)\(\phi \) space.

The \(\hbox {t}{\bar{\hbox {t}}}\) kinematic properties are determined from the four-momenta of the decay products using a kinematic reconstruction method [15]. The three-momenta of the neutrino (\({\upnu }\)) and of the antineutrino (\({{{\bar{\upnu }}}}\)) are not directly measured, but they can be reconstructed by imposing the following six kinematic constraints: the conservation in the event of the total transverse momentum vector, and the masses of the \({\text {W}}\) bosons, top quark, and top antiquark. The reconstructed top quark and antiquark masses are required to be 172.5\(\,{\text {GeV}}\). The \({\vec { p}}_{\text {T}}^{{\text {miss}}}\) in the event is assumed to originate solely from the two neutrinos in the top quark and antiquark decay chains. To resolve the ambiguity due to multiple algebraic solutions of the equations for the neutrino momenta, the solution with the smallest invariant mass of the \(\hbox {t}{\bar{\hbox {t}}}\) system is taken. The reconstruction is performed 100 times, each time randomly smearing the measured energies and directions of the reconstructed leptons and jets within their resolution. This smearing procedure recovers certain events that initially yield no solution because of measurement uncertainties. The three-momenta of the two neutrinos are determined as a weighted average over all smeared solutions. For each solution, the weight is calculated based on the expected true spectrum of the invariant mass of a lepton and a \({\text {b}}\) jet stemming from the decay of a top quark and taking the product of the two weights for the top quark and antiquark decay chains. All possible lepton-jet combinations in the event that satisfy the requirement on the invariant mass of the lepton and jet \(M_{\ell {{\text {b}}}} < 180\) \(\,{\text {GeV}}\) are considered. Combinations are ranked based on the presence of \({\text {b}}\)-tagged jets in the assignments, i.e. a combination with both leptons assigned to \({\text {b}}\)-tagged jets is preferred over those with one or no \({\text {b}}\)-tagged jet. Among assignments with equal number of \({\text {b}}\)-tagged jets, the one with the highest sum of weights is chosen. Events with no solution after smearing are discarded. The efficiency of the kinematic reconstruction, defined as the number of events where a solution is found divided by the total number of selected \(\hbox {t}{\bar{\hbox {t}}}\) events, is studied in data and simulation and consistent results are observed. The efficiency is about 90% for signal events. After applying the full event selection and the kinematic reconstruction of the \(\hbox {t}{\bar{\hbox {t}}}\) system, 150,410 events are observed in the \({\text {e}}^\pm {\upmu }^\pm \) channel, 34,890 events in the \({\text {e}}^{+} {{\text {e}}}^{-} \)channel, and 70,346 events in the \({{\upmu }}^{+} {{\upmu }}^{-} \) channel. Combining all decay channels, the estimated signal fraction in data is 80.6%. Figure 1 shows the distributions of the reconstructed top quark and \(\hbox {t}{\bar{\hbox {t}}}\) kinematic variables and of the multiplicity of additional jets in the events. In general, the data are reasonably well described by the simulation, however some trends are visible, in particular for \(p_{{\mathrm {T}}} ({\text {t}})\), where the simulation predicts a somewhat harder spectrum than that observed in data, as reported in previous differential \(\hbox {t}{\bar{\hbox {t}}}\) cross section measurements [15, 18, 20, 22, 23, 26, 27].

Fig. 1
figure 1

Distributions of \(p_{{\mathrm {T}}} ({\text {t}})\) (upper left), \(y({\text {t}})\) (upper right), \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) (middle left), \(y(\hbox {t}{\bar{\hbox {t}}})\) (middle right), \(M(\hbox {t}{\bar{\hbox {t}}})\) (lower left), and \(N_{{\text {jet}}}\) (lower right) in selected events after the kinematic reconstruction, at detector level. The experimental data with the vertical bars corresponding to their statistical uncertainties are plotted together with distributions of simulated signal and different background processes. The hatched regions correspond to the estimated shape uncertainties in the signal and backgrounds (as detailed in Sect. 7). The lower panel in each plot shows the ratio of the observed data event yields to those expected in the simulation

The \(M(\hbox {t}{\bar{\hbox {t}}})\) value obtained using the full kinematic reconstruction described above is highly sensitive to the value of the top quark mass used as a kinematic constraint. Since one of the objectives of this analysis is to extract the top quark mass from the differential \(\hbox {t}{\bar{\hbox {t}}}\) measurements, exploiting the \(M(\hbox {t}{\bar{\hbox {t}}})\) distribution in particular, an alternative algorithm is employed, which reconstructs the \(\hbox {t}{\bar{\hbox {t}}}\) kinematic variables without using the top quark mass constraint. This algorithm is referred to as the “loose kinematic reconstruction”. In this algorithm, the \({\upnu }{{\bar{\upnu }}}\) system is reconstructed rather than the \({\upnu }\) and \({{\bar{\upnu }}}\) separately. Consequently, it can only be used to reconstruct the total \(\hbox {t}{\bar{\hbox {t}}} \) system but not the top quark and antiquark separately. As in the full kinematic reconstruction, all possible lepton-jet combinations in the event that satisfy the requirement on the invariant mass of the lepton and jet \(M_{\ell {{\text {b}}}} < 180\) \(\,{\text {GeV}}\) are considered. Combinations are ranked based on the presence of \({\text {b}}\)-tagged jets in the assignments, but among combinations with equal number of \({\text {b}}\)-tagged jets, the ones with the highest-\(p_{{\mathrm {T}}}\) jets are chosen. The kinematic variables of the \({\upnu }{{\bar{\upnu }}}\) system are derived as follows: its \({\vec {p}}_{{\mathrm {T}}}\) is set equal to \({\vec { p}}_{\text {T}}^{{\text {miss}}}\), while its unknown longitudinal momentum and energy are set equal to the longitudinal momentum and energy of the lepton pair. Additional constraints are applied on the invariant mass of the neutrino pair, \(M({\upnu }{{\bar{\upnu }}}) \ge 0\), and on the invariant mass of the \({\text {W}}\) bosons, \(M({{\text {W}}}^{+}{{\text {W}}}^{-}) \ge 2M_{{\text {W}}}\), which have only minor effects on the performance of the reconstruction. The method yields similar \(\hbox {t}{\bar{\hbox {t}}}\) kinematic resolutions and reconstruction efficiency as for the full kinematic reconstruction. In this analysis, the loose kinematic reconstruction is exclusively used to measure triple-differential cross sections as a function of \(M(\hbox {t}{\bar{\hbox {t}}})\), \(y(\hbox {t}{\bar{\hbox {t}}})\), and extra jet multiplicity, which are exploited to determine QCD parameters, as well as the distributions used to cross-check the results. Figure 2 shows the distributions of the reconstructed \(\hbox {t}{\bar{\hbox {t}}}\) invariant mass and rapidity using the loose kinematic reconstruction. These distributions are similar to the ones obtained using the full kinematic reconstruction (as shown in Fig. 1). Towards forward rapidities \(|y(\hbox {t}{\bar{\hbox {t}}})|\ge 1.5\) a trend is visible in which the MC simulations predict more events than observed in the data. However, the differences between simulations and data are still compatible within the estimated shape uncertainties in the signal and backgrounds.

Fig. 2
figure 2

Distributions of \(y(\hbox {t}{\bar{\hbox {t}}})\) (left) and \(M(\hbox {t}{\bar{\hbox {t}}})\) (right) in selected events after the loose kinematic reconstruction. Details can be found in the caption of Fig. 1

5 Signal extraction and unfolding

The number of signal events in data is extracted by subtracting the expected number of background events from the observed number of events for each bin of the observables. All expected background numbers are obtained directly from the MC simulations (see Sect. 3) except for \(\hbox {t}{\bar{\hbox {t}}} \) final states other than the signal. The latter are dominated by events in which one or both of the intermediate \({\text {W}}\) bosons decay into \({\uptau }\) leptons with subsequent decay into an electron or muon. These events arise from the same \(\hbox {t}{\bar{\hbox {t}}}\) production process as the signal and thus the normalisation of this background is fixed to that of the signal. For each bin the number of events obtained after the subtraction of other background sources is multiplied by the ratio of the number of selected \(\hbox {t}{\bar{\hbox {t}}}\) signal events to the total number of selected \(\hbox {t}{\bar{\hbox {t}}}\) events (i.e. the signal and all other \(\hbox {t}{\bar{\hbox {t}}}\) events) in simulation.

The numbers of signal events obtained after background subtraction are corrected for detector effects, using the TUnfold package [70]. The event yields in the \({\text {e}}^{+} {{\text {e}}}^{-} \), \({{\upmu }}^{+} {{\upmu }}^{-} \) and \({\text {e}}^\pm {\upmu }^\pm \) channels are added together, and the unfolding is performed. It is verified that the measurements in the separate channels yield consistent results. The response matrix plays a key role in this unfolding procedure. An element of this matrix specifies the probability for an event originating from one bin of the true distribution to be observed in a specific bin of the reconstructed observables. The response matrix includes the effects of acceptance, detector efficiencies, and resolutions. The response matrix is defined such that the true level corresponds to the full phase space (with no kinematic restrictions) for \(\hbox {t}{\bar{\hbox {t}}}\) production at parton level. At the detector level, the number of bins used is typically a few times larger than the number of bins used at generator level. The response matrix is taken from the signal simulation. The generalised inverse of the response matrix is used to obtain the distribution of unfolded event numbers from the measured distribution by applying a \(\chi ^2\) minimisation technique. An additional \(\chi ^2\) term is included representing Tikhonov regularisation [71]. The regularisation reduces the effect of the statistical fluctuations present in the measured distribution on the high-frequency content of the unfolded spectrum. The regularisation strength is chosen such that the global correlation coefficient is minimal [72]. For the measurements presented here, this choice results in a small contribution from the regularisation term to the total \(\chi ^2\), on the order of a few percent. The unfolding of multidimensional distributions is performed by internally mapping the multi-dimensional arrays to one-dimensional arrays [70].

6 Cross section determination

The normalised cross sections for \(\hbox {t}{\bar{\hbox {t}}}\) production are measured in the full \(\hbox {t}{\bar{\hbox {t}}}\) kinematic phase space at parton level. The number of unfolded signal events \({\hat{M}}^{\text {unf}}_{i}\) in bins i of kinematic variables is used to define the normalised cross sections as a function of several (two or three) variables

$$\begin{aligned} \frac{\sigma _i}{\sigma } = \frac{1}{\sigma } \, \frac{{\hat{M}}^{\text {unf}}_{i}}{{\mathcal {B}} \, {\mathcal {L}}}, \end{aligned}$$
(1)

where the total cross section \(\sigma \) is evaluated by summing \(\sigma _{i}\) over all bins, \({\mathcal {B}}\) is the branching ratio of \(\hbox {t}{\bar{\hbox {t}}}\) into \({\text {e}}^{+} {{\text {e}}}^{-} \), \({{\upmu }}^{+} {{\upmu }}^{-} \), and \({\text {e}}^\pm {\upmu }^\pm \) final states and \({\mathcal {L}}\) is the integrated luminosity of the data sample. For presentation purposes, the measured cross sections are divided by the bin width of the first variable. They present single-differential cross sections as a function of the first variable in different ranges of the second or second and third variables and are referred to as double- or triple-differential cross sections, respectively. The bin widths are chosen based on the resolutions of the kinematic variables, such that the purity and the stability of each bin is generally above 20%. For a given bin, the purity is defined as the fraction of events in the \(\hbox {t}{\bar{\hbox {t}}}\) signal simulation that are generated and reconstructed in the same bin with respect to the total number of events reconstructed in that bin. To evaluate the stability, the number of events in the \(\hbox {t}{\bar{\hbox {t}}}\) signal simulation that are generated and reconstructed in a given bin are divided by the total number of reconstructed events generated in the bin.

The cross section determination based on the signal extraction and unfolding described in Sect. 5 has been validated with closure tests. Large numbers of pseudo-data sets were generated from the \(\hbox {t}{\bar{\hbox {t}}}\) signal MC simulations and analysed as if they were real data. The normalised differential cross sections are found to be unbiased and the confidence intervals based on the nominal measurements and the estimated \(\pm 1 \sigma \) uncertainties provide correct coverage probability. Any residual non-closure between generated and measured cross sections is found to be small compared to the statistical uncertainties of the measurements and is therefore neglected. A further closure test has been performed by unfolding pseudo-data sets generated using reweighted signal MCs for the detector corrections. The reweighting is performed as a function of the differential cross section kinematic observables and is used to introduce controlled shape variations, e.g. making the \(p_{{\mathrm {T}}} ({\text {t}})\) spectrum harder or softer. This test is sensitive to the stability of the unfolding with respect to the underlying physics model in the simulation. The effect on the unfolded cross sections is negligible for reweightings that lead to shape changes that are comparable to the observed differences between data and nominal MC distributions.

7 Systematic uncertainties

The systematic uncertainties in the measured differential cross sections are categorised into two classes: experimental uncertainties arising from imperfect modelling of the detector response, and theoretical uncertainties arising from the modelling of the signal and background processes. Each source of systematic uncertainty is assessed by changing in the simulation the corresponding efficiency, resolution, or scale by its uncertainty, using a prescription similar to the one followed in Ref. [27]. For each change made, the cross section determination is repeated, and the difference with respect to the nominal result in each bin is taken as the systematic uncertainty.

7.1 Experimental uncertainties

To account for the pileup uncertainty, the value of the total \({\text {p}}{\text {p}}\) inelastic cross section, which is used to estimate the mean number of additional \({\text {p}}{\text {p}}\) interactions, is varied by \(\pm \,4.6\%\), corresponding to the uncertainty in the measurement of this cross section [61].

The efficiencies of the dilepton triggers are measured with independent triggers based on a \(p_{{\mathrm {T}}} ^{\text {miss}}\) requirement. Scale factors, defined as the ratio of the trigger efficiencies in data and simulation, are calculated in bins of lepton \(\eta \) and \(p_{{\mathrm {T}}}\). They are applied to the simulation and varied within their uncertainties. The uncertainties from the modelling of lepton identification and isolation efficiencies are determined using the “tag-and-probe” method with \({\text {Z}}\)+jets event samples [73, 74]. The differences of these efficiencies between data and simulation in bins of \(\eta \) and \(p_{{\mathrm {T}}}\) are generally less than 10% for electrons, and negligible for muons. The uncertainty is estimated by varying the corresponding scale factors in the simulation within their uncertainties. An implicit assumption made in the analysis is that the scale factors derived from the \({\text {Z}}\)+jets sample are applicable for the \(\hbox {t}{\bar{\hbox {t}}}\) samples, where the efficiency for lepton isolation is reduced due to the typically larger number of jets present in the events. An additional uncertainty of 1% is added to take into account a possible violation of this assumption. This uncertainty is verified with studies with \(\hbox {t}{\bar{\hbox {t}}}\) enriched samples using a similar event selection as for the present analysis. In these studies the lepton isolation criteria are relaxed for one lepton and the efficiency for passing the criteria is measured both in data and simulation.

The uncertainty arising from the jet energy scale (JES) is determined by varying the twenty-six sources of uncertainty in the JES in bins of \(p_{{\mathrm {T}}}\) and \(\eta \) and taking the quadrature sum of the effects [68]. These uncertainties also include several sources related to pileup, that contribute a smaller part of all JES related uncertainties. The JES variations are also propagated to the uncertainties in \({\vec { p}}_{\text {T}}^{{\text {miss}}}\). The uncertainty from the jet energy resolution (JER) is determined by the variation of the simulated JER by ± 1 standard deviation in different \(\eta \) regions [68]. An additional uncertainty in the calculation of \({\vec { p}}_{\text {T}}^{{\text {miss}}}\) is estimated by varying the energies of reconstructed particles not clustered into jets.

The uncertainty due to imperfect modelling of the \({\text {b}}\) tagging efficiency is determined by varying the measured scale factor for \({\text {b}}\) tagging efficiencies within its uncertainties [69].

The uncertainty in the integrated luminosity of the 2016 data sample recorded by CMS is 2.5% [75] and is applied simultaneously to the normalisation of all simulated distributions.

7.2 Theoretical uncertainties

The uncertainties of the modelling of the \(\hbox {t}{\bar{\hbox {t}}}\) signal events are evaluated with appropriate variations of the nominal simulation based on powheg + pythia and the CUETP8M2T4 tune (see Sect. 3 for details). The studies presented in [47] show that the nominal simulation provides a reasonable prediction of differential \(\hbox {t}{\bar{\hbox {t}}}\) production cross sections at \(\sqrt{s}=8\) \(\,{\text {TeV}}\) and \(\sqrt{s}=13\) \(\,{\text {TeV}}\), also for events with additional jets, and thus can be used as a solid basis for evaluating theoretical uncertainties in the present analysis.

The uncertainty arising from missing higher-order terms in the simulation of the signal process at ME level is assessed by varying the renormalisation, \(\mu _{{\mathrm {r}}}\), and factorisation, \(\mu _{{\mathrm {f}}}\), scales in the powheg simulation up and down by factors of two with respect to the nominal values. In the powheg sample, the nominal scales are defined as \({\mu _{{\mathrm {r}}} = \mu _{{\mathrm {f}}} =} \sqrt{\smash [b]{m^2_{{\text {t}}} + p^2_{{\mathrm {T}},{\text {t}}}}}\), where \(p_{\mathrm {T},{\text {t}}}\) denotes the \(p_{{\mathrm {T}}}\) of the top quark in the \(\hbox {t}{\bar{\hbox {t}}}\) rest frame. In total, three variations are applied: one with the factorisation scale fixed, one with the renormalisation scale fixed, and one with both scales varied up and down coherently together. The maximum of the resulting measurement variations is taken as the final uncertainty. In the parton-shower simulation, the corresponding uncertainty is estimated by varying the scale up and down by factors of 2 for initial-state radiation and \(\sqrt{2}\) for final-state radiation, as suggested in Ref. [49].

The uncertainty from the choice of PDF is assessed by reweighting the signal simulation according to the prescription provided for the CT14 NLO set [60]. An additional uncertainty is independently derived by varying the \(\alpha _{S}\) value within its uncertainty in the PDF set. The dependence of the measurement on the assumed top quark mass parameter \(m_{{\text {t}}}^{{\text {MC}}} \) value is estimated by varying \(m_{{\text {t}}}^{{\text {MC}}} \) in the simulation by ± 1\(\,{\text {GeV}}\) around the central value of 172.5\(\,{\text {GeV}}\).

The uncertainty originating from the scheme used to match the ME-level calculation to the parton-shower simulation is derived by varying the \(h_{\text {damp}}\) parameter in powheg in the range \(0.996m_{{\text {t}}}^{{\text {MC}}}< h_{\text {damp}} < 2.239m_{{\text {t}}}^{{\text {MC}}} \), according to the tuning results from Ref. [47].

The uncertainty related to modelling of the underlying event is estimated by varying the parameters used to derive the CUETP8M2T4 tune in the default setup. The default setup in pythia includes a model of colour reconnection based on MPIs with early resonance decays switched off. The analysis is repeated with three other models of colour reconnection within pythia: the MPI-based scheme with early resonance decays switched on, a gluon-move scheme [76], and a QCD-inspired scheme [77]. The total uncertainty from colour reconnection modelling is estimated by taking the maximum deviation from the nominal result.

The uncertainty from the knowledge of the \({\text {b}}\) quark fragmentation function is assessed by varying the Bowler–Lund function within its uncertainties [78]. In addition, the analysis is repeated using the Peterson model for \({\text {b}}\) quark fragmentation [79], and the final uncertainty is determined, separately for each measurement bin, as an envelope of the variations of the normalised cross section resulting from all variations of the \({\text {b}}\) quark fragmentation function. An uncertainty from the semileptonic branching ratios of \({\text {b}}\) hadrons is estimated by varying them according to the world average uncertainties [44]. As \(\hbox {t}{\bar{\hbox {t}}}\) events producing electrons or muons originating from the decay of \({\uptau }\) leptons are considered to be background, the measured differential cross sections are sensitive to the branching ratios of \({\uptau }\) leptons decaying into electrons or muons assumed in the simulation. Hence, an uncertainty is determined by varying the branching ratios by 1.5% [44] in the simulation.

Fig. 3
figure 3

Comparison of the measured \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections to the theoretical predictions calculated using powheg + pythia (‘POW+PYT’), powheg + herwig++ (‘POW+HER’), and mg5_amc@nlo + pythia (‘MG5+PYT’) event generators. The inner vertical bars on the data points represent the statistical uncertainties and the full bars include also the systematic uncertainties added in quadrature. For each MC model, values of \(\chi ^2\) which take into account the bin-to-bin correlations and dof for the comparison with the data are reported. The hatched regions correspond to the theoretical uncertainties in powheg + pythia (see Sect. 7). In the lower panel, the ratios of the data and other simulations to the ‘POW+PYT’ predictions are shown

Fig. 4
figure 4

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), y({\text {t}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 5
figure 5

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 6
figure 6

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), {\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 7
figure 7

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), {\varDelta }\phi ({\text {t}},{\bar{{\text {t}}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 8
figure 8

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 9
figure 9

Comparison of the measured \([M(\hbox {t}{\bar{\hbox {t}}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 10
figure 10

Comparison of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

Fig. 11
figure 11

Comparison of the measured \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to the theoretical predictions calculated using MC event generators (further details can be found in the Fig. 3 caption)

The normalisations of all non-\(\hbox {t}{\bar{\hbox {t}}}\) backgrounds are varied up and down by \(\pm \,30\%\) taken from measurements as explained in Ref. [74].

The total systematic uncertainty in each measurement bin is estimated by adding all the contributions described above in quadrature, separately for positive and negative cross section variations. If a systematic uncertainty results in two cross section variations of the same sign, the largest one is taken, while the opposite variation is set to zero.

8 Results of the measurement

The normalised differential cross sections of \(\hbox {t}{\bar{\hbox {t}}}\) production are measured in the full phase space at parton level for top quarks (after radiation and before the top quark and antiquark decays) and at particle level for additional jets in the events, for the following variables:

  1. 1.

    double-differential cross sections as a function of pair of variables:

    • \(|y({\text {t}}) |\) and \(p_{{\mathrm {T}}} ({\text {t}})\),

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(|y({\text {t}}) |\),

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(|y(\hbox {t}{\bar{\hbox {t}}}) |\),

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \({\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}})\),

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \({\varDelta }\phi ({\text {t}},{\bar{{\text {t}}}})\),

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) and

    • \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(p_{{\mathrm {T}}} ({\text {t}})\).

    These cross sections are denoted in the following as \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\), etc.

  2. 2.

    triple-differential cross sections as a function of \(N_{{\text {jet}}}\), \(M(\hbox {t}{\bar{\hbox {t}}})\), and \(y(\hbox {t}{\bar{\hbox {t}}})\). These cross sections are measured separately using two (\(N_{{\text {jet}}} = 0\) and \(N_{{\text {jet}}} \ge 1\)) and three (\(N_{{\text {jet}}} = 0\), \(N_{{\text {jet}}} = 1\), and \(N_{{\text {jet}}} \ge 2\)) bins of \(N_{{\text {jet}}}\), for the particle-level jets. These cross sections are denoted as \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) and \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\), respectively.

The pairs of variables for the double-differential cross sections are chosen in order to obtain representative combinations that are sensitive to different aspects of the \(\hbox {t}{\bar{\hbox {t}}}\) production dynamics, mostly following the previous measurement [20]. The variables for the triple-differential cross sections are chosen in order to enhance sensitivity to the PDFs, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\). In particular, the combination of \(y(\hbox {t}{\bar{\hbox {t}}})\) and \(M(\hbox {t}{\bar{\hbox {t}}})\) variables provides sensitivity for the PDFs, as demonstrated in [20], the \(N_{{\text {jet}}}\) distribution for \(\alpha _{S}\)and \(M(\hbox {t}{\bar{\hbox {t}}})\) for \(m_{{\text {t}}}^{{\text {pole}}}\).

The numerical values of the measured cross sections and their uncertainties are provided in Appendix A. In general, the total uncertainties for all measured cross sections are about 5–10%, but exceed 20% in some regions of phase space, such as the last \(N_{{\text {jet}}}\) range of the \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) distribution. The total uncertainties are dominated by the systematic uncertainties receiving similar contributions from the experimental and theoretical systematic sources. The largest experimental systematic uncertainty is associated with the JES. Both the JES and signal modelling systematic uncertainties are also affected by the statistical uncertainties in the simulated samples that are used for the evaluation of these uncertainties. The cross sections measured in the \({\text {e}}^{+} {{\text {e}}}^{-} \), \({{\upmu }}^{+} {{\upmu }}^{-} \) and \({\text {e}}^\pm {\upmu }^\pm \) channels separately are compatible with each other.

In Figs. 3, 4, 5, 6, 7, 8, 9, 10 and 11, the measured cross sections are compared to three theoretical predictions based on MC simulations: powheg + pythia (‘POW+PYT’), powheg + herwig++ (‘POW+HER’), and mg5_amc@nlo + pythia (‘MG5+PYT’). The ‘POW+PYT’ and ‘POW+HER’ theoretical predictions differ by the parton-shower method, hadronisation and event tune (\(p_{{\mathrm {T}}}\)-ordered parton showering, string hadronisation model and CUETP8M2T4 tune in ‘POW+PYT’, or angular ordered parton showering, cluster hadronisation model and EE5C tune in ‘POW+HER’), while the ‘POW+PYT’ and ‘MG5+PYT’ predictions adopt different matrix elements (inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production at NLO in ‘POW+PYT’, or \(\hbox {t}{\bar{\hbox {t}}}\) with up to two extra partons at NLO in ‘MG5+PYT’) and different methods for matching with parton shower (correcting the first parton shower emission to the NLO result in ‘POW+PYT’, or subtracting from the exact NLO result its parton shower approximation in ‘MG5+PYT’). For each comparison, a \(\chi ^2\) and the number of degrees of freedom (dof) are reported. The \(\chi ^2\) value is calculated taking into account the statistical and systematic data uncertainties, while ignoring uncertainties of the predictions:

$$\begin{aligned} \chi ^2 = {{\mathbf {R}}}^{T}_{N-1} {\mathbf {Cov}}^{-1}_{N-1} {{\mathbf {R}}}_{N-1}, \end{aligned}$$
(2)

where \({{\mathbf {R}}}_{N-1}\) is the column vector of the residuals calculated as the difference of the measured cross sections and the corresponding predictions obtained by discarding one of the N bins, and \({\mathbf {Cov}}_{N-1}\) is the \((N-1)\times (N-1)\) submatrix obtained from the full covariance matrix by discarding the corresponding row and column. The matrix \({\mathbf {Cov}}_{N-1}\) obtained in this way is invertible, while the original covariance matrix \({\mathbf {Cov}}\) is singular because for normalised cross sections one degree of freedom is lost, as can be deduced from Eq. (1). The covariance matrix \({\mathbf {Cov}}\) is calculated as:

$$\begin{aligned} {\mathbf {Cov}} = {\mathbf {Cov}}^{\text {unf}} + {\mathbf {Cov}}^{\text {syst}}, \end{aligned}$$
(3)

where \({\mathbf {Cov}}^{\text {unf}}\) and \({\mathbf {Cov}}^{\text {syst}}\) are the covariance matrices corresponding to the statistical uncertainties from the unfolding, and the systematic uncertainties, respectively. The systematic covariance matrix \({\mathbf {Cov}}^{\text {syst}}\) is calculated as:

$$\begin{aligned}&{\mathbf {Cov}}^{\text {syst}}_{ij} = \sum _{k,l} \frac{1}{N_k} C_{j,k,l}C_{i,k,l}, \nonumber \\&\qquad 1 \le i \le N, \quad 1 \le j \le N, \end{aligned}$$
(4)

where \(C_{i,k,l}\) stands for the systematic uncertainty from variation l of source k in the ith bin, and \(N_k\) is the number of variations for source k. The sums run over all sources of the systematic uncertainties and all corresponding variations. Most of the systematic uncertainty sources in this analysis consist of positive and negative variations and thus have \(N_k = 2\), whilst several model uncertainties (the model of colour reconnection and the \({\text {b}}\) quark fragmentation function) consist of more than two variations, a property which is accounted for in Eq. (4). All systematic uncertainties are treated as additive, i.e. the relative uncertainties are used to scale the corresponding measured value in the construction of \({\mathbf {Cov}}^{\text {syst}}\). This treatment is consistent with the cross section normalisation and makes the \(\chi ^2\) in Eq. (2) independent of which of the N bins is excluded. A multiplicative treatment of uncertainties has been tested as well, and consistent results were obtained. The cross section measurements for different multi-differential distributions are statistically and systematically correlated. No attempt is made to quantify the correlations between bins from different multi-differential distributions. Thus, quantitative comparisons between theoretical predictions and the data can only be made for each single set of multi-differential cross sections.

In Fig. 3, the \(p_{{\mathrm {T}}} ({\text {t}})\) distribution is compared in different ranges of \(|y({\text {t}}) |\) to predictions from ‘POW+PYT’, ‘POW+HER’, and ‘MG5+PYT’. The data distribution is softer than that of the predictions over the entire \(y({\text {t}}) \) range. Only ‘POW+HER’ describes the data well, while the other two simulations predict a harder \(p_{{\mathrm {T}}} ({\text {t}})\) distribution than measured in the data over the entire \(y({\text {t}}) \) range. The disagreement is strongest for ‘POW+PYT’.

Figures 4 and 5 illustrate the distributions of \(|y({\text {t}}) |\) and \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) in different \(M(\hbox {t}{\bar{\hbox {t}}})\) ranges compared to the same set of MC models. The shapes of the \(y({\text {t}})\) and \(y(\hbox {t}{\bar{\hbox {t}}})\) distributions are reasonably well described by all models, except for the largest \(M(\hbox {t}{\bar{\hbox {t}}})\) range, where all theoretical predictions are more central than the data for \(y({\text {t}})\) and less central for \(y(\hbox {t}{\bar{\hbox {t}}})\). The \(M(\hbox {t}{\bar{\hbox {t}}})\) distribution is softer in the data than in the theoretical predictions. The latter trend is the strongest for ‘POW+PYT’, being consistent with the disagreement for the \(p_{{\mathrm {T}}} ({\text {t}})\) distribution (as shown in Fig. 3). The best agreement for both \([M(\hbox {t}{\bar{\hbox {t}}}), y({\text {t}}) ]\) and \([M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections is provided by ‘POW+HER’.

In Fig. 6, the \({\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}}) \) distribution is compared in the same \(M(\hbox {t}{\bar{\hbox {t}}})\) ranges to the theoretical predictions. For all generators, there is a discrepancy between the data and simulation for the medium and high \(M(\hbox {t}{\bar{\hbox {t}}})\) bins, where the predicted \({\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}}) \) values are too low. The disagreement is the strongest for ‘MG5+PYT’.

Table 1 The \(\chi ^2\) values (taking into account data uncertainties and ignoring theoretical uncertainties) and dof of the measured cross sections with respect to the predictions of various MC generators
Fig. 12
figure 12

Assessment of compatibility of various MC predictions with the data. The plot show the p-values of \(\chi ^2\)-tests between data and predictions. Only the data uncertainties are taken into account in the \(\chi ^2\)-tests while uncertainties on the theoretical calculations are ignored. Points with \(p \le 0.001\) are shown at \(p = 0.001\)

Figures 7 and 8 illustrate the comparison of the distributions of \({\varDelta }\phi ({\text {t}},{\bar{{\text {t}}}})\) and \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) in the same \(M(\hbox {t}{\bar{\hbox {t}}})\) ranges to the theoretical predictions. Both these distributions are sensitive to gluon radiation. All MC models describe the data well within uncertainties, except for ‘MG5+PYT’, which predicts a \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) distribution in the last \(M(\hbox {t}{\bar{\hbox {t}}})\) bin of the \([M(\hbox {t}{\bar{\hbox {t}}}), p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections that is too hard.

In Fig. 9, the \(p_{{\mathrm {T}}} ({\text {t}})\) distribution is compared in different \(M(\hbox {t}{\bar{\hbox {t}}})\) ranges to the theoretical predictions. None of the MC generators is able to describe the data, generally predicting a too hard \(p_{{\mathrm {T}}} ({\text {t}})\) distribution. The discrepancy is larger at high \(M(\hbox {t}{\bar{\hbox {t}}})\) values where the softer \(p_{{\mathrm {T}}} ({\text {t}})\) spectrum in the data must be kinematically correlated with the larger \({\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}}) \) values (as shown in Fig. 6), compared to the predictions. The disagreement is the strongest for ‘POW+PYT’. While the ‘POW+HER’ simulation is able to reasonably describe the \(p_{{\mathrm {T}}} ({\text {t}})\) distribution in the entire range of \(y(\hbox {t}{\bar{\hbox {t}}})\) (as shown in Fig. 3), it does not provide a good description in all ranges of \(M(\hbox {t}{\bar{\hbox {t}}})\), in particular predicting a too hard \(p_{{\mathrm {T}}} ({\text {t}})\) distribution at high \(M(\hbox {t}{\bar{\hbox {t}}})\).

Figures 10 and 11 illustrate the triple-differential cross sections as a function of \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) in different \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(N_{{\text {jet}}}\) ranges, measured using two or three bins of \(N_{{\text {jet}}}\). For the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurement, all MC models describe the data well. For the \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurement, only ‘POW+PYT’ is in satisfactory agreement with the data. In particular, ‘POW+HER’ predicts too high a cross section for \(N_{{\text {jet}}} > 1\), while ‘MG5+PYT’ provides the worst description of the \(M(\hbox {t}{\bar{\hbox {t}}})\) distribution for \(N_{{\text {jet}}} = 1\).

All obtained \(\chi ^2\) values, ignoring theoretical uncertainties, are listed in Table 1. The corresponding p-values are visualised in Fig. 12. From these values one can conclude that none of the central predictions of the considered MC generators is able to provide predictions that correctly describe all distributions. In particular, for \([M(\hbox {t}{\bar{\hbox {t}}}), {\varDelta }\eta ({\text {t}},{\bar{{\text {t}}}}) ]\) and \([M(\hbox {t}{\bar{\hbox {t}}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) the \(\chi ^2\) values are relatively large for all MC generators. In total, the best agreement with the data is provided by ‘POW+PYT’ and ‘POW+HER’, with ‘POW+PYT’ better describing the measurements probing \(N_{{\text {jet}}}\) and radiation, and ‘POW+HER’ better describing the ones involving probes of the \(p_{{\mathrm {T}}}\) distribution.

9 Extraction of \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\) from \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections using external PDFs

To extract \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\), the measured triple-differential cross sections are compared to fixed-order NLO predictions that do not have variable parameters, except for the factorisation and renormalisation scales. These predictions provide a simpler assessment of theoretical uncertainties than predictions from MC event generators. The latter complement fixed-order computations with parton showers, thus accounting for important QCD corrections beyond fixed order, but complicating thereby the interpretation of the extracted parameters, because the modelling of the showers can involve different PDFs and \(\alpha _{S}\) values. Furthermore, for PDF fits using these data (to be discussed in Sect. 10), fast computation techniques are required that are currently available only for fixed-order calculations.

Fixed-order theoretical calculations for fully differential cross sections for inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production are publicly available at NLO \(O(\alpha _{S}^3)\) in the fixed-flavour number scheme [80], and for \(\hbox {t}{\bar{\hbox {t}}} \) production with one (NLO \(O(\alpha _{S}^4)\)) [81] and two (NLO \(O(\alpha _{S}^5)\)) [82, 83] additional jets. These calculations are used in the present analysis. Furthermore, NLO predictions for \(\hbox {t}{\bar{\hbox {t}}} \) production with three additional jets exist [84], but are not used in this paper because the sample of events with three additional jets is not large enough to allow us to measure multi-differential cross sections. The exact fully differential NNLO \(O(\alpha _{S}^4)\) calculations for inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production have recently appeared in the literature [85, 86], but these predictions have not been published yet for multi-differential cross sections. The NNLO calculations for \(\hbox {t}{\bar{\hbox {t}}}\) production with additional jets have not been performed yet.

In the case of the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurement, cross sections for inclusive \(\hbox {t}{\bar{\hbox {t}}} \) and \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production in each bin of \(M(\hbox {t}{\bar{\hbox {t}}})\) and \(y(\hbox {t}{\bar{\hbox {t}}})\) are obtained in the following way. The cross sections for inclusive \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production are taken from the \(N_{{\text {jet}}} \ge 1\) bins of the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurements. The cross sections for inclusive \(\hbox {t}{\bar{\hbox {t}}} \) production are calculated from the sum of the cross sections in the \(N_{{\text {jet}}} =0\) and \(N_{{\text {jet}}} \ge 1\) bins. Statistical and systematic uncertainties and all correlations are obtained using error propagation. Finally, the cross sections obtained for inclusive \(\hbox {t}{\bar{\hbox {t}}} \) and \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production are compared to the NLO \(O(\alpha _{S}^3)\) and NLO \(O(\alpha _{S}^4)\) calculations, respectively. For these processes, the ratios of NLO over LO predictions are about 1.5 on average, and the requirement \(p_{{\mathrm {T}}} > 30\,{\text {GeV}} \) for the jets ensures that logarithms of the ratio \(p_{{\mathrm {T}}}/m_{{\text {t}}}^{{\text {pole}}} \) are not large, thereby demonstrating good convergence of the perturbation series. Similarly, cross sections for inclusive \(\hbox {t}{\bar{\hbox {t}}} \), \(\hbox {t}{\bar{\hbox {t}}} +1\), and \(\hbox {t}{\bar{\hbox {t}}} +2\) jets production are obtained using the \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurement and compared to the NLO \(O(\alpha _{S}^3)\), NLO \(O(\alpha _{S}^4)\), and NLO O(α5S) calculations, respectively. Thus, all cross sections are compared to calculations of the order in \(\alpha _{S}\) required for NLO accuracy. For presentation purposes, the cross sections are shown in Figs. 14, 15 and 16 in the \(N_{{\text {jet}}} =0\) and \(N_{{\text {jet}}} \ge 1\) bins used before for the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurements, one with \(N_{{\text {jet}}} =0\) and another with \(N_{{\text {jet}}} \ge 1\). The measured cross sections for \(N_{{\text {jet}}} \ge 1\) are compared to the NLO calculation for inclusive \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production, while those for \(N_{{\text {jet}}} = 0\) are compared to the difference of the NLO calculations for inclusive \(\hbox {t}{\bar{\hbox {t}}} \) and inclusive \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production. The normalisation cross section is evaluated by integrating the differential cross sections over all bins, i.e. it is given by the inclusive \(\hbox {t}{\bar{\hbox {t}}} \) cross section. As discussed below, \(\chi ^2\) values are calculated for the comparisons of data and NLO predictions and are also used for the extraction of parameter values. The total \(\chi ^2\) values obtained are identical for the comparisons based on inclusive \(\hbox {t}{\bar{\hbox {t}}} \) and \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production cross sections and the ones based on the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) results, shown in Figs. 14, 15 and 16, because the \(\chi ^2\) values are invariant under invertible linear transformations of the set of cross section values.

The NLO predictions are obtained using the mg5_amc@nlo framework running in the fixed-order mode. A number of the latest proton NLO PDF sets are used, namely: ABMP16 [87], CJ15 [88], CT14 [60], HERAPDF2.0 [89], JR14 [90], MMHT2014 [91], and NNPDF3.1 [92], available via the lhapdf interface (version 6.1.5) [93]. No \(\hbox {t}{\bar{\hbox {t}}}\) data were used in the determination of the CJ15, CT14, HERAPDF2.0 and JR14 PDF sets; only total \(\hbox {t}{\bar{\hbox {t}}}\) production cross section measurements were used to determine the ABMP16 and MMHT2014 PDFs, and both total and differential (from LHC Run 1) \(\hbox {t}{\bar{\hbox {t}}}\) cross sections were used in the NNPDF3.1 extraction. The number of active flavours is set to \(n_{{\mathrm {f}}} = 5\), an \(m_{{\text {t}}}^{{\text {pole}}} = 172.5\) \(\,{\text {GeV}}\) is used, and \(\alpha _{S}\) is set to the value used for the corresponding PDF extraction. The renormalisation and factorisation scales are chosen to be \(\mu _{{\mathrm {r}}} = \mu _{{\mathrm {f}}} = H'/2, H' = \sum _i {m_{{\text {t}},i}}\). Here the sum is running over all final-state partons (\({\text {t}}\), \({\overline{{\text {t}}}}\), and up to three light partons in the \(\hbox {t}{\bar{\hbox {t}}} + 2\) jet calculations) and \(m_{{\text {t}}}\) denotes a transverse mass, defined as \(m_{{\text {t}}} = \sqrt{\smash [b]{m^2 + p_{{\mathrm {T}}} ^2}}\). The theoretical uncertainty is estimated by varying \(\mu _{{\mathrm {r}}}\) and \(\mu _{{\mathrm {f}}}\) independently up and down by a factor of 2, with the additional restriction that the ratio \(\mu _{{\mathrm {r}}}/\mu _{{\mathrm {f}}}\) stays between 0.5 and 2 [94]. Additionally, an alternative scale choice \(\mu _{{\mathrm {r}}} = \mu _{{\mathrm {f}}} = H/2, H = \sum _i {m_{{\text {t}},i}}\), with the sum running only over \({\text {t}}\) and \({\overline{{\text {t}}}}\) [86], is considered. The scales are varied coherently in the predictions with different \(N_{{\text {jet}}}\). The final uncertainty is determined as an envelope of all scale variations on the normalised cross sections. This uncertainty is referred to hereafter as a scale uncertainty and is supposed to estimate the impact of missing higher-order terms. The PDF uncertainties are taken into account in the theoretical predictions for each PDF set. The PDF uncertainties of CJ15 [88] and CT14 [60], evaluated at 90% confidence level (\({\text {CL}}\)), are rescaled to the 68% \({\text {CL}}\) for consistency with other PDF sets. The uncertainties in the normalised \(\hbox {t}{\bar{\hbox {t}}}\) cross sections originating from \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) are estimated by varying them within αS(mZ) = 0.118 ± 0.001) and \(m_{{\text {t}}}^{{\text {pole}}} = 172.5 \pm \, 1.0\,{\text {GeV}} \), respectively (for presentation purposes, in some figures larger variations of \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) by \(\pm \, 0.005\) and \(\pm \, 5.0\,{\text {GeV}} \), respectively, are shown).

To compare the measured cross sections to the NLO QCD calculations, the latter are further corrected from parton to particle level. The NLO QCD calculations are provided for parton-level jets and stable top quarks, therefore the corrections (further referred to as NP) are determined using additional powheg + pythia MC simulations for \(\hbox {t}{\bar{\hbox {t}}}\) production with and without MPI, hadronisation and top quark decays, and defined as:

$$\begin{aligned} {\mathcal {C}}_{{\text {NP}}} = \frac{\sigma ^{{\text {particle}}}_{{\text {isolated from}} \, {\text {t}}\rightarrow \ell , {\text {b}}}}{\sigma ^{{\text {parton}}}_{\text {no MPI, no had., no} \, \hbox {t}{\bar{\hbox {t}}} \, \text {decays}}}. \end{aligned}$$
(5)

Here \(\sigma ^{{\text {particle}}}_{{\text {isolated from}} \, {\text {t}}\rightarrow \ell ,{\text {b}}}\) is the cross section with MPI and hadronisation for jets built of particles excluding neutrinos and isolated from charged leptons and \({\text {b}}\) quarks from the top quark decays, as defined in Sect. 4, and \(\sigma ^{{\text {parton}}}_{\text {no MPI, no had., no} \, \hbox {t}{\bar{\hbox {t}}} \, \text {decays}}\) is the cross section without MPI and hadronisation for jets built of partons excluding \({\text {t}}\) and \({\bar{{\text {t}}}}\). Both cross sections are calculated at NLO matched with parton showers. The \({\mathcal {C}}_{{\text {NP}}}\) factors are used to correct the NLO predictions to particle level. The NP corrections are determined in bins of the triple-differential cross sections as a function of \(N_{{\text {jet}}}\), \(M(\hbox {t}{\bar{\hbox {t}}})\), and \(y(\hbox {t}{\bar{\hbox {t}}})\), even though they depend primarily on \(N_{{\text {jet}}}\) and have only weak dependence on the \(\hbox {t}{\bar{\hbox {t}}}\) kinematic properties. For the cross sections with up to two extra jets measured in this analysis, the estimated NP corrections are close to 1, within 5%. The dependence of the NP corrections on MC modelling was studied using MC samples with varied hadronisation model, underlying event tune, and ME and parton-shower scales, as detailed in Sect. 7. All resulting variations of \({\mathcal {C}}_{{\text {NP}}}\) were found to be \(\lesssim \)1%, therefore no uncertainties on the determined NP corrections are assigned. To compare to the measured cross sections, the normalised multi-differential cross sections of the theoretical predictions are obtained by dividing the cross sections in specific bins by the total cross section summed over all bins.

The theoretical uncertainties for the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) and \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections are illustrated in Fig. 13. The CT14 PDF set with \(\alpha _{S}(m_{{\text {Z}}}) = 0.118\), \(m_{{\text {t}}}^{{\text {pole}}} = 172.5\,{\text {GeV}} \) is used as the nominal calculation. The contributions arising from the PDF, \(\alpha _{S}(m_{{\text {Z}}}) \) (± 0.005), and \(m_{{\text {t}}}^{{\text {pole}}} \) (\(\pm \, 1\) \(\,{\text {GeV}}\)) uncertainties are shown separately. The total theoretical uncertainties are obtained by adding the effects from PDF, \(\alpha _{S}(m_{{\text {Z}}})\), \(m_{{\text {t}}}^{{\text {pole}}}\), and scale variations in quadrature. On average, the total theoretical uncertainties are 5–10%. They receive similar contributions from PDF, \(\alpha _{S}(m_{{\text {Z}}})\), \(m_{{\text {t}}}^{{\text {pole}}}\), and scale variations. This shows that the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections can be used for reliable and precise extraction of the PDFs and QCD parameters. In this analysis the PDFs, \(\alpha _{S}(m_{{\text {Z}}}) \), and \(m_{{\text {t}}}^{{\text {pole}}}\) are extracted from the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections. These results are considered to be the nominal ones and are checked by repeating the analysis using the \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections.

Fig. 13
figure 13

The theoretical uncertainties for \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) (upper) and \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) (lower) cross sections, arising from PDF, \(\alpha _{S}(m_{{\text {Z}}})\), and \(m_{{\text {t}}}^{{\text {pole}}}\) variations, as well as the total theoretical uncertainties, with their bin-averaged values shown in brackets. The bins are the same as in Figs. 10 and 11

In Figs. 14, 15 and 16 the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections are compared to the predictions obtained using different PDFs, \(\alpha _{S}(m_{{\text {Z}}}) \), and \(m_{{\text {t}}}^{{\text {pole}}}\) values. For each comparison, a \(\chi ^2\) is calculated, taking into account the uncertainties of the data but ignoring uncertainties of the predictions. For the comparison in Fig. 14, additional \(\chi ^2\) values are determined, taking also PDF uncertainties in the predictions into account, i.e. Eq. (3) becomes \({\mathbf {Cov}} = {\mathbf {Cov}}^{\text {unf}} + {\mathbf {Cov}}^{\text {syst}} + {\mathbf {Cov}}^{{\mathrm {PDF}}}\), where \({\mathbf {Cov}}^{\mathrm {PDF}}\) is a covariance matrix that accounts for the PDF uncertainties. Theoretical uncertainties from scale, \(\alpha _{S}(m_{{\text {Z}}})\), and \(m_{{\text {t}}}^{{\text {pole}}} \) variations are not included in this \(\chi ^2\) calculation. Sizeable differences of the \(\chi ^2\) values are observed for the predictions obtained using different PDFs. These differences can be attributed to the different input data and methodologies that were used to extract these sets of PDFs as discussed elsewhere [95, 96]. Among the PDF sets considered, the best description of the data is provided by the ABMP16 PDFs. This comparison also shows that the data prefer lower \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) value than in the nominal calculation using CT14. The largest sensitivity to \(m_{{\text {t}}}^{{\text {pole}}}\) is observed in the lowest \(M(\hbox {t}{\bar{\hbox {t}}})\) region close to the threshold, while the sensitivity in the other \(M(\hbox {t}{\bar{\hbox {t}}})\) bins occurs mainly because of the cross section normalisation.

Fig. 14
figure 14

Comparison of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to NLO predictions obtained using different PDF sets (further details can be found in Fig. 3). For each theoretical prediction, values of \(\chi ^2\) and dof for the comparison to the data are reported, while additional \(\chi ^2\) values that include PDF uncertainties are shown in parentheses

Fig. 15
figure 15

Comparison of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to NLO predictions obtained using different \(\alpha _{S}(m_{{\text {Z}}}) \) values (further details can be found in Fig. 3). For each theoretical prediction, values of \(\chi ^2\) and dof for the comparison to the data are reported

Fig. 16
figure 16

Comparison of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to NLO predictions obtained using different \(m_{{\text {t}}}^{{\text {pole}}}\) values (further details can be found in Fig. 3). For each theoretical prediction, values of \(\chi ^2\) and dof for the comparison to the data are reported

The values of \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) are extracted by calculating a \(\chi ^2\) between data and NLO predictions as a function of the input \(\alpha _{S}(m_{{\text {Z}}}) \) or \(m_{{\text {t}}}^{{\text {pole}}}\) value, and approximating the dependence with a parabola. The minimum of the parabola is taken as the extracted \(\alpha _{S}(m_{{\text {Z}}}) \) or \(m_{{\text {t}}}^{{\text {pole}}}\) value, while its uncertainty is estimated from the \({\varDelta }\chi ^2 = 1\) variation. This extraction is performed separately using different PDF sets, as well as different scale values. As for the additional \(\chi ^2\) values in Fig. 14, the PDF uncertainties in the predictions are taken into account in these \(\chi ^2\) calculations. Among the available PDF sets, only CT14, HERAPDF2.0, and ABMP16 provide PDF sets for enough different \(\alpha _{S}(m_{{\text {Z}}}) \) values and are suitable for \(\alpha _{S}(m_{{\text {Z}}}) \) extraction. Because the dependence of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections on the \(m_{{\text {t}}}^{{\text {MC}}}\) value (as well as on PDFs and \(\alpha _{S}\)) is much smaller than the sensitivity of the theoretical predictions to \(m_{{\text {t}}}^{{\text {pole}}}\) (more details are given in Appendix C), it is not taken into account in the extraction procedure.

The \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) scans for different PDF sets are shown in Fig. 17. The extracted \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) values are reported in the plots. Furthermore, the \(\alpha _{S}(m_{{\text {Z}}}) \) (\(m_{{\text {t}}}^{{\text {pole}}}\)) scans were performed using altered scale and \(m_{{\text {t}}}^{{\text {pole}}} \) (\(\alpha _{S}(m_{{\text {Z}}})\)) settings and different PDF sets. For all input PDF sets, the impact of the scale variations is moderate and a weak positive correlation (\(\sim \)30%) between \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}} \) is observed (the distributions are shown in Figs. 25 and 26 in Appendix B).

Fig. 17
figure 17

The \(\alpha _{S}(m_{{\text {Z}}}) \) (left) and \(m_{{\text {t}}}^{{\text {pole}}}\) (right) extraction at NLO from the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections using different PDF sets. The extracted \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) values are reported for each PDF set, and the estimated minimum \(\chi ^2\) value is shown in brackets. Further details are given in the text

The values of \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\), extracted at NLO, are compared in Fig. 18 to the world average [97] and reported in Tables 2 and 3. The contributions to the total uncertainty arising from the data and from the theoretical prediction due to PDF, scale, and \(m_{{\text {t}}}^{{\text {pole}}} \) or \(\alpha _{S}(m_{{\text {Z}}})\) uncertainties are shown separately. For the extraction of \(\alpha _{S}(m_{{\text {Z}}})\), the experimental, PDF, scale, and \(m_{{\text {t}}}^{{\text {pole}}} \) uncertainties are comparable in magnitude. The size of the PDF uncertainties varies significantly for different PDF sets, and the extracted \(\alpha _{S}(m_{{\text {Z}}})\) values depend on the input PDFs because of a strong correlation between \(\alpha _{S}\) and the gluon distribution. This illustrates that precise and reliable \(\alpha _{S}(m_{{\text {Z}}}) \) extractions from the observed data can be obtained only in a simultaneous PDF and \(\alpha _{S}(m_{{\text {Z}}}) \) fit. For the \(m_{{\text {t}}}^{{\text {pole}}}\) extraction, the total uncertainty is dominated by the data uncertainties. The world average [97] is computed based on extractions of \(m_{{\text {t}}}^{{\text {pole}}}\) from inclusive \(\hbox {t}{\bar{\hbox {t}}}\) cross sections at NNLO+NNLL and differential distributions at NLO, and dominated by the inclusive cross section measurement and a measurement from leptonic distributions. For the combination, correlations were not taken into account. The world average \(\alpha _{S}\) value [97] is based on (at least) full NNLO QCD predictions.

Fig. 18
figure 18

The \(\alpha _{S}(m_{{\text {Z}}})\) (left) and \(m_{{\text {t}}}^{{\text {pole}}} \) (right) values extracted at NLO using different PDFs. The contributions to the total uncertainty arising from the data and from the theory prediction due to PDF, scale, and \(m_{{\text {t}}}^{{\text {pole}}} \) or \(\alpha _{S}(m_{{\text {Z}}}) \) uncertainties are shown separately. An additional theoretical uncertainty in the extracted \(m_{{\text {t}}}^{{\text {pole}}} \) (right) value of the order of \(+1 \,{\text {GeV}} \), due to missing Coulomb and soft-gluon resummation near the \(\hbox {t}{\bar{\hbox {t}}}\) production threshold, is not shown. The world average values \(\alpha _{S}(m_{{\text {Z}}}) = 0.1181 \pm 0.0011\) and \(m_{{\text {t}}}^{{\text {pole}}} = 173.1 \pm 0.9\,{\text {GeV}} \) from Ref. [97] are shown for reference

Table 2 The \(\alpha _{S}(m_{{\text {Z}}})\) values extracted at NLO using different PDFs, together with their fit, PDF, scale (\(\mu \)), and \(m_{{\text {t}}}^{{\text {pole}}}\) uncertainties
Table 3 The \(m_{{\text {t}}}^{{\text {pole}}}\) values extracted at NLO using different PDFs, together with their fit, PDF, scale (\(\mu \)), and \(\alpha _{S}\) uncertainties. An additional theoretical uncertainty of the order of \(+1 \,{\text {GeV}} \) due to missing Coulomb and soft-gluon resummation near the \(\hbox {t}{\bar{\hbox {t}}}\) production threshold is not shown

Near the mass threshold, relevant for the \(m_{{\text {t}}}^{{\text {pole}}}\) extraction, the fixed-order perturbation series should be improved with Coloumb and soft-gluon resummation that, however, is not available in the tools used to obtain theoretical predictions in this work. In Ref. [98] these effects are found to be relevant only very close to the threshold (within a few GeV) and give a correction of about \(+1\%\) to the total \(\hbox {t}{\bar{\hbox {t}}}\) cross section. A more recent study for the total cross section shows that these corrections are presently known only with a large relative uncertainty [99]. Attributing a \(+1\%\) total cross section correction to our first \(M(\hbox {t}{\bar{\hbox {t}}})\) interval and assuming it to be independent of \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) and \(N_{{\text {jet}}}\) leads to an increase of the predicted cross section in each of the lowest \(M(\hbox {t}{\bar{\hbox {t}}})\) bins by 5%. The effect of such a correction on the extraction of parameters has been tested for the simultaneous PDF, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\) fit described in Sect. 10. A shift of \(+0.7 \,{\text {GeV}} \) is observed for \(m_{{\text {t}}}^{{\text {pole}}}\) with respect to the nominal result listed in Eq. (8), while shifts of \(\alpha _{S}\) and PDFs are small. No attempt was made to further quantify the effects on the separate \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) extractions discussed in the present section. In the future, theoretical calculations should include gluon resummation effects to accurately extract \(m_{{\text {t}}}^{{\text {pole}}}\) from differential \(\hbox {t}{\bar{\hbox {t}}}\) cross sections. For the time being one can assume an additional theoretical uncertainty in the performed \(m_{{\text {t}}}^{{\text {pole}}}\) extraction of the order of \(+1 \,{\text {GeV}} \) due to neglecting these effects.

Furthermore, the impact of the parton shower was discussed in Ref. [7], where the predictions for \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production obtained at NLO and using \({\textsc {powheg}} \) NLO calculations matched with the pythia parton shower have been compared (as shown in Fig. 1 of Ref. [7]) and agreement between different approaches was found to be within 0.5\(\,{\text {GeV}}\) for the extracted \(m_{{\text {t}}}^{{\text {pole}}}\).

Moreover, electroweak corrections can be significant in some regions of phase space [100, 101], but for the analysis in this paper no electroweak corrections were applied. For inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production in the kinematic region of this analysis, electroweak corrections are calculated in [101] and found to be smaller than 2% for any region of the \(M(\hbox {t}{\bar{\hbox {t}}})\) distribution and smaller than 1% for the \(y(\hbox {t}{\bar{\hbox {t}}})\) distribution.

The \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) extraction is validated by repeating the procedure:

  1. 1.

    Using single-differential \(N_{{\text {jet}}}\), \(M(\hbox {t}{\bar{\hbox {t}}})\), \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) cross sections. The plots are available in Appendix B. The largest sensitivity to \(\alpha _{S}\) is observed when using the \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) cross sections; the value for \(\alpha _{S}\) is, however, strongly dependent on the PDF set used. The \(N_{{\text {jet}}}\) distribution provides a smaller \(\alpha _{S}\) sensitivity, but with little dependence on the PDFs. For \(m_{{\text {t}}}^{{\text {pole}}}\), the largest sensitivity is observed when using the \(M(\hbox {t}{\bar{\hbox {t}}})\) cross sections. In fact, almost no sensitivity to \(m_{{\text {t}}}^{{\text {pole}}}\) is present in \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) or \(N_{{\text {jet}}}\) single-differential cross sections. All determinations using the single-differential cross sections yielded \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) values that are consistent with the nominal determination,

  2. 2.

    Using triple-differential \([N^{0,1,2+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections. The distributions are available in Appendix B. The extracted \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) values are consistent with the nominal ones obtained using two \(N_{{\text {jet}}} \) bins and have similar precision with a slightly different uncertainty composition: smaller data uncertainties but larger scale uncertainties are present when using three \(N_{{\text {jet}}}\) bins. The different uncertainties are expected since more \(N_{{\text {jet}}}\) bins provide more sensitivity to \(\alpha _{S}\), while the NLO theoretical prediction for the last \(N_{{\text {jet}}}\) bins (two or more extra jets) have larger scale uncertainties compared to the other bins (as shown in Fig. 13). This shows that NLO QCD predictions are able to describe \(\hbox {t}{\bar{\hbox {t}}}\) data with up to two hard extra jets, however higher-order calculations are desirable to match the experimental precision in order to achieve a most accurate \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) determination.

  3. 3.

    Using triple-differential \([p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}}), M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections with two \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) bins. The NLO calculations for inclusive \(\hbox {t}{\bar{\hbox {t}}}\) and \(\hbox {t}{\bar{\hbox {t}}} +1\) jet production with an appropriate jet \(p_{{\mathrm {T}}}\) threshold are used to describe the distribution in the two \(p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}})\) bins (see Appendix B.1 for further details). The extracted \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) values (the plots are available in Appendix B.1) are consistent with the nominal ones but have slightly larger experimental, PDF, and scale uncertainties compared to the nominal results based on the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections. Nevertheless, these results are an important cross-check, because the \([p_{{\mathrm {T}}} (\hbox {t}{\bar{\hbox {t}}}), M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections are provided at parton level and do not require non-perturbative corrections, which have to be applied for distributions involving \(N_{{\text {jet}}}\).

  4. 4.

    Using unnormalised cross sections. Consistent \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) values are obtained, but with substantially larger experimental and scale uncertainties due to the increased scale dependence of the NLO predictions for the unnormalised cross sections and uncancelled normalisation uncertainties in the measured cross sections.

  5. 5.

    Using NNLO/NLO factors as a function of \(M(\hbox {t}{\bar{\hbox {t}}})\). The ratios of NNLO over NLO calculations from Ref. [86] are used to multiply the NLO calculations. The NNLO/NLO corrections are obtained with the CT14, MMHT2014, and NNPDF3.0 PDF sets and \(m_{{\text {t}}}^{{\text {pole}}} =173.1\) GeV, and applied independently of \(y(\hbox {t}{\bar{\hbox {t}}})\) or \(N_{{\text {jet}}}\) (note that NNLO/NLO corrections for the \(y(\hbox {t}{\bar{\hbox {t}}})\) distribution are generally smaller than for \(M(\hbox {t}{\bar{\hbox {t}}})\), and no NNLO corrections for the \(N_{{\text {jet}}}\) distribution are available). The NNLO/NLO corrections for the \(M(\hbox {t}{\bar{\hbox {t}}})\) bins used in this analysis do not exceed \(2\%\), and the impact on the extracted \(m_{{\text {t}}}^{{\text {pole}}}\) and \(\alpha _{S}\)values is \(-\,0.2\,{\text {GeV}} \), \(-\,0.4\,{\text {GeV}} \), \(+\,0.01\,{\text {GeV}} \), and \(-\,0.0005\), \(-\,0.0006\), \(-\,0.0004\) using the CT14, MMHT2014, NNPDF3.0 PDFs, respectively, which is compatible with the uncertainties assigned to the NLO results. Because of the several assumptions explained above, this study should not be interpreted as an extraction of \(m_{{\text {t}}}^{{\text {pole}}}\) and \(\alpha _{S}\) at NNLO, but only as a test of the scale uncertainties assigned to the NLO results.

  6. 6.

    Using a NLO calculation matched to parton showers. The powheg + pythia simulation with the NNPDF3.0 PDF set is used to determine the \(m_{{\text {t}}}^{{\text {pole}}}\) value. The extracted value is lower by 0.4 GeV than the nominal one determined using the NLO calculations, which is compatible with the uncertainties assigned to the NLO results.

10 Simultaneous PDF, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}} \) fit

The triple-differential normalised \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections are used in a simultaneous PDF, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\) fit at NLO (also referred to as a QCD analysis, or PDF fit), together with the combined HERA inclusive deep inelastic scattering (DIS) data [89]. The xFitter program (version 2.0.0) [102], an open-source QCD fit framework for PDF determination, is used. The precise HERA DIS data, obtained from the combination of individual H1 and ZEUS results, are directly sensitive to the valence and sea quark distributions and probe the gluon distribution through scaling violations. Therefore, these data form the core of all PDF fits. The measured \(\hbox {t}{\bar{\hbox {t}}}\) cross sections are included in the fit to constrain \(\alpha _{S}\), \(m_{{\text {t}}}^{{\text {pole}}}\), and the gluon distribution at high values of x, where x is the fraction of the proton momentum carried by a parton. The typical probed x values can be estimated using the LO kinematic relation

$$\begin{aligned} x = \frac{M(\hbox {t}{\bar{\hbox {t}}})}{\sqrt{s}}{\text {e}}^{\pm y(\hbox {t}{\bar{\hbox {t}}})}. \end{aligned}$$
(6)

The present measurement is expected to be mostly sensitive to x values in the region \(0.01 \lesssim x \lesssim 0.1\), as estimated using the highest or lowest \(|y(\hbox {t}{\bar{\hbox {t}}}) |\) or \(M(\hbox {t}{\bar{\hbox {t}}})\) bins and taking the low or high bin edge where the cross section is largest.

10.1 Details of the QCD analysis

The scale evolution of partons is calculated through DGLAP equations [103,104,105,106,107,108,109] at NLO, as implemented in the qcdnum program [110] (version 17.01.14). The Thorne–Roberts [111,112,113] variable-flavour number scheme at NLO is used for the treatment of the heavy-quark contributions. The number of flavours is set to 5, with \({\text {c}}\) and \({\text {b}}\) quark mass parameters \(M_{{\text {c}}}= 1.47\) \(\,{\text {GeV}}\) and \(M_{{\text {b}}} = 4.5\) \(\,{\text {GeV}}\) [89]. For the DIS data \(\mu _{{\mathrm {r}}}\) and \(\mu _{{\mathrm {f}}}\) are set to Q, which denotes the four-momentum transfer. The \(Q^2\) range of the HERA data is restricted to \(Q^2 > Q^2_{\text {min}} = 3.5\,{\text {GeV}} ^2\) [89]. The theoretical predictions for the \(\hbox {t}{\bar{\hbox {t}}}\) cross sections are calculated as described in Sect. 9 and are included in the fit using the mg5_amc@nlo (version 2.6.0) [38] framework, interfaced with the aMCfast (version 1.3.0) [114] and ApplGrid (version 1.4.70) [115] programs. The \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) are left free in the fit. Technically, ApplGrid tables have been produced for fixed values of \(m_{{\text {t}}}^{{\text {pole}}}\), while the theoretical predictions as a function of \(m_{{\text {t}}}^{{\text {pole}}}\) were obtained by linear interpolation between two predictions using different \(m_{{\text {t}}}^{{\text {pole}}}\) values. The results do not depend significantly on which particular \(m_{{\text {t}}}^{{\text {pole}}}\) values are used for linear interpolation. Consistent results were also obtained using a cubic spline interpolation.

The procedure for the determination of the PDFs follows the approach of HERAPDF2.0 [89]. The parametrised PDFs are the gluon distribution \(x{\text {g}}(x)\), the valence quark distributions \(x{\text {u}}_{{\mathrm {v}}}(x)\) and \(x{\text {d}}_{{\mathrm {v}}}(x)\), and the \({\text {u}}\)- and \({\text {d}}\)-type antiquark distributions \(x\overline{{\mathrm {U}}}(x)\) and \(x\overline{{\mathrm {D}}}(x)\). At the initial QCD evolution scale \(\mu _{{\mathrm {f0}}}^2 = 1.9\,{\text {GeV}} ^2\), the PDFs are parametrised as:

$$\begin{aligned} \begin{aligned} x{\text {g}}(x)&= A_{{\text {g}}} x^{B_{{\text {g}}}}\,(1-x)^{C_{{\text {g}}}}\, (1 + E_{{\text {g}}} x^2) - A'_{{\text {g}}} x^{B'_{{\text {g}}}}\,(1-x)^{C'_{{\text {g}}}},\\ x{\text {u}}_{{\mathrm {v}}}(x)&= A_{{\text {u}}_{{\mathrm {v}}}}x^{B_{{\text {u}}_{{\mathrm {v}}}}}\, (1-x)^{C_{{\text {u}}_{{\mathrm {v}}}}}\,(1+D_{{\text {u}}_{{\mathrm {v}}}}x),\\ x{\text {d}}_{{\mathrm {v}}}(x)&= A_{{\text {d}}_{{\mathrm {v}}}}x^{B_{{\text {d}}_{{\mathrm {v}}}}}\,(1-x)^{C_{{\text {d}}_{{\mathrm {v}}}}},\\ x\overline{{\mathrm {U}}}(x)&= A_{\overline{{\mathrm {U}}}}x^{B_{\overline{{\mathrm {U}}}}}\, (1-x)^{C_{\overline{{\mathrm {U}}}}}\, (1+D_{\overline{{\mathrm {U}}}}x), \\ x\overline{{\mathrm {D}}}(x)&= A_{\overline{{\mathrm {D}}}}x^{B_{\overline{{\mathrm {D}}}}}\, (1-x)^{C_{\overline{{\mathrm {D}}}}}, \end{aligned} \end{aligned}$$
(7)

assuming the relations \(x\overline{{\mathrm {U}}}(x) = x{\bar{{\text {u}}}}(x)\) and \(x\overline{{\mathrm {D}}}(x) = x{\bar{{\text {d}}}}(x) + x{\bar{{\text {s}}}}(x)\)ϕ. Here, \(x{\bar{{\text {u}}}}(x)\), \(x{\bar{{\text {d}}}}(x)\), and \(x{\bar{{\text {s}}}}(x)\) are the up, down, and strange antiquark distributions, respectively. The sea quark distribution is defined as \(x{\varSigma }(x)=x{\bar{{\text {u}}}}(x)+x{\bar{{\text {d}}}}(x)+x{\bar{{\text {s}}}}(x)\). The normalisation parameters \(A_{{\text {u}}_{{\mathrm {v}}}}\), \(A_{{\text {d}}_{{\mathrm {v}}}}\), and \(A_{{\text {g}}}\) are determined by the QCD sum rules. The B and \(B'\) parameters determine the PDFs at small x, and the C parameters describe the shape of the distributions as \(x\,{\rightarrow }\,1\). The parameter \(C'_{{\text {g}}}\) is fixed to 25 such that the term does not contribute at large x [89, 116]. Additional constraints \(B_{\overline{{\mathrm {U}}}} = B_{\overline{{\mathrm {D}}}}\) and \(A_{\overline{{\mathrm {U}}}} = A_{\overline{{\mathrm {D}}}}(1 - f_{{\text {s}}})\) are imposed to ensure the same normalisation for the \(x{\bar{{\text {u}}}}\) and \(x{\bar{{\text {d}}}}\) distributions as \(x \rightarrow 0\). The strangeness fraction \(f_{{\text {s}}} = x{\bar{{\text {s}}}}/( x{\bar{{\text {d}}}}+ x{\bar{{\text {s}}}})\) is fixed to \(f_{{\text {s}}}=0.4\) as in the HERAPDF2.0 analysis [89]. This value is consistent with the determination of the strangeness fraction when using the CMS measurements of \({\text {W}}\)+\({{\text {c}}}\) production [117].

The D and E parameters are added for some distributions in order to provide a more flexible functional form. The parameters in Eq. (7) are selected by first fitting with all D and E parameters set to zero, and then including them independently one at a time in the fit. The improvement in the \(\chi ^2\) of the fit is monitored and the procedure is stopped when no further improvement is observed. This leads to a 15-parameter fit. The \(\chi ^2\) definition used for the HERA DIS data follows that of Eq. (32) in Ref. [89]. It includes an additional logarithmic term that is relevant when the estimated statistical and uncorrelated systematic uncertainties in the data are rescaled during the fit [118]. For the \(\hbox {t}{\bar{\hbox {t}}}\) data presented here, a \(\chi ^2\) definition without such a logarithmic term is employed. The treatment of the experimental uncertainties in the \(\hbox {t}{\bar{\hbox {t}}}\) double-differential cross section measurements follows the prescription given in Sect. 8. The correlated systematic uncertainties are treated through nuisance parameters. For each nuisance parameter a Gaussian probability density function is assumed and a corresponding penalty term is added to the \(\chi ^2\). The treatment of the experimental uncertainties for the HERA DIS data follows the prescription given in Ref. [89].

The uncertainties are estimated according to the general approach of HERAPDF2.0 [89] in which the fit, model, and parametrisation uncertainties are taken into account. Fit uncertainties are determined using the criterion of \({\varDelta }\chi ^2 = 1\). Model uncertainties arise from the variations in the values assumed for the \({\text {c}}\) quark mass parameter of \(1.41\le M_{{\text {c}}}\le 1.53\,{\text {GeV}} \), the strangeness fraction \(0.3 \le f_{{\text {s}}} \le 0.5\), and the value of \(Q^2_{{\text {min}}}\) imposed on the HERA data. The latter is varied within \(2.5 \le Q^2_{\text {min}} \le 5.0\,{\text {GeV}} ^2\), following Ref. [89]. The parametrisation uncertainty is estimated by varying the functional form in Eq. (7) of all parton distributions, with D and E parameters added or removed one at a time. Additional parametrisation uncertainties are considered by using two other functional forms in Eq. (7): with \(A'_{{\text {g}}} = 0\) and \(E_{{\text {g}}} = 0\), since the \(\chi ^2\) in these variants of the fit are only a few units worse than that with the nominal parametrisation. Furthermore, \(\mu _{{\mathrm {f0}}}^2\) is changed from 1.9 to 1.6 and \(2.2\,{\text {GeV}} ^2\). The parametrisation uncertainty is constructed as an envelope, built from the maximal differences between the PDFs or QCD parameters resulting from the central fit and all parametrisation variations. For the PDFs, this uncertainty is valid in the x range covered by the PDF fit to the data. The total uncertainty is obtained by adding the fit, model, and parametrisation uncertainties in quadrature. For \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) extraction, the scale uncertainties in the theoretical predictions for \(\hbox {t}{\bar{\hbox {t}}}\) production are also considered.

A cross-check is performed using the MC method [119, 120]. It is based on analysing a large number of pseudo-data sets called replicas. For this cross-check, 1000 replicas are created by taking the data and fluctuating the values of the cross sections randomly within their statistical and systematic uncertainties taking correlations into account. All uncertainties are assumed to follow Gaussian distributions. The central values for the fitted parameters and their uncertainties are estimated using the mean and RMS values over the replicas. The obtained values of the PDF parameters, \(\alpha _{S}(m_{{\text {Z}}})\), and \(m_{{\text {t}}}^{{\text {pole}}}\) and their fit uncertainties are in agreement with the nominal results.

Table 4 The individual contributions to the uncertainties for the \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) determination
Table 5 The global and partial \(\chi ^2\)/dof values for all variants of the QCD analysis. The variant of the fit that uses the HERA DIS only is denoted as ‘Nominal fit’. For the HERA measurements, the energy of the proton beam, \(E_{{\text {p}}}\), is listed for each data set, with the electron energy being \(E_{{\text {e}}}=27.5\,{\text {GeV}} \), CC and NC standing for charged and neutral current, respectively. The correlated \(\chi ^2\) and the log-penalty \(\chi ^2\) entries refer to the \(\chi ^2\) contributions from the nuisance parameters and from the logarithmic term, respectively, as described in the text
Fig. 19
figure 19

Comparison of the measured \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections to the NLO predictions using the parameter values from the simultaneous PDF, \(\alpha _{S}\), and \(m_{{\text {t}}}^{{\text {pole}}}\) fit (further details can be found in Fig. 3). Values of \(\chi ^2\) and dof are reported

Fig. 20
figure 20

Comparison of the measured \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections to the NLO predictions using the parameter values from the simultaneous PDF, \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) fit of the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections, as well as the predictions obtained using the NNPDF3.1 and ABMP16 PDF sets with different values of \(m_{{\text {t}}}^{{\text {pole}}}\) (see Fig. 3 for further details). In the lower panel, the ratios of the data and theoretical predictions to the predictions from the fit are shown. For each theoretical prediction, values of \(\chi ^2\) and dof for the comparison to the data are reported

10.2 Fit results

The resulting values of \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) extracted using NLO calculations are:

$$\begin{aligned} \alpha _{S}(m_{{\text {Z}}})&= 0.1135 \pm 0.0016\,({\text {fit}}) {}^{+0.0002}_{-0.0004}\,({\text {model}}) \nonumber \\&\quad {}^{+0.0008}_{-0.0001}\,({\text {param}}) {}^{+0.0011}_{-0.0005}\,({\text {scale}})\nonumber \\&= 0.1135{}^{+0.0021}_{-0.0017}\,({\text {total}}),\nonumber \\ m_{{\text {t}}}^{{\text {pole}}}&= 170.5 \pm 0.7\,({\text {fit}}) \pm 0.1\,({\text {model}}) {}^{+0.0}_{-0.1}\,({\text {param}})\nonumber \\&\quad \pm 0.3\,({\text {scale}})\,{\text {GeV}} \nonumber \\&= 170.5 \pm 0.8 ({\text {total}})\,{\text {GeV}}. \end{aligned}$$
(8)

Here ‘fit’, ‘model’ and ‘param’ denote the fit, model and parameter uncertainties discussed above. The uncertainties arising from the scale variations are estimated by repeating the fit with altered values of the scales as described in Sect. 9 and taking the differences with respect to the nominal result. The individual contributions to the uncertainties are listed in Table 4. The extracted \(\alpha _{S}(m_{{\text {Z}}})\) and \(m_{{\text {t}}}^{{\text {pole}}}\) values have only weak positive correlation \(\rho (\alpha _{S}(m_{{\text {Z}}}),m_{{\text {t}}}^{{\text {pole}}}) = 0.3\), where the correlation was obtained from the data uncertainties propagated to the fit. This shows that the two SM parameters can be simultaneously determined from these data to high precision with only weak correlation between them. As discussed in Sect. 9, one expects an additional theoretical uncertainty in the extracted \(m_{{\text {t}}}^{{\text {pole}}}\) value of the order of \(+1 \,{\text {GeV}} \) due to gluon resummation corrections that are missing in the NLO calculation.

The global and partial \(\chi ^2\) values of the fit are listed in Table 5, illustrating the consistency of the input data with the fit model. In particular, the \(\hbox {t}{\bar{\hbox {t}}}\) data are well described in the fit. The DIS data show \(\chi ^2\)/dof values slightly larger than unity, similar to what is observed and investigated in Ref. [89]. For the \(\hbox {t}{\bar{\hbox {t}}}\) data, the full \(\chi ^2\) (including uncorrelated and correlated data uncertainties) is 20 for 23 degrees of freedom. The \(\hbox {t}{\bar{\hbox {t}}}\) cross sections are compared to the NLO predictions obtained after the fit in Fig. 19. Furthermore, in Fig. 20 the \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections (which were not used in the fit) are compared to NLO predictions obtained using the fitted PDFs, \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\), as well as other global PDF sets. The data are in satisfactory agreement with the predictions obtained in this analysis. In particular, these predictions or predictions obtained using the ABMP16 PDF set describe the slope of \(p_{{\mathrm {T}}} ({\text {t}})\) considerably better than the predictions obtained using the NNPDF3.1 PDF set, while the difference in the \(\chi ^2\) values is less significant. Additionally, the predicted \(p_{{\mathrm {T}}} ({\text {t}})\) slope is sensitive to the \(m_{{\text {t}}}^{{\text {pole}}}\) values used in the calculations.

Fits were performed for a series of \(\alpha _{S}(m_{{\text {Z}}})\) values ranging from \(\alpha _{S}(m_{{\text {Z}}}) = 0.100\) to \(\alpha _{S}(m_{{\text {Z}}}) = 0.130\) using only HERA DIS data, or HERA and \(\hbox {t}{\bar{\hbox {t}}} \) data. The results are shown in Fig. 21. A shallow \(\chi ^2\) dependence on \(\alpha _{S}(m_{{\text {Z}}})\) is present when using only the HERA DIS data, similar to the findings of the HERAPDF2.0 analysis [89]. Once the \(\hbox {t}{\bar{\hbox {t}}}\) data are included in the fit, a distinctly sharper minimum in \(\chi ^2\) is observed which coincides with the one found in the simultaneous PDF and \(\alpha _{S}(m_{{\text {Z}}})\) fit given in Eq. (8).

Fig. 21
figure 21

\({\varDelta }\chi ^2 = \chi ^2 - \chi ^2_{{\text {min}}}\) as a function of \(\alpha _{S}(m_{{\text {Z}}})\) in the QCD analysis using the HERA DIS data only, or HERA and \(\hbox {t}{\bar{\hbox {t}}}\) data

Fig. 22
figure 22

The PDFs with their total uncertainties in the fit using the HERA DIS data only, and the HERA DIS and \(\hbox {t}{\bar{\hbox {t}}}\) data. The results are normalised to the PDFs obtained using the HERA DIS data only

Fig. 23
figure 23

The relative total PDF uncertainties in the fit using the HERA DIS data only, and the HERA DIS and \(\hbox {t}{\bar{\hbox {t}}}\) data

Both the \(\hbox {t}{\bar{\hbox {t}}}\) and the HERA DIS data are sensitive to the \(\alpha _{S}(m_{{\text {Z}}})\) value in the fit: while the constraints from the \(\hbox {t}{\bar{\hbox {t}}}\) data seem to be dominant, the residual dependence of \(\alpha _{S}(m_{{\text {Z}}})\) on the HERA DIS data may remain nonnegligible. There is no way to assess the latter quantitatively because the HERA DIS data cannot be removed from the PDF fit. However, as was investigated in the HERAPDF2.0 analysis [89], when using only HERA DIS the minima are strongly dependent on the \(Q^2_{\text {min}}\) threshold. As a cross-check, the extraction of \(\alpha _{S}(m_{{\text {Z}}})\) was repeated for a larger threshold variation \(2.5 \le Q^2_{\text {min}} \le 30.0\,{\text {GeV}} ^2\). In contrast to the results of Ref. [89] obtained using only HERA DIS data, when adding the \(\hbox {t}{\bar{\hbox {t}}}\) data the extracted values of \(\alpha _{S}(m_{{\text {Z}}})\) show no systematic dependence on \(Q^2_{\text {min}}\) and are consistent with the nominal result of Eq. (8) within the total uncertainty.

To demonstrate the added value of the \(\hbox {t}{\bar{\hbox {t}}}\) cross sections, the QCD analysis is first performed using only the HERA DIS data. In this fit, \(\alpha _{S}(m_{{\text {Z}}}) \) is fixed to the value extracted from the fit using the \(\hbox {t}{\bar{\hbox {t}}}\) data, \(\alpha _{S}(m_{{\text {Z}}}) = 0.1135\), and the \(\alpha _{S}(m_{{\text {Z}}}) \) uncertainty of \(\pm 0.0016\) is added to the fit uncertainties. Then the \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) measurement is added to the fit. The global and partial \(\chi ^2\) values for the two variants of the fit are listed in Table 5.

The corresponding PDFs are compared in Fig. 22. The largest impact of the \(\hbox {t}{\bar{\hbox {t}}}\) data is observed at \(x \gtrsim 0.1\). In this region the gluon distribution lacks direct constraints in the fit using the HERA DIS data only. The impact on the valence and sea quark PDFs is expected because of the correlations between the different distributions in the fit arising in the PDF evolution and from the momentum sum rule.

In Fig. 23 the total PDF uncertainties are shown for the two variants of the fits. A reduction of uncertainties is observed for the gluon distribution, especially at \(x \sim 0.1\) where the included \(\hbox {t}{\bar{\hbox {t}}}\) data are expected to provide constraints, while the improvement at \(x \lesssim 0.1\) originates mainly from the reduced correlation between \(\alpha _{S}(m_{{\text {Z}}})\) and the gluon PDF. A smaller uncertainty reduction is observed for other PDFs as well (valence and sea quark distributions), because of the correlations between the PDF distributions in the fit, as explained above. In addition to the fit uncertainty reduction, the \(\hbox {t}{\bar{\hbox {t}}}\) data constrain the large asymmetric model uncertainty of the gluon PDF at high x. This uncertainty originates from the variation of \(Q^2_{\text {min}}\) in the fit, using the HERA DIS data only, because of a lack of direct constraints from these data.

In Fig. 24 the extracted \(\alpha _{S}\), \(m_{{\text {t}}}^{{\text {pole}}}\), and gluon PDF at the scale \(\mu _{{\mathrm {f}}}^2 = 30{,}000\,{\text {GeV}} ^2\) for several values of x are shown, together with their correlations. For this plot, the asymmetric \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\) uncertainties are symmetrised by taking the largest deviation, and the correlation of the fit uncertainties is assumed for the total uncertainties as well. The evolution of PDFs involves \(\alpha _{S}(m_{{\text {Z}}})\), therefore PDFs always depend on the \(\alpha _{S}(m_{{\text {Z}}})\) assumed during their extraction. When using only the HERA DIS data, the largest dependence on \(\alpha _{S}(m_{{\text {Z}}})\) is observed for the gluon distribution. The \(\hbox {t}{\bar{\hbox {t}}}\) data reduce this dependence, because they provide constraints on both the gluon distribution and \(\alpha _{S}\), reducing their correlation. In addition, the multi-differential \([N^{0,1+}_{{\text {jet}}}, M(\hbox {t}{\bar{\hbox {t}}}), y(\hbox {t}{\bar{\hbox {t}}}) ]\) cross sections provide constraints on \(m_{{\text {t}}}^{{\text {pole}}}\). As a result, the gluon PDF, \(\alpha _{S}(m_{{\text {Z}}})\), and \(m_{{\text {t}}}^{{\text {pole}}}\) can be determined simultaneously and their fitted values depend only weakly on each other. This makes future PDF fits at NNLO, once corresponding theoretical predictions for inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production with additional jets become available, very interesting.

Fig. 24
figure 24

The extracted values and their correlations for \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\) (upper left), \(\alpha _{S}\) and gluon PDF (lower left), and \(m_{{\text {t}}}^{{\text {pole}}}\) and gluon PDF (lower, right). The gluon PDF is shown at the scale \(\mu _{{\mathrm {f}}}^2 = 30\,000\,{\text {GeV}} ^2\) for several values of x. For the extracted values of \(\alpha _{S}\) and \(m_{{\text {t}}}^{{\text {pole}}}\), the additional uncertainties arising from the dependence on scale are shown (see Eq. (8) and Table 4). The correlation coefficients \(\rho \) are also displayed. Furthermore, values of \(\alpha _{S}\)(\(m_{{\text {t}}}^{{\text {pole}}}\), gluon PDF) extracted using fixed values of \(m_{{\text {t}}}^{{\text {pole}}}\) (\(\alpha _{S}\)) are displayed as dashed, dotted, or dash-dotted lines. The world average values \(\alpha _{S}(m_{{\text {Z}}}) = 0.1181 \pm 0.0011\) and \(m_{{\text {t}}}^{{\text {pole}}} = 173.1 \pm 0.9\,{\text {GeV}} \) from Ref. [97] are shown for reference

11 Summary

A measurement was presented of normalised multi-differential \(\hbox {t}{\bar{\hbox {t}}}\) production cross sections in \({\text {p}}{\text {p}} \) collisions at \(\sqrt{s}=13\,{\text {TeV}} \), performed using events containing two oppositely charged leptons (electron or muon). The analysed data were recorded in 2016 with the CMS detector at the LHC, and correspond to an integrated luminosity of \(35.9{\,{\text {fb}}^{-1}} \). The normalised \(\hbox {t}{\bar{\hbox {t}}}\) cross section is measured in the full phase space as a function of different pairs of kinematic variables that describe either the top quark or the \(\hbox {t}{\bar{\hbox {t}}}\) system. None of the central predictions of the tested Monte Carlo models is able to correctly describe all the distributions. The data exhibit softer transverse momentum \(p_{{\mathrm {T}}} ({\text {t}})\) distributions than given by the theoretical predictions, as was reported in previous single-differential and double-differential \(\hbox {t}{\bar{\hbox {t}}}\) cross section measurements. The effect of the softer \(p_{{\mathrm {T}}} ({\text {t}})\) spectra in the data relative to the predictions is enhanced at larger values of the invariant mass of the \(\hbox {t}{\bar{\hbox {t}}}\) system. The predicted \(p_{{\mathrm {T}}} ({\text {t}})\) slopes are strongly sensitive to the parton distribution functions (PDFs) and the top quark pole mass \(m_{{\text {t}}}^{{\text {pole}}}\) value used in the calculations, and the description of the data can be improved by changing these parameters.

The measured \(\hbox {t}{\bar{\hbox {t}}}\) cross sections as a function of the invariant mass and rapidity of the \(\hbox {t}{\bar{\hbox {t}}}\) system, and the multiplicity of additional jets, have been incorporated into two specific fits of QCD parameters at next-to-leading order, after applying corrections for nonperturbative effects, together with the inclusive deep inelastic scattering data from HERA. When fitting only \(\alpha _{S}(m_{{\text {Z}}}) \) and \(m_{{\text {t}}}^{{\text {pole}}}\) to the data, using external PDFs, the two parameters are determined with high accuracy and rather weak correlation between them, however, the extracted \(\alpha _{S}(m_{{\text {Z}}}) \) values depend on the PDF set. In a simultaneous fit of \(\alpha _{S}\), \(m_{{\text {t}}}^{{\text {pole}}}\), and PDFs, the inclusion of the new multi-differential \(\hbox {t}{\bar{\hbox {t}}}\) measurements has a significant impact on the extracted gluon PDF at large values of x, where x is the fraction of the proton momentum carried by a parton, and at the same time allows an accurate determination of \(\alpha _{S}\)and \(m_{{\text {t}}}^{{\text {pole}}}\). The values \(\alpha _{S}(m_{{\text {Z}}}) = 0.1135{}^{+0.0021}_{-0.0017}\) and \(m_{{\text {t}}}^{{\text {pole}}} = 170.5 \pm 0.8 \,{\text {GeV}} \) are obtained, which account for experimental and theoretical uncertainties.

The extraction of \(m_{{\text {t}}}^{{\text {pole}}}\) performed in this paper exploits the sensitivity of the \(\hbox {t}{\bar{\hbox {t}}}\) invariant mass distribution to the value of \(m_{{\text {t}}}^{{\text {pole}}}\). The highest sensitivity comes from the region of low \(\hbox {t}{\bar{\hbox {t}}}\) masses. Threshold corrections from Coulomb and soft-gluon resummation are expected to affect this region. In Ref. [98] an estimate of these effects is provided, showing an expected increase of the total inclusive \(\hbox {t}{\bar{\hbox {t}}}\) production cross section by about 1%. A more recent study for the total cross section shows that these corrections are presently known only with a large relative uncertainty [99]. Threshold corrections are neglected in the \(m_{{\text {t}}}^{{\text {pole}}}\) extraction performed in the present analysis. A rough estimate shows that the inclusion of these corrections according to the size estimated in Ref. [98] could lead to an increase of the extracted \(m_{{\text {t}}}^{{\text {pole}}}\) value by up to \(+0.7\) GeV. In the future, once precise calculations including threshold corrections are available for differential \(\hbox {t}{\bar{\hbox {t}}}\) cross sections, such corrections should be included for an improved \(m_{{\text {t}}}^{{\text {pole}}}\) extraction. For the time being one can assume an additional theoretical uncertainty in the extracted \(m_{{\text {t}}}^{{\text {pole}}}\) value of the order of \(+1 \,{\text {GeV}} \) due to neglected gluon resummation effects.

Table 6 The measured \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections, along with their relative statistical and systematic uncertainties
Table 7 The correlation matrix of statistical uncertainties for the measured \([y({\text {t}}), p_{{\mathrm {T}}} ({\text {t}}) ]\) cross sections. The values are expressed as percentages. For bin indices see Table 6