1 Introduction

Measurements of the top quark–antiquark pair cross section \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) in proton–proton (pp) collisions provide important tests of the standard model (SM). At the CERN LHC, measurements with increasing precision have been performed by the ATLAS and CMS Collaborations in several different decay channels and at four pp collision energies [1,2,3,4,5]. Precise theoretical predictions of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) have been performed in perturbative quantum chromodynamics (QCD) at next-to-next-to-leading order (NNLO) [6,7,8,9]. The calculations depend on several fundamental parameters: the top quark mass \(m_\mathrm {\mathrm {t}}\), the strong coupling constant \(\alpha _S \), and the parton distribution functions (PDFs) of the proton. The measurements of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) have been used to determine the top quark pole mass [1, 4, 10,11,12], \(\alpha _S \)  [4, 13], and the PDFs [14,15,16,17].

The value of \(m_\mathrm {\mathrm {t}}\) significantly affects the prediction for many observables, either directly or via radiative corrections. It is a key input to electroweak precision fits [18] and, together with the value of the Higgs boson mass and \(\alpha _S \), it has direct implications on the SM predictions for the stability of the electroweak vacuum [19]. In QCD calculations beyond leading order, \(m_\mathrm {\mathrm {t}}\) depends on the renormalization scheme. In the context of the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) predictions, the pole (on-shell) definition for the top quark mass \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) has wide applications; however, it suffers from the renormalon problem that introduces a theoretical ambiguity in its definition. The minimal subtraction (\(\mathrm {\overline{MS}}\)) renormalization scheme has been shown to have a faster convergence than other schemes [20]. The relation between the pole and \(\mathrm {\overline{MS}}\) masses is known to the four-loop level in QCD [21]. Experimentally, the most precise measurements of the top quark mass are obtained in so-called direct measurements performed at the Tevatron and LHC [22,23,24,25]. Except for a few cases such as Ref. [26], the measurements rely on Monte Carlo (MC) generators to provide the relation between the top quark mass and an experimental observable. Current MC generators implement matrix elements at leading or next-to-leading order (NLO), while higher orders are simulated through parton showering. Studies suggest that the top quark mass parameter \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\), as implemented in current MC generators, corresponds to \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) to an uncertainty on the order of 1\(\,\text {Ge}\text {V}\)  [27, 28]. A theoretically well-defined mass can be determined by comparing the measured \({\mathrm {t}\overline{\mathrm {t}}}\) cross section to the fixed-order theoretical predictions [1, 4, 10,11,12].

With the exception of the quark masses, \(\alpha _S \) is the only free parameter in the QCD Lagrangian. While the renormalization group equation predicts the energy dependence of \(\alpha _S \), i.e. it gives a functional form for \(\alpha _S (Q)\), where Q is the energy scale of the process, actual values of \(\alpha _S \) can only be obtained from experimental data. By convention and to facilitate comparisons, \(\alpha _S \) values measured at different energy scales are typically evolved to \(Q = m_\mathrm {\mathrm {Z}} \), the mass of the \(\mathrm {Z}\) boson. The current world-average value for \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is \(0.1181 \pm 0.0011\) [29]. In spite of this relatively precise result, the uncertainty in \(\alpha _S \) still contributes significantly to many QCD predictions, including cross sections for top quark or Higgs boson production. Very few measurements allow \(\alpha _S \) to be tested at high Q, and the precision on the world-average value for \(\alpha _S (Q)\) is driven by low-Q measurements. A determination of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) was used by the CMS Collaboration to extract the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) at NNLO for the first time [11]. In the prediction for \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \), \(\alpha _S \) appears not only in the expression for the parton-parton interaction but also in the QCD evolution of the PDFs. Varying the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) in the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) calculation therefore requires a consistent modification of the PDFs. The full correlation between the gluon PDF, \(\alpha _S \), and \(m_\mathrm {\mathrm {t}}\) in the prediction for \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) has to be accounted for.

The analysis uses events in the dileptonic decay channels in which the two \(\mathrm {W}\) bosons from the electroweak decays of the two top quarks each produce an electron or a muon, leading to three event categories: \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\), \(\mathrm {\mu ^+}\mathrm {\mu ^-}\), and \(\mathrm {e}^+\mathrm {e}^-\). The data set was recorded by CMS in 2016 at a centre-of-mass energy of 13\(\,\text {Te}\text {V}\), corresponding to an integrated luminosity of \(35.9{\,\text {fb}^{-1}} \). The measurement is performed using a maximum-likelihood fit in which the sources of systematic uncertainty are treated as nuisance parameters. Distributions of observables are chosen as input to the fit so as to further constrain the uncertainties. The fitting procedure largely follows the approach of Ref. [4]. In this analysis, the number of events is significantly larger than in previous data sets, thus providing tighter constraints. The dominant uncertainties come from the integrated luminosity and the efficiency to identify the two leptons. The correlation between the three decay channels is used to constrain the overall lepton identification uncertainty to that of the better-constrained lepton, which is the muon.

Experimentally, the measured value of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) has a residual dependence on the value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) used in the simulation to estimate the detector efficiency and acceptance. In contrast, the experimental dependence of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) on the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) used in the simulation is negligible [11]. For the extraction of a theoretically well-defined \(m_\mathrm {\mathrm {t}}\), the dependence of the cross section on the assumption of a \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) value can be reduced by including \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) as an additional free parameter in the fit [30]. In this paper, the cross section \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) is first measured for a fixed value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \,\text {Ge}\text {V} \), and then determined simultaneously with \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). In the simultaneous fit, input distributions sensitive to the top quark mass are introduced in order to constrain \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). For the measured parameter \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\), the same systematic uncertainties are taken into account as in Ref. [31]. Finally, the measured value of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) at the experimentally constrained value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is used to extract \(\alpha _S (m_\mathrm {\mathrm {Z}})\) and \(m_\mathrm {\mathrm {t}}\) in the \(\mathrm {\overline{MS}}\) scheme, using different PDF sets. For \(m_\mathrm {\mathrm {t}}\), the pole mass scheme is also considered.

The paper is structured as follows. After a brief description of the CMS experiment and the MC event generators in Sect. 2, the event selection is presented in Sect. 3. The event categories and the maximum-likelihood fit are explained in Sect. 4. The systematic uncertainties in the measurement are discussed in Sect. 5. The result of the cross section measurement at a fixed value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \,\text {Ge}\text {V} \) is presented in Sect. 6, and the simultaneous measurement of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is presented in Sect. 7. The extraction of \(m_\mathrm {\mathrm {t}}\) and \(\alpha _S \) in the \(\mathrm {\overline{MS}}\) scheme and the top quark pole mass are described in Sects. 8 and 9, respectively, and a summary is given in Sect. 10.

2 The CMS detector and Monte Carlo simulation

The central feature of the CMS apparatus [32] is a superconducting solenoid of 6\(\,\text {m}\) internal diameter, providing a magnetic field of 3.8\(\,\text {T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. These are used to identify electrons, photons, and jets. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. The detector is nearly hermetic, providing reliable measurement of the momentum imbalance in the plane transverse to the beams. A two-level trigger system selects interesting events for offline analysis [33]. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [32].

The powheg  v2 [34,35,36] NLO MC generator is used to simulate \({\mathrm {t}\overline{\mathrm {t}}}\) events [37] and its model dependencies on \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\), the PDFs [37], and the renormalization and factorization scales, \(\mu _\mathrm {r} =\mu _\mathrm {f} =m_{\mathrm {T}} =\sqrt{\smash [b]{m_\mathrm {\mathrm {t}} ^2+p_{\mathrm {T}} ^2}}\), where \(m_\mathrm {\mathrm {t}}\) is the pole mass and \(p_{\mathrm {T}}\) is the transverse momentum of the top quark. The PDF set NNPDF3.0 [38] is used to describe the proton structure. The parton showers are modelled using pythia  8.2 [39] with the CUETP8M2T4 underlying event (UE) tune [40, 41]. In this analysis, \({\mathrm {t}\overline{\mathrm {t}}}\) events are split into a signal and a background component. The signal consists of dilepton events and includes contributions from leptonically decaying \(\mathrm {\tau }\) leptons. All other \({\mathrm {t}\overline{\mathrm {t}}}\) events are considered as background.

Contributions to the background include single top quark processes (\(\mathrm {t}\mathrm {W}\)), Drell–Yan (DY) events (\(\mathrm {Z}/\gamma ^*\)+jets), and \(\mathrm {W}\)+jets production, as well as diboson (VV) events (including \(\mathrm {W}\) \(\mathrm {W}\), \(\mathrm {W}\) \(\mathrm {Z}\), and \(\mathrm {Z}\) \(\mathrm {Z}\)) with multiple jets, while the contribution from QCD multijet production is found to be negligible. The DY and \(\mathrm {t}\mathrm {W}\) processes are simulated in powheg  v2 [42,43,44] with the NNPDF3.0 PDF and interfaced to pythia  8.202 with the UE tune CUETP8M2T4 [45] for hadronization and fragmentation. The \(\mathrm {W}\)+jets events are generated at NLO using MadGraph 5_amc@nlo  2.2.2 [46, 47] with the NNPDF3.0 PDF and pythia  8.2 with the UE tune CUETP8M1. Events with \(\mathrm {W}\) \(\mathrm {W}\), \(\mathrm {W}\) \(\mathrm {Z}\), and \(\mathrm {Z}\) \(\mathrm {Z}\) diboson processes are generated at leading order using pythia  8.2 with the NNPDF2.3 PDF and the CUETP8M1 tune.

To model the effect of additional pp interactions within the same or nearby bunch crossing (pileup), simulated minimum bias interactions are added to the simulated data. Events in the simulation are then weighted to reproduce the pileup distribution in the data, which is estimated from the measured bunch-to-bunch instantaneous luminosity, assuming a total inelastic pp cross section of 69.2\(\,\text {mb}\)  [48].

For comparison with the measured distributions, the event yields in the simulated samples are normalized to their cross section predictions. These are obtained from calculations at NNLO (for \(\mathrm {W}\)+jets and \(\mathrm {Z}/\gamma ^*\)+jets [49]), NLO plus next-to-next-to-leading logarithms (NNLL) (for \(\mathrm {t}\mathrm {W}\) production [50]), and NLO (for diboson processes [51]). For the simulated \({\mathrm {t}\overline{\mathrm {t}}}\) sample, the full NNLO+NNLL calculation, performed with the Top++  2.0 program, is used [52]. The proton structure is described by the CT14nnlo [53] PDF set, where the PDF and \(\alpha _S \) uncertainties are estimated using the prescription by the authors. These are added in quadrature to the uncertainties originating from the scale variation \(m_\mathrm {\mathrm {t}}/2<\mu _\mathrm {r}, \mu _\mathrm {f} <2m_\mathrm {\mathrm {t}} \). The cross section prediction is \(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {theo}} = 832~^{+20}_{-29}\, (\text {scale}) \pm 35 ~(\text {PDF}+\alpha _S ) \,\text {pb} \), assuming a top quark pole mass of 172.5\(\,\text {Ge}\text {V}\).

3 Event selection

Events with at least two leptons (electron or muon) of opposite charge are selected. In events with more than two leptons, the two leptons of opposite charge with the highest \(p_{\mathrm {T}}\) are used. An event sample of three mutually exclusive event categories \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\), \(\mathrm {\mu ^+}\mathrm {\mu ^-}\), and \(\mathrm {e}^+\mathrm {e}^-\) is obtained.

A combination of single and dilepton triggers is used to collect the events. Each event is required to pass at least one of the triggers described below. Events in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel are required to contain either one electron with \(p_{\mathrm {T}} > 12\,\text {Ge}\text {V} \) and one muon with \(p_{\mathrm {T}} > 23\,\text {Ge}\text {V} \), or one electron with \(p_{\mathrm {T}} > 23\,\text {Ge}\text {V} \) and one muon with \(p_{\mathrm {T}} > 8\,\text {Ge}\text {V} \). Events in the same-flavour channels are required to have \(p_{\mathrm {T}} > 23\,(17)\,\text {Ge}\text {V} \) for the electron (muon) with the higher \(p_{\mathrm {T}}\), referred to in the following as the leading lepton, and \(p_{\mathrm {T}} > 12\,(8) \,\text {Ge}\text {V} \) for the other electron (muon), referred to as the subleading lepton. For all channels, single-lepton triggers with one electron (muon) with \(p_{\mathrm {T}} > 27\,(24)\,\text {Ge}\text {V} \)  are also used.

The particle-flow (PF) algorithm aims to reconstruct and identify each individual particle in an event, and to form PF candidates by combining information from the various components of the CMS detector [54]. The reconstructed vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is taken to be the primary pp interaction vertex.

Electron and muon candidates are identified through their specific signatures in the detector [55, 56]. Lepton candidates are required to have \(p_{\mathrm {T}} > 25\,(20)\,\text {Ge}\text {V} \) for the leading (subleading) lepton, in the range \(|\eta | < 2.4\). Electron candidates in the transition region between the barrel and endcap calorimeters, corresponding to \(1.4442< |\eta |< 1.5660\), are rejected because the reconstruction of electrons in this region is not optimal.

Lepton isolation requirements are based on the ratio of the scalar sum of the \(p_{\mathrm {T}}\) of neighbouring PF candidates to the \(p_{\mathrm {T}}\) of the lepton candidate, which is referred to as the lepton isolation variable. These PF candidates are the ones falling within a cone of size \(\varDelta R=0.3 \,(0.4)\) for electrons (muons), centred on the lepton direction, excluding the contribution from the lepton candidate itself. The cone size \(\varDelta R\) is defined as the square root of the quadrature sum of the differences in the azimuthal angle and pseudorapidity. The value of the isolation variable is required to be smaller than 6% for electrons and 15% for muons. Events with dilepton invariant mass \(m_{\ell \ell } < 20\,\text {Ge}\text {V} \) (\(\ell =\mathrm {e},\mathrm {\mu }\)) are rejected to suppress backgrounds due to QCD multijet production and decays of low mass resonances. Additionally, leptons are required to be consistent with originating from the primary interaction vertex.

Jets are reconstructed from the PF candidates using the anti-\(k_{\mathrm {T}}\) clustering algorithm with a distance parameter of 0.4 [57, 58]. The jet momentum is determined from the vectorial sum of all particle momenta in the jet, and is found from simulation to be within 5 to 10% of the true momentum over the relevant phase space of this analysis [59]. Pileup interactions can contribute additional tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, charged particles identified as originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions [59]. The jet energy corrections are determined from measurements of the energy balance in dijet, multijet, photon+jet, and leptonically decaying \(\mathrm {Z}\)+jets events, and are applied as a function of the jet \(p_{\mathrm {T}}\) and \(\eta \) to both data and simulated events [59]. For this measurement, jets are selected if they fulfill the criteria \(p_{\mathrm {T}} > 30\,\text {Ge}\text {V} \) and \(|\eta |< 2.4\).

Jets originating from the hadronization of \(\mathrm {b}\) quarks (\(\mathrm {b}\) jets) are identified (\(\mathrm {b}\) tagged) using the combined secondary vertex [60] algorithm, which combines lifetime information from tracks and secondary vertices. To achieve high purity, a working point is chosen such that the fraction of light-flavour jets with \(p_{\mathrm {T}} > 30 \,\text {Ge}\text {V} \) that are falsely identified as \(\mathrm {b}\) jets is \(0.1\%\), resulting in an average efficiency of about 41% for genuine \(\mathrm {b}\) jets and 2.2% for \(\mathrm {c}\) jets [60].

In the same-flavour channels, \(\mathrm {\mu ^+}\mathrm {\mu ^-}\) and \(\mathrm {e}^+\mathrm {e}^-\), DY events are suppressed by excluding the region of the \(\mathrm {Z}\) boson mass through the requirement \(76< m_{\ell \ell } < 106\,\text {Ge}\text {V} \). In these channels, events are also required to contain at least one \(\mathrm {b}\)-tagged jet.

Distributions of the leading and subleading lepton \(p_{\mathrm {T}}\) and \(\eta \), and the jet and \(\mathrm {b}\)-tagged jet multiplicities in events fulfilling the above selection criteria are shown in Figs. 1, 2 and 3 for the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\), \(\mathrm {\mu ^+}\mathrm {\mu ^-}\), and \(\mathrm {e}^+\mathrm {e}^-\) channels, respectively. The event yields in the simulations are normalized to the corresponding cross section predictions, as explained in Sect. 2. Selected events include a very small contribution from \({\mathrm {t}\overline{\mathrm {t}}}\) processes in the lepton+jets decay channel (referred to as “\({\mathrm {t}\overline{\mathrm {t}}}\) other” in the figures) in which one of the charged leptons originates from heavy-flavour hadron decay, misidentified hadrons, muons from light-meson decays, or electrons from unidentified photon conversions. Such leptons also lead to dilepton background in this analysis via \(\mathrm {W}\)+jets processes.

In all categories, the simulation is found to describe the data well within the systematic uncertainties, indicated by the bands in the figures.

Fig. 1
figure 1

Distributions of the transverse momentum (left) and pseudorapidity (right) of the leading (upper) and subleading (middle) leptons in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel after the event selection for the data (points) and the predictions for the signal and various backgrounds from the simulation (shaded histograms). The lower row shows the jet (left) and \(\mathrm {b}\)-tagged jet (right) multiplicity distributions. The vertical bars on the points represent the statistical uncertainties in the data. The hatched bands correspond to the systematic uncertainty in the \({\mathrm {t}\overline{\mathrm {t}}}\) signal MC simulation. The uncertainties in the integrated luminosity and background contributions are not included. The ratios of the data to the sum of the predicted yields are shown in the lower panel of each figure. Here, the solid gray band represents the contribution of the statistical uncertainty in the MC simulation

Fig. 2
figure 2

The same distributions as in Fig. 1, but for the \(\mathrm {\mu ^+}\mathrm {\mu ^-}\) channel

Fig. 3
figure 3

The same distributions as in Fig. 1, but for the \(\mathrm {e}^+\mathrm {e}^-\) channel

4 Event categories and fit procedure

The measurement is performed using a template fit to multidifferential distributions, divided into distinct event categories using the \(\mathrm {b}\)-tagged jet multiplicity, similar to the method utilized in a previous measurement [4]. In each of the same-flavour channels, two categories are defined, corresponding to events having 1 or 2 \(\mathrm {b}\)-tagged jets. Events with zero \(\mathrm {b}\)-tagged jets are not included since they are dominated by the DY background process. In the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel, three categories are defined, corresponding to events having 1, 2, or 0 or \(\ge \) 3 \(\mathrm {b}\)-tagged jets. The templates describing the distributions for the signal and background events are taken from simulation. Categorizing the events by their \(\mathrm {b}\)-tagged jet multiplicity allows the efficiency \(\epsilon _{\mathrm {b}}\) for selecting and identifying a \(\mathrm {b}\) jet to be constrained. Previous measurements that used a template fit with dilepton events were restricted to the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel [1, 4]. In this analysis, the decay channels with two electrons and two muons are also included in the fit. In this way, additional constraints on the lepton identification efficiencies are obtained.

First, a visible \({\mathrm {t}\overline{\mathrm {t}}}\) cross section \(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}\), defined for a phase space corresponding to the experimentally accessible fiducial volume, as described in Sect. 6, is determined. For the visible cross section, the fit is used to constrain the systematic uncertainties from the data. Using the relation

$$\begin{aligned} \sigma _{\mathrm {t}\overline{\mathrm {t}}} = \frac{\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}}{A_{\ell \ell }}, \end{aligned}$$
(1)

the measured visible cross section is then extrapolated to the full phase space to obtain \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \). Here, \(A_{\ell \ell }\) denotes the acceptance, which is defined as the fraction of \({\mathrm {t}\overline{\mathrm {t}}}\) events that fulfill the selection criteria for the visible cross section. The acceptance incorporates the combined branching fraction for the \(\mathrm {t}\) and \(\overline{\mathrm {t}}\) quarks to decay to two charged leptons [29]. Apart from the free parameter of interest \(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}\), the parameters of the fit are the J nuisance parameters \(\mathbf {\lambda } = (\lambda _1, \lambda _2, \ldots , \lambda _J)\) corresponding to the various sources of systematic uncertainty, discussed in detail in Sect. 5.

The likelihood function L is based on Poisson statistics:

$$\begin{aligned} L = \prod _{i} \frac{\mathrm {e}^{ -\nu _i } \nu _i^{n_i}}{n_i!}\, \prod _{j} \pi (\lambda _j), \end{aligned}$$
(2)

where i denotes the bin of the respective final-state distribution, and \(\nu _i\) and \(n_i\) are the expected and observed number of events in bin i, respectively. The symbol \(\pi (\lambda _j)\) denotes a penalty term for the deviation of the nuisance parameter \(\lambda _j\) from its nominal value according to its prior density distribution. A Gaussian prior density distribution is assumed for all nuisance parameters. The expectation values \(\nu _i\) can be written as

$$\begin{aligned} \nu _i = s_i(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}},\mathbf {\lambda }) + \sum _{k} b_{k,i}^{\mathrm {MC}}(\mathbf {\lambda }). \end{aligned}$$
(3)

Here, \(s_i\) denotes the expected number of \({\mathrm {t}\overline{\mathrm {t}}}\) signal events in bin i and the quantity \(b_{k,i}^{\mathrm {MC}}\) represents the prediction of the number of background events in bin i from source k. The Minuit program [61] is used to minimize \(-2 \ln {(L)}\) with L given in Eq. (2), and the Minos  [61] algorithm is used to estimate the uncertainties.

For the determination of the \(\mathrm {b}\) tagging efficiencies, multinomial probabilities are used to describe the expected number of signal events with one \(\mathrm {b}\)-tagged jet, \(s_{1\mathrm {b}}\), two \(\mathrm {b}\)-tagged jets, \(s_{2\mathrm {b}}\), and zero or more than two \(\mathrm {b}\)-tagged jets, \(s_{\text {other}}\):

$$\begin{aligned} s_{1\mathrm {b}}= & {} \mathcal {L} \sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}} \epsilon _{\ell \ell } 2 \epsilon _{\mathrm {b}}(1-C_{\mathrm {b}}\epsilon _{\mathrm {b}}), \end{aligned}$$
(4)
$$\begin{aligned} s_{2\mathrm {b}}= & {} {\mathcal {L} \sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}} \epsilon _{\ell \ell } \epsilon _{\mathrm {b}}^2 C_{\mathrm {b}} },\end{aligned}$$
(5)
$$\begin{aligned} s_{\text {other}}= & {} \mathcal {L} \sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}} \epsilon _{\ell \ell } (1-2\epsilon _{\mathrm {b}}(1-C_{\mathrm {b}} \epsilon _{\mathrm {b}})-\epsilon _{\mathrm {b}}^2C_{\mathrm {b}}), \end{aligned}$$
(6)

where \(\mathcal {L}\) denotes the integrated luminosity and \(\epsilon _{\ell \ell }\) is the efficiency for events in the visible phase space to pass the full selection described in Sect. 3. The quantity \(C_{\mathrm {b}}\) corrects for any small correlations between the tagging of two \(\mathrm {b}\) jets in an event, expressed as \( C_{\mathrm {b}} = 4 s_{\text {all}} s_{2\mathrm {b}}/ (s_{1\mathrm {b}}+2 s_{2\mathrm {b}})^2\), where \(s_{\text {all}}\) denotes the total number of signal events. The values for \(\epsilon _{\ell \ell }\), \(\epsilon _{\mathrm {b}}\), and \(C_{\mathrm {b}}\) are directly determined from the \({\mathrm {t}\overline{\mathrm {t}}}\) signal simulation, expressing \(\epsilon _{\mathrm {b}}\) as \((s_{1\mathrm {b}} + 2 s_{2\mathrm {b}})/2 s_{\text {all}}\). The values of these parameters for the nominal signal simulation in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel are \(\epsilon _{\mathrm {e}\mathrm {\mu }} = 0.49\), \(\epsilon _{\mathrm {b}} = 0.30\), and \(C_{\mathrm {b}} = 1.00\).

The overall selection efficiency \(\epsilon _{\ell \ell }\) is a linear combination of the efficiencies \(\epsilon _{\mathrm {e}\mathrm {\mu }}\), \(\epsilon _{\mathrm {e}\mathrm {e}}\), and \(\epsilon _{\mathrm {\mu }\mathrm {\mu }}\), in the three different dilepton channels, each given by the product of the two efficiencies for identifying a single lepton of the respective flavour. Prior to the fit, the muon identification uncertainty is smaller than that for electrons. By fitting the three dilepton decay channels simultaneously, the ratio of single-lepton efficiencies \(\epsilon _\mathrm {e}\) and \(\epsilon _\mathrm {\mu }\) is constrained. In the fit, the electron identification uncertainty is constrained to that for muons.

The values for \(\epsilon _{\ell \ell }\), \(\epsilon _{\mathrm {b}}\), \(C_{\mathrm {b}}\), the number of signal events in each category, and the background rates depend on the nuisance parameters \(\mathbf {\lambda }\). The dependence on the parameter \(\lambda _j\) is modelled by a second-order polynomial that describes the quantity at the three values \(\lambda _j=0,1,-1\), corresponding to the nominal value of the parameter and to a variation by +1 and \(-\,1\) standard deviation, respectively. If a variation is only possible in one direction, a linear function is used to model the dependence on \(\lambda _j\).

The events are further categorized by the number of additional non-\(\mathrm {b}\)-tagged jets in the event. Each of the seven previously described event categories is further divided by grouping together events with 0, 1, 2, or \(\ge \) 3 additional non-\(\mathrm {b}\)-tagged jets, thus producing 28 disjoint event categories. For those categories that have events with at least one additional non-\(\mathrm {b}\)-tagged jet, the smallest \(p_{\mathrm {T}}\) among those jets is used as the observable in the fit. For those categories containing events with zero additional non-\(\mathrm {b}\)-tagged jets, the total number of events in the category is used as the observable in the fit. The further division of events into these categories and the observable distributions from each category provide the sensitivity to constrain the modelling systematic uncertainties, such as those coming from variations in the scales for the matrix element (ME) and parton shower (PS) matching. For events with no additional jets, the total event yield is used.

The statistical uncertainty in the templates from simulation is taken into account by using pseudo-experiments. At each iteration, templates are varied within their statistical uncertainty. Templates created from different simulations are treated as statistically uncorrelated, while templates derived by varying weights in the simulation are treated as correlated. The template dependencies are rederived and the fit to data is repeated. Repeating this 30,000 times yields an approximately Gaussian distribution of the fitted value of the \({\mathrm {t}\overline{\mathrm {t}}}\) cross section (and of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) in the combined fit) and of the vast majority of the nuisance parameters. The root-mean-square of each distribution is considered as an additional uncertainty from the event counts in the simulated samples for the corresponding nuisance parameter.

The input distributions to the fit are shown in Figs. 4, 5 and 6, where the data are compared to the signal and background distributions resulting from the fit to the data. In the top row, the number of events without additional non-\(\mathrm {b}\)-tagged jets is displayed. For events with at least one additional non-\(\mathrm {b}\)-tagged jet, the \(p_{\mathrm {T}}\) distributions of the non-\(\mathrm {b}\)-tagged jet with the smallest \(p_{\mathrm {T}}\) in the respective category is considered, except for the category corresponding to events with 2 \(\mathrm {b}\)-tagged jets and at least three additional non-\(\mathrm {b}\)-tagged jets, where the statistical uncertainty of the simulation is high. This distribution is chosen in order to constrain the jet energy scale at lower jet \(p_{\mathrm {T}}\), where the corresponding systematic uncertainty is larger [59]. Good agreement is found between the data and the simulation.

Fig. 4
figure 4

Distributions in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel after the fit to the data. In the left column events with zero or three or more \(\mathrm {b}\)-tagged jets are shown. The middle (right) column shows events with exactly one (two) \(\mathrm {b}\)-tagged jets. Events with zero, one, two, or three or more additional non-\(\mathrm {b}\)-tagged jets are shown in the first, second, third, and fourth row, respectively. The hatched bands correspond to the total uncertainty in the sum of the predicted yields including all correlations. The ratios of the data to the sum of the simulated yields after the fit are shown in the lower panel of each figure. Here, the solid gray band represents the contribution of the statistical uncertainty in the MC simulation

Fig. 5
figure 5

Distributions in the \(\mathrm {\mu ^+}\mathrm {\mu ^-}\) channel after the fit to the data. The left (right) column shows events with exactly one (two) \(\mathrm {b}\)-tagged jets. Events with zero, one, two, or three or more additional non-\(\mathrm {b}\)-tagged jets are shown in the first, second, third, and fourth row, respectively. The hatched bands correspond to the total uncertainty in the sum of the predicted yields including all correlations. The ratios of the data to the sum of the simulated yields after the fit are shown in the lower panel of each figure. Here, the solid gray band represents the contribution of the statistical uncertainty in the MC simulation

Fig. 6
figure 6

Same distributions as in Fig. 5, but in the \(\mathrm {e}^+\mathrm {e}^-\) channel

5 Systematic uncertainties

The contributions from each source of systematic uncertainty are represented by nuisance parameters (see Sect. 4). For each uncertainty, the simulation is used to construct template histograms that describe the expected signal and background distributions for a given nuisance parameter variation. In the fit of the templates to the data, the best values for \(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}\) (and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) in the case of the combined fit) and all nuisance parameters are determined, as described in Sect. 4. The prior probability density functions for the nuisance parameters have a Gaussian shape. Table 1 shows the value of the contributions of the uncertainties after the fit.

Table 1 The relative uncertainties in \(\sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}\) and \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and their sources, as obtained from the template fit. The uncertainty in the integrated luminosity and the MC statistical uncertainty are determined separately. The individual uncertainties are given without their correlations, which are however accounted for in the total uncertainties. Extrapolation uncertainties only affect \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \). For these uncertainties, the ± notation is used if a positive variation produces an increase in \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \), while the \({\mp }\) notation is used otherwise

Most of the experimental uncertainties are determined from ancillary measurements in which data and simulation are compared and small corrections to the simulation, referred to as scale factors (SFs), are determined. To assess the impact of the uncertainty in these corrections, the SFs are varied within their uncertainty and the analysis is repeated.

The trigger efficiencies are determined using multiple independent methods, which show agreement within 0.3%. An additional statistical uncertainty arises because the SFs are determined from the data in intervals of \(p_{\mathrm {T}}\) and \(\eta \).

The uncertainty in the SFs of the lepton identification efficiency is typically 1.5% for electrons and 1.2% for muons, with a small dependency on the lepton \(p_{\mathrm {T}}\) and \(\eta \). The uncertainties in the calibration of the muon and electron momentum scales are included as nuisance parameters for each lepton separately. Their impact on the measurement is negligible.

The impact of the jet energy scale (JES) uncertainties is estimated by varying the jet momenta within the JES uncertainties, split into 18 contributions [59]. To account for the jet energy resolution (JER), the SFs are varied within their \(|\eta |\)-dependent uncertainties [62].

The uncertainties associated with the \(\mathrm {b}\) tagging efficiency are determined by varying the related corrections for the simulation of \(\mathrm {b}\) jets and light-flavour jets, split into 16 orthogonal contributions for \(\mathrm {b}\) jets. These uncertainties depend on the \(p_{\mathrm {T}}\) of each jet and amount to approximately 1.5% for \(\mathrm {b}\) jets in \({\mathrm {t}\overline{\mathrm {t}}}\) signal events [60].

The uncertainty in the modelling of the number of pileup events is obtained by changing the inelastic pp cross section, which is used to model the pileup in simulation, by ±4.6% [48].

The integrated luminosity uncertainty is not included in the fit as a nuisance parameter, but treated as an external uncertainty. It is estimated to be 2.5% [63].

The ME scale uncertainties for the simulation of the \({\mathrm {t}\overline{\mathrm {t}}}\) and DY are assessed by varying the renormalization and factorization scale choices in powheg by factors of two up and down independently [64, 65], avoiding cases where \(\mu _\mathrm {f}/\mu _\mathrm {r} = 1/4\) or 4.

To estimate the uncertainty due to the NLO generator, the powheg \({\mathrm {t}\overline{\mathrm {t}}}\) signal sample is replaced by a \({\mathrm {t}\overline{\mathrm {t}}}\) sample generated using the MadGraph 5_amc@nlo program with FxFx matching [66]. This uncertainty is only included in the combined measurement of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) (Sect. 7) in order to compare with the latest direct top quark mass measurement from CMS in the lepton+jets channel [31].

The PDF uncertainty is estimated using the 28 orthogonal Hessian eigenvectors of the CT14 [53] PDF, which are used as independent inputs to the fit.

Differential measurements of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) at \(\sqrt{s} = 13 \,\text {Te}\text {V} \) have demonstrated that the \(p_{\mathrm {T}}\) distribution of the top quark is softer than predicted by the powheg simulation [67,68,69]. An additional uncertainty, referred to as “Top quark \(p_{\mathrm {T}}\) ”, is estimated by reweighting the simulation. This nuisance parameter has a one-sided prior distribution.

The uncertainty due to the matching of the ME to the PS in simulation is estimated by varying the \(h_{\text {damp}}\) parameter in powheg, as described in Ref. [40]. The uncertainty due to the assumptions in the UE tune is estimated by varying the tuning parameters [40]. The impact of the PS scale uncertainty is estimated by varying the initial-state radiation (ISR) and the final-state radiation (FSR) scales by a factor of two up and down [41], similar to the case of renormalization and factorization scales.

The uncertainties due to the assumed \(\mathrm {b}\) hadron branching fraction (BF) and fragmentation are taken into account following the procedures described in Ref. [31]. For the fragmentation, variations of the Bowler–Lund fragmentation function [70] and the comparison to the Peterson fragmentation function [71] are considered.

The effects of colour reconnection (CR) processes on the top quark final state are estimated by enabling early resonance decays (ERD) in pythia. In the nominal sample, ERD are turned off. Alternative colour reconnection models are considered, such as “gluon move” [72] and “QCD inspired” [73], since they were found to potentially have relevant effects for the measurement of the top quark mass [31].

For the uncertainties related to the background contributions, prior normalization uncertainties of 30% are assumed [74]. The contributions of these uncertainties are small and/or strongly constrained in the fit. For the DY background, separate nuisance parameters are used for each \(\mathrm {b}\)-tagged jet category in order to remove the dependence of the fit result on the prediction of the \(\mathrm {b}\)-tagged jet multiplicity distribution by the DY MC simulation. Similarly, the DY background is given an additional uncertainty of 5, 10, 30, and 50% for events with exactly 0, 1, 2, and 3 or more jets, respectively. The first three numbers are estimated by performing scale variations in \(\mathrm {W}\)+jets predictions with NLO precision, whereas the last one is assigned conservatively.

In total, 103 uncertainty sources are used in the fit. In Fig. 7, the normalized pulls and constraints for the nuisance parameters related to the modelling uncertainties are shown. For each nuisance parameter, the normalized pull is defined as the difference between the best-fit and the input values, normalized to the pre-fit uncertainty, and the constraint is defined as the ratio of the post-fit to the pre-fit uncertainty. The vast majority of the nuisance parameters lie within one standard deviation of their priors, reflecting the good agreement of the nominal simulation with the data. Most \({\mathrm {t}\overline{\mathrm {t}}}\) signal uncertainties show significant constraints with respect to their prior uncertainty, illustrating the strength of the analysis ansatz. The nuisance parameter for the \(p_{\mathrm {T}}\) distribution of the top quarks is pulled by one standard deviation. This is expected since it is known that the observed \(p_{\mathrm {T}}\) distribution of the top quark is softer than predicted by the simulation [68, 69].

Fig. 7
figure 7

Normalized pulls and constraints of the nuisance parameters related to the modelling uncertainties for the cross section fit. The markers denote the fitted values, while the inner vertical bars represent the constraint and the outer vertical bars denote the additional uncertainty as determined from pseudo-experiments. The constraint is defined as the ratio of the post-fit uncertainty to the pre-fit uncertainty of a given nuisance parameter, while the normalized pull is the difference between the post-fit and the pre-fit values of the nuisance parameter normalized to its pre-fit uncertainty. The horizontal lines at \(\pm \,1\) represent the pre-fit uncertainty

6 Cross section measurement

The visible cross section is defined for \({\mathrm {t}\overline{\mathrm {t}}}\) events in the fiducial region with two oppositely charged leptons (electron or muon). Contributions from leptonically decaying \(\mathrm {\tau }\) leptons are included. The leading lepton is required to have \(p_{\mathrm {T}} > 25\,\text {Ge}\text {V} \), and the subleading lepton must have \(p_{\mathrm {T}} >20\,\text {Ge}\text {V} \). Both leptons have to be in the range \(|\eta | < 2.4\). From the likelihood fit, described in Sect. 4, the visible cross section is measured to be

$$\begin{aligned} \sigma _{{\mathrm {t}\overline{\mathrm {t}}}}^{\text {vis}}= & {} 25.61 \pm 0.05 \,\text {(stat)} \pm 0.75 \,\text {(syst)} \pm 0.64 \,\text {(lumi)} \,\text {pb} . \end{aligned}$$

Here, the uncertainties denote the statistical uncertainty, the systematic uncertainty, and that coming from the uncertainty in the integrated luminosity. The full list of uncertainties is presented in Table 1.

The total cross section \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) is obtained by extrapolating the measured visible cross section to the full phase space. As explained in Sect. 4, the extrapolation is described by a multiplicative acceptance correction factor \(A_{\ell \ell }\) (see Eq. (1)). The extrapolation uncertainty is determined for each relevant model systematic source j as described in the following: all nuisance parameters except the one under study are fixed to their post-fit values; the nuisance parameter \(\lambda _j\) is set to values \(+1\) and \(-\,1\), and the variations of \(A_{\ell \ell }\) are recorded. The resulting variations of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) with respect to the nominal value, obtained with the post-fit value of \(\lambda _j\), are taken as the additional extrapolation uncertainties. The individual uncertainties in \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) from these sources are summed in quadrature to estimate the total systematic uncertainty, as summarized in Table 1. A fixed value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \,\text {Ge}\text {V} \) is chosen in the simulation, and no uncertainty is assigned.

The total cross section \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) is measured to be

$$\begin{aligned} \sigma _{{\mathrm {t}\overline{\mathrm {t}}}}= & {} 803 \pm 2 \,\text {(stat)} \pm 25 \,\text {(syst)} \pm 20 \,\text {(lumi)} \,\text {pb} . \end{aligned}$$

As shown in Table 1, in comparison to the fiducial cross section, the relative systematic uncertainty in the total cross section is marginally increased. The result is in good agreement with the theoretical calculation at NNLO+NNLL, which predicts a \({\mathrm {t}\overline{\mathrm {t}}}\) cross section of \(832~^{+20}_{-29}\, (\text {scale}) \pm 35 ~(\text {PDF}+\alpha _S ) \,\text {pb} \), as described in Sect. 2.

An independent cross section measurement is performed using a simple event-counting method and a more restrictive event selection, following closely the analysis of Ref. [75]. The analysis uses events in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel with at least two jets, at least one of which is \(\mathrm {b}\) tagged. The cross section is measured to be \( \sigma _{\mathrm {t}\overline{\mathrm {t}}} = 804 \pm 2 \,\text {(stat)} \pm 31 \,\text {(syst)} \pm 20 \,\text {(lumi)} \,\text {pb} \), in good agreement with the main result.

7 Simultaneous measurement of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\)

The analysis is designed such that the dependence of the measured \({\mathrm {t}\overline{\mathrm {t}}}\) cross section on \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is small. However, because of the impact of the top quark mass on the simulated detector efficiency and acceptance, the measurement is expected to have a residual dependence on the chosen value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). In previous measurements, this dependence was determined by repeating the analysis with varied mass values.

Here, the approach proposed in Refs. [5, 30] is followed. The value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is introduced in the fit as an additional free parameter. In the simultaneous fit, \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) are directly constrained from the data. The resulting \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and its uncertainty therefore account for the dependence on \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) and can be used, e.g. for the extraction of \(m_\mathrm {\mathrm {t}}\) and \(\alpha _S \) using fixed-order calculations. The value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\), in turn, can be compared to the results of direct measurements using, e.g. kinematic fits [31].

In contrast to the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement presented in Sect. 6, the sensitivity of the simultaneous fit to \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is maximized by introducing a new observable: the minimum invariant mass \(m_{\ell \mathrm {b}}^{\text {min}}\), which is defined as the smallest invariant mass found when combining the charged leptons with the \(\mathrm {b}\) jets in an event. To minimize the impact from background, only the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) sample is used. The simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is performed in 12 mutually exclusive categories, according to the number of \(\mathrm {b}\)-tagged jets and of additional non-\(\mathrm {b}\)-tagged jets in the event. The same observables as in Fig. 4 are used as input to the fit, where the jet \(p_{\mathrm {T}}\) spectrum is replaced by the \(m_{\ell \mathrm {b}}^{\text {min}}\) distribution in categories with at least one \(\mathrm {b}\)-tagged jet, as shown in Fig. 8.

To construct the templates describing the dependence of the final-state distributions on \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\), separate MC simulation samples of \({\mathrm {t}\overline{\mathrm {t}}}\) and \(\mathrm {t}\mathrm {W}\) production are used in which \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is varied in the range \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \pm 3 \,\text {Ge}\text {V} \). The data and MC samples, the event selection, the modelling of the systematic uncertainties, and the fit procedure are identical to those described in Sect. 4. In the simultaneous fit, the same systematic uncertainties are included as in a previous CMS measurement [31] of the \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). The results of the two measurements are thus directly comparable.

Comparisons of the data and the prediction from the MC simulation before and after the fit are presented in Figs. 8 and 9, respectively. Good agreement is found in both cases.

Fig. 8
figure 8

Comparison of data (points) and pre-fit distributions of the expected signal and backgrounds from simulation (shaded histograms) used in the simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel. In the left column events with zero or three or more \(\mathrm {b}\)-tagged jets are shown. The middle (right) column shows events with exactly one (two) \(\mathrm {b}\)-tagged jets. Events with zero, one, two, or three or more additional non-\(\mathrm {b}\)-tagged jets are shown in the first, second, third, and fourth row, respectively. The hatched bands correspond to the total uncertainty in the sum of the predicted yields. The ratios of data to the sum of the predicted yields are shown in the lower panel of each figure. Here, the solid gray band represents the contribution of the statistical uncertainty

Fig. 9
figure 9

Comparison of data (points) and post-fit distributions of the expected signal and backgrounds from simulation (shaded histograms) used in the simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel. In the left column events with zero or three or more \(\mathrm {b}\)-tagged jets are shown. The middle (right) column shows events with exactly one (two) \(\mathrm {b}\)-tagged jets. Events with zero, one, two, or three or more additional non-\(\mathrm {b}\)-tagged jets are shown in the first, second, third, and fourth row, respectively. The hatched bands correspond to the total uncertainty in the sum of the predicted yields and include the contribution from the top quark mass (\(\varDelta m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} \)). The ratios of data to the sum of the predicted yields are shown in the lower panel of each figure. Here, the solid gray band represents the contribution of the statistical uncertainty

The result of the fit is found to be stable against the choice of the fit distributions, and the introduction of the \(m_{\ell \mathrm {b}}^{\text {min}}\) distribution was confirmed not to alter the final result on \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) or the behaviour with respect to the nuisance parameters. The procedure is calibrated by performing fits where data is replaced by simulations with different \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) hypotheses: full closure of the method is obtained and no additional correction is applied. The effect of the statistical uncertainty in the simulation on the fit results is estimated as explained in Sect. 4 and is considered as an additional uncertainty. The results for \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) are

$$\begin{aligned} \sigma _{\mathrm {t}\overline{\mathrm {t}}}&= 815 \pm 2 \,\text {(stat)} \pm 29 \,\text {(syst)} \pm 20 \,\text {(lumi)} \,\text {pb} , \\ m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}&= 172.33 \pm 0.14 \,\text {(stat)} \,^{+0.66}_{-0.72} \,\text {(syst)} \,\text {Ge}\text {V} . \end{aligned}$$

The value for the cross section is in good agreement with the result obtained for a fixed value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \,\text {Ge}\text {V} \), reported in Sect. 6. The correlation between the two parameters is found to be \(12\%\).

The results of the simultaneous fit to \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) are summarized in Tables 2 and 3, respectively, together with the contribution of each systematic uncertainty to the total uncertainty. Normalized pulls and constraints of the nuisance parameters related to modelling uncertainties are shown in Fig. 10. The nuisance parameters displayed in this figure show similar trends to those in Fig. 7, described above. Here, the constraints on the nuisance parameters tend to be less stringent because only data in the \(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\) channel are used to determine the two parameters of interest, using mostly the \(m_{\ell \mathrm {b}}^{\text {min}}\) spectra in place of the jet \(p_{\mathrm {T}}\) distributions within the jet and \(\mathrm {b}\)-tagged jet categories.

Fig. 10
figure 10

Normalized pulls and constraints of the nuisance parameters related to the modelling uncertainties for the simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). The markers denote the fitted value, while the inner vertical bars represent the constraint and the outer vertical bars denote the additional uncertainty as determined from pseudo-experiments. The constraint is defined as the ratio of the post-fit uncertainty to the pre-fit uncertainty of a given nuisance parameter, while the normalized pull is the difference between the post-fit and the pre-fit values of the nuisance parameter normalized to its pre-fit uncertainty. The horizontal lines at \(\pm 1\) represent the pre-fit uncertainty

As a cross-check, a measurement of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is performed by fitting a single \(m_{\ell \mathrm {b}}^{\text {min}}\) distribution containing all events with at least one \(\mathrm {b}\)-tagged jet. The resulting value is \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) = \(171.92 \pm 0.13 \,\text {(stat)} \,^{+0.76}_{-0.77} \,\text {(syst)} \,\text {Ge}\text {V} \). Since the uncorrelated uncertainty with respect to the main result is estimated to be at least \(0.54 \,\text {Ge}\text {V} \), which is larger than the difference between the two measurements, the two results are in good agreement.

Table 2 The same as Table 1, but for the simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\)
Table 3 The absolute uncertainties in \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) and their sources, from the simultaneous fit of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\). The MC statistical uncertainty is determined separately. The individual uncertainties are given without their correlations, which are however accounted for in the total uncertainties

8 Extraction of \(m_\mathrm {\mathrm {t}}\) and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) in the \(\mathrm {\overline{MS}}\) scheme

The cross section value obtained in the simultaneous fit to \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is used to extract \(\alpha _S (m_\mathrm {\mathrm {Z}})\) and \(m_\mathrm {\mathrm {t}}\) in the \(\mathrm {\overline{MS}}\) renormalization scheme. For this purpose, the measured and the predicted cross sections are compared via a \(\chi ^2\) minimization. The \(\chi ^2\) fit is performed using the open-source QCD analysis framework xFitter  [76] and a \(\chi ^2\) definition from Ref. [77]. The method to determine \(m_\mathrm {\mathrm {t}}\) and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is very similar to the one used in earlier CMS analyses to extract \(\alpha _S (m_\mathrm {\mathrm {Z}})\) using jet cross section measurements, e.g. in Ref. [78].

It is assumed that the measured \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) is not affected by non-SM physics. The SM theoretical prediction for \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) at NNLO [6,7,8,9] is calculated using the Hathor  2.0 [79] program, interfaced with xFitter. This is the only available calculation to date that provides the \(m_\mathrm {\mathrm {t}}\) definition in the \(\mathrm {\overline{MS}}\) scheme. The top quark mass in the \(\mathrm {\overline{MS}}\) scheme is denoted by \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\), following the convention of presenting the value of a running coupling at a fixed value. In the calculation, the renormalization and factorization scales, \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\), are set to \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\). These are varied by a factor of two up and down, independently, avoiding cases where \(\mu _\mathrm {f}\)/\(\mu _\mathrm {r}\) = 1/4 or 4, in order to estimate the uncertainty due to the missing higher-order corrections (referred to in the following as the scale variation uncertainty).

The values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) and \(m_\mathrm {\mathrm {t}}\) cannot be determined simultaneously, since both parameters alter the predicted \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) in such a way that any variation of one parameter can be compensated by a variation of the other. In the presented analysis, the values of \(m_\mathrm {\mathrm {t}}\) and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) are therefore determined at fixed values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) and \(m_\mathrm {\mathrm {t}}\), respectively.

The four most recent PDF sets available [80] at NNLO are used: ABMP16nnlo [17], CT14nnlo [53], MMHT14nnlo [81], and NNPDF3.1nnlo [82]. While CT14nnlo does not use any \({\mathrm {t}\overline{\mathrm {t}}}\) data as input, the PDF sets ABMP16nnlo and MMHT14nnlo use measurements of inclusive \({\mathrm {t}\overline{\mathrm {t}}}\) cross sections at the Tevatron and LHC, and NNPDF3.1nnlo makes use of all available inclusive and differential \({\mathrm {t}\overline{\mathrm {t}}}\) cross section measurements. Using the currently available \({\mathrm {t}\overline{\mathrm {t}}}\) measurements has only a marginal effect on a global PDF and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) fit [17, 53]. The details of the PDFs relevant for this analysis are summarized in Table 4. In the MMHT14nnlo, CT14nnlo, and NNPDF3.1nnlo PDFs, the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is assumed to be 0.118. In ABMP16nnlo, \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is fitted simultaneously with the PDFs. The ABMP16nnlo PDF employs the \(\mathrm {\overline{MS}}\) scheme for the heavy-quark mass treatment in its determination. Similar to the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\), the value of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) in the ABMP16nnlo set is obtained in a simultaneous fit with the PDFs. For the other PDFs, the values of \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) are assumed, as listed in Table 4. Since the analysis is performed in the \(\mathrm {\overline{MS}}\) scheme, the assumed \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) of each PDF is converted into \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) using the RunDec  [83, 84] code, according to the prescription by the corresponding PDF group.

Table 4 Values of the top quark pole mass \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) and strong coupling constant \(\alpha _S (m_\mathrm {\mathrm {Z}})\) used in the different PDF sets. Also shown are the corresponding \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) values obtained using the RunDec  [83, 84] conversion, the number of loops in the conversion, and the \(\alpha _S \) range used to estimate the PDF uncertainties

For each used PDF set, a series of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) values is provided. The PDF uncertainties for all sets correspond to a 68% confidence level (\(\text {CL}\)), whereby the uncertainties in the CT14nnlo PDF set are scaled down from 95% \(\text {CL}\).

Because of the strong correlation between \(\alpha _S \) and \(m_\mathrm {\mathrm {t}}\) in the prediction of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \), for the \(m_\mathrm {\mathrm {t}}\) extraction, the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) in the theoretical prediction is set to that of the particular PDF set. Similarly, in the theoretical prediction of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) used for the \(\alpha _S (m_\mathrm {\mathrm {Z}})\) determination, the value of \(m_\mathrm {\mathrm {t}}\) is the one used in the PDF evaluation. The correlation of the values of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\), \(\alpha _S (m_\mathrm {\mathrm {Z}})\), and the proton PDFs in the prediction of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) is also studied.

To extract the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) from \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \), the measured cross section is compared to the theoretical prediction, and for each \(\alpha _S (m_\mathrm {\mathrm {Z}})\) member of each PDF set, the \(\chi ^2\) is evaluated. In the case of ABMP16nnlo and NNPDF3.1nnlo, the complete set of PDF uncertainties is provided for each member of the \(\alpha _S (m_\mathrm {\mathrm {Z}})\) series and is accounted for in the analysis. The uncertainties in the CT14nnlo and MMHT14nnlo PDFs are evaluated only for the central \(\alpha _S (m_\mathrm {\mathrm {Z}})\) value of 0.118 and are used for each \(\alpha _S (m_\mathrm {\mathrm {Z}})\) variant in the fit. The optimal value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is subsequently determined from a parabolic fit of the form

$$\begin{aligned} \chi ^2 (\alpha _S )=\chi ^2 _\text {min}+\left( \frac{\alpha _S - \alpha _S ^\text {min}}{\delta (\alpha _S ^\text {min})}\right) ^2 \end{aligned}$$
(7)

to the \(\chi ^2 (\alpha _S )\) values. Here, \(\chi ^2 _\text {min}\) is the \(\chi ^2\) value at \(\alpha _S = \alpha _S ^\text {min} \) and \(\delta (\alpha _S ^\text {min})\) is the fitted experimental uncertainty in \(\alpha _S ^\text {min}\), which also accounts for the PDF uncertainty. The \(\chi ^2 (\alpha _S ) \) scan is illustrated in Fig. 11 for the PDF sets used, demonstrating a clear parabolic behaviour. To estimate the scale variation uncertainties, this procedure is repeated with \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\) being varied, and the largest deviations of the resulting values of \(\alpha _S ^\text {min}\) from that of the central scale choice are considered as the corresponding uncertainties. The values of the \(\alpha _S (m_\mathrm {\mathrm {Z}})\) obtained using different PDFs are listed in Table 5 and shown in Fig. 11. The uncertainties in the measured \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) and the PDF contribute about equally to the resulting \(\alpha _S (m_\mathrm {\mathrm {Z}})\) uncertainty.

Fig. 11
figure 11

Left: \(\chi ^2\) versus \(\alpha _S \) obtained from the comparison of the measured \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) value to the NNLO prediction in the \(\mathrm {\overline{MS}}\) scheme using different PDFs (symbols of different styles). Right: \(\alpha _S (m_\mathrm {\mathrm {Z}})\) obtained from the comparison of the measured \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) value to the theoretical prediction using different PDF sets in the \(\mathrm {\overline{MS}}\) scheme. The corresponding value of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) is given for each PDF set. The inner horizontal bars on the points represent the experimental and PDF uncertainties added in quadrature. The outer horizontal bars show the total uncertainties. The vertical line displays the world-average \(\alpha _S (m_\mathrm {\mathrm {Z}})\) value [29], with the hatched band representing its uncertainty

Table 5 Values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) with their uncertainties obtained from a comparison of the measured \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) value to the NNLO prediction in the \(\mathrm {\overline{MS}}\) scheme using different PDF sets. The first uncertainty is the combination of the experimental and PDF uncertainties, and the second is from the variation of the renormalization and factorization scales

The values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) obtained using different PDF sets are consistent among each other and are in agreement with the world-average value [29] within the uncertainties, although suggesting a smaller value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\). The value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is also in good agreement with the recent result of the analysis in Ref. [85] of jet production in deep-inelastic scattering using the NNLO calculation by the H1 experiment, and is of comparable precision.

The same procedure is used to extract \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) by fixing \(\alpha _S (m_\mathrm {\mathrm {Z}})\) to the nominal value at which the used PDF is evaluated. The fit is performed by varying \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) in a 5-\(\text {Ge}\text {V}\) range around the central value used in each PDF. The uncertainties related to the variation of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) in the PDF are estimated by repeating the fit using the PDF eigenvectors with \(\alpha _S (m_\mathrm {\mathrm {Z}})\) varied within its uncertainty, as provided by NNPDF3.1nnlo, MMHT2014nnlo, and CT14nnlo. In the case of ABMP16nnlo, the value of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) is a free parameter in the PDF fit and its uncertainty is implicitly included in the ABMP16nnlo PDF uncertainty eigenvectors. The resulting \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) values are summarized in Table 6, where the fit uncertainty corresponds to the precision of the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement. The results obtained with different PDF sets are in agreement, although the ABMP16nnlo PDF set yields a systematically lower value. This difference is expected and has its origin in a larger value of \(\alpha _S (m_\mathrm {\mathrm {Z}}) = 0.118\) assumed in the NNPDF3.1, MMHT2014, and CT14 PDFs.

Table 6 Values of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) obtained from the comparison of the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement with the NNLO predictions using different PDF sets. The first uncertainty shown comes from the experimental, PDF, and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) uncertainties, and the second from the variation in the renormalization and factorization scales

The values of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) are in agreement with those originally used in the evaluation of each PDF set. The results are shown in Fig. 12 for the four different PDFs used.

Fig. 12
figure 12

Values of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) obtained from comparing the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement to the theoretical NNLO predictions using different PDF sets. The inner horizontal bars on the points represent the quadratic sum of the experimental, PDF, and \(\alpha _S (m_\mathrm {\mathrm {Z}})\) uncertainties, while the outer horizontal bars give the total uncertainties

Fig. 13
figure 13

Values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) obtained in the comparison of the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement to the NNLO prediction using different PDFs, as a function of the \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) value used in the theoretical calculation. The results from using the different PDFs are shown by the bands with different shadings, with the band width corresponding to the quadratic sum of the experimental and PDF uncertainties in \(\alpha _S (m_\mathrm {\mathrm {Z}})\). The resulting measured values of \(\alpha _S (m_\mathrm {\mathrm {Z}})\) are shown by the different style points at the \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) values used for each PDF. The inner vertical bars on the points represent the quadratic sum of the experimental and PDF uncertainties in \(\alpha _S (m_\mathrm {\mathrm {Z}})\), while the outer vertical bars show the total uncertainties

The dependence of the \(\alpha _S (m_\mathrm {\mathrm {Z}})\) result on the assumption on \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) is investigated for each PDF by performing the \(\chi ^2 (\alpha _S )\) scan for ten values of \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) varying from 160.5 to 165.0\(\,\text {Ge}\text {V}\). A linear dependence is observed, as shown in Fig. 13.

9 Extraction of \(m_\mathrm {\mathrm {t}}\) in the pole mass scheme

The extraction of \(m_\mathrm {\mathrm {t}}\) is repeated in the pole mass scheme using the Top++  2.0 program [52], which employs the calculation of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) at NNLO, improved by the NNLL soft-gluon resummation. The results are summarized in Table 7. The scale variation uncertainties are estimated in the same way as in the case of the \(m_\mathrm {\mathrm {t}} (m_\mathrm {\mathrm {t}})\) extraction. These uncertainties are larger than those determined in the \(\mathrm {\overline{MS}}\) scheme. This is because of the better convergence of the perturbative series when using the \(\mathrm {\overline{MS}}\) renormalization scheme in the calculation of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \).

Table 7 Values of \(m_\mathrm {\mathrm {t}} ^{\text {pole}}\) obtained by comparing the \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) measurement with predictions at NNLO+NNLL using different PDF sets

10 Summary

A measurement of the top quark–antiquark pair production cross section \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) by the CMS Collaboration in proton–proton collisions at a centre-of-mass energy of 13\(\,\text {Te}\text {V}\) is presented, corresponding to an integrated luminosity of \(35.9{\,\text {fb}^{-1}} \). Assuming a top quark mass in the simulation of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.5 \,\text {Ge}\text {V} \), a visible cross section is measured in the fiducial region using dilepton events (\(\mathrm {e}\) \(^{\pm }\) \(\mathrm {\mu }\) \(^{{\mp }}\), \(\mathrm {\mu ^+}\mathrm {\mu ^-}\), \(\mathrm {e}^+\mathrm {e}^-\)) and then extrapolated to the full phase space. The total \({\mathrm {t}\overline{\mathrm {t}}}\) production cross section is found to be \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} = 803 \pm 2 \,\text {(stat)} \pm 25 \,\text {(syst)} \pm 20 \,\text {(lumi)} \,\text {pb} \). The measurement is in good agreement with the theoretical prediction calculated to next-to-next-to-leading order in perturbative QCD, including soft-gluon resummation to next-to-next-to-leading logarithm.

The measurement is repeated including the top quark mass in the powheg simulation as an additional free parameter in the fit. The sensitivity to \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}}\) is maximized by fitting the minimum invariant mass found when combining the charged leptons with the \(\mathrm {b}\) jets in an event. This yields a cross section of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} = 815 \pm 2 \,\text {(stat)} \pm 29 \,\text {(syst)} \pm 20 \,\text {(lumi)} \,\text {pb} \) and a value of \(m_\mathrm {\mathrm {t}} ^{\mathrm {MC}} = 172.33 \pm 0.14 \,\text {(stat)} \,^{+0.66}_{-0.72} \,\text {(syst)} \,\text {Ge}\text {V} \), in good agreement with previous measurements. The value of \(\sigma _{\mathrm {t}\overline{\mathrm {t}}} \) obtained in the simultaneous fit is further used to extract the values of the top quark mass and the strong coupling constant at next-to-next-to-leading order in the minimal subtraction renormalization scheme, as well as the value of the top quark pole mass for different sets of parton distribution functions.