Measurement of the inclusive $\mathrm{t\bar{t}}$ production cross section in proton-proton collisions at $\sqrt{s} =$ 5.02 TeV

The top quark pair production cross section is measured in proton-proton collisions at a center-of-mass energy of 5.02 TeV. The data were collected in a special LHC low-energy and low-intensity run in 2017, and correspond to an integrated luminosity of 302 pb$^{-1}$. The measurement is performed using events with one electron and one muon of opposite charge, and at least two jets. The measured cross section is 60.7 $\pm$ 5.0 (stat) $\pm$ 2.8 (syst) $\pm$ 1.1 (lumi) pb. To reduce the statistical uncertainty, a combination with the result in the single lepton + jets channel, based on data collected in 2015 at the same center-of-mass energy and corresponding to an integrated luminosity of 27.4 pb$^{-1}$, is then performed. The resulting measured value is 63.0 $\pm$ 4.1 (stat) $\pm$ 3.0 (syst+lumi) pb, in agreement with the standard model prediction of 66.8 $^{+2.9}_{-3.1}$ pb.


Introduction
The top quark is the most massive elementary particle in the standard model (SM). The study of its production and properties is one of the core elements of the CERN LHC physics program. At the LHC, top quarks are primarily produced in pairs (tt), and the tt production cross section is sensitive to the gluon parton distribution function (PDF) of the proton [1] and to the top quark pole mass [2]. The ATLAS, CMS, and LHCb Collaborations have performed several cross section measurements with increasing precision in a variety of decay channels at four protonproton (pp) collision energies [3][4][5][6][7][8][9][10][11][12], as well as in proton-nucleus [13] and nucleus-nucleus [14] collisions.
The first measurement of the tt production cross section, σ tt , in pp collisions at a center-ofmass energy of 5.02 TeV, was performed by the CMS experiment analyzing events with one or two leptons ( = electron or muon) and at least two jets, using a data sample taken in 2015 that corresponds to an integrated luminosity of 27.4 pb −1 . The measurement of σ tt = 69.5 ± 6.1 (stat) ± 5.6 (syst) ± 1.6 (lumi) pb was obtained from the combination of the results in the dilepton and single lepton decay channels [3].
During the year 2017, the LHC delivered a subset of pp collisions at √ s = 5.02 TeV and CMS collected a data sample corresponding to 302 pb −1 , an increase in integrated luminosity of more than an order of magnitude compared to the data set recorded in 2015. A distinct feature of this data sample is the low number of additional interactions per bunch crossing (pileup) with respect to the standard operating conditions of the LHC. We present here a measurement of σ tt using events with two opposite-charge different-flavor leptons, i.e., one electron and one muon (e ± µ ∓ ), and at least two jets. The cross section is extracted using a counting experiment and the result is then combined with the measurement in the +jets final state contained in Ref. [3]. This paper is organized as follows. A brief description of the CMS detector, and of the Monte Carlo (MC) simulation samples, are given in Section 2, followed by the object and event selection in Section 3. The background estimation methods are covered in Section 4 and the systematic uncertainties in Section 5. Results are discussed in Section 6 and the summary is given in Section 7. Tabulated results are provided in HEPData [15].

The CMS detector and Monte Carlo simulation
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [16].
Simulated event samples are used to define the analysis strategy, to estimate the background contribution, and to evaluate efficiencies and uncertainties. The samples used in the analysis are summarized in Table 1. The propagation of the generated particles through the CMS detector and the modeling of the detector response are performed using GEANT4 [17].
Simulated tt events are generated at next-to-leading order (NLO) in quantum chromodynamics (QCD) using POWHEG (v2) [18][19][20], assuming a top quark mass m t of 172.5 GeV. The events are then interfaced with PYTHIA 8 (v230) [21] with the "CP5" tune [22] for parton showering, hadronization, and the underlying event description. For the study of the acceptance dependence on m t , alternative generator-level samples have been used with m t = 166.5 and 178.5 GeV. The NNPDF3.1 [23] next-to-next-to-leading-order (NNLO) PDFs are used. A similar setup is used for the simulation of the single top quark production in association with a W boson (tW).
The MADGRAPH5 aMC@NLO (v2.4.2) generator [24], interfaced with PYTHIA 8 for parton showering, is used to simulate W boson production with additional jets (W+jets), and Drell-Yan (DY) quark-antiquark annihilation into lepton-antilepton pairs through Z boson or virtualphoton exchange. The simulation is performed at NLO in QCD and includes up to two extra partons at the matrix element (ME) level. The FxFx matching scheme [25] is used to merge jets from the ME calculations and the parton shower (PS). Diboson (VV, with V = W or Z) events are simulated at NLO in QCD with POWHEG (v2). When available, higher-order cross sections are used instead of those of the generator, as shown in Table 1. The SM prediction for σ tt at 5.02 TeV is 66.8 +1.9 −2.3 (scale) ± 1.7 (PDF) +1.4 −1.3 (α S (m Z )) pb for m t = 172.5 GeV and a strong coupling at the Z boson mass, α S (m Z ), of 0.118 ± 0.001 [30]. This prediction is calculated with the TOP++ program [26] at NNLO in perturbative QCD including soft-gluon resummation at next-to-next-to-leading-log (NNLL) approximation [27]. The first uncertainty reflects variations in the factorization (µ F ) and renormalization (µ R ) scales. The second and third uncertainties are associated with possible choices of PDFs and the α S value respectively, using the NNPDF3.1 [23] NNLO PDF sets that include top quark measurements. The expected integrated event yields for signal in all figures and tables are normalized to the predicted cross section.
The simulated samples include multiple pp collisions occurring in the same bunch crossing (pileup), with a distribution that matches that observed in data, with an average of about two pileup collisions per bunch crossing.

Object reconstruction and event selection
Events of interest are selected online using a two-tiered trigger system [31, 32]. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of less than 4 µs. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage. Only events that fired at least one of the single-lepton triggers with transverse momentum (p T ) thresholds greater than 17 (12) GeV in the case of electrons (muons) are considered.
Events may contain multiple primary vertices, corresponding to pileup collisions. The candidate vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [33, 34] using tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets.
The particle-flow algorithm [35] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of electrons is determined from a combination of the electron momentum as measured by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.
Jets are clustered from these reconstructed particles using the anti-k T algorithm [33, 34] with a distance parameter of 0.4. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance. Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle-level jets. Measurements of the momentum balance are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [36]. These corrections were derived using the full low-pileup data set of pp collisions at 5.02 TeV. Additional selection criteria are applied to remove jets potentially dominated by instrumental effects or reconstruction failures [37].
Electron candidates are required to satisfy |η| < 2.5 and p T > 10 GeV. To identify electrons, requirements are placed on a multivariate discriminant based on the shower shape and track quality of the electron candidates [38]. Electron candidates that are matched to a secondary vertex consistent with a photon conversion, or have a missing hit in the inner layer of the tracker are vetoed.
Reconstructed muon candidates are required to have |η| < 2.4 and p T > 10 GeV, and must fulfill criteria on the geometrical matching between the tracks reconstructed by the silicon tracker and the muon system, and on the quality of the global fit [39].
Lepton candidates must be consistent with originating from the primary vertex which is ensured by requiring that the transverse (longitudinal) impact parameter should not exceed 0.05 (0.10) cm. Furthermore, the significance of the three-dimensional impact parameter must be smaller than 8. Electrons and muons must also satisfy a requirement on their relative isolation (I rel ), defined as the scalar p T sum of all the particles inside a cone around the lepton direction, excluding the lepton itself, divided by the lepton p T . The cone size, defined as ∆R = √ (∆η) 2 + (∆φ) 2 , where φ is the azimuthal angle, changes as a function of the lepton p T as ∆R(p T ) = 10 GeV/p T if 50 GeV < p T < 200 GeV, ∆R = 0.2 if p T ≤ 50 GeV and ∆R = 0.05 otherwise. Electrons (muons) must satisfy the condition I rel < 0.085 (0.325).
To reject leptons originating from hadron decays, or misidentified leptons, also referred to as "nonprompt" leptons, from those produced in the decay of the electroweak bosons ("prompt"), a gradient boosted decision tree (BDT) is used, trained using MC simulation, to distinguish prompt from nonprompt leptons [40]. This BDT uses the properties of the jet containing the lepton, as returned by the jet clustering algorithm: its b tagging score, the ratio of the lepton p T to that of the jet, and the momentum of the jet transverse to the lepton direction. Other input variables are the lepton p T , η, I rel , longitudinal and transverse impact parameters, and the significance of the three-dimensional impact parameter. In addition, the previously mentioned multivariate discriminant for electrons and the muon segment compatibility for muons are used as input variables [39]. To further suppress nonprompt leptons originating from b quark decays, leptons associated with a jet satisfying the loose working point of the DeepCSV b tagging algorithm [41] are rejected.
The tt candidate events are required to have at least two leptons (one electron and one muon) with opposite charge and at least two jets. Only jets with p T > 25 GeV, |η| < 2.4, and containing no selected leptons are considered. To ensure efficient triggering of the events, the leading lepton is required to have p T > 20 GeV. In addition, events must have a dilepton invariant mass above 20 GeV to reduce the background from photon conversions and low-mass resonances.

Background estimation
Background events arise mainly from tW, DY, and VV production in which at least two prompt leptons emerge from the Z or W boson decays. The tW and VV contributions are estimated from simulation.
The DY event yield is estimated from data using the R out/in method [6], where events with same-flavor leptons are used to normalize the yield of different-flavor pairs from DY production of τ lepton pairs. A data-to-simulation normalization factor is estimated from the number of events in data within a 15 GeV window around the Z boson mass and extrapolated to the number of events outside the Z boson mass window with corrections applied using control regions enriched in DY events in data. This factor is measured to be 0.91 ± 0.01. The stability of the method against a potential mismodeling of the jet multiplicity is checked and found to be within 30%, which will be considered as an extra systematic uncertainty in this background estimation.
Other residual background sources, such as tt where only one of the W bosons decay leptonically or W+jets events, may contaminate the signal sample when a jet is misreconstructed as a lepton, or contains a lepton from a b/c hadron decay, incorrectly identified as a prompt lepton. These events are grouped into the nonprompt lepton category, together with meson decays and photon conversions; their contribution is estimated with simulated tt events with at least one W boson decaying into jets and W+jets events. Figure 1 shows the p T of the two leptons and of the leading jet, and the jet multiplicity of the selected events. The data are compared to the sum of the expected signal and background distributions for the tt signal and individual backgrounds, which are derived either from simulated samples or from data, as described above. The expected distributions describe the data within the experimental uncertainties.

Systematic uncertainties
The measurement of σ tt is affected by sources of systematic uncertainty related to detector effects or theoretical assumptions. Each source of systematic uncertainty is evaluated by repeating the σ tt extraction with variations of the input parameters by ±1 standard deviations (experimental uncertainty) or from dedicated simulation samples with different settings (theoretical uncertainty). The experimental uncertainties affect mostly the efficiency, while modeling un- are corrected to mimic this behavior in data and the uncertainty in these corrections is propagated to σ tt by varying the correction within the associated uncertainty.

Scale choice:
The uncertainty related to the missing higher-order diagrams in POWHEG is estimated by varying the default µ F and µ R choices independently by a factor of 2 and 1/2. As uncertainty in the signal acceptance is assigned the maximum difference of each variation from the nominal values, excluding variations of the scales in opposite directions.

Parton shower scale:
The effect of the choice of PS scale is studied by changing the scale used for the initial-and final-state radiation by a factor 2 and 1/2 with respect to its default value. The maximum variation with respect to the central sample is taken as the uncertainty.

Matrix element and PS matching (h damp ):
The impact of the ME and PS matching, which is parameterized by the POWHEG generator as h damp , with a nominal value of (1.4 +0.9 −0.5 )m t [22], is calculated by varying this parameter within the uncertainties, using dedicated samples. The variation with respect to the central value of the signal acceptance at particle level is considered as the uncertainty in the σ tt extraction.

Parton distribution functions:
The uncertainty due to the proton PDFs is evaluated by reweighting simulated signal events using the replicas of the NNPDF3.1 set [23]. The variations consist of a central PDF and 100 replicas, for which the root mean square of all differences of the resulting σ tt with respect to the central value is taken as the uncertainty. Two extra variations corresponding to different α S (m Z ) choices are added in quadrature [42].
Underlying event tune: The parameters of PYTHIA are adjusted to model the measured underlying event tune [22]. The uncertainty is calculated by varying these parameters within their uncertainties in dedicated simulated samples. The variation with respect to the central value of the signal acceptance is taken as the uncertainty.

Background normalization:
The uncertainty in the tW and VV cross sections is taken to be 20 and 30%, respectively, based on the theoretical uncertainties and the effect of finite size of the simulated samples. To the nonprompt background estimation is assigned a 50% uncertainty to account for possible mismodeling of the data in simulation. As explained in Section 4, a 30% uncertainty is considered for the DY background normalization.

Pileup and integrated luminosity:
The uncertainty assigned to the number of pileup events in simulation is calculated by varying the total inelastic pp cross section by 4.6% [43]. The impact of this uncertainty on the result is negligible, as the number of pileup events is also small. The uncertainty in the measurement of the integrated luminosity is estimated to be 1.9% [44]. Table 2 summarizes the sources of systematic and statistical uncertainties in the measured σ tt , as obtained using Eq. (1), explained in the next section. The result is dominated by the statistical uncertainty, while the uncertainty in the JES and the DY background estimate constitute the largest systematic uncertainties.

Results
The tt production cross section is extracted via the expression where N is the number of observed events, N bkg is the number of estimated background events, L is the integrated luminosity, B = 3.19% is the SM value [30] of the branching fraction of a W boson pair to e ± µ ∓ , including decays through τ leptons, A is the total acceptance, defined as the fraction of all generated tt → e ± µ ∓ events fulfilling the aforementioned kinematic selection criteria, and ε is the reconstruction efficiency. The acceptance is estimated from simulation and is found to be 0.54 ± 0.01. The efficiency is estimated from simulation, after applying all the correction factors for leptons and jets to match the performance of the data, and is measured to be 0.53 ± 0.02. Table 3 shows the total number of events observed in data together with the total number of expected signal and background events. The measured inclusive cross section for a top quark mass of 172.5 GeV is σ tt = 60.7 ± 5.0 (stat) ± 2.8 (syst) ± 1.1 (lumi) pb.
The fiducial cross section (σ fid tt ) is measured for events containing one electron and one muon with p T > 10 GeV and |η| < 2.4, invariant mass of the pair of at least 20 GeV, a leading lepton p T of at least 20 GeV, and at least two jets with p T > 25 GeV and |η| < 2.4. For the fiducial cross section measurement, an estimate of the uncertainties similar to that shown in Table 2 is made. The resulting value is σ fid tt = 1.05 ± 0.09 (stat) ± 0.05 (syst) ± 0.02 (lumi) pb. The acceptance has been predicted for m t = 166.5 and 178.5 GeV and is parameterized as a linear function of m t . The cross section varies by ∓0.30 pb when the top quark mass changes by ±0.5 GeV.
The result is combined with that obtained in the +jets decay channel of Ref.
[3], corresponding to an integrated luminosity of 27.4 pb −1 . The result obtained in the dilepton decay channel of Ref. [3] was not added to the combination, as its contribution would be negligible. We determine the combined σ tt using the best linear unbiased estimator (BLUE) method [45,46]. The 2015 measurement in the +jets channel yielded a cross section of σ tt = 68.9 pb with a total uncertainty of 13%, dominated by the statistical uncertainty. Most sources of experimental uncertainty are considered as uncorrelated, given that the data sets and background estimation methods are different, with the exception of the uncertainties on the tW background and the scale choice, which are considered as fully correlated. The modeling uncertainties are taken as fully correlated. The resulting cross section is σ comb tt = 63.0 ± 4.1 (stat) ± 3.0 (syst+lumi) pb, where the total uncertainty of 8.0% is the quadrature sum of the individual sources of uncertainty. The weights of the individual measurements, to be understood in the sense of Ref. [46], are 27 and 73% for the +jets [3] and the measurement presented in this paper, respectively. This result is in agreement with the SM prediction.
The combined result is found to be robust by performing an iterative variant of the BLUE method [47] and varying some assumptions on the correlations of different combinations of systematic uncertainties. Also, the correlations between the nuisance parameters in both channels have been checked and found to have a negligible impact. Figure 2 presents a summary of the CMS measurements [2, 6, 7, 10, 11] of σ tt in pp collisions at different √ s in the +jets and dilepton channels, including the one presented in this paper, compared to the NNLO+NNLL prediction using the NNPDF3.1 NNLO PDF set with α S (m Z ) = 0.118 and m t = 172.5 GeV. In the inset, the results from this analysis at √ s = 5.02 TeV are also compared to the predictions from the MSHT20 [48], CT18 [49], and ABMP16 [50] NNLO PDF sets, with the latter using α S (m Z ) = 0.115 and m t = 170.4 GeV. Theoretical predictions using different PDF sets have comparable values and uncertainties, once consistent values of α S (m Z ) and m t are associated with the respective PDF set.
The impact of the combined σ tt measurement at √ s = 5.02 TeV on the knowledge of the proton PDFs is tested following the MC methodology of Ref. In the inset, predictions at 5.02 TeV using the MSHT20 [48], CT18 [49], and ABMP16 [50] NNLO PDF sets, the latter with α S (m Z ) = 0.115 and m t = 170.4 GeV, are compared, along with the NNPDF3.1 NNLO prediction, to the individual and combined results from this analysis. The vertical bars and bands represent the total uncertainties in the data and in the predictions, respectively. Points corresponding to measurements at the same √ s are horizontally shifted for better readability.

Summary
A measurement of the top quark pair production cross section at a center-of-mass energy of 5.02 TeV is presented for events with one electron and one muon of opposite charge, and at least two jets, using proton-proton collisions collected by the CMS experiment in 2017 and corresponding to an integrated luminosity of 302 pb −1 . The measured cross section is found to be σ tt = 60.7 ± 5.0 (stat) ± 2.8 (syst) ± 1.1 (lumi) pb. A combination with the single lepton + jets measurement, using a data set collected in 2015 at the same center-of-mass energy and corresponding to an integrated luminosity of 27.4 pb −1 , is performed. A measurement of 63.0 ± 4.1 (stat) ± 3.0 (syst+lumi) pb is obtained, in agreement with the prediction from the standard model of 66.8 +2.9 −3.1 pb.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the follow-