Search for the production of four top quarks in the single-lepton and opposite-sign dilepton final states in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search for the standard model production of four top quarks (pp $\to$ $\mathrm{t\bar{t}t\bar{t}}$) is reported using single-lepton plus jets and opposite-sign dilepton plus jets signatures. Proton-proton collisions are recorded with the CMS detector at the LHC at a center-of-mass energy of 13 TeV in a sample corresponding to an integrated luminosity of 35.8 fb$^{-1}$. A multivariate analysis exploiting global event and jet properties is used to discriminate $\mathrm{t\bar{t}t\bar{t}}$ from $\mathrm{t\bar{t}}$ production. No significant deviation is observed from the predicted background. An upper limit is set on the cross section for $\mathrm{t\bar{t}t\bar{t}}$ production in the standard model of 48 fb at 95% confidence level. When combined with a previous measurement by the CMS experiment from an analysis of other final states, the observed signal significance is 1.4 standard deviations, and the combined cross section measurement is 13 $^{+11}_{-9}$ fb. The result is also interpreted in the framework of effective field theory.


Introduction
Many models of physics beyond the standard model (BSM) predict enhanced or modified couplings of top quarks to other particles. This is particularly relevant for processes that have small production cross sections and, therefore, are yet to be observed, such as the production of four top quarks, tttt. There is considerable interest in the measurement of the tttt cross section because of its sensitivity to BSM physics, including supersymmetry [1,2], composite models [3], top quark compositeness [4], two-Higgs-doublet models [5][6][7], and models with extra spatial dimensions [8,9]. Within the effective field theory (EFT) framework, the contribution of any BSM process to tttt production can be parameterized in terms of nonrenormalizable effective couplings of the standard model (SM) fields, if the characteristic energy scale, Λ, of the BSM physics is much larger than the typical energy scale of tttt production at the LHC. A generic interpretation of the tttt production can be done using the EFT predictions [10].
The production of four top quarks from proton-proton (pp) interactions pp → tttt has not yet been observed. The SM predicts a cross section, at next-to-leading order (NLO), with electroweak corrections (EWK), of σ SM tt tt of 12.0 fb at the center-of-mass energy of 13 TeV [11]. To facilitate comparison with published ATLAS and CMS analyses using comparable data sets, the NLO quantum chromodynamics (QCD) calculation with a value of σ SM tt tt = 9.2 fb is used [12,13]. Consequently, the experiments at the CERN LHC may be just approaching sensitivity to the process, provided that it can be separated from the overwhelming background from SM tt events. The lowest-order Feynman diagrams illustrating typical contributions to SM four top quark production in pp collisions are shown in Fig. 1. This paper presents a new search in the single-lepton (SL) (µ, e)+jets and opposite-sign dilepton (DL) (µ + µ − , µ ± e ∓ , or e + e − )+jets tttt decay channels using pp collisions at 13 TeV collected by the CMS experiment in 2016 and corresponding to an integrated luminosity of 35.8 fb −1 . For this analysis, only final states containing one or two leptons are considered, which constitute about 40% of all tttt decays. Compared to the previous analysis [20], we have implemented a number of important changes which combine to give a much improved analysis sensitivity. The training process and selection of the input variables for the event-discriminating MVA's (Section 4.2) in both the SL and OS dilepton channels has been re-optimized. A new categorization of the signal sensitive regions at large jet and b-tag multiplicities has been introduced, and a revised binning scheme is used to decrease the statistical uncertainties, and improve the signal sensitivity. The categorisation provides additional discrimination against the rare tt+boson (H, Z, W, WW/WZ/ZZ) backgrounds. Lastly, a much larger simulated tt data set is used to populate the discriminant bins with high jet multiplicity and high b-tag multiplicity.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage (η) provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid, in the range |η| < 2.4. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [24].

Simulated samples
The acceptance for the SM pp → tttt process is estimated using samples simulated at NLO precision by the MADGRAPH5 aMC@NLO 2.2.2 generator [13,25]. Only diagrams arising from quantum chromodynamics interactions were taken into account in the simulation. The cross section used to normalize the simulation is the NLO calculation of 9.2 +2.9 −2.4 fb [13], where the quoted uncertainty incorporates the variation of factorization and renormalization scales used in the calculation of the matrix elements (ME), and the dependence on the choice of parton distribution functions (PDFs). The signal model includes MADSPIN [26] and uses the default dynamic scale choice in MADGRAPH5 aMC@NLO, defined as µ R,F = 1 2 Σ t m T . This is the sum of m T over each outgoing parton (the four top quarks), divided by two, where m T = m 2 + p 2 T , in which m is the mass of the parton, and p T is the transverse momentum.
The most important background process is top quark pair production with additional jets (tt +jets), that comprises over 90-95% of the background. Next in importance are single top (ST) quark processes including t-channel and tW production. These are followed by Z+jets and W+jets electroweak processes (EW), where only the leptonic decay modes of the bosons are considered. Next are rare processes involving the production of a top quark-antiquark pair and a Z, W, or Higgs bosons, namely, tt+Z,W,H. Finally, tt production in association with dibosons, ttWW, ttWZ, ttZZ, ttWH, ttZW, ttHH, and triple top quark production (ttt+jets and tttW) are considered, processes we collectively denote as ttXY. Based on their signature resemblance and comparability of production rates to the tttt signal, tt+Z and tt+H are grouped together while tt+W and ttXY are grouped together in the simulation.
Several Monte Carlo (MC) event generators are used to simulate these processes. The tt +jets process is simulated using the POWHEG-BOX v2 generator [27][28][29][30][31] at NLO accuracy for the tt ME, but the tt cross section is normalized to its predicted value at next-to-next-to-leading order (NNLO), which includes soft-gluon corrections, at next-to-next-to-leading-logarithm accuracy, obtained with TOP++ 2.0 [32][33][34][35][36][37][38]. The POWHEG-BOX simulations are interfaced with PYTHIA 8.212 using the CUETP8M2T4 tune [39][40][41]. Recent calculations [38] suggest that nextto-next-to-leading-order effects have an important consequence on the shape of the top quark p T spectrum that NLO ME generators are unable to reproduce. To allow for this, a parton-level reweighting of the tt simulation has been applied to match the predictions to the data [42,43]. The correction is applied as a function of the transverse momenta of the parton-level top quark and antiquark after initial-and final-state radiation. Specifically for this result, additional dedicated samples were created that populate the tails in high multiplicity with a factor of 10 more events.
Single top quark tW processes are simulated with the POWHEG-BOX v1 generator [44], while tchannel processes are simulated with POWHEG-BOX v2. Both are interfaced with PYTHIA 8.212 using the CUETP8M2T4 tune, with the cross sections normalized to the NLO calculations [45,46]. The analysis has been shown [20] to be insensitive to other ST quark production processes, such as s-channel production.
Events with massive gauge bosons and no top quarks (Z+jets, W+jets) are simulated using MADGRAPH5 aMC@NLO [13] at leading-order (LO) accuracy, with up to four additional partons in the ME calculations, and using the MLM matching scheme [47]. The tune CUETP8M1 is used for the parton shower (PS) and underlying event (UE) modeling. These samples are normalized to their NNLO cross sections [48].
The production of a tt pair in association with a W, Z and up to one extra parton is simulated using the MADGRAPH5 aMC@NLO generator at LO accuracy and matched with the PS predictions using the MLM matching scheme. Top quark pair production in association with a Higgs boson, ttH, is modeled using POWHEG-BOX v2, interfaced with PYTHIA 8.212 with the CUETP8M2T4 tune. In this sample, only the dominant H → bb decays are taken into account. These three samples are normalized to the NLO cross sections [49]. Top quark pair production in association with one or two massive bosons is simulated using the LO ME in the MAD-GRAPH5 aMC@NLO generator, and the CUETP8M2T4 tune of PYTHIA 8.212 to provide the PS. The cross sections are scaled to their LO values [49].
For the samples with NLO MEs, the NNPDF3.0NLO [50] PDFs are used, while for LO MEs, the corresponding NNPDF3.0LO PDFs are used. The parton shower, hadronization, and underlying event models implemented in PYTHIA 8.212 [51] are used to simulate higher-order processes and nonperturbative aspects of pp collisions. The NLO simulations use strong coupling constant values of α S (M Z ) = 0.137 and α S (M Z ) = 0.113 for the ME and PS modeling, and the LO simulations use α S (M Z ) = 0.130 for the ME. In all simulations involving the top quark, a mass m t of 172.5 GeV is used.
The PYTHIA CUETP8M2T4 tune [39][40][41] currently provides the best description of the tt data [52,53]. The POWHEG-BOX calculation describes the high-multiplicity tail when this tune is used. The uncertainties cover the differences due to alternative choices of the PS and hadronization models [54].
All of the simulated samples include an estimate of the additional pp interactions per bunch crossing (pileup), modeled with the PYTHIA 8.212 program. Corrections are applied to make the simulation of the number of additional interactions representative of that observed in the data. The simulated events are propagated through a simulation of the CMS detector based on GEANT4 (v.9.4) [55] and reconstructed using the same algorithms as for the collider data.

Event selection
The final states considered in this analysis are the single-lepton channel with exactly one muon or electron, (µ, e)+jets, and the opposite-sign dilepton channel, (µ + µ − , µ ± e ∓ , e + e − )+jets. In all cases, the leptons are expected to originate from the W bosons arising from top quark decays and thus tend to be isolated, unlike the leptons produced in the decay of unstable hadrons within jets.
Single-lepton events were recorded using a trigger [56] that required at least one isolated muon with p T > 24 GeV and |η| < 2.4, or one isolated electron with p T > 32 GeV and |η| < 2.1. Dilepton events were recorded using either single-lepton or dilepton triggers. In the case of dilepton triggers, the p T thresholds for the leading and subleading leptons for the dimuon triggers are 17 and 8 GeV, respectively, 23 and 12 GeV for dielectron triggers, and 23 and 8 GeV for muon-electron triggers, regardless of lepton flavor. Dilepton triggers require |η| < 2.4 for muons and |η| < 2.5 for electrons. The single-lepton triggers were also used in the dilepton channel to increase the efficiency, while retaining the orthogonality of the selections addressing the two final states.
Offline event reconstruction uses the CMS particle-flow (PF) algorithm [57] for particle reconstruction and identification. Single-lepton events are required to have exactly one isolated muon with p T > 26 GeV or one isolated electron with p T > 35 GeV, either within |η| < 2.1. In the dilepton channel, events are required to contain exactly two isolated leptons of opposite sign with p T > 25 GeV for the leading and p T > 20 GeV for the subleading lepton, within |η| < 2.4. Muons must satisfy the criteria described in Ref.
[58] and have a relative isolation, I rel < 0.15. Electron candidates must satisfy stringent identification criteria, including I rel , which are described in Ref. [59]. The I rel is defined as the scalar p T sum of the additional particles consistent with the same vertex as the lepton, within a cone of angular radius ∆R = √ (∆η) 2 + (∆φ) 2 = 0.4 around the lepton, divided by the p T of the lepton, where ∆η and ∆φ (in radians) are the differences in pseudorapidity and azimuthal angle, respectively, between the directions of the lepton and the additional particle. The sum is corrected for the neutral particle contribution from pileup on an event-by-event basis [58,59]. To suppress background events from decays of low-mass resonances and Z bosons, the lepton pairs are required to have an invariant mass greater than 20 GeV and be outside of a 30 GeV window centered on the Z boson mass in both the µ + µ − and e + e − channels. Events containing additional muons with looser relative isolation, I rel < 0.25, or isolated electrons are vetoed.
Each event is required to contain at least one reconstructed vertex. The reconstructed vertex with the largest value of the quadratic sum of the p T of its associated tracks is considered the primary pp interaction vertex. Jets are reconstructed from the PF candidates using the infrared-and collinear-safe anti-k T algorithm [60, 61] with a distance parameter of 0.4. Pileup interactions can contribute tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, charged particles identified as originating from pileup vertices are discarded and the jet is corrected for the remaining contributions [62, 63]. Jet energy corrections are derived from simulations to bring the measured response of jets to that of particle level jets on average. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to account for any residual differences in jet energy scale between real and simulated data [64]. The jet energy resolution is typically 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV. The missing transverse momentum vector p miss T is computed as the negative vector sum of p T of all the PF candidates in an event, and its magnitude is denoted as p miss A minimum of seven jets for the single-muon and eight jets for the single-electron channel are required, each of which must have p T > 30 GeV and |η| < 2.5. The difference in the jet multiplicity is motivated by the need to reduce the residual contamination from multijet QCD background in the electron channel due to a higher lepton misidentification rate. In the selected events, at least two jets must be tagged as originating from the hadronization of bottom quarks (b jets) using the combined secondary vertex (CSVv2) algorithm at its medium working point [66]. Additional b jet candidates are identified using the CSVv2 algorithm at its loose working point. The two working points, loose and medium, provide different levels of purity and efficiency. The loose working point gives a misidentification rate of approximately 10% for light-quark and gluon jets, with a b tagging efficiency of about 80%. The medium working point has a misidentification rate of about 1% with a b tagging efficiency of about 68%. The efficiency to tag c quarks is 12%. To suppress the small residual QCD background, p miss T is required to be larger than 50 GeV. Studies on the estimation of non-prompt leptons from QCD multijet background by inverting lepton isolation selection criteria have verified that this background is negligible after applying the selection requirements. In addition, a requirement on the scalar sum of the p T of all jets, H T > 500 GeV, is applied. The H T requirement is used to suppress the tt background, while having little effect on the signal acceptance [20].
In the dilepton channels, a minimum of four jets is required, each with |η| < 2.4. Of these, at least two must be b-tagged using the same CSVv2 algorithm with medium working point as was used in the single-lepton channel. While the p T threshold for non-tagged jets is 30 GeV (as for the single-lepton channel), the threshold for b-tagged jets is lowered to 25 GeV to increase the acceptance for events with multiple b jets. The H T > 500 GeV requirement is also applied to the dilepton channels.

Multivariate discriminants
Boosted decision trees (BDTs) [68,69] are used in two roles in this analysis: to identify the top quarks and to improve the discrimination between signal and background. The jet multiplicity, jet properties and the number of the b jets, as well as associated kinematic variables, feature strongly in the choice of BDT input variables. The method is based on the strategies developed for the previous 13 TeV CMS analyses in the single-lepton and opposite-sign dilepton final states [20]. All BDTs are trained using the ADABOOST algorithm [70], as implemented in the TMVA package [71], and return a discriminant as output.
The BDT for identifying hadronically decaying top quarks classifies combinations of three jets (trijet) on how consistent they are with the trijet originating from the all-hadronic decay of a top quark, rather than from other sources such as initial-state radiation (ISR) or final-state radiation (FSR). Its input variables consist of the invariant dijet and trijet masses, the b tagging information for the jet not associated to the dijet, and the angles between the three jets. This BDT is trained to distinguish between the three jets from a hadronically decaying top quark and any other permutation of 3-jet combinations using the ME information in tt+jets simulations.
Because of the high jet multiplicity in both signal and background events, many three-jet combinations are possible. The trijet permutations for each event are ranked according to their discriminant value, from highest to lowest. In the single-lepton channel, each tt background event contains a genuine hadronic top quark decay, so the jets included in the first-ranked trijet (T trijet1 ) are removed and the highest-ranked discriminant using the remaining jets (T trijet2 ) is used. In the dilepton channels, the tt background contains no hadronic top quark decays, so only the output for T trijet1 is used as the discriminant.
The BDTs, yielding the discriminants for the single-lepton channel (D SL tt tt ) and for the dilepton channel (D DL tt tt ), use the discriminant from the trijet associations, described above, as one of its input variables. In the single-lepton channel, D SL tt tt is trained separately for each jet multiplicity, and inclusively over the number of b-tagged jets. In the dilepton channel, the training is done unitarily for all jet multiplicities while separately in µ + µ − , µ ± e ∓ , and e + e − states. The choice of input variables is optimized separately for the two channels and is based on the characteristics of the lepton and jet activity in the events. The resulting variable lists are different for the two channels. The variables can be grouped into three categories: event activity, event topology, and b quark multiplicity. Although many of the input variables are correlated, each one contributes some additional discrimination between the tt background and the tttt signal.
Studies of the differences between the simulated tt and tttt events have led to the selection of the following variables describing the hadronic activity in the event: 1. The number of jets present in the event, N j . 5. The transverse momenta of the jets with the third-and fourth-largest p T in the event, p j3 T and p j4 T . 6. The reduced event mass, M h red , defined as the invariant mass of the system comprising all the jets in the reduced event, where the reduced event is constructed by removing the jets contained in T trijet1 in single-lepton events. In tt events, the reduced event will typically only contain the b jet from the semileptonic top quark decay and jets arising from ISR and FSR. Conversely, a reduced tttt event can contain up to two hadronically decaying top quarks and, as a result, a relatively high reduced event mass.
7. The reduced event H T , H x T , is defined as the H T of all jets in the single-lepton event selection excluding those contained in T trijet1 .
The event topology is characterized by the two variables: 1. Event sphericity, S, [72], calculated from all of the jets in the event in terms of the normal- where α and β refer to the three-components of the momentum of the ith jet. The sphericity is defined as S = (3/2)(λ 2 + λ 3 ), where λ 2 and λ 3 are the two smallest eigenvalues of M αβ . The sphericity in tttt events should differ from that in background tt events of the same energy, since the jets in tt events will be less isotropically distributed because of their recoil from sources such as ISR.
2. Hadronic centrality, C, defined as the value of H T divided by the sum of the energies of all jets in the event.
Since all these variables rely only on the hadronic information in the event, sensitivity to the lepton information is provided through the p T and η of the highest p T lepton (or the only lepton for the single-lepton channel) p 1 T , η 1 and the angular difference (∆R ) between the leptons in dilepton events. The b jet multiplicity is characterized in terms of the number of b jets tagged by the CSVv2 algorithm operating at its loose (N l tags ) and medium (N m tags ) operating points, and the angular separation ∆R bb between the b-tagged jets with the highest CSVv2 discriminants. Finally, the third-and fourth-highest b tagging discriminant values are used as they allow separation between tt +light jets, and genuine additional heavy-flavor jets, as present in tttt events.
The training variables were not changed as a function of final state or jet multiplicity. In the single-lepton channel, the optimal variable set, listed in the order of their discriminating power, the third-and fourth-highest CSVv2 discriminants, and the p T of those tagged jets; the p T for the first, second, fifth, and sixth jet. In the dilepton channel, the optimal variable set, listed in the order of their discriminating power, is T , p 1 T and η 1 . The MC modeling of the individual observables utilized in the discriminants D SL tt tt and D DL tt tt was verified using samples of tt events and found to be in agreement with the data for all the jet and b jet multiplicities.

Systematic uncertainties
The systematic uncertainties that affect this analysis can change the shape, or the normalization, or both, of the D SL tt tt and D DL tt tt discriminants. The uncertainties are characterized in Table 1. Each of the systematic uncertainty sources is modeled by one nuisance parameter. The normalization-dependent terms account for the uncertainties in the background yields, while the effect of the shape-dependent terms is evaluated using discriminant distributions whose shape has been modified by each of the uncertainties.
The experimental uncertainties considered are: • Integrated luminosity: A 2.5% normalization uncertainty on the integrated luminosity [73].
• Pileup modeling: The number of pileup events in the simulation is matched to that of the data. The uncertainty due to this correction is estimated by using two sets of alternative weights derived with a variation of ±4.6% on the total inelastic pp cross section [74].
• Lepton reconstruction and identification: The uncertainties in lepton identification, isolation, trigger efficiencies, and tracking efficiencies were examined. After a comparison between data and simulations, we assign a normalization uncertainty of 3% to take into account these effects.
Pred. Pred.    • b tagging: The uncertainty in the b tagging discriminant shape is estimated by varying the shape of the discriminant distribution according to its one standard deviation uncertainties in terms of the p T , η, and flavor of the jets [67]. The variations correspond to uncertainties in the jet energy scale, background contamination of the samples used to derive them, and statistical uncertainties of these data samples.
Sources of systematic uncertainties originating from theory are listed below.
Events / 100 GeV 1 − Pre-fit unc.  • Renormalization and factorization scales: In order to estimate the uncertainty arising from missing higher-order terms in the calculation of the signal and background cross sections, renormalization and factorization scales are each modified, independently, up and down by a factor of two relative to their nominal values. The cases in which the two scales are varied in opposite directions are excluded. This is estimated for both the tt and tttt processes.
• Parton shower scales: The evolution scales in the initial-and final-state PSs are separately varied by a factor of 2 and √ 2, respectively, up and down relative to their nominal values, in order to estimate the uncertainty attributed to the shower model. This is estimated for both the tt and tttt processes.
• ME-PS matching: The uncertainty resulting from this source is estimated by varying the POWHEG-BOX PS scale parameter, h damp , that controls the ME and PS matching Pred. Pred.

fb CMS
Pre-fit unc.  . This is estimated for the tt process. • Underlying event: The uncertainty from the UE tune of tt event generator is evaluated by using simulations with varied parameters that are related to the CUETP8M2T4 tune [39]. This is estimated for the tt process.

fb CMS
• Jet multiplicity correction: The modeling of tt +jets production in POWHEG-BOX is insufficient to describe the data in the regions of large jet multiplicity. To allow for this, scale factors are determined from fits to the single-lepton data in the signal depleted regions (N j = 8,9, and N m tags = 2,3), and propagated to the signal sensitive regions. The scale factors determined in the single-lepton channel are also used in the dilepton channel taking into account the difference in the jet multiplicity between

fb CMS
Pre-fit unc.  • Parton distribution functions: The PDF uncertainty [75] in tt production is estimated by evaluating the shape difference between the nominal simulation and simulations based on the NNPDF [50], MMHT14 [76], and CT10 [77] PDF sets. This is estimated for the tt process.
• Top quark p T reweighting: The tt simulation is corrected to match the observed spectra [78,79]. The uncertainty from the corrections made to the shape of the top quark p T distribution is estimated by allowing the correction function to vary within a ±1 standard deviation uncertainty. This is estimated for the tt process. • Heavy-flavor reweighting: To correctly model the rate of additional heavy-flavor jets in tt production, the uncertainty in the rate of tt+bb is taken from the ±1 standard deviation uncertainty in the measured value [80]. This is estimated for the tt process. As a cross-check, an independent uncertainty on tt+cc production was added. The resulting effect on the expected sensitivity of the search was found to be negligible.
• Rare processes: Uncertainties from the cross sections of rare processes of tt pair production in association with one or two massive gauge bosons and triple top quark production are taken into account by allowing them to vary within 50% of their SM value [21].
The simulated samples used to evaluate the PS, ME-PS and UE uncertainties are statistically limited, so these uncertainties are estimated conservatively by assigning the larger value between the statistical uncertainty of these simulated samples and the rate change of these simulated sample from the nominal simulation as uncertainty, independently for different jet multiplicities.

Results
A simultaneous binned maximum-likelihood template fit to the single-lepton, dilepton, and combined experimental results was used to determine the signal strength parameter, which is defined as the ratio of the observed and predicted SM tttt cross sections, µ = σ obs tt tt /σ SM tt tt . To increase the sensitivity of the analysis, events are categorized depending on their jet and btagged jet multiplicities. In the single-lepton channel these categories are: N j = 7, 8, 9, and ≥10 and N m tags = 2, 3, and ≥4 in each jet multiplicity region. In the dilepton channel these are N j = 4-5, 6-7, and ≥8 and N m tags = 2, and ≥3 in each jet multiplicity region. In each category the binning was chosen to ensure at least 4 predicted background events per bin.
The likelihood function incorporates each of the systematic uncertainties in the signal and background D DL tt tt and D SL tt tt templates as nuisance parameters in the fit. The systematic uncertainties attributed to the trigger or specific to the jet or lepton reconstruction were treated as fully correlated among the different final states. The normalization uncertainties are included assuming a log-normal distribution for the nuisance parameters, while the shape uncertainties are included as Gaussian-distributed parameters.
All of the post-fit nuisance parameter values were found to be consistent with their initial values to well within their quoted uncertainties, indicating the consistency of the fit model with the observed data. Two of the post-fit nuisance parameters are significantly constrained by the fit. These correspond to the heavy-flavor reweighting and initial-state parton-shower radiation scale, which are reduced by 65% and 30%, respectively. The sensitivity of the analysis is affected almost equally by the statistical uncertainty and the combined systematic uncertainties. The leading sources of systematic uncertainty are the tt+heavy-flavor production reweighting, the jet multiplicity correction, and the PS and UE modeling in tt simulation.      − Data Pre-fit unc. Post-fit unc.  − Data Pre-fit unc. Post-fit unc.

fb
Pre-fit unc. Post-fit unc.  Table 2: Maximum-likelihood signal strength, µ, and cross section estimates, as well as the expected and observed significance of SM tttt production. Both µ and σ tt tt are constrained to be positive. The results for the two analyses from this paper are shown separately and combined. The results from a previous CMS multilepton measurement are also given [21]. The values quoted for the uncertainties on the signal strengths and cross sections are the one standard deviation (s.d.) values and include all statistical and systematic uncertainties. The expected significance is calculated assuming that the data are distributed according to the prediction of the SM with nominal tttt production cross section value σ SM tt tt , which corresponds to the assumed signal strength modifier value µ = 1.

Channel
Best and the best fit value of the signal strength parameter are given together with the tttt cross section in Table 2. In order to quantify the experimental sensitivity of the search, the median expected significance is calculated assuming that the data are distributed according to the SM prediction with a nominal tttt production cross section value σ SM tt tt , corresponding to the signal strength modifier value µ = 1. An upper limit on the tttt production cross section is derived using the asymptotic approximation of the CL s method [81][82][83][84][85]. The observed and expected 95% confidence level (CL) upper limits from the two analyses and their combination are listed in Table 3. The expected upper limit on the tttt production is calculated under assumption of a background-only hypothesis, corresponding to the signal strength modifier µ = 0.

Combination with the same-sign dilepton and multileptons channels
An independent search for the SM tttt production has been performed previously in same-sign dilepton and multilepton channels [21]. This search is characterized by a different background composition which, in contrast to the single-lepton and opposite-sign dilepton searches, is composed of mainly the ttZ and ttW processes. In order to exploit the complementarity of this analysis, a combination of the results from single-lepton, opposite-sign, same-sign and multilepton channels has been performed. The combination is based on a binned likelihood function equal to a product of likelihood terms over all search regions considered in single-lepton, opposite-sign and same-sign dilepton and multilepton channels.
Because of different origins of the dominant background processes, the main systematic uncertainties in the three analyses are independent and can be treated as uncorrelated. Nevertheless, the stability of the combination with respect to the assumption on the correlations between common sources of systematic uncertainty was tested by repeating the fit with and without correlations between the corresponding nuisance parameters. The resulting changes in the signal strength and expected limit were found to be less than 1% of the corresponding total uncertainties and were therefore not included. The combined expected and observed 95% CL upper limits on the tttt production are 20 + 10 − 6 fb and 33 fb, respectively, which is about a 10% improvement on the precision of the measurement with respect to the multilepton analysis alone. A summary of upper limit determinations from the individual analyses and their combination is provided in Table 3.

Effective field theory interpretation
New physics may manifest itself as modified interactions of SM fields, even if the associated particles are too heavy to be directly probed at the LHC. Such interactions can be modeled by extending the SM Lagrangian with terms involving composite operators of SM fields. Assuming that these terms preserve the gauge symmetries of the SM, possible new interactions can be classified according to their scaling dimension and the SM fields content [86][87][88]. The EFT Lagrangian reads where L SM is the SM Lagrangian, while O (n) k and C (n) k denote dimension-n (dim-n) composite operators and their coupling parameters, respectively. Each term in the sum is suppressed by Λ n−4 , where Λ is an energy scale that characterizes the new physics and n is the scaling dimension of the corresponding operator. The energy scale, Λ, is the scale below which onshell effects of BSM physics can be neglected and is typically related to the mass scale of the hypothetical BSM states. The EFT approach is generic and, in principle, experimental constraints obtained within the EFT framework can be recast into bounds on parameters of any ultraviolet-complete new physics model.
The production of four top quarks is a unique signature that provides information about models that predict enhanced interactions of the third generation quarks, such as four-fermion tttt coupling. The dim-5 operators do not contribute to tttt production because they do not couple to top quarks [89]. A minimal basis of composite dim-6 operators contributing in Eq. (1) was derived in Ref. [87]. Only a small subset of these operators lead to four top quark production at LO in the EFT perturbation series. In a restricted scenario [10,90], assuming that new physics couples predominantly to the left-handed doublet and right-handed up-type quark singlet of the third generation, only four operators are expected to contribute significantly to tttt production, namely, where Q L and t R denote the left-handed third generation quark doublet and the right-handed top quark singlet, respectively. The 4-fermion ttbb operators were not included because of the negligible b quark parton density in the proton. Leading order predictions for the pp → tttt cross section can be parameterized using the equation where the linear terms, C k σ (1) k , represent the interference of the SM production with the dim-6 EFT contribution, while the quadratic terms include two components: the square of the diagrams containing one EFT operator, and the interference term for two diagrams, each with one EFT operator. Representing C k as a column-vector, C, Eq. (3) can be expressed in a matrix form as In order to find σ (1) and σ (2) , a system of linear equations has to be solved. It is obtained by substituting linearly-independent vectors C into Eq. (4). In the cross section calculation, the EFT interactions are implemented in the FEYNRULES [90,91] package and interfaced with MADGRAPH5 aMC@NLO [13]. The NNPDF3.0LO [50] PDF set and α S (M Z ) = 0.138 were used in the calculation. In the EFT predictions, the SM contribution, σ SM tt tt in Eqs. (3) and (4), was rescaled to the NLO cross section of 9.2 fb for the collision energy of 13 TeV. The linear and quadratic coefficients, σ (1) k and σ (2) j,k , in Eq. (3) can be found in Table 4. Table 4: Linear (left) and quadratic (right) parameterization coefficients, σ (1) k and σ (2) j,k , of Eq. (3). The coefficients σ (1) k are in units (fb TeV 2 ), while the coefficients σ (2) j,k are in units (fb TeV 4 ). The observed limit of 3.6σ SM tt tt , with a corresponding expected limit of 3.2σ SM tt tt (assuming µ = 1), from the combined experimental results, is used to constrain possible contributions of EFT operators. Since the data are only sensitive to the ratios, C k /Λ 2 , the constraints are presented only for such ratios. In the limit setting, SM kinematics of the tttt final state were assumed and only rate information was utilized to calculate the constraints. Besides the NLO scale uncertainty from the SM tttt NLO prediction, no further scale uncertainties were added because other uncertainties on tttt production are already included in the experimental limit.
Independent limits were obtained under the assumption that only one operator contributes to the tttt cross section with the coefficients of the other operators set to zero. The intervals obtained are summarized in Table 5. More conservative estimates were obtained by marginalizing the contribution of other operators within the interval C k /Λ 2 ∈ [−4π, 4π], defined by the stability of perturbation series. The corresponding limits are listed in Table 6. The results obtained are only slightly weaker than independent constraints because of the small correlations between the operators.

Summary
A search for standard model tttt production has been performed in final states with one or two oppositely signed muons or electrons plus jets. The observed yields attributed to tttt production are consistent with the background predictions. An upper limit at 95% confidence level of 48 fb is set on the cross section for tttt production. Combining this result with a previous same-sign dilepton and multilepton search [21] the resulting cross section is 13 + 11 − 9 fb with an observed significance of 1.4 standard deviations. The combined result constitutes one of the most stringent constraints from CMS on the production of four top quarks and can be used for phenomenological reinterpretation of a wide range of new physics models. The experimental results are interpreted in the effective field theory framework and yield limits on dimension-6 four-fermion operators coupling to third generation quarks competitive with the latest ATLAS interpretation [18].

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. [13] J. Alwall et al., "The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations", JHEP 07 (2014) 079, doi:10.1007/JHEP07(2014)079, arXiv:1405.0301.
[54] CMS Collaboration, "Investigations of the impact of the parton shower tuning in Pythia 8 in the modelling of tt at √ s = 8 and 13 TeV", CMS Physics Analysis Summary CMS-PAS-TOP-16-021, 2016.