Search for direct top squark pair production in events with one lepton, jets, and missing transverse momentum at 13 TeV with the CMS experiment

A search for direct top squark pair production is presented. The search is based on proton-proton collision data at a center-of-mass energy of 13 TeV recorded by the CMS experiment at the LHC during 2016, 2017, and 2018, corresponding to an integrated luminosity of 137 fb$^{-1}$. The search is carried out using events with a single isolated electron or muon, multiple jets, and large transverse momentum imbalance. The observed data are consistent with the expectations from standard model processes. Exclusions are set in the context of simplified top squark pair production models. Depending on the model, exclusion limits at 95% confidence level for top squark masses up to 1.2 TeV are set for a massless lightest supersymmetric particle, assumed to be the neutralino. For models with top squark masses of 1 TeV, neutralino masses up to 600 GeV are excluded.


Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6][7][8] is an attractive extension of the standard model (SM), characterized by the presence of SUSY partners for every SM particle.These partner particles have the same quantum numbers as their SM counterparts, except for the spin, which differs by one-half unit.In models with R-parity conservation [9], the lightest supersymmetric particle (LSP), is stable, and, if neutral, could be a dark matter candidate [10].The extended particle spectrum in SUSY scenarios allows for the cancellation of quadratic divergences arising from quantum corrections to the Higgs boson mass [11][12][13][14][15]. Scenarios realizing this cancellation often contain top squarks ( t), SUSY partners of the SM top quark (t), and higgsinos, SUSY partners of the SM Higgs boson, with masses near the electroweak scale.The t pair production cross section is expected to be large at the CERN LHC.
In this paper, a search is presented for top squark pair production in final states with events from pp collisions at √ s = 13 TeV, collected between 2016 and 2018 by the CMS experiment, corresponding to an integrated luminosity of 137 fb −1 .Two top squark decay modes are considered: the decay to a top quark and the lightest neutralino ( χ 0 1 ), which is taken to be the LSP, or the decay to a bottom quark (b) and the lightest chargino ( χ ± 1 ).In the latter scenario, it is assumed that the χ ± 1 decays to a W boson and the χ 0 1 .The mass of the chargino is chosen to be (m t + m χ 0 1 )/2.The corresponding diagrams are given in Fig. 1.The common experimental signature for pair production with these decay modes is WW ( * ) + bb + χ 0 1 χ 0 1 .The analysis is based on events where one of the W bosons decays leptonically and the other hadronically.This results in the event selection of one isolated lepton, at least 2 jets, and large missing transverse momentum (p miss T ) from the two neutralinos and the neutrino.For the latter decay, the χ ± 1 decays further into a W boson and a χ 0 1 .
Dedicated searches for top squark pair production in 13 TeV proton-proton (pp) collision events have been carried out by both the ATLAS [16][17][18][19][20][21][22][23][24][25] and CMS [26][27][28][29][30][31][32][33][34][35][36][37][38] Collaborations.The search presented here improves the previous one [29] by adding the data collected in 2017 and 2018, resulting in approximately a factor of four increase in the size of the data sample.In addition, new search regions have been added, which are sensitive to scenarios where the mass of the top squark is close to the sum of the masses of either the χ 0 1 and the top quark, or the χ 0 1 and the W boson.These scenarios are referred to as compressed mass scenarios hereafter.In addition, a method has been implemented to identify top quarks that decay hadronically, and also the background estimation techniques have been improved.The paper is organized as follows: Section 2 and 3 describe the CMS detector and the simulated samples used in this analysis.The object reconstruction and search strategy are presented in Section 4. The background prediction methods are described in Section 5, and the relevant systematic uncertainties are discussed in Section 6. Results and interpretations are detailed in Section 7, and a summary is presented in Section 8.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
Events of interest are selected using a two-tier trigger system.The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events in a fixed time interval of less than 4 µs.The second level, called the high-level trigger, further decreases the event rate from around 100 kHz to less than 1 kHz before data storage.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Refs.[39,40].The pixel tracker was upgraded before the start of the data taking period in 2017, providing one additional layer of measurements compared to the older tracker [41].

Simulated samples
Monte Carlo (MC) simulation is used to design the search, to aid in the estimation of SM backgrounds, and to evaluate the sensitivity of the analysis to top squark pair production.Samples of events of SM tt, W + jets, Z + jets, and γ + jets processes and simplified SUSY top squark pair production models are generated at leading-order (LO) in quantum chromodynamics (QCD) using the MADGRAPH5 aMC@NLO 2 (2.2.2 or 2.4.2) generator [42].The MADGRAPH5 aMC@NLO at next-to-LO (NLO) in QCD is used to generate samples of ttZ, WZ, and ttW events, while single top quark events are generated at NLO in QCD using the POWHEG 2.0 [43][44][45][46] program.Samples of W + jets, tt, and SUSY events are generated with four, three, and two additional partons included in the matrix element calculations, respectively.
Since the data used for this search were collected in three distinct periods (2016, 2017, and 2018), different detector MC simulations are used to reflect the running conditions.In addition, in some cases, the generator settings are also different as described below.
The NNPDF3.0 [47,48] parton distribution functions (PDFs) are used to generate all 2016 MC samples, while NNPDF3.1 [49] is used for 2017 and 2018 samples.The parton shower and hadronization are modeled with PYTHIA 8.2 (8.205 or 8.230) [50].The MLM [51] and FxFx [52] prescriptions are employed to match partons from the matrix element calculation to those from the parton showers, for the LO and NLO samples, respectively.
The 2016 MC samples are generated with the CUETP8M1 [53] PYTHIA tune.For the later running periods, the CP5 [54] tune was used for SM samples, and the SUSY samples use LO PDFs, combined with tune CP2, in order to avoid large negative weights that arise from PDF interpolations at very large energies.The differences in jet kinematics for the different PYTHIA tunes are within 5% of each other.The GEANT4 [55] package is used to simulate the response of the CMS detector for all SM processes, while the CMS fast simulation program [56,57] is used for SUSY samples.
Cross section calculations performed at next-to-NLO (NNLO) in QCD are used to normalize the MC samples of W + jets [58] and single top quark [59,60] events.The tt samples are nor-malized to a cross section determined at NNLO in QCD that includes the resummation of the next-to-next-to-leading logarithmic (NNLL) soft-gluon terms [61][62][63][64][65][66][67].Monte Carlo samples of other SM background processes are normalized to cross sections obtained from the MC event generators at either LO or NLO in QCD.The SUSY cross sections are computed at approximately NNLO plus NNLL precision with all other SUSY particles assumed to be heavy and decoupled [68][69][70][71][72][73][74].
To improve the modeling of the multiplicity of additional jets either from initial-state radiation (ISR) or final-state radiation (FSR), simulated SM and SUSY events are reweighted so as to make the jet multiplicity agree with data.The reweighting is applied to all SUSY samples but only to 2016 SM samples.No reweighting is applied for 2017 and 2018 SM simulation because of the improved tuning of the MC generators mentioned above.The procedure is based on a comparison of the light-flavor jet multiplicity in dilepton tt events in data and simulation.The comparison is performed after selecting events with two leptons and two btagged jets, which are jets identified as originating from the fragmentation of bottom quarks.The reweighting factors obtained vary from 0.92 to 0.51 for one to six additional jets.The uncertainties in the reweighting factors are evaluated as half of the deviation from unity.These uncertainties cover the data-simulation differences observed in tt enriched validation samples obtained by selecting events with an eµ pair and at least one b-tagged jet.
The p miss T and its vector ( p miss T ), defined in Section 4, are key ingredients of the analysis.The modeling of their resolution in the simulation is studied in γ + jets samples for each data taking period.Based on these studies, the simulated p miss T resolution is corrected with scale factors, the magnitudes of which are around 10% for the 2018 data and up to 15% for the latter subset of the 2017 data.The correction factors for the earlier subset of the 2017 data, or the entire 2016 data are close to unity.

Event reconstruction and search strategy
The overall strategy of the analysis follows that of the search presented in Ref. [29].Three categories of search regions are defined.The "standard selection" is designed to be sensitive to the majority of the top squark scenarios under consideration with ∆m t, χ 0 1 > m t .In this paper we use the symbol ∆m(a, b) to indicate the mass difference between particles a and b, and m a to denote the mass of a. Two additional sets of signal regions are used to target decays of the top squark to a top quark and a neutralino with mass splittings between these particles of either ∆m t, χ 0 1 ∼ m t , or ∆m t, χ 0 1 ∼ m W .

Event reconstruction
The events used in this analysis are selected using triggers that require either large p miss T , or the presence of an isolated electron or muon.The p miss T is first computed from the negative vector sum of the p T of all particle-flow candidates, described below.The trigger selects events with p miss T > 120 GeV.The minimum requirement on the lepton p T varied between 27 and 35 GeV for electrons, and between 24 and 27 GeV for muons, depending on the data taking period.The combined trigger efficiency, measured with a data sample of events with a large scalar sum of jet p T , is greater than 99% for events with p miss T > 250 GeV and lepton p T > 20 GeV.
The CMS event reconstruction is based on a particle-flow (PF) algorithm [75].The algorithm combines information from all CMS subdetectors to identify charged and neutral hadrons, photons, electrons, and muons, collectively referred to as PF candidates.
Each event must contain at least one reconstructed pp interaction vertex.The reconstructed vertex with the largest value of the summed p 2 T of physics objects is taken to be the primary vertex (PV).The physics objects are the objects reconstructed by the anti-k T jet finding algorithm [76][77][78] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum (H miss T ), taken as the negative vector sum of the p T of those jets.
Events with possible contributions from beam halo interactions or anomalous noise in the calorimeter are rejected using dedicated filters [79].For the 2017 and 2018 data taking periods, the ratio of the scalar sums of jet p T within |η| < 5.0 and of jet p T within |η| < 2.4 is required to be smaller than 1.5 to reject events with significant p miss T arising from noise in the ECAL endcap forward region.Additionally, during part of the 2018 data taking period, two sectors of the HCAL endcap detector experienced a power loss.The affected data sample size is about 39 fb −1 .As the identification of both electrons and jets depends on correct energy fraction measurements, events from the affected data taking periods containing an electron or a jet in the region −2.4 < η < −1.4 and azimuthal angle −1.6 < φ < −0.8 radians are rejected.
After these initial requirements, we apply an event preselection summarized in Table 1 and described below.Selected events are required to have exactly one electron [80] or muon [81] originating from the PV and isolated from other activity in the event.Leptons are identified as isolated if the scalar sum of the p T of all PF candidates in a cone around the lepton, excluding the lepton itself, is less than 10% of the lepton p T .Typical lepton selection efficiencies are approximately 85% for electrons and 95% for muons, depending on p T and η.
The PF candidates are clustered into jets using the anti-k T algorithm with a distance parameter of 0.4.Jet energies are corrected for contributions from multiple interactions in the same or adjacent beam crossing (pileup) [82,83] and to account for nonuniformity in the detector response.These jet energy corrections are propagated to the calculation of p miss T [84,85].
Jets in the analysis are required to be within p T > 30 GeV and |η| < 2.4, and the number of these jets (N j ) is required to be at least two.Jets overlapping with the selected lepton within a cone radius of ∆R = 0.4 are not counted.The distribution of the number of jets after the preselection requirements is shown in Fig. 2 (upper right).The jet multiplicity s used to define the signal region bins to optimize sensitivity for a variety of signal models and SUSY particle masses, as shown in this figure .After these requirements, jets originating from a bottom quark fragmentation are identified as b-tagged jets by the combined secondary vertex algorithm using a deep neural network (DeepCSV) [86].The preselection requires at least one b-tagged jet with either a medium or tight working point.The threshold on the discriminator value corresponding to the medium (tight) working point is chosen so that the tagging rate for light-flavor jets is about 1% (0.1%), corresponding to an efficiency to identify a jet originating from a bottom-flavored hadron of 65-80 (40-65)%, for jet p T of 30-400 GeV.
To enhance sensitivity to signal scenarios with a compressed mass spectra, we use a secondary vertex (SV), not associated to jets or leptons, to identify soft b hadrons [30] with p T > 1 GeV and |η| < 2.5.The SV is reconstructed by the inclusive vertex finding algorithm [87].At least two tracks must be associated to the SV and the sum of the transverse momenta of all the associated tracks is required to be below 20 GeV.The distance between the SV and the PV must be <3 cm and the significance of this distance is required to be >4.The cosine of the pointing angle defined by the scalar product between the distance vector, −−−−→ (SV,PV), and the p SV T , where the p SV T is the total three-momentum of the tracks associated with the SV, must be >0.98.These requirements help suppress background from light-flavor hadrons and jets.Events containing objects that pass these selections, are said to contain a "soft b object".These requirements result      1, including the requirement on the variable shown, and the distributions of M T (lower left) and min ∆φ(j 1,2 , p miss T ) (lower right) are shown after applying the preselection requirements, excluding the requirement on the variable shown with the green, dashed vertical line marking the location of the requirement.The stacked histograms for the SM background contributions (categorized as described in Section 5) are from the simulation to illustrate the discriminating power of these variables.The gray hashed region indicates the statistical uncertainty of the simulated samples.The last bin in each distribution includes the overflow events.The expectations for three signal hypotheses are overlaid, and the corresponding numbers in parentheses in the legends refer to the masses of the top squark and neutralino, respectively.For models with b χ ± 1 decays, the mass of the chargino is chosen to be (m t + m χ 0 1 )/2.
Table 1: Summary of the event preselection requirements.The magnitude of the vector sum of the p T of all jets and leptons in the event is denoted by H miss   1, the preselection requires the presence of at least one soft b object in the signal regions dedicated to the compressed mass spectra.
The background processes relevant for this search are semileptonic or dileptonic tt (tt → 1 + X or tt → 2 + X), single top quark production (mostly in the tW channel), W + jets, and processes containing a Z boson decaying into a pair of neutrinos (Z → ν ν), such as ttZ or WZ.Contributions to the background from semileptonic tt and W + jets are heavily suppressed by requiring in the preselection that the transverse mass (M T ) be greater than 150 GeV and the p miss T to be greater than 250 GeV, as shown in Fig. 2 (upper left and lower left, respectively).The M T is defined as ] with p T denoting the lepton p T , and ∆φ the azimuthal separation between the lepton direction and p miss T .
In addition, to suppress background from processes with two leptonically decaying W bosons, primarily tt and tW, we also reject events containing either an additional lepton passing a loose selection (denoted as "veto lepton" in Table 1) or an isolated track.Further rejection is achieved by requiring that the minimum angle in the transverse plane between the p miss T and the directions of the two leading p T jets in the event (denoted as j 1,2 ), min ∆φ(j 1,2 , p miss T ), is greater than 0.8 or 0.5, depending on the signal region.This can be seen from the distribution of min ∆φ(j 1,2 , p miss T ), after applying the rest of the preselection requirements, is shown in Fig. 2 (lower right).
In addition to the preselection requirements, we also use two deep neural networks (DNNs) to categorize events based on the identification of hadronically decaying top quarks.
One DNN, referred to as the resolved tagger, uses the DeepResolved algorithm to identify hadronically decaying top quarks with a moderate Lorentz boost.The decay products of these objects result in three separate jets (resolved top quark decay).The DeepResolved algorithm identifies top quarks the decay products of which form three anti-k T jets of distance parameter 0.4.The three jets (p T > 40, 30, 20 GeV) of each candidate must have an invariant mass between 100 and 250 GeV, no more than one of the jets can be identified as a b-tagged jet, and the three jets must all lie within a cone of ∆R < 3.14 of the trijet centroid.
A neural network is used to distinguish trijet combinations which match to a top quark versus those which do not.The network uses high-level information such as the invariant mass of the trijet system and of the individual dijet pairs, as well as kinematic information from each jet.This includes its Lorentz vector, DeepCSV heavy-flavor discriminator values, jet shape variables, and detector level particle multiplicity and energy fraction variables.The network is trained using both tt and QCD simulation, and data as training inputs.The simulation is used to define the examples of signal and background.The signal is defined as any trijet passing the preselection requirements, where each jet is matched to a generator level daughter of a top quark within a cone of ∆R < 0.4 and the overall trijet system is matched to the generator level top quark within a cone of ∆R < 0.6.The background category is defined as any trijet combination that is not categorized as signal.This includes trijet combinations for which some, but not all, of the jets match top decay products.The data is included in the training to inhibit the network from learning features of the MC which are not present in data.This is achieved through a technique called domain adaption via gradient reversal [88].With this method, an additional output is added to the neural network to distinguishing between trijet candidates from QCD simulation and a QCD-enriched data sample.The main network is then restricted to minimize its ability to discriminate simulation from data.This yields a network with good separation between signal and background while minimizing over-fitting on features that exist only in simulation.Before the final selection of trijets as top quarks can be made, any trijet candidates that may share the jets with another candidate must be removed.This is achieved by always favoring a candidate with a higher top discriminator value as determined by the neural network.The final list of reconstructed tops is then found by placing a requirement on the neural network discriminator corresponding to an efficiency to select a hadronic top with the resolved tagger is 45% and the mistagging rate is 10% for dileptonic tt events An event has a resolved top quark tag if at least one top candidate has a discriminator value above a threshold.
The second DNN, referred to as a merged tagger, uses the DeepAK8 [89] algorithm to identify top quarks with large boost, where the decay products are merged into a single jet (merged top quark decay).The identification of this boosted top quark signature is based on anti-k T jets clustered with a distance parameter of 0.8.The efficiency for lepton + hadronic-top events is 40% and the mistagging rate is 5% for dileptonic tt events.

Search strategy
The signal regions for the standard search are summarized in Table 2, and are defined by categorizing events passing the preselection requirements based on N j , the number of identified hadronic top quarks, p miss T , the invariant mass (M b ) of the lepton and the closest b-tagged jet in ∆R, and a modified version of the topness variable [90], t mod [27], which is defined as: with a W = 5 GeV and a t = 15 GeV.The t mod variable is a χ 2 -like variable that discriminates signal from leptonically decaying tt events: an event with a small value of t mod is likely to be a dilepton tt event, while signal events tend to have larger t mod values.The first term in its definition corresponds to the top quark decay containing the reconstructed lepton, and the second term corresponds to the top quark decay containing the missing lepton.The minimization of the variable S is done with respect to all three components of the three momentum p W , and the component of the three momentum p ν along the beam line with the constraints that p miss T = p T,W + p T,ν and p 2 W = m 2 W .The distribution of t mod for events passing the preselection is shown in Fig. 3 (upper left).The t mod distribution is split into three bins, each sensitive to a different mass splitting of the top squark and neutralino.
In events containing a leptonically decaying top quark, the invariant mass of the lepton and the bottom quark jet from the same top quark decay is bound by This bound does not apply to either W + jets events or signal events, where the top squark decays to a bottom quark and a chargino.)/2.
Hadronic top quark taggers are used in signal regions sensitive to SUSY scenarios with hadronically decaying top quarks when most of the expected SM background does not contain such a top quark decay.Therefore, the hadronic top taggers are deployed in the low M b , t mod ≥ 0, and relatively modest p miss T signal regions.Events containing two or three jets and p miss T ≤ 600 GeV, or at least four jets and p miss T ≤ 450 GeV, are categorized according to the presence of a merged top quark tag.The resolved top quark tagger is used to further categorize events with four or more jets.If an event contains both merged and resolved top quark tags, it is placed in the merged top category, while events containing neither are categorized as untagged.Distributions of the discriminant of the merged and resolved top quark taggers in the signal regions are also shown in Fig. 3 (lower left and lower right, respectively).
The small mass splitting in SUSY models with a compressed mass spectrum results in soft decay products.High values of p miss T can only be caused by large boost from ISR.As a result, in signal regions targeting these models the jet with the highest p T is expected to be from ISR and therefore it is required to not be identified as a bottom quark jet.We also impose an upper bound on the lepton p T relative to the p miss T , since this requirement provides an additional handle to reject SM W + jets and tt backgrounds.Regions targeting signal scenarios with ∆m t, χ 0 1 ∼ m t require at least five jets and at least one b-tagged jet based on the DeepCSV medium working point.For signal scenarios with ∆m t, χ 0 1 ∼ m W , the bottom quarks are expected to have low p T .Therefore, in these regions the N j selection is relaxed to N j ≥ 3 and instead of requiring the presence of a b-tagged jet we require the presence of a soft b object.Note that soft b objects are included in the jet count in these regions.The requirements for the two sets of signal regions targeting compressed mass spectrum SUSY scenarios are summarized in Table 3.

Background estimation
Three categories of SM backgrounds remain after the selection requirements described in Section 4.
• The lost-lepton background consists of events with two W bosons decaying leptonically, where one of the leptons is either not reconstructed, or not identified.This background arises primarily from tt events, with a smaller contribution from single top quark processes.It is the dominant background in regions with low values of M b , no top quark tag, or N j ≥ 5.This background is estimated using a dilepton control sample.
• The one-lepton background consists of events with a single W boson decaying leptonically and without any additional source of genuine p miss T .The requirements of p miss T > 250 GeV and M T > 150 GeV heavily suppress this background.The onelepton background is estimated from simulation when it originates from top quark decays (i.e.semi-leptonic tt).Background events not originating from top quark decays, instead mainly from direct W production, are estimated using a control sample of events with no b-tagged jets.
• The Z → ν ν background consists of events with a single leptonically decaying W boson and a Z boson that decays to a pair of neutrinos, i.e., pp → ttZ or WZ.This background is estimated using simulation.

Lost-lepton background
The lost-lepton background in each of the signal regions is estimated from corresponding dilepton control samples.Each dilepton control sample is obtained with the signal selections except for the requirement of a second isolated lepton with p T > 10 GeV and the removal of the lepton, track, and tau vetoes.The estimated background in each search region is obtained from the yield of data events in the corresponding control sample and a transfer factor obtained from simulation, R lost-/2 MC .The transfer factor is defined as the ratio of the expected lost-lepton yield in the signal region and the yield of dilepton SM events in the control sample.Corrections obtained from studies of samples of Z, J/ψ → events are applied to account for differences in lepton reconstruction and selection efficiencies between data and simulation.
When defining the p miss T in this control sample, the trailing lepton p T is added to p miss T to enhanced data statistics and all p miss T related quantities are recalculated.The distribution of p miss T for after this addition is shown in Fig. 4 (left) for an inclusive selection.Some control samples only contain a small number of events.These samples, corresponding to multiple p miss T bins, are combined into a single control sample until the expected yield in simulation is at least five events, as detailed in Table 4.The number of data events in the combined control sample is used to estimate the sum of expected background events in the corresponding signal regions.This sum is then distributed across p miss T bins according to the expectation from simulation using an extrapolation factor k(p miss T ).
The dominant uncertainties in the transfer factors are the statistical uncertainties in the simulated samples, the uncertainties in the lepton efficiencies, and the uncertainties in the jet energy scale.These uncertainties range between 3-68%, 2-20%, and 1-16%, respectively.For the regions in Table 4, the dominant uncertainty associated with the p miss T extrapolation is the statistical uncertainty in the simulated samples (5-60%).Uncertainties in the b tagging efficiency and in the choices of the renormalization and factorization scales are small.The total uncertainty in the transfer factor is 6-100%, depending on the region.The uncertainty in the transfer factor is typically comparable to the statistical uncertainty in the control sample yield.

One-lepton background
The one-lepton (1 ) background is suppressed by the p miss T > 250 GeV and M T > 150 GeV requirements.This suppression is more effective for events with a W boson originating from a top quark decay than for direct W boson production (W + jets).In the case of a top quark decay, the mass of the top quark sets bound at the mass of the lepton-neutrino system.As a result, the contribution of semileptonic tt events to the tail of the M T distribution is caused by p miss T resolution effects, while in the case of W + jets events the contribution from off-shell W bosons is dominant.
The semileptonic tt background is taken from simulation.The W + jets background is estimated from a control sample with no b-tagged jets nor soft b objects (0b sample) obtained by inverting the b-tagging requirement.Figure 4 (right) shows the M b distribution in the 0b control sample, where this quantity is computed from the jet with the highest value of the DeepCSV discriminant.The modeling of this distribution in simulation is validated by comparing simulation and data in a W + jets enriched control sample obtained by selecting events with 1-2 jets and 60 < M T < 120 GeV.
The W + jets background estimate in each search region is obtained from the yield in the corresponding control samples and a transfer factor determined from simulation.The transfer factor, defined as the ratio of the expected one lepton (not from t) yield in the signal region and the yield of events in the 0b control sample, accounts for the acceptance and b tagging efficiency.As in the case of the lost-lepton background estimate, multiple control samples are combined into a single control sample until the expected yield in simulation is at least five events, as detailed in Table 5.Studies with simulated samples indicate that the contribution to the total background from semileptonic tt events is less than 10% in most search regions, except in a few regions with ≥1 top quark tags, where the contribution becomes as large as 30%.An uncertainty of 100% is assigned to cover the impact of the uncertainties in the p miss T resolution.

Background from events containing Z → ν ν
The third category arises from ttZ, WZ, and other rare multiboson processes.In all these processes, events from a leptonically decaying W boson, and one or more Z bosons decaying to neutrinos, enter the search regions.In most search regions, ttZ is the most important process contributing to this category.These backgrounds are estimated from simulation.The contribution from ttZ is normalized using the measured value of the cross section [91].This normalization results in a rescaling of the theoretical cross section by 1.17 +0.10 −0.09 , where the uncertainty is taken from the statistical uncertainty in the measurement.

Systematic uncertainties
The contributions to the total uncertainty in the estimated backgrounds and expected signal yields are summarized in Table 6.The total uncertainty is generally larger at higher p miss T or when yields in the control samples become small.Out of the uncertainties quoted, the theoretical uncertainties are correlated across the different data-taking periods because they are independent of the data-taking period.The uncertainties on lepton efficiency are also assumed to be fully correlated, but other experimental uncertainties are taken as uncorrelated between the different data-taking years.
Theoretical uncertainties affect all quantities derived from simulation such as the signal acceptance, the transfer factors used in the estimate of the lost lepton and one-lepton backgrounds, and the estimate of the Z → ν ν background.The uncertainty resulting from missing higherorder corrections is estimated by varying the renormalization and factorization scales by a factor of two [92,93] with the two scales taken to be the same in each variation.The effect of the uncertainties in the parton distribution functions is estimated using 100 variations provided with the NNPDF sets, and the effect of the uncertainty in the value of the strong coupling constant is estimated by varying the value α S (m Z ) = 0.1180 by ±0.0015 [94].
The p miss T lineshape is corrected to account for mismodeling effects from p miss T resolution and N ISR/FSR j .The uncertainty in these corrections results in a 1-50% uncertainty in the estimated backgrounds, depending on signal region.The uncertainty in the N ISR/FSR j rescaling also affects the signal acceptance.The effect is small in most search regions, but can be noticeable in signal scenarios with a compressed mass spectrum.
The effect of the uncertainty in the jet energy scale is 1-34% in the estimated backgrounds and up to 24% in the signal acceptance.Variations in the efficiency of the b jet and soft b object identification typically affect the estimated signal and background yields by 0.1% and 3%, with a full range up to 10%.
The uncertainty in the cross section of W + jets events with jets containing b quarks is an important source of uncertainty in the estimation of the W + jets background.A comparison of the multiplicity of b-tagged jets between data and simulation is performed in a W + jets enriched control sample obtained with the same selection as for the M b validation test, with the additional requirement of p miss T > 250 GeV.From this study, we estimate a 50% uncertainty in the W + b(b ) cross section resulting in a 20-40% uncertainty in the W + jets background estimate.

Results and interpretation
The event yields and the SM predictions in the search regions are summarized in Tables 7  and 8.These results are also illustrated in Fig. 5.The observed yields are consistent with the estimated SM backgrounds.Isolated fluctuations are observed in a few signal region bins.The data events in these signal region bins were inspected carefully to determine if any detector or reconstruction effects were the source of the high p miss T .No such issues were detected.Table 7: The observed and expected yields in the standard search regions.For the top quark tagging categories, we use the abbreviations U for untagged, M for merged, and R for resolved.
[GeV] lepton from t) Results are interpreted in the context of top squark pair production models described in Section 1.For a given model, limits on the production cross sections are derived as a function of the mass of the SUSY particles by combining the search regions using a modified frequentist approach, employing the CL s criterion and an asymptotic formulation [95][96][97][98].When computing the limit, the expected signal yields are corrected for the possible contributions of signal events to the control samples.These corrections are typically around 5-10%.7 and 8 and their ratios are shown as stacked histograms.The lost lepton and 1 (not from t) are estimated from data-driven methods, while 1 (from t) and Z → ν ν backgrounds are taken from simulation.The uncertainties consist of statistical and systematic components summed in quadrature and are shown as shaded bands.
For the models in which both top squarks decay to a top quark and an χ 0 1 , the limits are derived from the ∆m t, χ 0 1 ∼ m W search regions when 100 ≤ ∆m t, χ 0 1 ≤ 150 GeV, and from the ∆m t, χ 0 1 ∼ m t search regions when 150 ≤ ∆m t, χ 0 1 ≤ 225 GeV.For all other models, the cross section limits are obtained from the standard search regions.
In the case of ∆m t, χ 0 1 ∼ m W , the specially designed signal regions result in improvements of up to a factor of five in cross section sensitivity with respect to the results that would have been obtained based on the standard search regions.On the other hand, the corresponding improvements from the signal regions designed for ∆m t, χ 0 1 ∼ m t are typically of the order of 10-20%.In the high mass region, this analysis is sensitive to an additional ∼200 GeV in expected limit for top squark masses [29].
The 95% confidence level (CL) upper limits on cross sections for the pp → t t → tt χ 0 1 χ 0 1 process, as a function of sparticle masses and assuming that the top quarks are not polarized, are shown in Fig. 6.In this figure we also show the excluded region of parameter space based on the expected cross section for top squark pair production.We exclude the existence of top squarks with masses up to 1.2 TeV for a massless neutralino, and neutralinos with masses up to 600 GeV for m t = 1 TeV.The white band corresponds to the region |m t − m t − m χ 0 1 | < 25 GeV, m t < 275 GeV, where the selection acceptance for top squark pair production changes rapidly.In this region the acceptance is very sensitive to the details of the simulation, and therefore no interpretation is performed.
Figures 7 and 8 display the equivalent limits for the pp scenarios, respectively.These models are characterized by three mass parameters (for the top squark, the chargino, and the neutralino).In the mixed decay scenario of Fig. 8, we have assumed a compressed mass spectrum for the neutralinochargino pair, which is theoretically favored if the χ ± 1 and the χ 0 1 are higgsinos.The search has very poor sensitivity for models with this mass spectrum when both top squarks decay to charginos.Therefore in the case of Fig. 7, we have chosen a larger mass splitting between the χ ± 1 and the χ 0 1 .

Summary
A search for direct top squark pair production is performed using events with one lepton, jets, and significant missing transverse momentum.The search is based on proton-proton collision data at a center-of-mass energy of 13 TeV recorded by the CMS experiment at the LHC during 2016-2018 and corresponding to an integrated luminosity of 137 fb −1 .The leading backgrounds in this analysis, mainly dileptonic tt decays, where one of the leptons is not reconstructed or identified, and W + jets production are estimated from data control regions.The semileptonic tt and Z → ν ν backgrounds are taken from simulation.No significant deviations from the standard model expectations are observed.Limits on pair-produced top squarks are established in the context of supersymmetry models conserving R-parity.Exclusion limits at 95% CL for top squark masses up to 1.2 TeV are set for a massless neutralino.For models with a top squark mass of 1 TeV, neutralino masses up to 600 GeV are excluded. [GeV]

95% CL upper limit on cross section [pb]
Figure 6: Exclusion limits at 95% CL for the pp → t t → tt χ 0 1 χ 0 1 scenario.The colored map illustrates the 95% CL upper limits on the product of the production cross section and branching fraction.The area enclosed by the thick black curve represents the observed exclusion region, and that enclosed by the thick, dashed red curve represents the expected exclusion.The thin dotted (red) curves indicate the region containing 68% of the distribution of limits expected under the background-only hypothesis.The thin solid (black) curves show the change in the observed limit by varying the signal cross sections within their theoretical uncertainties.The white band excluded from the limits corresponds to the region |m t − m t − m χ 0 1 | < 25 GeV, m t < 275 GeV, where the selection acceptance for top squark pair production changes rapidly and is therefore very sensitive to the details of the simulation. [GeV] )/2.The colored map illustrates the 95% CL upper limits on the product of the production cross section and branching fraction.The area enclosed by the thick black curve represents the observed exclusion region, and that enclosed by the thick, dashed red curve represents the expected exclusion.The thin dotted (red) curves indicate the region containing 68% of the distribution of limits expected under the background-only hypothesis.The thin solid (black) curves show the change in the observed limit by varying the signal cross sections within their theoretical uncertainties. [GeV]

95% CL upper limit on cross section [pb]
Figure 8: Exclusion limits at 95% CL for the pp → t t → tb χ ± 1 χ 0 1 χ ± 1 → W * χ 0 1 scenario.The mass difference between the χ ± 1 and the χ 0 1 is taken to be 5 GeV.The colored map illustrates the 95% CL upper limits on the product of the production cross section and branching fraction.The area enclosed by the thick black curve represents the observed exclusion region, and that enclosed by the thick, dashed red curve represents the expected exclusion.The thin dotted (red) curves indicate the region containing 68% of the distribution of limits expected under the background-only hypothesis.The thin solid (black) curves show the change in the observed limit by varying the signal cross sections within their theoretical uncertainties.

Figure 1 :
Figure1: Diagrams for top squark pair production, with each t decaying either to t χ 0 1 or to b χ ± 1 .For the latter decay, the χ ± 1 decays further into a W boson and a χ 0 1 .

Figure 2 :
Figure 2: The distributions of p miss T (upper left) and N j (upper right) are shown after applying the preselection requirements of Table1, including the requirement on the variable shown, and the distributions of M T (lower left) and min ∆φ(j 1,2 , p miss

T.>
The symbols p T and η correspond to the transverse momentum and pseudorapidity of the lepton.The symbol p sum T is the scalar sum of the p T of all (charged) PF candidates in a cone around the lepton (track), excluding the lepton (track) itself.Finally, N b, med and N b, soft are the multiplicity of b-tagged jets (medium working point) and soft b objects, respectively.120 GeV and H miss T > 120 GeV or isolated µ(e) with p T > 24(25) GeV Trigger (2017, 2018) p miss T > 120 GeV and H miss T > 120 GeV or isolated µ(e) with p T > 27(35) GeV p sum T cone size for µ or e: ∆R = min[max(0.05,10 GeV/p T ), 0.2] for track: ∆R = 0.3 Lepton µ(e) with p T > 20 GeV, |η | < 2.4 (1.44) p sum T < 0.1 × p T Veto lepton µ or e with p T > 5 GeV, |η | < 2.4 GeV min ∆φ(j 1,2 , p miss T ) > 0.8 radians for standard search > 0.5 radians for compressed scenarios in a 40-55 (2-5)% efficiency to select a soft b object originating from a soft bottom-flavor (lightflavor) hadron.As listed in Table To maintain acceptance to a broad range of signal scenarios, rather than requiring a selection on M b , events are placed into low-or high-M b categories if the value of M b is less or greater than 175 GeV, respectively.In signal regions with M b > 175 GeV, at least one jet is required to satisfy the tight b tagging working point of the DeepCSV discriminator to suppress the background from W + jets events.The distribution of M b in the signal regions is shown in Fig.3(upper right).As seen from this figure, the low M b regions are more sensitive to t χ 0 1 and the M b > 175 GeV are more sensitive to b χ ± 1 .

Figure 3 :
Figure 3: The distributions of t mod (upper left), M b (upper right), the merged top quark tagging discriminant (lower left), and the resolved top quark tagging discriminant (lower right) areshown after the preselection requirements.The green, dashed vertical lines mark the locations of the binning or tagging requirements.The stacked histograms showing the SM background contributions (categorized as described in Section 5) are from the simulation to illustrate the discriminating power of these variables.The gray hashed region indicates the statistical uncertainty of the simulated samples.Events outside the range of the distributions shown are included in the first or last bins.The expectations for three signal hypotheses are overlaid, and the corresponding numbers in parentheses in the legends refer to the masses of the top squark and neutralino, respectively.For models with b χ ± 1 decays, the mass of the chargino is chosen to be (m t + m χ 0 1

Figure 4 :
Figure 4: Distributions of kinematic variables in the inclusive control samples used for the background estimation.The gray hashed region indicates the statistical uncertainty of the simulated samples.The distributions for data are shown as points with error bars corresponding to the statistical uncertainty.The stacked histograms show the expected SM background contributions from simulation, normalized to the number of events observed in data.The last bin in each distribution also includes the overflow.Left: Distribution of p miss T in the dilepton control sample.Right: Distribution of M b in the 0b control sample.The lost-lepton background in each signal region, N SR lost-, is obtained by scaling the number of events in the control region, N CR 2 , using the transfer factor R lost-/2 MC

Figure 5 :
Figure 5: The observed and expected yields in Tables7 and 8and their ratios are shown as stacked histograms.The lost lepton and 1 (not from t) are estimated from data-driven methods, while 1 (from t) and Z → ν ν backgrounds are taken from simulation.The uncertainties consist of statistical and systematic components summed in quadrature and are shown as shaded bands.

Table 2 :
The 39 signal regions of the standard selection, with each neighboring pair of values in the p miss T bins column defines a single signal region.At least one b-tagged jet selected using the medium (tight) working point is required for search regions with M b lower (higher) than 175 GeV.For the top quark tagging categories, we use the abbreviations U for untagged, M for merged, and R for resolved.

Table 3 :
Definitions of the total 10 search regions targeting signal scenarios with a compressed mass spectrum.Search regions for ∆m t, χ 0 1 ∼ m t and ∼ m W scenarios are labeled with the letter I and J, respectively.The symbol p T denotes the transverse momentum of the lepton.

Table 4 :
Dilepton control samples that are combined when estimating the lost-lepton background.

Table 6 :
Summary of major systematic uncertainties.The range of values reflect their impact on the estimated backgrounds and signal yields in different signal regions.A 100% uncertainty is assigned to the 1 (from t) background estimated from simulation.

Table 8 :
The observed and expected yields for signal regions targeting scenarios of top squark production with a compressed mass spectrum.