Search for top squark pair production in pp collisions at sqrt(s) = 13 TeV using single lepton events

A search for top squark pair production in pp collisions at sqrt(s) = 13 TeV is performed using events with a single isolated electron or muon, jets, and a large transverse momentum imbalance. The results are based on data collected in 2016 with the CMS detector at the LHC, corresponding to an integrated luminosity of 35.9 inverse femtobarns. No significant excess of events is observed above the expectation from standard model processes. Exclusion limits are set in the context of supersymmetric models of pair production of top squarks that decay either to a top quark and a neutralino or to a bottom quark and a chargino. Depending on the details of the model, we exclude top squarks with masses as high as 1120 GeV. Detailed information is also provided to facilitate theoretical interpretations in other scenarios of physics beyond the standard model.


Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6][7][8] is an extension of the standard model (SM) that postulates the existence of a superpartner for every SM particle with the same gauge quantum numbers but differing by one half-unit of spin. The search for a low mass top squark, the scalar partner of the top quark, is of particular interest following the discovery of a Higgs boson [9][10][11], as it would substantially contribute to the cancellation of the divergent loop corrections to the Higgs boson mass, providing a possible solution to the hierarchy problem [12][13][14]. We present results of a search for top squark pair production in the final state with a single lepton ( = e or µ) with high transverse momentum (p T ), jets, and significant p T imbalance. Dedicated top squark searches have been carried out by the ATLAS [15] and CMS [16,17] collaborations based on 13 TeV proton-proton (pp) collisions at the CERN LHC, with data sets corresponding to integrated luminosities of 3.2 and 2.3 fb −1 , respectively. In this paper we report on an extension of the search of Ref. [16] in the single-lepton final state that exploits the data sample collected with the CMS detector [18] in 2016, corresponding to the much larger integrated luminosity of 35.9 fb −1 . We find no evidence for an excess of events above the expected background from standard model processes, and interpret the results as limits on simplified models [19][20][21][22] of the pair production of top squarks ( t) decaying into top quarks and neutralinos ( χ 0 1 ) and/or bottom quarks and charginos ( χ ± 1 ), as shown in Fig. 1. We take the χ 0 1 to be the lightest supersymmetric particle (LSP) and to be stable. (c) Figure 1: Simplified-models diagrams corresponding to top squark pair production, followed by the specific decay modes targeted in this paper. (a) pp → t t → t χ 0 1 t χ 0 1 ; (b) pp → t t → b χ + 1 b χ − 1 ; (c) pp → t t → b χ + 1 t χ 0 1 . Charge-conjugate decays are implied. objects is summarized in Table 1 and is described in more detail below.
The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the objects returned by a jet finding algorithm [35,36] applied to all charged tracks associated with the vertex, plus the corresponding associated missing transverse momentum.
Selected events are required to have exactly one electron [37] or muon [38] with p T > 20 GeV and |η| < 1.4442 or |η| < 2.4, respectively. The lepton needs to be consistent with originating from the primary interaction vertex and isolated from other activity in the event. Typical lepton selection efficiencies are approximately 85% for electrons and 95% for muons within the selection acceptance criteria, with variations at the level of a few percent depending on the p T and η of the lepton.
Jets are formed by clustering neutral and charged PF objects using the anti-k T algorithm [35] with a distance parameter of 0.4. The charged PF objects are required to be consistent with originating from the primary vertex. Jet energies are corrected for contributions from multiple interactions in the same or adjacent beam crossings (pileup) [36], and to account for nonuniformity in the detector response [39]. Jets overlapping with the selected lepton within a cone ∆R = √ (∆η) 2 + (∆φ) 2 = 0.4 are not considered. We select events with two or more jets with p T > 30 GeV and |η| < 2.4, at least one of which is required to be consistent with containing the decay of a heavy-flavor hadron. These jets, referred to as b-tagged jets, are identified using two different working points (medium and tight WP) of the CSVv2 tagging algorithm [40,41]. The jet corrections described above are propagated consistently as a correction to the missing transverse momentum vector ( p miss T ), defined as the negative vector p T sum of all PF objects. We denote the magnitude of this vector as E miss T in the discussion below. Events with possible contributions from beam halo processes or anomalous noise in the calorimeter are rejected using dedicated filters [42].
Background events originating from tt decays with only one top quark decaying leptonically (tt → 1 ), W+jets, and single top quark processes are suppressed by the requirement on the E miss T and the transverse mass (M T ) of the lepton-p miss T system. For signal, higher values of E miss T than for background are expected due to the presence of additional unobserved particles, the LSPs. Similarly, the M T distribution has a jacobian edge around the W boson mass for background events, whereas for signal events no such edge exists due to the presence of the LSPs. We require M T to be greater than 150 GeV. After these requirements, the largest contribution of SM background events is from processes with two lepton in the final state such as from tt (tt → 2 ) where the second lepton does not pass the selection requirements for the leading lepton. Additional rejection is achieved by vetoing events containing a second lepton or isolated track passing looser identification and isolation requirements than those used for the leading lepton. We also demand that the angle min ∆φ(J 1,2 , E miss T ) in the azimuthal plane between the p miss T and the direction of the closest of the two leading p T jets in the event (J 1 and J 2 ) to be greater than 0.8 radians. This requirement is motivated by the fact that background tt → 2 events tend to have high-p T top quarks, and thus objects in these events tend to be collinear in the transverse plane, resulting in smaller values of min ∆φ(J 1,2 , E miss T ) than is typical for signal events.

Signal regions
We define two sets of signal regions. The first set ("standard") is designed to be sensitive to most of the ∆m t,  The second set ("compressed") is designed to enhance sensitivity to the decay mode in Fig. 1(a) when ∆m t, χ 0 1 ∼ m t . While the signal regions within each set are mutually exclusive, there is overlap across the signal regions of the two sets.
Both sets have been optimized to have a high signal sensitivity for different decay modes and mass hypotheses using simulation of the SM background processes and the simplified model topologies shown in Fig. 1.
For the first set, signal regions are defined by categorizing events based on the number of jets (N J ), the E miss T , the invariant mass (M b ) of the lepton and the closest b-tagged jet in ∆R, and a modified version of the topness variable [43], t mod : with the constraint p miss T = p T,W + p T,ν . The first term corresponds to the leptonically decaying top quark, the second term to the hadronically decaying top quark. The calculation uses resolution parameters a W = 5 GeV and a t = 15 GeV. The exact choices of objects used in this variable together with a more detailed motivation can be found in Ref. [16].
In models with t decays containing a χ ± 1 that is almost mass degenerate with the χ 0 1 , the SM decay products of the χ ± 1 are very soft. The final state for these signal can contain a small number of jets, while in signal models without this mass degeneracy at least four jets are expected.
The M b distribution has a sharp endpoint at about m 2 t − m 2 W for events containing a leptonically decaying top quark such as tt events or signals containing at least one top quark in the decay chain. On the other hand, the M b distribution does not have this endpoint for the subdominant background of W+jets as well as signal models with top squark decays to a b quark and a χ ± 1 . The t mod variable tests for compatibility with the tt → 2 hypothesis when one of the leptons is not reconstructed. Very high values of t mod imply that an event is not compatible with the tt → 2 hypothesis. Signal models with large ∆m t, χ 0 1 result in such values. On the other hand negative values of t mod are a property of tt → 2 . As signal models with a small mass splitting between t and χ 0 1 also have low values in t mod , we keep events with negative t mod , to retain sensitivity for these signal models.
The requirements for the standard signal regions are summarized in Table 2. The compressed signal regions are designed to select events with a high-p T jet from ISR, which is needed to provide the necessary boost to the system to obtain large E miss T and large M T . Thus, we require at least five jets in the event, with the highest p T jet failing the medium WP of the b tagging algorithm. Additionally, we reject events if the selected lepton has p T > 150 GeV as we expect the lepton to be soft in the compressed region. We also require the angle between the lepton direction and p miss T in the azimuthal plane to be <2. This is because the ISR selection results in boosted top squarks with decay products typically close to each other. Finally, we relax the min ∆φ(J 1,2 , E miss T ) requirement in the preselection from 0.8 to 0.5 to increase the signal acceptance. The selection requirements for the compressed signal regions are summarized in Table 3. Table 3: Summary of the compressed selection and the requirements for the four corresponding signal regions. The symbol ∆φ(E miss T , ) denotes the angle between p miss T and the p T of the lepton, and J 1 denotes the highest p T jet.

Background estimation
Three categories of background from SM processes remain after the selection requirements described in Sections 4 and 5.
• Lost-lepton background: events with two leptonically decaying W bosons in which one of the leptons is not reconstructed or identified. This background arises primarily from tt events, with a smaller contribution from single-top quark processes. It is the dominant background in the M b < 175 GeV and N J ≥ 4 search regions, and is estimated using a dilepton control sample.
• One-lepton background: events with a single leptonically decaying W boson and no additional source of genuine E miss T . This background is strongly suppressed by the preselection requirements of E miss T > 250 GeV and M T > 150 GeV.
The suppression is much more effective for events with a W boson originating from the decay of a top quark than for direct W boson production (W+jets), as the mass of the top quark imposes a bound on the mass of the charged lepton-neutrino system. As a result, the tail of the M T distribution in tt → 1 events is dominated by E miss T resolution effects, while in W+jets it extends further and is largely driven by the width of the W boson.
The W+jets background estimate is obtained from a control sample of events with no b-tagged jets. The subleading tt → 1 background is modeled from simulation. Onelepton events are the dominant background in the M b ≥ 175 GeV search regions. • Z → νν background: events with exactly one leptonically decaying W boson and a Z boson that decays to a pair of neutrinos, e.g., ttZ or WZ. This background is estimated from simulation, after normalizing the simulated event yield to the observed data counts in a control region obtained by selecting events with three leptons, two of which must be consistent with the Z decay hypothesis.
These three types of backgrounds are discussed below. More details about the validity of the background estimation methods for the two first categories can be found in Ref. [16].

Lost-lepton background
The lost-lepton background is estimated from a dilepton control sample obtained with the same selection requirements as the signal sample, except for requiring the presence of a second isolated lepton with p T > 10 GeV. For each signal region, a corresponding control region is constructed, with an exception as noted below. In defining the control regions, the p T of the second lepton is added to the p miss T and all relevant event quantities are recalculated. The estimated background in each search region is then obtained from the yield of data events in the control region and a transfer factor defined as the ratio of the expected SM event yields in the signal and control regions, as determined from simulation. Corrections obtained from studies of Z/γ → events are applied to account for small differences in lepton reconstruction and selection efficiencies between data and simulation.
Due to a lack of statistics, the two or three highest E miss T bins of Table 2 are combined resulting in the list of control regions listed in Table 4, and the simulation, after the correction described below, is used to determine the expected distribution of SM events as a function of E miss T . The correction is based on a study of the E miss T distribution in a top quark enriched control region of eµ events with at least one b-tagged jet, as shown in Fig. 2. The ratio of data to simulation yields as a function of E miss T in the eµ sample is taken as a bin-by-bin correction for the expected E miss T distribution in the simulation of tt and tW events with a lost lepton. The uncertainty in each bin is taken to be one half the deviation from unity.
The dominant uncertainties on the transfer factors arise from the statistical uncertainties in the simulated samples and the uncertainties in the lepton efficiency. These range from 5-100% and 5-15%, respectively. The uncertainties on the lepton efficiency are derived from studies of samples of leptonically-decaying Z bosons. For the regions of Table 4, there are also uncertainties associated with the E miss T distribution. These are also dominated by the statistical precision of the simulated samples, and range between 10 and 100%. Uncertainties due to the jet energy scale and the b tagging efficiency are evaluated by varying the correction factors for simulation by their uncertainties, and the uncertainties due to the choices of renormalization and factorization scale used in the generation of SM samples are assessed by varying the scales by a factor of 2. All these uncertainties are found to be small. The resulting systematic uncertainties on the transfer factors are 10-100%, depending on the region. These are generally smaller than the statistical uncertainties from the data yield in the corresponding control regions that are used, in conjunction with the transfer factors, to predict the SM background in the signal regions.

One-lepton background
As discussed previously, the one-lepton background receives contributions from processes where the leptonically decaying W boson is produced directly or from the decay of a top quark. The background from direct W boson production is estimated in each search region using a control region obtained with the same selection as the signal region except that the b tagging requirement is inverted to enrich the sample in W+jets events. The estimate in each search region is then obtained using a transfer factor determined from simulated samples that accounts for the b quark jet acceptance and tagging efficiency. The estimate is corrected for small differences in the performance of the b tagging algorithm between data and simulation.
In the control sample, the M b variable is constructed using the selected lepton and the jet in the event with the highest value of the b-tag discriminator. The M b distribution is validated in a control sample enriched in the W+jets events, obtained by selecting events with 1 or 2 jets, 60 < M T < 120 GeV, E miss T > 250 GeV, and either 0 or ≥ 1 jet passing the medium WP of the b tagging algorithm. Figure 3(a) shows the M b distribution in both data and simulation for the control samples with 0 and ≥1 b-tagged jets. The bottom panel shows the good agreement between data and simulation in the extrapolation factor from the 0 b-tagged jets sample to the sample with ≥1 b-tagged jets.
The largest uncertainty in the transfer factor comes from the limited event counts of the simulated samples, followed by the uncertainty on the heavy-flavor fraction of jets in W+jets events. A comparison of the multiplicity of b-tagged jets between data and simulation is performed in a W+jets enriched region obtained with the same selection as for the M b distribution, as shown in Fig. 3(b). The difference between data and simulation is covered by a 50% uncertainty on the heavy-flavor component of W+jets events, and is indicated by the shaded band in the figure.
Variations of the jet energy scale and b tagging efficiency within their measured uncertainties each result in a 10% uncertainty in the background estimate. The total uncertainty in the estimate of the W+jets background varies from 20 to 80%, depending on signal region.
Simulation studies indicate that in all signal regions the contribution from tt → 1 events is expected to be smaller than 10% of the total background. This estimate is sensitive to the correct modeling of the E miss T resolution, since this affects the M T tail. The modeling of the E miss T resolution is studied using data and simulated samples of γ+jets events. The photon p T spectrum is reweighted to match that of the neutrino in simulated tt → 1 events after first re-weighting the photon p T spectrum in simulation to match that observed in data. We then add the photon p T to the E miss T , and compare the resulting spectra. Differences of up to 40% in the E miss T shape between data and simulated events are observed, as shown in Fig. 3(c) for a selection with at least 2 jets. Corrections for these differences are applied to the tt → 1 simulation and a resulting 100% uncertainty is assigned to the estimate of this background.

Background from events containing Z → νν decays
The third and last category of background arises from ttZ, WZ, and other rare multiboson processes, all with a leptonically decaying W boson and one or more Z bosons decaying to neutrinos. Within this category, the contribution from WZ events is dominant in the low-N J bins, whereas in events with higher N J , 60-80% of this background is due to ttZ processes.
The background from these processes is estimated from simulation with normalization obtained from a data control sample containing three leptons. For this sample, two leptons must form an opposite charge, same flavor pair having an invariant mass between 76 and 106 GeV. The normalization of the WZ and ttZ processes is determined by performing a template fit to the distribution of the number of b-tagged jets in this sample. The result of this fit yield scale factors of 1.21 ± 0.11 and 1.14 ± 0.30 to be applied to the simulated samples of WZ and ttZ events, respectively.
We also assess all relevant theoretical and experimental uncertainties that can affect the shapes of the kinematic distributions of our signal region definitions by recomputing acceptances after modifying the various kinematical quantities and reconstruction efficiencies within their respective uncertainties. The experimental uncertainties are obtained by variations of the simulation correction factors within their measured uncertainties. The largest contributions are due to the uncertainties in the jet energy scale and to the choices of the renormalization and factorization scales used in the MC generation of SM samples. The latter is obtained by varying the scales by a factor of 2. Other uncertainties are due to the lepton and b tagging efficiencies, the modeling of additional jets in the parton shower, pileup, the value of the strong coupling constant α S , and the PDF sets. The uncertainty on the PDF sets is evaluated by using replicas of the NNPDF3.0 set [25].
The total uncertainty in the Z → νν background is 17-78%, depending on the search region.

Results and interpretation
The event yields in data in the 31 search regions defined in Tables 2 and 3 are statistically compatible with the estimated backgrounds from SM processes. They are summarized in Table 5 and Fig. 4 and are interpreted in the context of the simplified models of top squark pair production described in Section 1. Further information on the experimental results to facilitate     reinterpretations for beyond the SM models not considered here is given in Appendix A. For a given model, limits on the production cross-section are derived as a function of the masses of the SUSY particles by combining search regions using a modified frequentist approach, employing the CL s criterion in an asymptotic formulation [44][45][46][47]. These limits are turned into exclusion regions in the m( t) − m( χ 0 1 ) plane using the calculation of the cross-section from reference [48] and are shown on Figures 5, 6, and 7. Limits are obtained by combining the 27 regions from the standard selection defined in Table 2, except for the model of Fig. 1(a) in the compressed region 100 ≤ ∆m t, χ 0 1 ≤ 225 GeV, where we use the four compressed search regions listed in Table 3. This approach improves the expected cross section upper limit in the compressed mass region by ∼15-30%. When computing the limits, the expected signal yields are corrected for possible contamination of SUSY events in the data control regions. These corrections are typically around 5-10%.
A summary of the uncertainties in the signal efficiency is shown in Table 6. They are evaluated in the same manner as done in the background estimation methods, described in Section 6. The Lost lepton (not from t)  largest uncertainties are due to the limited size of the simulated signal samples, the b tagging efficiency, and the jet energy scale. For model points with a small mass splitting, the ISR uncertainty described in Section 3 is also significant. Since new physics signals are simulated using the CMS fast simulation program, additional uncertainties are assigned to the correction of the lepton and b tagging efficiencies, as well as to cover differences in E miss T resolution between the fast simulation and the full GEANT4-based model of the CMS detector. The latter uncertainty is small in the bulk of the model space, but may reach up to 25% in scenarios with a compressed mass spectrum. Uncertainties due to the integrated luminosity, ISR modeling, E miss T resolution, and b tagging and lepton efficiencies are treated as fully correlated across search regions. Figure 5 shows the 95% confidence level (CL) upper limit on pp → t t → t χ 0 1 t χ 0 1 , assuming unpolarized top quarks in the decay chain, together with the upper limit at 95% CL on the signal cross section. We exclude top squark masses up to 1120 GeV for a massless LSP and LSP masses up to 515 GeV for a 950 GeV top squark mass. The white band corresponds to the region |m t − m t − m χ 0 1 | < 25 GeV, m t < 275 GeV where the selection efficiency of top squark events changes rapidly and becomes very sensitive to details of the model and the simulation. No cross section limit is established in that region. Figure 6 shows the 95% CL upper limit for pp → t t → bb χ ± 1 χ ± 1 , χ ± 1 → W χ 0 1 , together with the upper limit at 95% CL on the excluded signal cross section. The mass of the chargino is chosen to be (m t + m χ 0 1 )/2. We exclude top squark masses up to 1000 GeV for a massless LSP and LSP masses up to 450 GeV for a 800 GeV top squark mass.  Figure 7 shows the 95% CL upper limit for pp → t t → tb χ ± 1 χ 0 1 , χ ± 1 → W * χ 0 1 , together with the upper limit at 95% CL on the excluded signal cross section. The mass splitting of the chargino and neutralino is fixed to 5 GeV. We exclude top squark masses up to 980 GeV for a massless LSP and LSP masses up to 400 GeV for a 825 GeV top squark mass.

Summary
We have reported on a search for top squark pair production in pp collisions at √ s = 13 TeV in events with a single isolated electron or muon, jets, and large missing transverse momentum using data collected with the CMS detector during the 2016 run of the LHC, corresponding to an integrated luminosity of 35.9 fb −1 . The event data yields are consistent with the expectations from SM processes. The results are interpreted as exclusion limits in the context of supersymmetric models with pair production of top squarks that decay either to a top quark and a neutralino or to a bottom quark and a chargino. Assuming both top squarks decay to a top quark and a neutralino, we exclude at 95% CL top squark masses up to 1120 GeV for a massless neutralino and neutralino masses up to 515 GeV for a 950 GeV top squark mass. For a scenario where both top squarks decay to a bottom quark and a chargino, with the chargino mass the average of the masses of the neutralino and top squark, we exclude at the 95% CL top squark masses up to 1000 GeV for a massless neutralino and neutralino masses up to 450 GeV for a 800 GeV top squark mass. For the mixed decay scenario, with the mass splitting between the chargino and neutralino fixed to be 5 GeV, we exclude at the 95% CL top squark masses up to 980 GeV for a massless neutralino and neutralino masses up to 400 GeV for a 825 GeV top squark mass.
The mass of the chargino is chosen to be (m t + m χ 0 1 )/2. The interpretation is done in the two-dimensional space of m t vs. m χ 0 1 . The color indicates the 95% CL upper limit on the cross section times branching fraction at each point in the m t vs. m χ 0 1 plane. The area between the thick black curves represents the observed exclusion region at 95% CL assuming 100% branching fraction, while the dashed red lines indicate the expected limits at 95% CL and their ±1σ experimental standard deviation uncertainties. The thin black lines show the effect of the theoretical uncertainties (σ theory ) in the signal cross section.
The mass splitting of the chargino and neutralino is fixed to 5 GeV. The interpretation is done in the two-dimensional space of m t vs. m χ 0 1 . The color indicates the 95% CL upper limit on the cross section at each point in the m t vs. m χ 0 1 plane. The area between the thick black curves represents the observed exclusion region at 95% CL, while the dashed red lines indicate the expected limits at 95% CL and their ±1σ experimental standard deviation uncertainties. The thin black lines show the effect of the theoretical uncertainties (σ theory ) in the signal cross section.  The yields and background predictions of this search can be used to confront scenarios for physics beyond the standard model (BSM) not considered in this paper. To facilitate such reinterpretations, in Table 7 we provide results for a small number of inclusive aggregated signal regions. The background expectation, the event count, and the expected BSM yield in any one of these regions can be used to constrain BSM hypotheses in a simple way. In addition, we provide the correlation matrix for the background predictions in the full set of search regions (Figs. 8 and 9). This information can be used to exploit the full power of the analysis by constructing a simplified likelihood for a BSM model as described in Ref. [49].