Search for direct pair production of scalar top quarks in the single- and dilepton channels in proton-proton collisions at p s = 8 TeV

: Results are reported from a search for the top squark e t 1 , the lighter of the two supersymmetric partners of the top quark. The data sample corresponds to 19.7 fb (cid:0) 1 of proton-proton collisions at p s = 8 TeV collected with the CMS detector at the LHC. The search targets e t 1 ! b e (cid:31) (cid:6) 1 and e t 1 ! t ( (cid:3) ) e (cid:31) 01 decay modes, where e (cid:31) (cid:6) 1 and e (cid:31) 01 are the lightest chargino and neutralino, respectively. The reconstructed (cid:12)nal state consists of jets, b jets, missing transverse energy, and either one or two leptons. Leading backgrounds are determined from data. No signi(cid:12)cant excess in data is observed above the expectation from standard model processes. The results exclude a region of the two-dimensional plane of possible e t 1 and e (cid:31) 0 1 masses. The highest excluded e t 1 and e (cid:31) 0 1 masses are about 700 GeV and 250 GeV, respectively. results in terms of a region in the ( m ( e t 1 ) ; m ( e (cid:31) 01 )) plane, excluded at 95% CL. We combine the results of both searches for maximal sensitivity; the sensitivity depends on the decay mode, and on the ( m ( e t 1 ) ; m ( e (cid:31) 01 )) signal point. The highest excluded e t 1 and e (cid:31) 01 masses are about 700 GeV and 250 GeV, respectively.

The CMS collaboration 31

Introduction
Theories of supersymmetry (SUSY) predict the existence of a scalar partner for each standard model (SM) left-handed and right-handed fermion. When the symmetry is broken, the scalar partners acquire a mass different from their SM counterparts, the mass splitting between scalar mass eigenstates being dependent on the mass of the SM fermion. Because of the large mass of the top quark, the splitting between its chiral supersymmetric partners is potentially the largest among all supersymmetric quarks (squarks). As a result the lighter supersymmetric scalar partner of the top quark, the top squark ( t 1 ), could be the lightest squark. The search for a low mass top squark is of particular interest following the discovery of a Higgs boson [1][2][3], as a top squark with a mass in the TeV range would contribute substantially to the cancellation of the divergent loop corrections to the Higgs boson mass. SUSY scenarios with a neutralino ( χ 0 1 ) as lightest supersymmetric particle (LSP) and a nearly degenerate-mass t 1 provide one theoretically possible way to produce the observed relic abundance of dark matter [4,5]; this further motivates the search for the t 1 at the LHC.
In this paper we report two searches for direct top squark pair production with the CMS detector at √ s = 8 TeV with integrated luminosities of 19.5 fb −1 and 19.7 fb −1 . Each search is based on the two decay modes shown in figure 1. The decay modes and the nomenclature we will use to refer to them are as follows: (the "tt" decay mode); pp → t 1 t 1 → bb χ + 1 χ − 1 → bbW +( * ) W −( * ) χ 0 1 χ 0 1 (the "bbWW" decay mode).
The tt and bbWW events both contain bottom quark jets (henceforth called b jets) and may contain charged leptons and neutrinos from W ( * ) decay. The search strategies are therefore tailored to require either one lepton or two leptons, as well as at least one b jet and a minimum amount of transverse momentum imbalance. Throughout this paper the term "lepton" refers only to e ± and µ ± . Previous searches for low mass top squarks in leptonic final states have been conducted by the D0, CDF, CMS, and ATLAS collaborations [6][7][8][9][10][11][12]. As shown in table 1, we categorize the decays of the t 1 as 2-body or 3-body processes and as a function of the masses of the involved particles. In all cases we take the lightest neutralino χ 0 1 to be the LSP. For each decay mode we fix the corresponding t 1 branching fraction to unity; the search is in all other respects designed to be as independent as possible of the details of any specific SUSY model. We explore a range of signal mass points for each decay mode under consideration. In the decay mode tt, the unknown masses are those of the t 1 and the χ 0 1 , while in the case of bbWW, a third unknown is the mass of the lightest chargino ( χ ± 1 ). In the latter case we consider three possible mass assignments, labeled by the parameter x = 0.25, 0.50, 0.75; x is defined by m( χ ± 1 ) = m( χ 0 1 ) + x m( t 1 ) − m( χ 0 1 ) . (1.1)

JHEP07(2016)027
Kinematic conditions Type of decay Decay mode m(b) + m(W) + m( χ 0 1 ) ≤ m( t 1 ) 3-body decays (tt) t 1 (→ t * χ 0 1 ) → bW χ 0 1 and m( t 1 ) < m(t) + m( χ 0 1 ) m(t) + m( χ 0 1 ) ≤ m( t 1 ) 2-body decays (tt) t 1 → t χ 0 1 , with t → bW m(b) + m(W) + m( χ 0 1 ) ≤ m( t 1 ) 2-body decays (bbWW) t 1 → b χ ± 1 , with χ ± 1 → W ( * ) χ 0 1 and m( χ 0 1 ) < m( χ ± 1 ) < m( t 1 ) − m(b) Table 1. Kinematic conditions for the t 1 decay modes explored in this paper. dominates the signal by several orders of magnitude and often has similar distributions for individual discriminating variables, a multivariate approach has been developed to exploit differences in the correlations among discriminating variables for signal and SM background. The background determination method has also been improved compared to ref. [12] in order to better control and correct the tail of the key transverse mass distribution. In addition to the single lepton search, we also report on a search in the dilepton mode, where the key discriminating variable is an M T2 variable [13]. The final result is based on a combination of the single lepton and dilepton searches.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid that provides an axial magnetic field of 3.8 T for charged-particle tracking. Trajectories of charged particles are measured by a silicon pixel and strip tracker, covering 0 < φ < 2π in azimuth and |η| < 2.5, where the pseudorapidity η is defined as η = − ln[tan(θ/2)]; θ is the polar angle of the trajectory of the particle with respect to the counterclockwise beam direction. A crystal electromagnetic calorimeter and a brass/scintillator hadronic calorimeter surround the tracking detectors. The calorimeter measures the energy and direction of electrons, photons, and hadronic jets. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. The detector is nearly hermetic, allowing for momentum balance measurements in the plane transverse to the beam axis. Events are selected online by a two-level trigger system [14]. A more detailed description of the CMS detector can be found in ref. [15].

Samples and trigger requirements
Events used for this search are selected initially by single-lepton and dilepton triggers. For the single-electron final state, the online selection requires the electron be isolated and have transverse momentum p T > 27 GeV; in subsequent offline analysis the reconstructed electron p T is required to exceed 30 GeV. For the single-muon final state, two triggers are used, which both require |η(µ)| < 2.1: a purely leptonic trigger requiring an isolated muon with p T > 24 GeV; and an additional mixed trigger requiring an isolated muon of p T > 17 GeV together with three jets, each having p T > 30 GeV. The first trigger suffices for muons whose offline reconstructed p T exceeds 26 GeV, while the second trigger allows the analysis to use muons with reconstructed p T as low as 20 GeV; the additional jets are required in the analysis in any case. The dilepton triggers require either ee, µµ, or eµ pairs. In each case, one lepton must satisfy p T > 17 GeV and the other lepton must satisfy p T > 8 GeV. Trigger efficiencies are measured in data and applied to simulated events. The integrated luminosity, after data quality requirements, is 19.5 ± 0.5 fb −1 for the single-lepton states and 19.7 ± 0.5 fb −1 for the dilepton final states [16].
To ensure agreement with data, simulated events are weighted so that the distribution of the number of proton-proton interactions per beam crossing agrees with that seen in data; they are additionally weighted by the trigger efficiency and the lepton identification and isolation efficiencies. For simulated tt samples, a p T -dependent weight is applied to match the shape of the dσ(pp → tt + X)/dp T distribution observed in data. Signal events are weighted to account for the effect of initial state radiation [12].

Object reconstruction
In this search, all particle candidates are reconstructed with the particle-flow (PF) algorithm [31,32], and additional criteria are then applied to select electrons, muons, jets, and b jets; the criteria are applied to both collision data and simulation samples.

JHEP07(2016)027
The identification and measurement of the p T of muons uses information provided by the silicon detector and the muon system [33]. We require the muon to have a 'tight' identification [33] with pseudorapidity |η| < 2.1 and |η| < 2.4 for the single-lepton and dilepton searches, respectively. The identification and energy measurement of the electrons uses information provided by the tracker and the electromagnetic calorimeter. Electron candidates are reconstructed in the tracker with the Gaussian-sum filter algorithm [34]. We require the electron to have a 'medium' identification [34] with pseudorapidity |η| < 1.44 and |η| < 2.5 for the single-lepton and dilepton searches, respectively. Both muon and electron identification demand that the lepton be isolated from the hadronic components of the event. We define an isolation variable for the leptons based on a scalar sum of transverse momenta, P ≡ | p T |, where the sum is taken over all PF candidates within a cone about the lepton of ∆R ≡ √ (∆φ) 2 + (∆η) 2 = 0.3, excluding the transverse momentum of the lepton itself, p T ( ). In the single-lepton search we impose an upper limit on the absolute isolation, P < 5 GeV; for both searches we impose an upper limit on the relative isolation P/p T ( ) < 0.15.
Jets are constructed by clustering all the PF candidates with the anti-k T jet clustering algorithm [35], using a distance parameter R = 0.5. Contamination from additional pp interactions (pileup) is mitigated by discarding charged PF candidates that are incompatible with having originated from the estimated proton-proton collision point. The average pileup energy associated with neutral hadrons is computed event-by-event and subtracted from the jet energy and from the energy used when computing lepton isolation, i.e., a measure of the activity around the lepton. The energy subtracted is the average pileup energy per unit area (in ∆η × ∆φ) times the jet or isolation cone area [36,37]. Candidate jets must be separated from selected leptons by ∆R > 0.4. Relative and absolute jet energy corrections are applied to the raw jet momenta to establish a uniform jet response in |η| and a calibrated response in jet p T . We require the jets pass p T > 30 GeV and |η| < 2.4. To tag jets originating from the hadronization of b quarks, we utilize the combined secondary vertex algorithm at its 'medium' operating point [38] with a corresponding efficiency for b jets of 65% and a mistag rate for light jets of 1%. Scale factors are applied to simulation samples to reproduce the efficiencies measured in the data.
As the decays of t 1 are expected to yield neutralinos and neutrinos in their decay chain, genuine missing transverse momentum is expected in the final state of signal events. We define the missing transverse momentum by a sum over the transverse momenta of all PF candidates, p miss JHEP07(2016)027 The figure shows clearly the evolution of the kinematic distributions as the mass difference between the lightest top squark and the LSP is varied. Differences in kinematic distributions may also be seen when comparing the tt and bbWW signal decay modes, and when varying the choice of x (x = 0.25, 0.50, 0.75) in the bbWW decay mode. In figure 4 we show distributions of some discriminating variables at the preselection level (but without the restriction on M T ) for both e and µ final states in data and simulated events. The figure shows good agreement between data and the total simulated background, within the uncertainties of the simulated events, which include the statistical uncertainty in the simulation samples quadratically added to the systematic uncertainty in the jet energy scale (JES). As expected from the distributions shown in figure 3, different selection variables will exhibit different degrees of discriminating power, depending on the decay mode (tt or bbWW) and the relevant mass parameters (∆m or x) of the signal. To find the most discriminating variables, we test different sets of candidate BDT input variables, maximizing a figure of merit that compares the expected signal yield to the quadratic sum of the statistical and systematic uncertainties in the expected background yield. To keep the selection tool simple, a new variable is incorporated into the set of input variables only if it leads to a substantial increase in the figure of merit. The training of the BDT, together with this procedure for selecting variables, is then carried out separately for the different decay modes tt and bbWW (x = 0. 25  as: ∆m = (150±25), (250±25), (350±25), (450±25), (550±25), and (650±25) GeV. This partitioning allows us to take into account the evolution of the signal kinematics across the (m( t 1 ), m( χ 0 1 )) plane. The different BDT trainings are numbered from 1 to 6 to reflect the ∆m regions in which they are trained.
The final sets of variables retained as input to the BDT are reported in table 2. Having been chosen with a quantitative assessment of the discriminating power of each variable, these represent the most reduced, while effective, sets of input variables to the BDT, for each decay mode and kinematic region. This represents a new feature of this search compared to ref. [12], where the BDT was trained with the same set of variables across different kinematic regions. Once the input variables to the BDT are determined, different BDTs are trained in each of the benchmark kinematic regions to build selection tools adapted to a kinematically varying signal. The simulation samples used for finding the best set of variables and training the BDT are statistically independent. This procedure is done for the tt and bbWW (different x values) decay modes. Using a more systematic approach for the definition of signal regions (SRs) than in ref. [12], we first consider which training is the best performing one in the (m( t 1 ), m( χ 0 1 )) plane. We observe that some  BDTs are the best over a very limited part of the (m( t 1 ), m( χ 0 1 )) plane, so to simplify the final selection we retain BDT trainings that are observed to be the best performing over a large portion of the mass plane. The resulting SRs, defined as the chosen BDT training in the (m( t 1 ), m( χ 0 1 )) plane, are shown for all considered decay modes in figure 5. With these SRs determined, the final selection is made by applying a minimum threshold to each BDT output as shown in figure 6. The thresholds are determined by minimizing bbWW: Table 2. Final selection variables chosen as input for the BDT training, as functions of the decay modes bbWW and tt, and kinematic regions. Column headings ∆R and ∆φ refer to ∆R( , b 1 ) and ∆φ(j 1,2 , p miss T ).  the expected upper limit cross section (σ exp 95 ) obtained from events remaining above the threshold, taking into account the predicted background (section 4.2). The final BDT trainings and selections are reported in figure 5 for all decay modes; within some SRs, the same BDT training is used with different threshold values, thus leading to different selections. On average the BDT selection suppresses the SM background by a factor ∼10 3 while reducing the signal only by a factor ∼10; the performance improves monotonically with increasing ∆m.

Background estimation
The SM background processes in the single-lepton search can be divided into four categories. At preselection, the dominant contribution (∼ 66% of the total) is the tt production with one lepton; we include single top-quark production in this category and call the combination the "tt → 1 " component. The second most significant background (23%) comes from tt events with two leptons, where one lepton escapes detection; we will call this the "tt → " component. The third background (7%) is the production of W in association with jets, which we will denote "W+jets". Other backgrounds are labeled as "rare". We use data to estimate the event yields of the first three categories, starting with distributions obtained from simulation, and normalizing these with scale factors (SF ) determined in control regions. The background is estimated using the formulae: The subscript tail refers to the region M T > 100 GeV. The simulation yields at the final selection level (N MC tail ) are corrected by normalization scale factors SF and SF 0 (defined in eq. (4.5) and (4.6)), determined in the M T peak region 50 < M T < 80 GeV. The additional scale factor ratios, denoted SF R 1 and SF R W , are used to correct the tail of the M T distribution, and are determined using a control region with zero b jets. The procedure accounts for the possibility of signal contamination in the different control regions. At the -12 -

JHEP07(2016)027
final selection level the tt → process represents an approximately constant proportion of the total background at ∼60%, while the tt → 1 and W+jets processes have varying proportions across the different selections within the remaining ∼40%. Signal contamination is important only at low ∆m, where it alters the background determination by up to 25%.

Normalization in the M T peak
The scale factors SF and SF 0 are estimated to correct for the normalization in the M T peak region and after the final selection on the output of the BDT. To calculate SF 0 we further require the second lepton veto, while SF is obtained without this veto. SF fixes the tt → background normalization, while SF 0 sets the tt → 1 and W+jets background normalizations. The scale factors are computed as follows: The inclusion of the N MC (signal) term accounts for possible signal contamination. At preselection we have: SF = (1.06 ± 0.01) and SF 0 = (1.05 ± 0.01). At the final selection level, the deviation of these scale factors from unity is always within 10%.

Correction for the tail in the M T distribution
To study the tail of the M T distribution for different backgrounds, we enrich the data with the W+jets contribution by inverting the b-tagging criterion of the preselection. The left plot of figure 7 compares the data with background simulation, and shows some disagreement between the two for M T > 100 GeV. To correct this, we follow an approach based on template fits, which allows us to extract different correction factors for the tt → 1 and the W+jets backgrounds, rather than assuming them to be equal as in ref. [12]. The template fit is performed using the invariant mass of the lepton and the jet with the highest b-tag discriminator. This variable, M b , is well modeled by the background simulation (see figure 8, left) and exhibits discriminating power between W+jets and tt → 1 (figure 8, right). The contributions of the tt → background, the rare backgrounds, and the signal, are taken from simulation and their normalizations are constrained within a 20% uncertainty during the template fit. The normalizations of the tt → 1 and W+jets backgrounds are free parameters expressed in terms of scale factors SF . The fit is performed in a control region with zero b-tag jets, in two separate regions of the M T distribution: the peak defined by 50 < M T < 80 GeV, and the tail defined by M T > 100 GeV. We then extract the normalization independent ratios SF R = SF tail /SF peak for tt → 1 and for W+jets. Without any BDT signal selection and for a case of negligible signal contamination, the fit yields: SF R tt→1 = (1.04 ± 0.16) and SF R W = (1.33 ± 0.10). The right plot of figure 7 confirms the effectiveness of this correction.
Due to the low yields after the final selections, we loosen the requirements on the output of the BDT to keep 25% of the total yield when we extract the SF R values. The SF R ratios obtained for the different signal regions within a given decay mode (tt or bbWW)  agree well with each other. We therefore set the final SF R factor for each decay mode to the average over the signal regions for that mode. The resulting SF R values for the tt and bbWW decay modes differ from one another, and also vary across the (m( t 1 ), m( χ 0 1 )) mass plane: SF R 1 increases from 1.0 to 1.4 with increasing top squark mass, while SF R W is stable around a mean value ∼1.2 everywhere. In addition to the extraction of tail correction factors, we check in the control region with zero b jets that the distributions of all input variables in data are well described by the predicted background.

Systematic uncertainties
The sensitivity of this search is limited by uncertainties in both the background prediction and the acceptance and efficiency of the signal at the mass points under consideration. The

Background
For systematic uncertainties affecting the predicted background: • We study the impact of limited simulation statistics, generator scale variations, and JES uncertainty in the template fit method in the control region with zero b jets and no BDT selection. This leads to a global absolute uncertainty of 0.6 in SF R 1 and 0.4 in SF R W .
• The goodness of the tt → background modeling is checked in two different control regions. The first uses events with exactly two leptons in the final state and a lower jet multiplicity (N (jets) ≥ 2) than that employed in the preselection; the second uses events with exactly one lepton, and an isolated track or τ h candidate. The simulation prediction is compared with data in the M T tail region of these control regions for each BDT selection. The comparison shows overall agreement and deviations are used to derive a relative systematic uncertainty, ranging from 20 to 80% depending on the selection.
• We check the modeling of the N (jets) distribution in the tt background with a control region defined to have exactly two leptons and no requirement on M T . The data/simulation scale factors are observed to be compatible with unity; therefore, no correction factor is used, but the deviations from unity are taken as systematic uncertainty. This leads to a flat 2% uncertainty, used for all the BDT selections.
• A 6% uncertainty for the modeling of the isolated track veto is applied to the fraction of tt dilepton background events that have a second e/µ or a one-prong τ h decay in the acceptance. A 7% uncertainty for the modeling of the hadronic τ veto is only applied to the fraction of tt dilepton background events that have a τ h in the acceptance.
• The SF and SF 0 normalization factors are varied within their statistical uncertainties and the variations are propagated as systematic uncertainties to the M T peak regions.
• The statistical uncertainties in the simulation background samples are propagated to the systematic uncertainties in the backgrounds.
• The cross section of W+jets and rare backgrounds are conservatively varied by 50%, affecting the prediction of other background processes through SF and SF 0 (see equations of section 4.2); the cross section of the tt process is varied by 10%.  Table 3. Summary of the relative systematic uncertainties in the total background, at the preselection level, and the range of variation over the BDT selections.

Signal
The statistical uncertainties in the signal samples are taken into account. The integrated luminosity is known [16] to a precision of 2.6% and the efficiencies of triggers (section 3.1) applied to the signal yield are known with a precision of 3%. The efficiencies for the identification and isolation of leptons are observed to be consistent within 5% for data and simulation; we take this difference as an uncertainty. The b-tagging efficiency has been varied within its uncertainties for b, c, and light flavor jets, leading to final yield uncertainties within 3% for all signal mass points. The systematic uncertainty in signal yield that is associated with the JES [41] is obtained by varying the jet energy scale within its uncertainty; the final uncertainties for all signal mass points are within 10%. Systematic uncertainties in the signal efficiency due to PDFs have been calculated [42][43][44], and are constant at ∼5%. The effect of the systematic uncertainty due to the modeling of ISR jets by the simulation is studied by deriving data/simulation scale factors that depend on N (jets). The maximum size of these uncertainties varies between 8 and 10% for different decay modes.

Summary of the single-lepton search
We develop a ∆m-dependent signal selection tool with BDTs for the tt and bbWW decay modes. For each BDT selection shown in figure 5 we provide in  the bbWW x=0.75 decay mode, tt → is the dominant background (45-65%) for BDT1 to BDT3, while rare processes dominate for BDT5 (47-61%). In figure 6 we show the distribution of the BDT output for data and the predicted background (without signal contamination) for two trainings of the bbWW x = 0.75 case. The signal contamination is taken into account by calculating a new estimation of the background in case of signal contamination (see eqs. (4.5) and (4.6)); this is done separately at each signal mass point in the (m( t 1 ), m( χ 0 1 )) plane, and for each of the signal regions defined in figure 5. For the calculation of limits (see section 6), the number of observed events in data and expected signal remain the same, while the expected background is modified to correct for signal contamination in the control regions. While the effect of this contamination is observed to be almost negligible at high ∆m, it can modify the background estimate up to 25% at low ∆m.

Selection
For the three dilepton final states considered in this search (eµ, ee, and µµ), we define the preselection as follows: • At least two oppositely charged leptons.
• For the leading and sub-leading lepton, we require p T > 20 and p T > 10 GeV, respectively.

JHEP07(2016)027
• If more than two lepton pairs are found that satisfy the above three requirements, the pair with the highest p T is chosen.
At the preselection level, tt production with two leptons represents ∼90% of the total expected background.
In this search we separate the signal from the dileptonic tt background by constructing a transverse mass variable M T2 as defined in eq. (5.1). We begin with the two selected leptons 1 and 2 . Under the assumption that the p miss T originates only from two neutrinos, we partition the p miss T into two hypothetical neutrinos with transverse momenta p miss T1 and p miss T2 . We calculate the transverse mass M T of the pairings of these hypothetical neutrinos with their respective lepton candidates and record the maximum of these two M T . This process is repeated with other viable partitions of the p miss T until the minimum of these maximal M T values is reached; this minimum is the M T2 for the event [13,45]: When constructed in this fashion, M T2 has the property that its distribution in tt → events has a kinematic endpoint at m(W). The presence of additional invisible particles for the signal breaks the assumption that the p miss T arises from only two neutrinos; consequently, M T2 in dileptonic top squark events does not necessarily have an endpoint at m(W). The value of m(W) therefore dictates the primary demarcation between the control region M T2 < 80 GeV, and the general signal region M T2 > 80 GeV. The left plot of figure 9 shows the distribution of M T2 at the preselection level, where we observe its discriminating power for two representative signal mass points. The distribution of M T2 in top squark events, however, depends upon the signal mass point (m( t 1 ), m( χ 0 1 )), as can be observed on the right plot of figure 9.
The optimal threshold on M T2 for the final selection is thus dependent on the supersymmetric particle masses: using the background predictions from section 5.2 for the M T2 signal region, we iterate in 10 GeV steps through possible M T2 thresholds, from 80 GeV to 120 GeV; for each (m( t 1 ), m( χ 0 1 )) signal mass point, we pick the threshold that yields the lowest expected upper limit for the top squark production cross section, σ exp 95 .

Background prediction
For the M T2 signal regions used in this search, the dominant background is tt. Other backgrounds also contribute, including DY, single-lepton events with an additional misidentified lepton (see section 5.2.3), and rare processes. The rare processes include single top quarks produced in association with a W boson; diboson production, including W or Z production with an associated photon; triple vector boson production; and tt production in association with one or two vector bosons. The normalization of the tt and DY backgrounds, and the normalization and shape of the misidentified lepton backgrounds, are evaluated from data

tt estimation
The tt → background represents about 90% of the events in the control region M T2 < 80 GeV (see figure 9 left). We can therefore use this region to determine the normalization of the expected SM tt contribution in the signal region. To accomplish this, we first count the number of data events in the control region and subtract the simulation background contributions of all non-tt backgrounds; we then normalize it by the simulated tt yield in the control region. This procedure yields a scale factor of 1.024 ± 0.005. In this control region, the signal contamination relative to the expected tt contribution depends upon the ∆m considered: while being completely negligible at high ∆m, it can take values between 5% and 40% at low ∆m, depending on ∆m as well as the considered top squark decay mode.

Estimation of the Drell-Yan background
To estimate the contribution of DY events in the selected events, we use the Z-boson mass resonance in the M + − distribution for opposite charge and same flavor dilepton events.

JHEP07(2016)027
From comparisons with data, we find that our simulation accurately models the Z mass line shape within systematic uncertainties. We can therefore calculate a normalization scale factor for simulated DY events by comparing the observed number of events inside the Z-veto region (N + − in ) against the expected number of DY events calculated from the simulation (N DY in ), where the number of events with different flavor (N eµ in ) is subtracted to account for non-DY processes contaminating N + − in . The k-factors in eq. (5.2) account for different reconstruction efficiencies for electrons and muons. Using eq. (5.2), we calculate a scale factor of (1.43 ± 0.04) for µµ events and (1.46 ± 0.04) for ee events. To account for the contribution of eµ events originating from Z → τ + τ − decays, we estimate a scale factor of (1.44±0.04) for eµ events by taking the geometric average of the scale factors for the same-flavor channels.

Misidentified lepton background estimation
The misidentified lepton background consists of events in which non-prompt leptons pass the identification criteria. The largest category of events falling in this group are semileptonic tt events and leptonically decaying W events where a jet, or a lepton within a jet, is misreconstructed as an isolated prompt lepton.
In order to have an estimation of this background from data, we first measure the lepton misidentification rate, which is the probability for a non-prompt lepton to pass the requirements of an isolated lepton. This is done by counting the rate at which leptons with relaxed identification ("loose" leptons) pass the "tight" selection requirements (see section 3.2). The measurement is performed in a data sample dominated by multijet events.
We then measure the prompt lepton rate, which is the efficiency for isolated and prompt leptons to pass selection requirements, in a data sample enriched in Z → + − events. As with the misidentification rate, the prompt rate is determined by counting the rate at which loose leptons pass tight selection requirements.
Both the measurements of the lepton misidentification rate and the prompt lepton rate are performed as functions of lepton p T and |η|. For each dilepton event where both selected leptons pass at least the loose selection requirements, the measured misidentification and prompt rates directly translate into a weight for the event. These weights depend upon whether neither, one, or both loose leptons also passed the tight selection requirements. The shape and normalization of the misidentified lepton background is then extracted by first applying these derived weights to the data sample where both selected leptons pass at least the loose selection requirements, and then calculating the weighted distribution of relevant variables such as M T2 . Once the background is determined, the number of events falling into the M T2 signal regions is found.

Checks of the M T2 shape
The search in the dileptonic final states requires a good understanding of the M T2 shape. In this section we provide a number of validation studies performed with simulation, with  comparisons to data in control regions. One of the main factors determining the M T2 shape is the intrinsic resolution and energy scale of the input objects used in the M T2 calculation. From studies using Z → events, we confirm that the Gaussian core of the E miss T resolution function is sufficiently well-modeled by the simulation. These studies also confirm that the resolution and scale of the lepton p T are both well-modeled in the simulation.
The intrinsic width of the intermediate W bosons in dileptonic tt events drives the shape of the M T2 distribution near the kinematic edge at 80 GeV. Comparisons of events with different generated W widths (between 289 MeV and 2.1 GeV) show that any systematic uncertainty in the W boson width has a negligible effect in the selected signal regions.
The final notable effect driving the M T2 shape is the category of events populating the tails of the E miss T resolution function. To confirm that this class of events is modeled in simulation with reasonable accuracy, we perform comparisons between data and simulation in a control region enriched in Z → events; this control region is obtained by inverting the Z boson veto and requiring zero reconstructed b jets. Figure 10 shows the M T2 distribution in this control region, illustrating that the data distribution, including expected events in the tail, is well-modeled by the simulation.

Systematic uncertainties
We present the dominant systematic uncertainties affecting the dilepton search.

Systematic uncertainties affecting the background and signal
The E miss T measurement, and subsequently the shape of the M T2 distribution, is affected by uncertainties in the lepton energy scale, the JES, the jet energy resolution, and the scale of the unclustered energy (objects with p T < 10 GeV) in the event. We vary the four-vector momenta of the lepton and jets within their systematic uncertainties, and propagate the shifted p T back into the E miss T and M T2 calculations. For the jet energy resolution uncertainty, we vary it within its uncertainty and propagate it back into the E miss T calculation. For the unclustered energy scale, we scale the total p T of the unclustered energy by ±10% and propagate it back into the E miss T calculation. As with the single-lepton search (see section 4.3), we also apply systematic uncertainties to account for the intrinsic statistical uncertainty in the simulation samples as well as any mismodeling by the simulation of the b-tagging efficiency, the lepton trigger efficiency, the lepton ID and isolation, and the limited modeling of ISR jets by the simulation. No substantial correlation has been observed between the value of M T2 and the size of these four systematic uncertainties.

Systematic uncertainties affecting only the background
For the two background normalizations (tt and DY), we account for the statistical uncertainty in the normalization. For the misidentified lepton background (see section 5.2.3), the two primary sources of systematic uncertainty are the statistical uncertainty in the measured rates of prompt and misidentified leptons, and any systematic uncertainty in the measurement of the misidentification rate. Combining these in quadrature yields a total systematic uncertainty of ∼75% for the considered signal regions. For the diboson background processes, which are estimated from the simulation, we apply a conservative cross section uncertainty of 50%. Table 5 displays the magnitude of the effect of the aforementioned systematic uncertainties (sections 5.4.1 and 5.4.2) on the background estimate for each of the considered signal regions.

Systematic uncertainties affecting only the signal
As in the single-lepton search, we account for the effect of PDF uncertainties in the signal efficiency. The resulting uncertainty in signal efficiency is found to be ∼4% across all signal mass points.

Summary of the dilepton search
We have developed a signal selection based on the M T2 distribution. Table 6 presents the predicted backgrounds as well as the number of observed data events for all signal regions; we do not observe any excess of data events compared to the predicted total background. Top quark pair production dominates the composition of the total predicted background in the four signal regions with the lowest M T2 threshold, decreasing from 91 % to 45% with increasing threshold, while DY dominates in the last region (∼38%). As with the singlelepton search, the signal contamination is also taken into account in the final interpretation of the results.    Table 6. Data yields and background expectation for five different M T2 threshold values. The asymmetric uncertainties quoted for the background indicate the total systematic uncertainty, including the statistical uncertainty in the background expectation.

Combination and final results
After applying all selections for the single-lepton and dilepton data sets, no evidence for direct top squark production is observed (see tables 4 and 6). We proceed to combine the results of the two searches. In this combination, no overlap is expected in the event selections of the two searches, and none is observed. Since the background predictions are primarily based on data in the two searches, the corresponding systematic uncertainties are taken to be uncorrelated. Systematic uncertainties affecting the expected signal, as well as those due to luminosity, b tagging, PDF, JES, and lepton identification and isolation, are treated as 100% correlated between the two searches.
We interpret the absence of excess in both single-lepton and dilepton searches in terms of a 95% confidence level (CL) exclusion of top squark pair production in the (m( t 1 ), m( χ 0 1 )) plane. A frequentist CL S method [46][47][48] with a one-sided profile is used, taking into account the predicted background and observed number of data events, and the expected signal yield for all signal points. In this method, Poisson likelihoods are assigned to each of the single-lepton and dilepton yields, for each (m( t 1 ), m( χ 0 1 )) signal point, and multiplied to give the combined likelihood for both observations. The final yields of each analysis are taken from the signal region corresponding to the considered signal point. Systematic uncertainties are included as nuisance parameter distributions. A test statistic defined to be -23 -the likelihood ratio between the background only and signal plus background hypotheses is used to set exclusion limits on top squark pair production; the distributions of these test statistics are constructed using simulated experiments. When interpreting the results for the tt and bbWW decay modes, we make the hypothesis of unit branching fractions, B( t 1 → t ( * ) χ 0 1 ) = 1 and B( t 1 → b χ ± 1 ) = 1, respectively. The expected and observed limits, for which we combine the results of both searches and account for signal contamination, are reported in figure 11; the experimental uncertainties are reported on the expected contour, while the PDF uncertainty for the signal cross section, quadratically added to the systematic uncertainties in 2µ r and µ r /2 renormalization scales of the top squark pair production cross section, are reported on the observed contour.
For the tt decay mode, we reach sensitivity up to m( t 1 ) ∼ 700 GeV for χ 0 1 mass up to ∼250 GeV; there is a loss of sensitivity along the line ∆m = m(t), which delineates two different scenarios within the tt decay mode (see table 1) and where the signal acceptance drops dramatically. For the bbWW decay mode, the sensitivity reached in this study ranges from 600 to ∼700 GeV in m( t 1 ), depending on the values of m( χ 0 1 ) and m( χ ± 1 ); the sensitivity is greater in the case of a large m( χ ± 1 ) − m( χ 0 1 ) mass difference as for x = 0.75, where the decay products of the two produced W bosons are more energetic. In the case of x = 0.50, there is a drop in sensitivity for m( χ ± 1 ) − m( χ 0 1 ) ∼ m(W), which corresponds to the limit in which the W boson is virtual. Because of the rather low threshold achievable in lepton p T , sensitivity extends down to the kinematic limit ∆m ∼ m(b) + m(W) for the bbWW x = 0.50 and 0.75 cases.
The final results are dominated by the single-lepton search, where the selection is based on a multivariate selection with new discriminating variables, which is adapted to the kinematics of expected signal events, and where the discriminating power of selection variables is quantitatively assessed. The new signal selection presented in this paper leads to the strengthening and further improvement of the results of ref. [12]. We now account for systematic uncertainties due to PDFs, and more thoroughly assess the effects of signal contamination. The combination with the dilepton search extends the sensitivity by ∼25 GeV in the tt decay mode in the ∆m m(t) region, and in the bbWW (x = 0.50) decay mode across the m( χ ± 1 ) − m( χ 0 1 ) = m(W) region; it very moderately extends the sensitivity in the bbWW (x = 0.75) at both high t 1 and χ 0 1 masses; no gain of sensitivity is observed in the bbWW (x = 0.25) case where the search is limited by the small m( χ ± 1 ) − m( χ 0 1 ) mass difference, leaving a rather limited phase space to the decay products of the W boson. The signal contamination (see section 4.4) reduces the sensitivity of the search by 0-30 GeV depending on the decay mode and signal point under consideration. The limits are rather insensitive to the choice of hypothesis for the polarization of the interaction in the t χ 0 1 and W χ 0 1 χ ± 1 couplings for the tt and bbWW decay modes, respectively.

Conclusions
Using up to 19.7 fb −1 of pp collision data taken at √ s = 8 TeV, we search for direct top squark pair production in both single-lepton and dilepton final states. In both searches the standard model background, dominated by the tt process, is predicted using control sam- ples in data. In this single-lepton search, we improve the results of ref. [12] by employing an upgraded multivariate tool for signal selection, fed by both kinematic and topological variables and specifically trained for different decay modes and kinematic regions. This systematic approach to the signal selection, where the discriminating power of each selection variable is quantitatively assessed, is a key feature of the single-lepton search. The background determination method has also been improved compared to ref. [12]. In the dilepton search the signal selection is based on the M T2 variable. In both searches, the effect of the signal contamination is accounted for. No excess above the predicted background is observed in either search. Simplified models (figure 1) are used to interpret the -25 -

JHEP07(2016)027
results in terms of a region in the (m( t 1 ), m( χ 0 1 )) plane, excluded at 95% CL. We combine the results of both searches for maximal sensitivity; the sensitivity depends on the decay mode, and on the (m( t 1 ), m( χ 0 1 )) signal point. The highest excluded t 1 and χ 0 1 masses are about 700 GeV and 250 GeV, respectively.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: the Austrian Federal Ministry of Science, Research and Economy and the Austrian Science