Search for direct pair production of scalar top quarks in the single- and dilepton channels in proton-proton collisions at sqrt(s) = 8 TeV

Results are reported from a search for the top squark, the lighter of the two supersymmetric partners of the top quark. The data sample corresponds to 19.7 inverse femtobarns of proton-proton collisions at sqrt(s) = 8 TeV collected with the CMS detector at the LHC. The search targets top squark to b chi+/- and top squark to t(*) chi0 decay modes, where chi+/- and chi0 are the lightest chargino and neutralino, respectively. The reconstructed final state consists of jets, b jets, missing transverse energy, and either one or two leptons. Leading backgrounds are determined from data. No significant excess in data is observed above the expectation from standard model processes. The results exclude a region of the two-dimensional plane of possible top squark and chi0 masses. The highest excluded top squark and chi0 masses are about 700 GeV and 250 GeV, respectively.


Introduction
Theories of supersymmetry (SUSY) predict the existence of a scalar partner for each standard model (SM) left-handed and right-handed fermion.When the symmetry is broken, the scalar partners acquire a mass different from their SM counterparts, the mass splitting between scalar mass eigenstates being dependent on the mass of the SM fermion.Because of the large mass of the top quark, the splitting between its chiral supersymmetric partners is potentially the largest among all supersymmetric quarks (squarks).As a result the lighter supersymmetric scalar partner of the top quark, the top squark ( t 1 ), could be the lightest squark.The search for a low mass top squark is of particular interest following the discovery of a Higgs boson [1][2][3], as a top squark with a mass in the TeV range would contribute substantially to the cancellation of the divergent loop corrections to the Higgs boson mass.SUSY scenarios with a neutralino ( χ 0 1 ) as lightest supersymmetric particle (LSP) and a nearly degenerate-mass t 1 provide one theoretically possible way to produce the observed relic abundance of dark matter [4,5]; this further motivates the search for the t 1 at the LHC.
In this paper we report two searches for direct top squark pair production with the CMS detector at √ s = 8 TeV with integrated luminosities of 19.5 fb −1 and 19.7 fb −1 .Each search is based on the two decay modes shown in Fig. 1.The decay modes and the nomenclature we will use to refer to them are as follows: (the "tt" decay mode); pp → t 1 t 1 → bb χ + 1 χ − 1 → bbW +( * ) W −( * ) χ 0 1 χ 0 1 (the "bbWW" decay mode).The tt and bbWW events both contain bottom quark jets (henceforth called b jets) and may contain charged leptons and neutrinos from W ( * ) decay.The search strategies are therefore tailored to require either one lepton or two leptons, as well as at least one b jet and a minimum amount of transverse momentum imbalance.Throughout this paper the term "lepton" refers only to e ± and µ ± .Previous searches for low mass top squarks in leptonic final states have been conducted by the D0, CDF, CMS, and ATLAS collaborations [6][7][8][9][10][11][12].
As shown in Table 1, we categorize the decays of the t 1 as 2-body or 3-body processes and as a function of the masses of the involved particles.In all cases we take the lightest neutralino χ 0 1 to be the LSP.For each decay mode we fix the corresponding t 1 branching fraction to unity; the search is in all other respects designed to be as independent as possible of the details of any specific SUSY model.We explore a range of signal mass points for each decay mode under consideration.In the decay mode tt, the unknown masses are those of the t 1 and the χ 0 1 , while in the case of bbWW, a third unknown is the mass of the lightest chargino ( χ ± 1 ).In the latter case we consider three possible mass assignments, labeled by the parameter x = 0.25, 0.50, 0.75; x is defined by m( χ ± 1 ) = m( χ 0 1 ) + x m( t 1 ) − m( χ 0 1 ) .
In this paper we expand the result of our previous search in the single-lepton final state [12] by improving key aspects of the signal selection.Since the SM background dominates the signal by several orders of magnitude and often has similar distributions for individual discriminating variables, a multivariate approach has been developed to exploit differences in the correlations among discriminating variables for signal and SM background.The background determination method has also been improved compared to Ref. [12] in order to better control and correct the tail of the key transverse mass distribution.In addition to the single lepton search, we also report on a search in the dilepton mode, where the key discriminating variable is an M T2 variable [13].The final result is based on a combination of the single lepton and dilepton searches.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid that provides an axial magnetic field of 3.8 T for charged-particle tracking.Trajectories of charged particles are measured by a silicon pixel and strip tracker, covering 0 < φ < 2π in azimuth and |η| < 2.5, where the pseudorapidity η is defined as η = − ln[tan(θ/2)]; θ is the polar angle of the trajectory of the particle with respect to the counterclockwise beam direction.A crystal electromagnetic calorimeter and a brass/scintillator hadronic calorimeter surround the tracking detectors.The calorimeter measures the energy and direction of electrons, photons, and hadronic jets.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.The detector is nearly hermetic, allowing for momentum balance measurements in the plane transverse to the beam axis.Events are selected online by a two-level trigger system [14].A more detailed description of the CMS detector can be found in Ref. [15].

Samples and trigger requirements
Events used for this search are selected initially by single-lepton and dilepton triggers.For the single-electron final state, the online selection requires the electron be isolated and have transverse momentum p T > 27 GeV; in subsequent offline analysis the reconstructed electron p T is required to exceed 30 GeV.For the single-muon final state, two triggers are used, which both require |η(µ)| < 2.1: a purely leptonic trigger requiring an isolated muon with p T > 24 GeV; and an additional mixed trigger requiring an isolated muon of p T > 17 GeV together with three jets, each having p T > 30 GeV.The first trigger suffices for muons whose offline reconstructed p T exceeds 26 GeV, while the second trigger allows the analysis to use muons with reconstructed p T as low as 20 GeV; the additional jets are required in the analysis in any case.The dilepton triggers require either ee, µµ, or eµ pairs.In each case, one lepton must satisfy p T > 17 GeV and the other lepton must satisfy p T > 8 GeV.Trigger efficiencies are measured in data and applied to simulated events.The integrated luminosity, after data quality requirements, is 19.5 ± 0.5 fb −1 for the single-lepton states and 19.7 ± 0.5 fb −1 for the dilepton final states [16].
To ensure agreement with data, simulated events are weighted so that the distribution of the number of proton-proton interactions per beam crossing agrees with that seen in data; they are additionally weighted by the trigger efficiency and the lepton identification and isolation efficiencies.For simulated tt samples, a p T -dependent weight is applied to match the shape of the dσ(pp → tt + X)/dp T distribution observed in data.Signal events are weighted to account for the effect of initial state radiation [12].

Object reconstruction
In this search, all particle candidates are reconstructed with the particle-flow (PF) algorithm [31,32], and additional criteria are then applied to select electrons, muons, jets, and b jets; the criteria are applied to both collision data and simulation samples.
The identification and measurement of the p T of muons uses information provided by the silicon detector and the muon system [33].We require the muon to have a 'tight' identification [33] with pseudorapidity |η| < 2.1 and |η| < 2.4 for the single-lepton and dilepton searches, respectively.The identification and energy measurement of the electrons uses information provided by the tracker and the electromagnetic calorimeter.Electron candidates are reconstructed in the tracker with the Gaussian-sum filter algorithm [34].We require the electron to have a 'medium' identification [34] with pseudorapidity |η| < 1.44 and |η| < 2.5 for the single-lepton and dilepton searches, respectively.Both muon and electron identification demand that the lepton be isolated from the hadronic components of the event.We define an isolation variable for the leptons based on a scalar sum of transverse momenta, P ≡ ∑| p T |, where the sum is taken over all PF candidates within a cone about the lepton of ∆R ≡ √ (∆φ) 2 + (∆η) 2 = 0.3, excluding the transverse momentum of the lepton itself, p T ( ).In the single-lepton search we impose an upper limit on the absolute isolation, P < 5 GeV; for both searches we impose an upper limit on the relative isolation P /p T ( ) < 0.15.
Jets are constructed by clustering all the PF candidates with the anti-k T jet clustering algorithm [35], using a distance parameter R = 0.5.Contamination from additional pp interactions (pileup) is mitigated by discarding charged PF candidates that are incompatible with having originated from the estimated proton-proton collision point.The average pileup energy associated with neutral hadrons is computed event-by-event and subtracted from the jet energy and from the energy used when computing lepton isolation, i.e., a measure of the activity around the lepton.The energy subtracted is the average pileup energy per unit area (in ∆η × ∆Φ) times the jet or isolation cone area [36,37].Candidate jets must be separated from selected leptons by ∆R > 0.4.Relative and absolute jet energy corrections are applied to the raw jet momenta to establish a uniform jet response in |η| and a calibrated response in jet p T .We require the jets pass p T > 30 GeV and |η| < 2.4.To tag jets originating from the hadronization of b quarks, we utilize the combined secondary vertex algorithm at its 'medium' operating point [38] with a corresponding efficiency for b jets of 65% and a mistag rate for light jets of 1%.Scale factors are applied to simulation samples to reproduce the efficiencies measured in the data.
As the decays of t 1 are expected to yield neutralinos and neutrinos in their decay chain, genuine missing transverse momentum is expected in the final state of signal events.We define the missing transverse momentum by a sum over the transverse momenta of all PF candidates, p miss T ≡ − ∑ p T .All calibration corrections [39] have been applied to candidates used in the sum.The magnitude of the p miss

Single-lepton search
In the single-lepton search, we consider only final states containing one lepton (e or µ only) and several jets.

Event Selection
The preselection criteria are defined as follows: • Exactly one identified and isolated lepton satisfying p T (µ) > 20 GeV or p T (e) > 30 GeV; • A veto is applied against the presence of a second lepton by requiring that no additional isolated tracks or hadronically decaying τ lepton (τ h ) candidates [12] are present; • The number of jets and number of b jets must satisfy N(jets) ≥ 4 and N(b jets) ≥ 1; • E miss T > 80 GeV; The transverse mass variable is defined by , where p T ( ) is the transverse momentum of the selected lepton and ∆φ is the angular difference between the lepton p T ( ) and p miss T .The requirement on this variable suppresses events in which the source of the lepton and p miss T is W ± decay.
At the preselection level, the tt and W+jets backgrounds represent 90% and 7%, respectively, of the total expected background (see Section 4.2).For the signal selection, we use a boosted decision tree (BDT) [40] to take advantage of the correlations among variables that discriminate between signal and background; Fig. 2 illustrates how a pair of kinematic variables correlate differently for a background process and signal.Compared to the approach of Ref. [12], the signal selection is characterized mainly by the use of new variables, and a systematic search for the most reduced set of best-performing variables to be used as input to the BDT.Furthermore, because the discriminating power of each input varies across the (m( t 1 ), m( χ 0 1 )) mass plane, the latter is partitioned and a unique BDT is trained in each partition.The full list of variables (not all used in every BDT) is given below: • E miss T : The presence of missing transverse momentum signals the possible production of a stable unseen object, such as the χ 0 1 .

• p T ( ):
The correlation between the missing transverse momentum E miss T and the lepton transverse momentum p T ( ) differs between signal, where genuine E miss T is due to two missing objects ( χ 0 1 ), and tt and W+jets backgrounds where the E miss T is due to a single missing object (ν).
• N(jets) , p T (j 1 ) , p T (b 1 ) : These describe the multiplicity of selected jets and the p T of the highest p T jet and highest p T b jet, respectively.
• M W T2 : The distribution of this variable shows an edge at the top quark mass for tt events where both W bosons decay leptonically and one of the leptons is lost.
It is defined by minimizing the following over possible momentum vectors p T1 and p T2 : (2) Here p 1 is the momentum of the neutrino associated with a successfully reconstructed lepton in one W → ν decay, and p 2 corresponds to an unreconstructed W whose two decay products (the lost lepton and the neutrino) escape detection.The momenta p b 1 and p b 2 are of the b jets with the highest (leading) and secondhighest (sub-leading) p T values, respectively.Including M W T2 in the BDT reduces the contribution of the tt dilepton background.
• H T : The scalar sum H T ≡ ∑| p T |, summed over all jets with p T > 30 GeV, characterizes the hadronic component of the event.A related variable H frac T is defined by H frac T ≡ ∑ | p T |/H T , where the terms in the numerator are restricted to jets of p T > 30 GeV that lie in the same hemisphere as p miss T .
• ∆R( , b 1 ), ∆φ(j 1,2 , p miss T ): Two topological variables, ∆R( , b 1 ) and ∆φ(j 1,2 , p miss T ), are defined as follows: ∆R is the distance between the lepton and the leading b jet; and ∆φ is the minimal angular difference between the p miss T vector and either the leading or sub-leading jet.
• χ 2 had : To characterize the kinematics of tt events we build a χ 2 variable comparing the invariant masses of the three-and two-jet systems to the mass of the top quark and W boson, respectively.It is defined as: where M j1j2j3 and M j1j2 are, respectively the invariant mass of the three-jet system from the top quark and of the two jets posited to originate from W boson decay; σ j1j2j3 and σ j1j2 are the uncertainties of these invariant masses.The M j1j2j3 value is calculated after imposing a M j1j2 = m(W) constraint by kinematic fit, while M j1j2 is the two-jet invariant mass before the fit.The jet assignments are made according to the b tag information: j3 must be tagged as a b quark if there are at least two b jets in the event, and j1 and j2 cannot be tagged unless there are at least three b jets in the event.This variable is used for the signal selection in the tt decay mode.
• M(3 jet), M( b): Finally, to kinematically disentangle the signal from the tt background, we construct two new invariant-mass variables that characterize the process where one t 1 decays into 3 jets and χ 0 1 while the other decays into a b quark, lepton, neutrino, and χ 0 1 .In the case of the bbWW decay mode and the tt decay mode where no on-shell top quark is produced, i.e. m( t 1 ) − m( χ 0 1 ) < m(t), the M( b) distribution discriminates between tt and signal.The quantity M(3 jet) is the invariant mass of the 3 jets among the 4 highest p T jets which are the most back-to-back (according to angular difference) to the lepton.In the case of tt background, M(3 jet) reconstructs the mass of the top quark having decayed into 3 jets, modulo the limitations of the jet association.For the bbWW decay mode of the signal, it reconstructs an invariant mass different from m(t), as no top quark is present in the final state.The quantity M( b) is defined as the invariant mass of the lepton and the b jet closest to it in ∆R.
The figure shows clearly the evolution of the kinematic distributions as the mass difference between the lightest top squark and the LSP is varied.Differences in kinematic distributions may also be seen when comparing the tt and bbWW signal decay modes, and when varying the choice of x (x = 0.25, 0.50, 0.75) in the bbWW decay mode.In Fig. 4 we show distributions of some discriminating variables at the preselection level (but without the restriction on M T ) for both e and µ final states in data and simulated events.The figure shows good agreement between data and the total simulated background, within the uncertainties of the simulated events, which include the statistical uncertainty in the simulation samples quadratically added to the systematic uncertainty in the jet energy scale (JES).
As expected from the distributions shown in Fig. 3, different selection variables will exhibit different degrees of discriminating power, depending on the decay mode (tt or bbWW) and the relevant mass parameters (∆m or x) of the signal.To find the most discriminating variables, we test different sets of candidate BDT input variables, maximizing a figure of merit that compares the expected signal yield to the quadratic sum of the statistical and systematic uncertainties in the expected background yield.To keep the selection tool simple, a new variable is incorporated into the set of input variables only if it leads to a substantial increase in the figure of  merit.The training of the BDT, together with this procedure for selecting variables, is then carried out separately for the different decay modes tt and bbWW (x = 0.25, 0.50, 0.75), and across six benchmark kinematic regions, defined as: ∆m = (150 ± 25), (250 ± 25), (350 ± 25), (450 ± 25), (550 ± 25), and (650 ± 25) GeV.This partitioning allows us to take into account the evolution of the signal kinematics across the (m( t 1 ), m( χ 0 1 )) plane.The different BDT trainings are numbered from 1 to 6 to reflect the ∆m regions in which they are trained.
The final sets of variables retained as input to the BDT are reported in Table 2. Having been chosen with a quantitative assessment of the discriminating power of each variable, these represent the most reduced, while effective, sets of input variables to the BDT, for each decay mode and kinematic region.This represents a new feature of this search compared to Ref. [12], where the BDT was trained with the same set of variables across different kinematic regions.Once the input variables to the BDT are determined, different BDTs are trained in each of the benchmark kinematic regions to build selection tools adapted to a kinematically varying signal.The simulation samples used for finding the best set of variables and training the BDT are statistically independent.This procedure is done for the tt and bbWW (different x values) decay modes.Using a more systematic approach for the definition of signal regions (SRs) than in Ref. [12], we first consider which training is the best performing one in the (m( t 1 ), m( χ 0 1 )) plane.We observe that some BDTs are the best over a very limited part of the (m( t 1 ), m( χ 0 1 ))  simplify the final selection we retain BDT trainings that are observed to be the best performing over a large portion of the mass plane.The resulting SRs, defined as the chosen BDT training in the (m( t 1 ), m( χ 0 1 )) plane, are shown for all considered decay modes in Fig. 5.With these SRs determined, the final selection is made by applying a minimum threshold to each BDT output as shown in Fig. 6.The thresholds are determined by minimizing the expected upper limit cross section (σ exp 95 ) obtained from events remaining above the threshold, taking into account the predicted background (Section 4.2).The final BDT trainings and selections are ).

Background estimation
The SM background processes in the single-lepton search can be divided into four categories.At preselection, the dominant contribution (∼ 66% of the total) is the tt production with one lepton; we include single top-quark production in this category and call the combination the "tt → 1 " component.The second most significant background (23%) comes from tt events with two leptons, where one lepton escapes detection; we will call this the "tt → " component.The third background (7%) is the production of W in association with jets, which we will denote "W+jets".Other backgrounds are labeled as "rare".We use data to estimate the event yields of the first three categories, starting with distributions obtained from simulation, and normalizing these with scale factors (SF) determined in control regions.The background is estimated using the formulae: The subscript tail refers to the region M T > 100 GeV.The simulation yields at the final selection level (N MC tail ) are corrected by normalization scale factors SF and SF 0 (defined in Eq. ( 6) and ( 7)), determined in the M T peak region 50 < M T < 80 GeV.The additional scale factor ratios, denoted SFR 1 and SFR W , are used to correct the tail of the M T distribution, and are determined using a control region with zero b jets.The procedure accounts for the possibility of signal contamination in the different control regions.At the final selection level the tt → process represents an approximately constant proportion of the total background at ∼60%, while the tt → 1 and W+jets processes have varying proportions across the different selections within the remaining ∼40%.Signal contamination is important only at low ∆m, where it alters the background determination by up to 25%.

Normalization in the M T peak
The scale factors SF and SF 0 are estimated to correct for the normalization in the M T peak region and after the final selection on the output of the BDT.To calculate SF 0 we further require the second lepton veto, while SF is obtained without this veto.SF fixes the tt → background normalization, while SF 0 sets the tt → 1 and W+jets background normalizations.The scale factors are computed as follows: The inclusion of the N MC (signal) term accounts for possible signal contamination.At preselection we have: SF = (1.06 ± 0.01) and SF 0 = (1.05± 0.01).At the final selection level, the deviation of these scale factors from unity is always within 10%.

Correction for the tail in the M T distribution
To study the tail of the M T distribution for different backgrounds, we enrich the data with the W+jets contribution by inverting the b-tagging criterion of the preselection.The left plot of Fig. 7 compares the data with background simulation, and shows some disagreement between the two for M T > 100 GeV.To correct this, we follow an approach based on template fits, which allows us to extract different correction factors for the tt → 1 and the W+jets backgrounds, rather than assuming them to be equal as in Ref. [12].
Entries / 10 GeV  The template fit is performed using the invariant mass of the lepton and the jet with the highest b-tag discriminator.This variable, M b , is well modeled by the background simulation (see Fig. 8, left) and exhibits discriminating power between W+jets and tt → 1 (Fig. 8, right).The contributions of the tt → background, the rare backgrounds, and the signal, are taken from simulation and their normalizations are constrained within a 20% uncertainty during the template fit.The normalizations of the tt → 1 and W+jets backgrounds are free parameters expressed in terms of scale factors SF.The fit is performed in a control region with zero b-tag jets, in two separate regions of the M T distribution: the peak defined by 50 < M T < 80 GeV, and the tail defined by M T > 100 GeV.We then extract the normalization independent ratios SFR = SF tail /SF peak for tt → 1 and for W+jets.Without any BDT signal selection and for a case of negligible signal contamination, the fit yields: SFR tt→1 = (1.04 ± 0.16) and SFR W = (1.33 ± 0.10).The right plot of Fig. 7 confirms the effectiveness of this correction.Due to the low yields after the final selections, we loosen the requirements on the output of the BDT to keep 25% of the total yield when we extract the SFR values.The SFR ratios obtained for the different signal regions within a given decay mode (tt or bbWW) agree well with each other.We therefore set the final SFR factor for each decay mode to the average over the signal regions for that mode.The resulting SFR values for the tt and bbWW decay modes differ from one another, and also vary across the (m( t 1 ), m( χ 0 1 )) mass plane: SFR 1 increases from 1.0 to 1.4 with increasing top squark mass, while SFR W is stable around a mean value ∼1.2 everywhere.In addition to the extraction of tail correction factors, we check in the control region with zero b jets that the distributions of all input variables in data are well described by the predicted background.

Systematic uncertainties
The sensitivity of this search is limited by uncertainties in both the background prediction and the acceptance and efficiency of the signal at the mass points under consideration.The uncertainties are listed below.

Background
For systematic uncertainties affecting the predicted background: • We study the impact of limited simulation statistics, generator scale variations, and JES uncertainty in the template fit method in the control region with zero b jets and no BDT selection.This leads to a global absolute uncertainty of 0.6 in SFR 1 and 0.4 in SFR W .
• The goodness of the tt → background modeling is checked in two different control regions.The first uses events with exactly two leptons in the final state and a lower jet multiplicity (N(jets) ≥ 2) than that employed in the preselection; the second uses events with exactly one lepton, and an isolated track or τ h candidate.The simulation prediction is compared with data in the M T tail region of these control regions for each BDT selection.The comparison shows overall agreement and deviations are used to derive a relative systematic uncertainty, ranging from 20 to 80% depending on the selection.
• We check the modeling of the N(jets) distribution in the tt background with a control region defined to have exactly two leptons and no requirement on M T .The data/simulation scale factors are observed to be compatible with unity; therefore, no correction factor is used, but the deviations from unity are taken as systematic uncertainty.This leads to a flat 2% uncertainty, used for all the BDT selections.
• A 6% uncertainty for the modeling of the isolated track veto is applied to the fraction of tt dilepton background events that have a second e/µ or a one-prong τ h decay in the acceptance.A 7% uncertainty for the modeling of the hadronic τ veto is only applied to the fraction of tt dilepton background events that have a τ h in the acceptance.
• The SF and SF 0 normalization factors are varied within their statistical uncertainties and the variations are propagated as systematic uncertainties to the M T peak regions.
• The statistical uncertainties in the simulation background samples are propagated to the systematic uncertainties in the backgrounds.
• The cross section of W+jets and rare backgrounds are conservatively varied by 50%, affecting the prediction of other background processes through SF and SF 0 (see equations of Section 4.2); the cross section of the tt process is varied by 10%.
Table 3 gives a summary of the relative systematic uncertainties in the predicted total background yield at the preselection level, as well as their range of variation over the different top squark decay modes and BDT selections.

Signal
The statistical uncertainties in the signal samples are taken into account.The integrated luminosity is known [16] to a precision of 2.6% and the efficiencies of triggers (Section 3.1) applied to the signal yield are known with a precision of 3%.The efficiencies for the identification and isolation of leptons are observed to be consistent within 5% for data and simulation; we take this difference as an uncertainty.The b-tagging efficiency has been varied within its uncertainties for b, c, and light flavor jets, leading to final yield uncertainties within 3% for all signal mass points.The systematic uncertainty in signal yield that is associated with the JES [41] is obtained by varying the jet energy scale within its uncertainty; the final uncertainties for all signal mass points are within 10%.Systematic uncertainties in the signal efficiency due to PDFs have been calculated [42][43][44], and are constant at ∼5%.The effect of the systematic uncertainty due to the modeling of ISR jets by the simulation is studied by deriving data/simulation scale factors that depend on N(jets).The maximum size of these uncertainties varies between 8 and 10% for different decay modes.

Summary of the single-lepton search
We develop a ∆m-dependent signal selection tool with BDTs for the tt and bbWW decay modes.
For each BDT selection shown in Fig. 5 we provide in Table 4 the predicted background yield (without signal contamination) as well as the number of observed data events for the BDT selections.We do not observe any excess of data events compared to the predicted total background.The background composition varies as function of the different SRs of various decay modes.
The signal contamination is taken into account by calculating a new estimation of the background in case of signal contamination (see Eqs. ( 6) and ( 7)); this is done separately at each signal mass point in the (m( t 1 ), m( χ 0 1 )) plane, and for each of the signal regions defined in Fig. 5.For the calculation of limits (see Section 6), the number of observed events in data and expected signal remain the same, while the expected background is modified to correct for signal contamination in the control regions.While the effect of this contamination is observed to be almost negligible at high ∆m, it can modify the background estimate up to 25% at low ∆m.

Selection
For the three dilepton final states considered in this search (eµ, ee, and µµ), we define the preselection as follows: • At least two oppositely charged leptons.
• For the leading and sub-leading lepton, we require p T > 20 and p T > 10 GeV, respectively.• For all lepton flavors: M + − > 20 GeV.
• If more than two lepton pairs are found that satisfy the above three requirements, the pair with the highest p T is chosen.
At the preselection level, tt production with two leptons represents ∼90% of the total expected background.
In this search we separate the signal from the dileptonic tt background by constructing a transverse mass variable M T2 as defined in Eq. ( 8).We begin with the two selected leptons 1 and 2 .Under the assumption that the p miss T originates only from two neutrinos, we partition the p miss T into two hypothetical neutrinos with transverse momenta p miss T1 and p miss T2 .We calculate the transverse mass M T of the pairings of these hypothetical neutrinos with their respective lepton candidates and record the maximum of these two M T .This process is repeated with other viable partitions of the p miss T until the minimum of these maximal M T values is reached; this minimum is the M T2 for the event [13,45]: When constructed in this fashion, M T2 has the property that its distribution in tt → events has a kinematic endpoint at m(W).The presence of additional invisible particles for the signal breaks the assumption that the p miss T arises from only two neutrinos; consequently, M T2 in dileptonic top squark events does not necessarily have an endpoint at m(W).The value of m(W) therefore dictates the primary demarcation between the control region M T2 < 80 GeV, and the general signal region M T2 > 80 GeV.The left plot of Fig. 9 shows the distribution of M T2 at the preselection level, where we observe its discriminating power for two representative signal mass points.The distribution of M T2 in top squark events, however, depends upon the signal mass point (m( t 1 ), m( χ 0 1 )), as can be observed on the right plot of Fig. 9.The optimal threshold on M T2 for the final selection is thus dependent on the supersymmetric particle masses: using the background predictions from Section 5.2 for the M T2 signal region, we iterate in 10 GeV steps through possible M T2 thresholds, from 80 GeV to 120 GeV; for each (m( t 1 ), m( χ 0 1 )) signal mass point, we pick the threshold that yields the lowest expected upper limit for the top squark production cross section, σ exp 95 .

Background prediction
For the M T2 signal regions used in this search, the dominant background is tt.Other backgrounds also contribute, including DY, single-lepton events with an additional misidentified lepton (see Section 5.2.3), and rare processes.The rare processes include single top quarks produced in association with a W boson; diboson production, including W or Z production with an associated photon; triple vector boson production; and tt production in association with one or two vector bosons.The normalization of the tt and DY backgrounds, and the normalization and shape of the misidentified lepton backgrounds, are evaluated from data using control samples.The shapes of the tt and DY backgrounds, and the normalization and shapes of less common processes, are all estimated from the simulation.We perform a number of checks to validate the modeling of the M T2 distribution in our simulation (see Section 5.3).For the background processes estimated from simulation, we apply the corrective scale factors mentioned in Sec.3.2.

tt estimation
The tt → background represents about 90% of the events in the control region M T2 < 80 GeV (see Fig. 9 left).We can therefore use this region to determine the normalization of the expected SM tt contribution in the signal region.To accomplish this, we first count the number of data events in the control region and subtract the simulation background contributions of all non-tt backgrounds; we then normalize it by the simulated tt yield in the control region.This procedure yields a scale factor of 1.024 ± 0.005.In this control region, the signal contamination relative to the expected tt contribution depends upon the ∆m considered: While being completely negligible at high ∆m, it can take values between 5% and 40% at low ∆m, depending on ∆m as well as the considered top squark decay mode.

Estimation of the Drell-Yan background
To estimate the contribution of DY events in the selected events, we use the Z-boson mass resonance in the M + − distribution for opposite charge and same flavor dilepton events.From comparisons with data, we find that our simulation accurately models the Z mass line shape within systematic uncertainties.We can therefore calculate a normalization scale factor for simulated DY events by comparing the observed number of events inside the Z-veto region (N + − in ) against the expected number of DY events calculated from the simulation (N DY in ), where the number of events with different flavor (N eµ in ) is subtracted to account for non-DY processes contaminating N + − in .The k-factors in Eq. ( 9) account for different reconstruction efficiencies for electrons and muons.Using Eq. ( 9), we calculate a scale factor of (1.43 ± 0.04) for µµ events and (1.46 ± 0.04) for ee events.To account for the contribution of eµ events originating from Z → τ + τ − decays, we estimate a scale factor of (1.44 ± 0.04) for eµ events by taking the geometric average of the scale factors for the same-flavor channels.

Misidentified lepton background estimation
The misidentified lepton background consists of events in which non-prompt leptons pass the identification criteria.The largest category of events falling in this group are semileptonic tt events and leptonically decaying W events where a jet, or a lepton within a jet, is misreconstructed as an isolated prompt lepton.
In order to have an estimation of this background from data, we first measure the lepton misidentification rate, which is the probability for a non-prompt lepton to pass the requirements of an isolated lepton.This is done by counting the rate at which leptons with relaxed identification ("loose" leptons) pass the "tight" selection requirements (see Section 3.2).The measurement is performed in a data sample dominated by multijet events.
We then measure the prompt lepton rate, which is the efficiency for isolated and prompt leptons to pass selection requirements, in a data sample enriched in Z → + − events.As with the misidentification rate, the prompt rate is determined by counting the rate at which loose leptons pass tight selection requirements.
Both the measurements of the lepton misidentification rate and the prompt lepton rate are performed as functions of lepton p T and |η|.For each dilepton event where both selected leptons pass at least the loose selection requirements, the measured misidentification and prompt rates directly translate into a weight for the event.These weights depend upon whether neither, one, or both loose leptons also passed the tight selection requirements.The shape and normalization of the misidentified lepton background is then extracted by first applying these derived weights to the data sample where both selected leptons pass at least the loose selection requirements, and then calculating the weighted distribution of relevant variables such as M T2 .Once the background is determined, the number of events falling into the M T2 signal regions is found.

Checks of the M T2 shape
The search in the dileptonic final states requires a good understanding of the M T2 shape.In this section we provide a number of validation studies performed with simulation, with comparisons to data in control regions.
One of the main factors determining the M T2 shape is the intrinsic resolution and energy scale of the input objects used in the M T2 calculation.From studies using Z → events, we confirm that the Gaussian core of the E miss T resolution function is sufficiently well-modeled by the simulation.These studies also confirm that the resolution and scale of the lepton p T are both well-modeled in the simulation.
The intrinsic width of the intermediate W bosons in dileptonic tt events drives the shape of the M T2 distribution near the kinematic edge at 80 GeV.Comparisons of events with different generated W widths (between 289 MeV and 2.1 GeV) show that any systematic uncertainty in the W boson width has a negligible effect in the selected signal regions.
The final notable effect driving the M T2 shape is the category of events populating the tails of the E miss T resolution function.To confirm that this class of events is modeled in simulation with reasonable accuracy, we perform comparisons between data and simulation in a control region enriched in Z → events; this control region is obtained by inverting the Z boson veto and requiring zero reconstructed b jets.Figure 10 shows the M T2 distribution in this control region, illustrating that the data distribution, including expected events in the tail, is well-modeled by the simulation.

Systematic uncertainties
We present the dominant systematic uncertainties affecting the dilepton search.

Systematic uncertainties affecting the background and signal
The E miss T measurement, and subsequently the shape of the M T2 distribution, is affected by uncertainties in the lepton energy scale, the JES, the jet energy resolution, and the scale of the unclustered energy (objects with p T < 10 GeV) in the event.We vary the four-vector momenta of the lepton and jets within their systematic uncertainties, and propagate the shifted p T back into the E miss T and M T2 calculations.For the jet energy resolution uncertainty, we vary it within its uncertainty and propagate it back into the E miss T calculation.For the unclustered energy scale, we scale the total p T of the unclustered energy by ±10% and propagate it back into the E miss T calculation.
As with the single-lepton search (see Section 4.3), we also apply systematic uncertainties to account for the intrinsic statistical uncertainty in the simulation samples as well as any mismodeling by the simulation of the b-tagging efficiency, the lepton trigger efficiency, the lepton ID and isolation, and the limited modeling of ISR jets by the simulation.No substantial correlation has been observed between the value of M T2 and the size of these four systematic uncertainties.

Systematic uncertainties affecting only the background
For the two background normalizations (tt and DY), we account for the statistical uncertainty in the normalization.For the misidentified lepton background (see Section 5.2.3), the two primary sources of systematic uncertainty are the statistical uncertainty in the measured rates of prompt and misidentified leptons, and any systematic uncertainty in the measurement of the misidentification rate.Combining these in quadrature yields a total systematic uncertainty of ∼75% for the considered signal regions.For the diboson background processes, which are estimated from the simulation, we apply a conservative cross section uncertainty of 50%.
Table 5 displays the magnitude of the effect of the aforementioned systematic uncertainties (Sections 5.4.1 and 5.4.2) on the background estimate for each of the considered signal regions.

Systematic uncertainties affecting only the signal
As in the single-lepton search, we account for the effect of PDF uncertainties in the signal efficiency.The resulting uncertainty in signal efficiency is found to be ∼4% across all signal mass points.

Summary of the dilepton search
We have developed a signal selection based on the M T2 distribution.Table 6 presents the predicted backgrounds as well as the number of observed data events for all signal regions; we do not observe any excess of data events compared to the predicted total background.Top quark pair production dominates the composition of the total predicted background in the four signal regions with the lowest M T2 threshold, decreasing from 91 % to 45% with increasing threshold, while DY dominates in the last region (∼38%).As with the single-lepton search, the signal contamination is also taken into account in the final interpretation of the results.

Combination and final results
After applying all selections for the single-lepton and dilepton data sets, no evidence for direct top squark production is observed (see Tables 4 and 6).We proceed to combine the results of the two searches.In this combination, no overlap is expected in the event selections of the two searches, and none is observed.Since the background predictions are primarily based on data in the two searches, the corresponding systematic uncertainties are taken to be uncorrelated.Systematic uncertainties affecting the expected signal, as well as those due to luminosity, b tagging, PDF, JES, and lepton identification and isolation, are treated as 100% correlated between the two searches.
We interpret the absence of excess in both single-lepton and dilepton searches in terms of a 95% confidence level (CL) exclusion of top squark pair production in the (m( t 1 ), m( χ 0 1 )) plane.A frequentist CL S method [46][47][48] with a one-sided profile is used, taking into account the predicted background and observed number of data events, and the expected signal yield for all signal points.In this method, Poisson likelihoods are assigned to each of the single-lepton and dilepton yields, for each (m( t 1 ), m( χ 0 1 )) signal point, and multiplied to give the combined likelihood for both observations.The final yields of each analysis are taken from the signal region corresponding to the considered signal point.Systematic uncertainties are included as nuisance parameter distributions.A test statistic defined to be the likelihood ratio between the background only and signal plus background hypotheses is used to set exclusion limits on top squark pair production; the distributions of these test statistics are constructed using simulated experiments.When interpreting the results for the tt and bbWW decay modes, we make the hypothesis of unit branching fractions, B( t 1 → t ( * ) χ 0 1 ) = 1 and B( t 1 → b χ ± 1 ) = 1, respectively.The expected and observed limits, for which we combine the results of both searches and account for signal contamination, are reported in Fig. 11; the experimental uncertainties are reported on the expected contour, while the PDF uncertainty for the signal cross section, Figure 11: Exclusion limit at 95% CL obtained with a statistical combination of the results from the single-lepton and dilepton searches, for the tt (top left), bbWW x = 0.25 (top right), bbWW x = 0.50 (bottom left) and bbWW x = 0.75 (bottom right) decay modes.The red and black lines represent the expected and observed limits, respectively; the dotted lines represent in each case the ±1 σ variations of the contours.For all decay modes, we show the kinematic limit m( t 1 ) = m(b) + m(W) + m( χ 0 1 ) on the left side of the (m( t 1 ), m( χ 0 1 )) plane; for the tt decay mode, we show the ∆m = m(t) line; and for the bbWW decay mode, we show the m( χ ± 1 ) − m( χ 0 1 ) = m(W) line.

Figure 1 :
Figure 1: Top squark direct pair production at the LHC.Left: tt decay mode.Right: bbWW decay mode.

T
vector is denoted by E miss T ≡ | p miss T |.We reject events where known detector effects or noise lead to anomalously large E miss T values.

Figure 2 :
Figure 2: Distribution of the transverse momentum of the lepton versus the missing transverse energy at the preselection, for the simulated tt background (left) and for the bbWW decay mode (x = 0.50) of the signal with m( t 1 )-m( χ 0 1 ) ≥ 625 GeV (right).Distributions of some of the variables used for the bbWW (x = 0.75) decay mode are illustrated in Fig. 3.The figure shows the distributions for both tt and signal samples; in the latter case four different kinematic possibilities are illustrated, distinguished by values of ∆m: ∆m ≡ m( t 1 ) − m( χ 0 1 ).(4)

Figure 3 :
Figure 3: Distribution of some discriminating variables for the bbWW (x = 0.75) decay mode at the preselection level, for the main tt background and benchmark signal mass points grouped in bands of constant width ∆m = (150 ± 25), (350 ± 25), (550 ± 25), and (750 ± 25) GeV.Distributions are normalized to the same area.From left to right and from top to bottom: M W T2 , M(3 jet), M( b) and N(jets).

Figure 4 :
Figure 4: Distributions of different variables in both data and simulation, for both e and µ final states at the preselection level without the M T requirement.From left to right: M W T2 , M(3 jet), M( b) and N(jets).The hatched region represents the quadratic sum of statistical and JES simulation uncertainties.The lower panel shows the ratio of data to total simulation background, with the red band representing the uncertainties mentioned in the text.Two signal mass points of the bbWW decay mode (x = 0.75) are represented by open histograms, dashed and solid, with their cross sections scaled by 100; the two mass points (m( t 1 ), m( χ 0 1 )) are (300, 200) and (500, 200) GeV.

Figure 5 :Figure 6 :
Figure 5: Signal regions (SRs) defined as functions of the chosen BDT trainings in the (m( t 1 ), m( χ 0 1 )) plane for tt (top left), bbWW x = 0.25 (top right), 0.50 (bottom left), and 0.75 (bottom right) decay modes.The SRs are delimited by continuous red lines, and the final selections within the different SRs are delimited by dashed red lines.The attributes "low / high m( χ 0 1 )" and "low / high ∆m" indicate that in these regions different thresholds are applied for the same BDT training.

Figure 7 :
Figure 7: Full M T distribution in the control region with zero b jets, without any extra signal selection.Left: without the tail correction factors applied; right: with SFR W and SFR 1 corrections applied.The plots on the bottom represent the ratio of Data over the predicted background.

Figure 8 :
Figure 8: Left: Comparison of data and simulation in the M b distributions for events with 50 < M T < 80 GeV and zero b jets.Right: Shape comparison between tt → 1 and W+jets for M T > 100 GeV.

Figure 9 :
Figure 9: Left: Data, expected background, and signal contributions in the M T2 distribution at the preselection level.Background processes are estimated as in Section 5.2.The uncertainty bands are calculated from the full list of uncertainties discussed in Section 5.4.The same signal mass point (m( t 1 ), m( χ 0 1 )) = (400, 50) GeV is represented for the tt and bbWW (x = 0.75) decay modes.Right: M T2 distribution for the tt background and different signal mass points of the tt decay mode regrouped in constant ∆m bands; distributions are normalized to the same area.

Figure 10 :
Figure10: Data and expected background contributions for the M T2 distribution in a control region enriched in Z → events.This control region is similar to the preselection, except that the Z boson veto and b jet requirements have been inverted.Background processes are estimated as in Section 5.2.The uncertainty bands are calculated from the full list of uncertainties discussed in Section 5.4.

Table 5 :
The relevant sources of systematic uncertainty in the background estimate for each signal region used in the limit setting.From left to right, the systematic uncertainty sources are: lepton energy scale ( ES), jet energy scale (JES), unclustered energy scale (Uncl.),E miss T energy resolution from jets (JER), uncertainty in b tagging scale factors (b tag), lepton selection efficiency ( eff.), ISR reweighting (ISR), the misidentified lepton estimate (ML), and the combined normalization uncertainty in the tt, DY, and other electroweak backgrounds (σ).

Table 1 :
Kinematic conditions for the t 1 decay modes explored in this paper.
plane, so to

Table 2 :
Final selection variables chosen as input for the BDT training, as functions of the decay modes bbWW and tt, and kinematic regions.Column headings ∆R and ∆φ refer to ∆R( , b 1 ) and ∆φ(j 1,2 , p miss T

Table 3 :
Summary of the relative systematic uncertainties in the total background, at the preselection level, and the range of variation over the BDT selections.

Table 4 :
Background prediction without signal contamination and observed data for the BDT selections.The total systematic uncertainties are reported for the predicted background.

Table 6 :
Data yields and background expectation for five different M T2 threshold values.The asymmetric uncertainties quoted for the background indicate the total systematic uncertainty, including the statistical uncertainty in the background expectation.