Search for direct pair production of supersymmetric top quarks decaying to all-hadronic final states in pp collisions at sqrt(s) = 8 TeV

Results are reported from a search for the pair production of top squarks, the supersymmetric partners of top quarks, in final states with jets and missing transverse momentum. The data sample used in this search was collected by the CMS detector and corresponds to an integrated luminosity of 18.9 inverse femtobarns of proton-proton collisions at a centre-of-mass energy of 8 TeV produced by the LHC. The search features novel background suppression and prediction methods, including a dedicated top quark pair reconstruction algorithm. The data are found to be in agreement with the predicted backgrounds. Exclusion limits are set in simplified supersymmetry models with the top squark decaying to jets and an undetected neutralino, either via a top quark or through a bottom quark and chargino. Models with the top squark decaying via a top quark are excluded for top squark masses up to 755 GeV in the case of neutralino masses below 200 GeV. For decays via a chargino, top squark masses up to 620 GeV are excluded, depending on the masses of the chargino and neutralino.


Introduction
The standard model (SM) of particle physics is an extremely powerful framework for the description of the known elementary particles and their interactions. Nevertheless, the existence of dark matter [1][2][3] inferred from astrophysical observations, together with a wide array of theoretical considerations, all point to the likelihood of physics beyond the SM. New physics could be in the vicinity of the electroweak (EW) scale and accessible to experiments at the CERN LHC [4]. In addition, the recent discovery of a Higgs boson [5][6][7] at a mass of 125 GeV [8][9][10] has meant that the hierarchy problem, also known as the 'fine-tuning' or 'naturalness' problem [11][12][13][14][15][16], is no longer hypothetical.
A broader theory that can address many of the problems associated with the SM is supersymmetry (SUSY) [17][18][19][20][21], which postulates a symmetry between fermions and bosons. In particular, a SUSY particle (generically referred to as a 'sparticle' or 'superpartner') is proposed for each SM particle. A sparticle is expected to have the same couplings and quantum numbers as its SM counterpart with the exception of spin, which differs by a half-integer. Spin-1/2 SM fermions (quarks and leptons) are thus paired with spin-0 sfermions (the squarks and sleptons). There is a similar, but slightly more complicated pairing for bosons; SUSY models have extended Higgs sectors that contain neutral and charged higgsinos that mix with the SUSY partners of the neutral and charged EW gauge bosons, respectively. The resulting mixed states are referred to as neutralinos χ 0 and charginos χ ± .
Supersymmetry protects the mass of the Higgs boson against divergent quantum corrections associated with virtual SM particles by providing cancellations via the corresponding corrections for virtual superpartners [22][23][24][25]. Since no sparticles have been observed to date, they are generally expected to be more massive than their SM counterparts. On the other hand, sparticle masses cannot be arbitrarily large if they are to stabilise the Higgs boson mass without an unnatural level of fine-tuning. This is particularly important for the partners of the third generation SM particles that have large Yukawa couplings to the Higgs boson [26][27][28][29]. The top and bottom squarks ( t and b), are expected to be among the lightest sparticles and potentially the most accessible at the LHC, especially when all other constraints are taken into consideration [27,30]. With conservation of R-parity [31,32], SUSY particles are produced in pairs and the lightest SUSY particle (LSP) is stable. If the lightest weakly interacting neutralino ( χ 0 1 ) is the stable LSP, it is a leading candidate for dark matter [33]. Based upon these considerations, it is of particular interest at the LHC to look for evidence of the production of t t with decay chains of the t and t ending in SM particles and LSPs. The latter do not interact with material in the detector and so must have their presence inferred from missing transverse momentum p miss T , which in each event is defined as the projection of the negative vector sum of the momenta of all reconstructed particles onto the plane perpendicular to the beam line. Its magnitude is referred to as E miss T .
Within the Simplified Model Spectra (SMS) framework [34][35][36] the study presented here considers two broad classes of signals that lead to a bbqqqq + E miss T final state via decay modes denoted T2tt and T2bW. These are defined, respectively, as (i) t decay to a top quark: t → t χ 0 1 → bW + χ 0 1 , and (ii) t decay via a chargino: t → b χ + → bW + χ 0 1 . Figure 1 shows the diagrams representing these two simplified models. The two decay modes are not mutually exclusive, and it is possible for one of the top squarks to decay as in T2tt and the other as in T2bW. However, such a scenario is not considered in the analysis presented here.
Only the lightest t mass eigenstate is assumed to be involved, although the results are equivalent for the heavier eigenstate. The polarization of the t decay products depends on the prop-erties of the SUSY model, such as the left and right t mixing [37,38]. Instead of choosing a specific model, each SMS scenario is assumed to have unpolarized decay products and has a 100% branching ratio to the final state under consideration. As such, the results can be interpreted, with appropriately rescaled branching fractions, in the context of any SUSY model in which these decays are predicted to occur. With event characteristics of these signals in mind, we have developed a search for pair production of top squarks with decays that result in a pair of LSPs in the final state in addition to SM particles. Two selection criteria address the desire to extract a potentially very small signal from a sample dominated by top quark pair events. The first criterion comes from the E miss T signature associated with the LSPs, which motivates the focus on all-hadronic final states, as this eliminates large sources of SM background events with genuine E miss T from neutrinos in leptonic W decays. The all-hadronic final state with E miss T constitutes 45% of the signal because W bosons decay to quarks with a 67% branching ratio. For the same reason this final state makes up an even higher proportion of the subset of events with high jet multiplicity including many jets with high transverse momentum, p T , that is often required in SUSY searches to eliminate SM backgrounds. The second criterion relies upon the identification of top quark decay products to eliminate such backgrounds as SM production of W bosons in association with jets. Together, these criteria define a preselection region consisting of events that pass stringent vetoes on the presence of charged leptons, and are required to have large E miss T , two tagged b quark jets, and four additional jets from the hadronisation and decay of light quarks.
In spite of these stringent requirements, the low production cross sections of new physics signals mean that they are easily overwhelmed by SM backgrounds. In the case of SUSY, for example, the cross section for the production of top squark pairs with m t = 800 GeV is predicted to be nearly five orders of magnitude smaller than that of top quark pairs [39]. For this reason, this analysis focuses heavily on background suppression, employing several new methods that improve sensitivity to signal. The relevant SM processes contributing to this analysis fall into four main categories: (i) top quark and W boson events where the W decays leptonically, thereby contributing genuine E miss T , but the lepton is not successfully reconstructed or identified, or it is outside the acceptance of the detector; (ii) invisible decays of the Z boson when produced in association with jets, Z+jets with Z → νν; (iii) QCD multijet production, which, due to its very high rate, can produce events with substantial E miss T in the very rare cases of either extreme mismeasurements of jet momenta or the leptonic decay of heavy-flavour hadrons with large neutrino p T ; and (iv) ttZ production (with Z → νν), which is an irreducible background to signals with top squark decays via on-shell top quarks. The ttZ process has a small cross section that has been measured by ATLAS and CMS to be 176 +58 −52 fb −1 [40] and 242 +65 −55 fb −1 [41], respectively.
The first step in developing the search is the construction of a set of optimised vetoes for all three lepton flavours that reduce SM backgrounds for both signal types. Next, specific features of each signal type are exploited by combining several variables in a multivariate analysis (MVA) based upon Boosted Decision Trees (BDT). For T2tt, a high performance hadronic top quark decay reconstruction algorithm is developed and used to facilitate discrimination of signal from background by using details of top quark kinematics.
This paper is organised as follows: Section 2 describes the CMS detector, while Section 3 discusses event reconstruction, event selection, and Monte Carlo (MC) simulations of signal and background. The top quark pair reconstruction algorithm and lepton vetoes are described in Sections 4 and 5, respectively. The search regions are discussed in Section 6, and the evaluation of backgrounds is presented in Section 7 along with a discussion of the method of MC reweighting. Final results and their interpretations are presented in Section 8, followed by a summary in Section 9.

CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.
The silicon tracker measures charged particles within the range |η| < 2.5. Isolated particles of p T = 100 GeV emitted with |η| < 1.4 have track resolutions of 2.8% in p T and 10 (30) µm in the transverse (longitudinal) impact parameter [69]. The ECAL and HCAL measure energy deposits in the range |η| < 3. Quartz-steel forward calorimeters extend the coverage to |η| = 5. The HCAL, when combined with the ECAL, measures jets with a resolution [70]. Muons are measured in the range |η| < 2.4. Matching muons to tracks measured in the silicon tracker results in a relative p T resolution for muons with 20 < p T < 100 GeV of 1.3-2.0% in the barrel and better than 6% in the endcaps. The p T resolution in the barrel is better than 10% for muons with p T up to 1 TeV [71].
Particles reconstructed with the CMS PF algorithm are clustered into jets by the anti-k T algorithm [77,78] with a distance parameter of 0.5 in the η-φ plane. For a jet, the momentum is determined as the vectorial sum of all associated particle momenta and is found from MC simulated data to be within 5-10% of the true momentum of the generated particle from which the jet originates over the whole p T spectrum and detector acceptance. An offset correction determined for each jet via the average p T density per unit area and the jet area is applied to jet energies to take into account the contribution from pileup, defined as the additional protonproton interactions within the same or adjacent bunch crossings [70]. Jet energy corrections are derived from simulated events and are confirmed with in situ measurements of the energy balance in dijet and photon+jet events. Additional selection criteria are applied to each event to remove spurious jet-like features originating from isolated noise patterns in certain HCAL regions [79].
Jets referred to as 'picky jets' are the input to the Comprehensively Optimised Resonance Reconstruction ALgorithm (CORRAL) for top quark reconstruction. The picky jet reconstruction algorithm is not constrained to any fixed characteristic width or cutoff and therefore is optimized for clustering the particles associated with the b quark and quarks from the W boson. This leads to an improvement in the reconstruction of top quark decays with a wide range of Lorentz boosts, as expected in signal events. The CORRAL and picky jet algorithms are described in Section 4.
Jets are identified as originating from the hadronisation of a bottom quark (b-tagged) by means of the CMS combined secondary vertex (CSV) tagger [80,81]. The standard CMS "tight" operating point for the CSV tagger is used [80], which has approximately 50% b tagging efficiency, 0.1% light flavour jet misidentification rate, and an efficiency of 5% for c quark jets.
Several simulated data samples based on MC event generators are used throughout this analysis. Signal samples are produced using the MADGRAPH (version 5.1.3.30) [82] event generator with CTEQ6L [83] parton distribution functions (PDFs). For both the T2tt and T2bW signals, the top squark mass (m t ) is varied from 200 to 1000 GeV, while the LSP mass (m χ 0 1 ) is varied from 0 to 700 GeV for T2tt and 0 to 550 GeV for T2bW. The masses are varied in steps of 25 GeV in all cases. For the T2bW sample the chargino mass is defined via the fraction x applied to the top squark and neutralino masses as follows: m χ ± = x m t + (1 − x) m χ 0 1 . We consider three fractions for x : 0.25, 0.50, and 0.75.
Standard model backgrounds are generated with MADGRAPH, POWHEG (version 1.0 r1380) [84][85][86][87][88], PYTHIA (version 6.4.26) [89], or MC@NLO (version 3.41) [90,91]. The MADGRAPH generator is used for the generation of Z and W bosons accompanied by up to three additional partons as well as for diboson and ttW processes, while the single top quark and tt processes are generated with POWHEG. Multijet QCD events are produced in two samples, one generated with PYTHIA and the other with MADGRAPH. Two ttZ event samples are used. One is generated with MC@NLO and the other with MADGRAPH. The decays of τ leptons are simulated with TAUOLA (version 27.121.5) [92].
The PYTHIA generator is subsequently used to perform parton showering for all signal and background samples, except for the MC@NLO ttZ sample, which uses HERWIG (version 6.520) [93]. The detector response for all background samples is simulated with GEANT4 [94], while the CMS fast simulation package [95] is used for producing signal samples in the grid of mass points described earlier. Detailed cross checks are performed to ensure that the results obtained with the fast simulation are in agreement with those obtained with the GEANT-based full simulation.
Events are selected online by a trigger that requires E miss T > 80 GeV and the presence of two central (|η| < 2.4) jets with p T > 50 GeV. Offline, a preselection of events common to all search samples used in the analysis has the following requirements: • There must not be any isolated electrons, muons, or tau leptons in the event. This requirement is intended mainly to suppress backgrounds with genuine E miss T that arise from W boson decays. The high efficiency lepton selection criteria used in the definitions of the lepton vetoes are described in detail in Section 5.
• There must be E miss T > 175 GeV and at least two jets with p T > 70 GeV and |η| < 2.4, such that the online selection is fully efficient.
• The azimuthal angular separation between each of the two highest p T jets and p miss T must satisfy |∆φ| > 0.5, while for the third leading jet, the requirement is |∆φ| > 0.3. These criteria suppress rare QCD multijet events with severely mismeasured highp T jets.
Baseline selections for the two targeted signal types are then defined by the following additional requirements. The T2tt baseline selection requires one or more b-tagged picky jets with p T > 30 GeV and |η| < 2.4, and at least one pair of top quarks reconstructed by the CORRAL algorithm. The T2bW baseline selection requires at least five jets (p T > 30 GeV and |η| < 2.4) of which at least one must be b-tagged. SM background yields, estimated as described in Section 7, and signal yields after the baseline selections are shown in Table 1. The trigger efficiency is measured to be greater than 95% for events passing these baseline selections.
A number of data control samples are used to derive corrections to reconstructed quantities and to estimate SM backgrounds. There are four control samples involving at least one wellidentified lepton and two that are high purity QCD multijet samples. The leptonic control samples are used to understand tt and vector boson plus jets backgrounds and are named accordingly, as indicated below. The data are drawn from samples collected online with triggers that require the presence of at least one charged lepton. The standard CMS lepton identification algorithms operating at their tightest working points [71,76] are then applied offline. Each event must have at least one selected muon with p T > 28 GeV and |η| < 2.1 or a selected electron with p T > 30 GeV and |η| < 2.4. Additional leptons must have p T > 15 GeV and |η| < 2.4. Table 1: Estimated SM background yields as obtained with the methods described in Section 7,  and the observed data yields for the T2tt and T2bW baseline selections. The T2bW yield corresponds to the simplified model point   • The inclusive tt control sample: At least one identified lepton and three or more jets, of which at least one must be b-tagged.
• The high purity tt control sample: This is the subset of the inclusive tt control sample for which the selected lepton is a muon and there are at least two b-tagged jets.
• The inclusive W+jets control sample: There must be one identified muon. In addition, the transverse mass m T formed from p miss T and the muon momentum is required to be ≥ 40 GeV in order to reduce QCD multijet contamination. • The inclusive Z+jets control sample: There must be two identified leptons of the same flavour with an invariant mass in the range 80 < m < 100, consistent with the mass of the Z boson.
The two additional data control samples selected to be pure in QCD mulitjet events are defined as follows: • The inclusive QCD multijet control sample: Events are required to have H T , the scalar sum of jet p T , >340 GeV and are collected with a set of H T triggers. • The high E miss T QCD multijet control sample: Events are selected with the same trigger used for the baseline selection. All events must satisfy E miss T > 175 GeV and have at least two jets with p T > 70 GeV in order to be fully efficient with respect to the online selection. The QCD multijet purity is increased by vetoing any events with isolated electrons, muons, or tau leptons and by inverting the baseline selection requirement on the angular separation between the three leading jets and p miss T .

Top quark pair reconstruction for the T2tt simplified model
The T2tt and T2bW signal modes involve the same final-state particles but differ in that only T2tt involves the decays of on-shell top quarks. The only SM background with potentially large E miss T and a visible component that is identical to that of T2tt is ttZ, with the tt pair decaying hadronically and the Z boson decaying invisibly to neutrinos. Efficient identification of a pair of hadronically decaying top quarks in events with large E miss T provides an important means of suppressing most other backgrounds. As mentioned in the previous section, we developed the CORRAL dedicated top quark reconstruction algorithm for this purpose. Kinematic properties of the top quark candidates reconstructed with CORRAL are exploited to further improve the discrimination of signal from background.
Top quark taggers are typically characterized by high efficiencies for the reconstruction of allhadronic decays of top quarks that have been Lorentz boosted to sufficiently high momentum for their final state partons and associated showers to form a single collimated jet. Such taggers are not ideal for the regions of parameter space targeted by this search because the top quarks from top squark decays can experience a wide range of boosts in these regions and it is not uncommon for one of the top quarks to have a boost that is too low to produce such a coalescence of final-state objects. An additional problem arises with traditional jet algorithms that do not always distinguish two separate clusters of particles whose separation is smaller than their fixed distance parameter or cone radius. In addition, for low-p T jets and those originating from hadronisation of b quarks, it is not unusual for algorithms with fixed distance metrics to miss some of the particles that should be included in the jet. These issues are addressed by making use of a variable jet-size clustering algorithm that is capable of successfully resolving six jets in the decays of top quark pairs with efficiency ranging between 25% in the case of signal with compressed mass splitting (m t = 400 GeV ≈ m t + m χ 0 1 + 75 GeV) to 40% in the case of large mass splitting (m t = 750 GeV ≈ m t + m χ 0 1 + 550 GeV).
The algorithm starts by clustering jets with the Cambridge-Aachen algorithm [96,97] with a distance parameter of 1.0 in the η-φ plane to produce what will be referred to as proto-jets. Studies based on MC simulation show that this parameter value is large enough to capture partons with p T as low as 20 GeV. Each proto-jet is then considered for division into a pair of subjets. The N-subjettiness metric [98], τ N , is used to determine the relative compatibility of particles in a proto-jet with a set of "N" jet axes. It is defined as the p T -weighted sum of the distances of proto-jet constituents to the nearest jet axis, resulting in lower values when the particles are clustered near jet axes and higher values when they are more widely dispersed. As discussed in Ref. [98], the exclusive two-jet k T algorithm [99,100] can be used to find an initial pair of subjet axes in the proto-jet that approximately minimizes the τ 2 metric. The exclusive two-jet algorithm differs from the inclusive k T algorithm in that it does not have a distance parameter. It simply clusters a specified set of particles into exactly two jets. In our case, the axes are varied in the vicinity of the initial set until a local minimum in the value of τ 2 is found. This defines the final set of axes and each particle in the proto-jet is then associated with the closest of the two axes, resulting in two candidate subjets.
An MVA 'picky' metric is then used to determine if it is more appropriate to associate the particles with two subjets than with the original proto-jet. The input variables include the τ 1 and τ 2 subjettiness metrics, the mass of the proto-jet, the (η,φ) separation of the two subjets, and a profile of the proto-jet's energy deposition. An MVA discriminator working point is defined as the threshold value at which the efficiency to correctly split proto-jets into distinct constituent subjets of top quark decays is 95%, while incorrectly splitting fewer than 10% of jets that are already distinct constituents. If the discriminator value doesn't meet or exceed the threshold, the proto-jet is treated as a single jet and added to the final jet list, otherwise the two subjets enter the proto-jet list to be considered for possible further division. The algorithm runs recursively until there are no remaining proto-jets, yielding a collection of variable-size jet clusters known as 'picky' jets.
The efficiency to correctly cluster W bosons (top quarks) into two (three) picky jets satisfying the basic acceptance requirements of p T > 20 GeV and |η| < 2.4 is shown in Fig. 2 as a function of generated particle (top quark or W boson) p T in all-hadronic T2tt events with m t = 600 GeV and m χ 0 1 = 50 GeV. In each event the six quarks arising from the hadronic decays of the two top quarks are matched to reconstructed picky jets by means of ghost association [101]. This technique associates particles produced in the fragmentation and hadronization of the quark prior to detector response simulation. The 'generator-level' particles are clustered together with the full reconstructed particles used to form the picky jets as described above, but the momentum of each of the generator-level particles is scaled by a very small number so that the picky jet collection is not altered by their inclusion. A quark is then determined to be matched to the picky jet that contains the largest fraction of the quark's energy if it is greater than 15% of the quark's total energy. In the case that two or more quarks are associated with the same picky jet, the picky jet is matched to the quark with the largest clustered energy in that jet. = 50 GeV. left: The efficiency to correctly cluster final state particles from each W boson and top quark decay into two and three picky jets, respectively, as a function of particle (top quark or W boson) p T . right: The efficiency at each stage of the CORRAL algorithm to reconstruct a hadronically decaying top quark pair as a function of the average p T of the two top quarks. They are the efficiency to correctly cluster the final state particles from top quark decays into six picky jets, labelled "Picky jet clustering"; the efficiency to both carry out picky jet clustering and reconstruct the top quark pair with these six picky jets, labelled "Top pair reconstruction"; and finally the efficiency to carry out picky jet clustering, top pair reconstruction, and then correctly select the reconstructed top quark pair for use in the analysis, labelled "Correct pair selection".
The energy of each resulting picky jet is corrected for pileup by subtracting the measured energy associated with pileup on a jet-by-jet basis by means of a trimming procedure similar to the one discussed in Ref. [102]. The procedure involves reclustering of the particles associated with the jet into subjets of radius 0.1 in η-φ and then ordering them by decreasing p T . The lowest p T subjets are removed one-by-one until the summed momentum and mass of the remaining subjets have minimal differences with the same quantities after subtracting an estimate of the pileup contribution [103]. The reconstructed W boson and top quark masses as a function of the number of reconstructed primary vertices are shown in Fig. 3 in all-hadronic T2tt events with m t = 600 GeV and m χ 0 1 = 50 GeV. The reconstructed mass values are seen to have no pileup dependence after the trimming procedure is applied. No additional jet energy scale corrections, other than those mentioned below, have been derived to remove the remaining 5-10% bias in the reconstructed mass values. The CORRAL algorithm is optimized for the uncorrected top quark and W boson mass values.   Figure 3: Masses of the top quarks and W bosons reconstructed with picky jets that are matched at particle level in simulation, as discussed in the text, in all-hadronic T2tt events with m t = 600 GeV and m χ 0 1 = 50 GeV. The labels "before PU corr." and "after PU corr." refer to results obtained before and after application of the trimming procedure used to correct for pileup effects.

Number of primary verticies
The p T spectra of picky jets in MC data are corrected to match those observed in data in the inclusive tt and Z+jets control samples by rescaling of individual picky jet p T values. The rescaling factors are derived separately for each of the two processes and for the flavour of parton that initiated the jet. They are found to be within 2-3% of unity. Picky jets can also be b-tagged with the CSV algorithm by considering the tracks that have been used in their formation.
A candidate for a hadronically decaying top quark pair is a composite object constructed from six picky jets that passes every step of the CORRAL algorithm, which will now be described. To reduce the number of jet combinations that must be considered, the algorithm involves several stages, with progressively tighter selection criteria at each stage. First, BDTs are trained to discriminate the highest p T jet coming from a top quark decay from all other jets in the event using input variables related to jet kinematics, b tagging discrimination and jet composition information. Jets are labelled as seed jets if they have an associated discriminator value that exceeds a high efficiency cutoff value. Three-jet top quark candidates are then constructed from all combinations of three jets in the event that include at least one seed jet. High quality top quark candidates are those that pass one of two MVA working points chosen to identify 97-99% of those cases in which the jets are correctly matched to top quark decays and to reject 60-80% of the candidates that are not correctly matched. The most important input variables are the W boson and top quark invariant masses and the picky jet b tagging discriminator value. Other variables such as the angular separations of the jets are included for additional discrimination. A final list of top quark pairs contains all combinations of two high quality top quark candidates with distinct sets of three jets. The final reconstructed top quark pair used in the analysis is the one with the highest discriminator value from a BDT that is trained with variables similar to those used in the candidate selection but also including information on the correlations between the top quark candidates.
The CORRAL algorithm reconstructs at least one top quark pair in nearly every event that has six or more picky jets. However, CORRAL is not strictly a top quark tagger that must distinguish events with top quarks from events without top quarks. It is designed to reconstruct top quark pairs in data samples that are predominantly made up of top quark events, as is the case for the T2tt part of this analysis. In Fig. 2, the efficiency for correctly resolving the top quark pair is shown at each stage of the algorithm. These efficiencies are calculated for T2tt events with m t = 600 GeV and m χ 0 1 = 50 GeV, but they do not depend strongly on the signal mass parameters. The two hadronic top quark decays are each resolved into three distinct picky jets in 15-70% of events, depending on the boost of the quarks. In nearly all of these events the correct six jets pass the CORRAL jet seeding and top quark candidate selection requirements and are used to form the correct top quark pair among a number of top quark pairs found in the event. The correct pair is then chosen to be used in the analysis in 30-80% of events.
Properties of the reconstructed top quark pairs used in the analysis are compared to true top quark pair quantities in Fig. 4 for signal events with at least one reconstructed top quark pair. The events in which the true top quark pair is chosen are categorized separately in the figure.
In the fully resolved and selected case the reconstructed separation in φ between the two top quarks agrees with the true separation within 0.1 in over 80% of events. Even in the case of the reconstructed top quark pair not being fully resolved or selected, there is reasonable agreement because the top quark pair is constructed with five of the six correct jets in the majority of these events.
The signal discrimination that is achieved by exploiting differences in the kinematics of the reconstructed top quark pairs in simulated signal samples and those in simulated SM background samples is illustrated in Fig. 5. The left plot shows the minimum separation in the η-φ plane between any two jets in the reconstructed top quark candidate with the highest discriminator value, labelled t 1 . The separation tends to be smaller in T2tt signal events because the top quarks with the highest discriminator value are more likely to be boosted. Similarly, the right plot shows the distribution for the separation in φ between the jet direction and p miss T for the jet with the smallest such separation from the sub-leading reconstructed top quark, labelled t 2 . The distribution for the semileptonic tt background, involving tt events in which one W boson decays leptonically, is shifted to low values of ∆φ because the t 2 top quark candidates in tt events typically use the b jet from the leptonically decaying top quark, which is correlated in angle with the p miss T from the leptonically decaying W boson.

Rejection of isolated leptons
The main backgrounds for this analysis arise from events with lost or misidentified leptons. Sensitivity to signal is therefore improved by identifying and rejecting events with charged leptons originating from prompt W boson decays as efficiently as possible. On the other hand, signal events often contain charged leptons that arise from decays of heavy flavour hadrons or charged hadrons that have been misidentified as charged leptons. It is advantageous to retain these events in order to achieve high signal efficiency. In events with E miss T > 175 GeV and five or more jets, the standard CMS lepton identification algorithms operating at their tightest working points [71,76] can identify semileptonic tt events with efficiencies of 54% and 60% for final states involving electrons and muons, respectively. This analysis makes use of MVA techniques to achieve higher efficiencies for the identification and rejection of semileptonic tt events, while retaining high signal efficiency.   = 50 GeV. The label "Correct pair selection" corresponds to events in which the two top quark decays are each resolved into three distinct picky jets and these jets are used to reconstruct the two top quarks. The label "Incorrect clustering or pair selection" is used for all other events. The top two figures show comparisons of the angular separation between the two top quarks in rapidity, , and azimuthal angle φ. The bottom figure compares the relative p T of the two top quarks. In all cases, t 1 refers to the top quark with the highest p T .
The MVAs used here combine a number of moderately discriminating quantities into a single metric that can be used for electron and muon identification. Electrons and muons must have p T > 5 GeV, |η| < 2.4, and are required to satisfy the conditions for the loose working point of the standard CMS identification algorithms, for which the efficiencies for electrons and muons in the tracker acceptance are above 90%. The discriminating variables used in the training of the muon identification BDT are the p T of the muon, its track impact parameter information, relative isolation in terms of charged and neutral particles, and the properties of the jet nearest to the muon. Isolation in terms of charged and neutral hadrons is defined by means of separate The left plot shows the minimum separation in the η-φ plane between any two jets in the leading reconstructed top quark, defined as the one with the highest discriminator value, while the right plot shows the separation in φ between p miss T and the jet in the sub-leading reconstructed top quark for which this separation is the smallest. Both variables are inputs to the T2tt search region BDT discriminators, which are described in Section 6.
sums of the p T of charged and neutral PF particles, respectively, in a region near the lepton, divided by the lepton p T . The properties of the nearest jet that are used include the separation from the lepton in the η-φ plane, the momentum of the lepton relative to the jet axis, and the CSV b tagging discriminator value for the jet. For electron identification, the variables include all of those used for the muon, plus several electron-specific variables that are used in the standard CMS electron identification MVA [76].
The BDTs are trained using simulated event samples with electrons or muons. In particular, single-lepton tt events are the source of prompt leptons, while electrons or muons in allhadronic tt events are used for non-prompt leptons. The non-prompt lepton selection efficiency in signal events is similar to that in tt events. The left plot in Fig. 6 shows the selection efficiency, by lepton type, for non-prompt leptons as a function of that for prompt leptons in the BDT training samples. The curves are obtained by varying the cutoff on the corresponding BDT discriminator value above which events are accepted. In this analysis, the discriminator values that are chosen have efficiencies of 98% for events with electrons and muons from W boson decays that pass the preselection requirements, while incorrectly selecting no more than 5% of all-hadronic tt events. The latter gives some indication of the expected loss of all-hadronic top squark signal events. Upon including reconstruction and acceptance inefficiencies, these requirements eliminate 80% of single-electron and single-muon tt events with E miss T > 175 GeV and five or more jets.
A similar approach is used to identify hadronically decaying tau leptons originating from semileptonic tt decays. The τ identification algorithm focuses on decays involving a single charged hadron in conjunction with neutral hadrons because the majority of hadronic τ decays are to final states of this type, which are often referred to as 'one-prong' decays. No attempt is made to specifically reconstruct the sub-dominant 'three-prong' decays. A τ candidate is thus defined by a track and a nearby electromagnetic cluster produced by the photons from π 0 → γγ decay, if present, in order to include more of the visible energy from the τ lepton decay. Since every charged particle with p T > 5 GeV and |η| < 2.4 could be considered to be a τ candidate, we reduce the pool of candidates by using m T calculated from p miss T and the momentum of each candidate. As seen in the right plot in Fig. 6, the m T distribution for genuine τ candidates has an endpoint at the mass of the W boson for semileptonic tt events, reflecting the fact that the neutrinos associated with W boson and τ lepton decays are the largest source of E miss T in these events. Fully hadronic signal events with large E miss T do not have this constraint, and so each τ candidate is required to have m T < 68 GeV.
The variables used in a BDT discriminator for the identification of the τ candidate are the track p T , |η|, and distance of closest approach to the primary vertex, as well as the isolation quantities and general properties of the jet in which the τ candidate is contained. The isolation variables include the separate sums of the transverse momenta of charged and neutral PF particles, in cones of radii 0.1, 0.2, 0.3, and 0.4 centered on the candidate, and the distance between the candidate and the nearest track. The jet variables used are the separation in the η-φ plane between the track and the jet axis, and the b tagging discriminator value for the jet. This BDT is trained with hadronically decaying τ candidates originating from semileptonic tt decays in MC simulation for prompt candidates, while all τ candidates in all-hadronic T2tt events with m t = 620 GeV and m χ 0 1 = 40 GeV are used for the non-prompt candidates. The samples produced with these T2tt mass parameters are not included in the final array of T2tt samples used in the later stages of this analysis. The T2bW baseline selection is applied to all events in order to have training samples whose kinematic selection criteria are consistent with those used to select the data samples used for the search. The m T cutoff value and the BDT discriminator value are chosen to keep losses below 10% in the all-hadronic signal samples targeted by this analysis. The efficiency for correctly selecting the background of semileptonic tt events with hadronically decaying tau leptons is 65%. This efficiency is defined relative to events for which the τ lepton decay products include at least one reconstructed charged particle with p T > 5 GeV.
The efficiencies for selecting leptons in simulation are corrected to match those measured in data after applying the T2bW baseline selection criteria. The multiplicative correction factors applied to the simulated electron and muon selection efficiencies for this purpose are 0.95 ± 0.03 and 1.01 ± 0.03, respectively. The corrections to the simulated τ selection efficiency are 1.30 ± 0.10 for τ candidates with p T < 10 GeV and 0.98 ± 0.04 for all other candidates.

Search regions
As discussed above, this analysis makes use of MVA techniques based on BDTs to achieve sensitivity to direct production of top squark pairs in the all-hadronic final states of the T2tt and T2bW simplified models in the presence of three main classes of much more copiously produced SM backgrounds. The signal space of the T2tt simplified model is parameterised by the masses of the top squark and the neutralino. The T2bW simplified model also includes an intermediate chargino, and is therefore parameterised by three masses. For each model, a large set of simulated event samples is prepared, corresponding to a grid of mass points in two dimensions for T2tt, and in three dimensions for T2bW. A large set of moderately to strongly discriminating variables, discussed in more detail below, serves as input to each BDT to yield a single discriminator value ranging between −1.0 and +1.0 for each event considered. Events with values closer to 1 (−1) are more like signal (background).
Since there are potentially significant differences in the kinematic characteristics of signal samples at different points in the mass grids described above, it is not known a priori what is the minimum number of distinct BDTs that are needed to achieve the near optimal coverage of the signal spaces. To this end, a minimum number of BDTs that provides sufficient coverage of each signal space is selected from a larger superset that includes BDTs that are each uniquely trained on grid points separated by ≈100 GeV in top squark mass and ≈50 GeV in neutralino mass for both signal types. For T2bW, there are also 3 different values of chargino mass that are considered, corresponding to x = 0.25, 0.5, and 0.75. Sensitivity to signal is probed by varying discriminator thresholds from 0.5 to 1.0 in steps of 0.01. Ultimately it is determined that four BDTs for T2tt and five for T2bW are adequate to cover the largest possible parameter space with near optimal signal sensitivity. Each BDT tends to cover a specific portion of signal space, referred to as a search region. The optimisation of the overall search does not depend strongly on the specific signal points that are used to train individual BDTs. Moreover, adding more regions is not found to increase the sensitivity of the analysis. Table 2 lists the search regions for both signal types, the mass parameter points used to train each BDT, and the optimal BDT discriminator cutoffs that are used to define the final samples. Figure 7 displays the most sensitive search regions in T2tt and selected T2bW mass planes. The colour plotted in any given partition of the plane corresponds to the search region BDT with the strongest expected limit on the signal production cross section.
For the T2tt search a total of 24 variables are used. They can be divided into variables that do or do not rely upon top quark pair reconstruction by the CORRAL algorithm. The latter include E miss T , jet multiplicity, and m T calculated with p miss T and the p T of the b-tagged picky jet that is closest to p miss T in φ. Of these, the most important variables for tt suppression are E miss T and m T . The m T distribution is peaked near the top quark mass for semileptonic tt events because nearly all of the E miss T originates from the leptonic W decay, and the corresponding lepton is usually soft. On the other hand, there is no peak in the distribution for fully hadronic signal events. One variable suppresses SM background by exploiting the higher probability for jets in SM events, particularly Z+jets and W+jets, to originate from gluons. It is the product of the quark-gluon likelihood values [104] that are computed for each jet in the event. Two additional variables, the η of the peak in jet activity and the ∆η between two peaks in jet activity, provide a measure of the centrality of the event activity. They are obtained by a kernel density estimate (KDE) [105,106] of the one dimensional jet p T density. The KDE uses the jet η as input with a jet p T weighted gaussian kernel function and a bandwidth parameter optimized on an event by event basis such that two peaks in the KDE are found. Another variable counts the number of unique combinations of jets that can form reconstructed top quark pairs. The remaining seventeen variables are all built with information pertaining to the candidate top quark pair obtained from CORRAL. The invariant mass of the top quark pair and the relative p T of the two reconstructed top quarks are used to take into account correlations between the two top quark candidates that generally differ for signal and background. The degree of boost or collimation of each top quark candidate is measured with three variables, including the minimum cone size in the η-φ plane that contains all of the reconstructed particles from the top quark decay. Two variables use the CORRAL discriminator value for each of the two top quarks as a measure of the quality of the reconstruction. Two other variables measure the angular correlation with p miss T for the lower-quality member of the top quark pair. The last eight variables are the p T values for the six jets in the top quark pair and two CSV b jet discriminator values that each correspond to the highest b tagging discriminator value obtained for the three jets that make up each of the two top quark candidates. While the properties of the reconstructed top quark pairs differ between signal events with two hadronic top decays and all SM background events with one or no hadronic top decays, the variables measuring the quality of the reconstruction are particularly useful for the suppression of Z+jets and W+jets since no reconstructed top quark candidates originate from hadronic top decays. A similar situation occurs for the variables utilizing b jet discriminator values since these processes typically have fewer jets that originate from b quarks than signal processes. As explained in Section 4, the kinematics of the reconstructed top quarks, such as their angular correlation with p miss T , are used for tt suppression.
There are 14 variables used to train the BDTs that target the T2bW final state, half of which are the same or very similar to those used for the T2tt final state. Four of these are commonly used to distinguish SM background from SUSY signals. They are E miss T , jet multiplicity, multiplicity of jets passing the CSV b tagger medium working point, and the azimuthal separation of the third-leading jet from p miss T . Variables that are sensitive to correlations between b jets and the rest of the event are the invariant mass formed with the two highest p T b-tagged jets; m T formed with p miss T and the nearest b-tagged jet; and the standard deviation of the separation in pseudorapidity between the b-tagged jet with the highest p T and all other jets in the event. Three additional variables make use of quark-gluon likelihood values for the jets in the event, and a further set of three make use of jet kinematics. Of the last the most important is the scalar sum over p T of jets whose transverse momenta are within π/2 of the direction of p miss T , (i.e. ∆φ( p jet T , p miss T ) < π/2) divided by the corresponding sum for all jets that do not meet this criterion. This variable is particularly useful for suppression of Z+jets and W+jets since the jets and p miss T in these events are typically opposite in φ. This is not the case for signal events, for which the direction of p miss T and hadronic activity is less correlated. For the calculation of the final variable, jets are first grouped into unique pairs by requiring the smallest separation distances in η − φ space. Of these, the invariant mass of the pair with the highest vector sum p T is found in simulation to have a high probability to correspond to the decay of a W boson and is used to suppress Z+jets events with Z → νν. Table 2: Search regions for the T2tt and T2bW channels. The table lists the SUSY particle masses used for the training of the BDTs, the cutoff on the BDT output, and the efficiency for the signal to pass the BDT selection relative to the baseline selection. The event counts of the T2bW discriminator training samples are limited and so four nearby mass points were used. They are the four combinations of the two t and two χ 0 1 masses listed. The signal efficiency in each row of the table is then that of the best case of the four, which in every case is the point with the largest m t and smallest m χ 0 1 values of those indicated.

Estimation of SM backgrounds
We divide the important SM backgrounds into three classes. The first class, referred to as EW backgrounds, includes semileptonic and dileptonic decays of tt, W+jets, single top, and Z+jets with Z → νν. The second class of backgrounds originates from high-E miss T QCD multijet processes, and the third arises from associated production of ttZ with Z → νν and both top quarks decaying to hadrons. The latter produces a final state that is extremely similar to that of the signal but is fortunately very rare. The diboson contributions to search regions are studied in simulation and found to be negligible.
The estimation of the EW and QCD multijet backgrounds is based on MC samples in which the events have been reweighted by scale factors with values that are generally within a few percent of unity. As discussed in Section 7.1, the scale factors are extracted from data-MC comparisons in control regions. The reweighting of the events assures that the simulation samples match data samples with regard to distributions of quantities that are relevant to the selection of events in the signal regions. However, it is important to note that the reweighted MC samples are not used directly to estimate backgrounds in the signal region. Rather, the search region yields and uncertainties are estimated by comparing the reweighted MC samples to data in background-specific control regions that differ from the search regions only in that they are obtained with selection criteria that simultaneously increase the purity of a single background and reduce any potential signal contamination. In the case of the EW backgrounds the control regions are selected by requiring one or more isolated leptons, while for the QCD multijet background it is selected by requiring p miss T to be aligned with one of the leading jets.

EW and QCD background estimates with MC reweighting
This analysis uses MC samples as the basis for the estimation of SM backgrounds in signal regions. These simulations have been extensively tested and tuned in CMS since the start of LHC data taking in 2009. As a result, they accurately reproduce effects related to the detailed geometry and material content of the apparatus, as well as those related to physics processes such as initial-state and final-state radiation. Nevertheless, the MC samples are not assumed to be perfect, discrepancies being observed with data in some kinematic regions. Comparisons between data and MC simulation are therefore performed to derive scale factors in order to reduce the observed discrepancies.
The scale factors fall into two conceptually different categories. The first category involves effects associated with detector modelling and object reconstruction that are manifested as discrepancies in jet and E miss T energy scales and resolutions, lepton and b jet reconstruction efficiencies, and trigger efficiencies. The second category corresponds to discrepancies associated with theoretical modelling of the physics processes as represented by differential cross sections in collision events. The scale factors in this category are estimated separately for each SM background process. The main sources of discrepancy here are finite order approximations in matrix element calculations and phenomenological models for parton showering and hadronisation. Scale factors are parameterised as a function of generator-level quantities controlling post-simulation event characteristics relevant to the final selection criteria used in the analysis. The scale factors are derived by comparing distributions of variables after full reconstruction that are particularly sensitive to these generator-level quantities, as seen in comparisons of MC with data. D'Agostini unfolding with up to four iterations [107], implemented with RooUnfold [108], is used to determine the correct normalization of the generator-level quantities such that the distributions agree after full reconstruction. The scale factors are defined as the ratio of the corrected values of generator-level quantities to their original values. The MC events are reweighted by these scale factors, thereby eliminating any observed discrepancies with data. The scale factors are generally found to be close to unity as a result of the high quality of the MC simulation. The inclusive kinematic scale factors lead to no more than 10% shifts in any regions of the distributions of H T and number of jets that are relevant to this analysis.

Detector modelling and object reconstruction effects
The detector modelling and object reconstruction scale factors are grouped into the following categories: lepton identification efficiency, jet flavour, jet p T , and p miss T .
For the lepton identification efficiency, the event yields of simulated data passing the lepton vetoes in the search regions are corrected by scale factors as described in Section 5. The associated uncertainties in the search region predictions are denoted as "MVA lepton sel. scale factors" in Tables 3 and 4. Similarly, in the control regions defined by the presence of a single lepton as described in Section 3, scale factors are applied to the simulated electron and muon reconstruction, identification, and trigger efficiencies. These scale factors are measured by applying a "tag-and-probe" technique to the pairs of leptons coming from Z boson decays [71,76,109].
Identification of jet type via b tagging is important for the CORRAL top reconstruction algorithm and the signal discriminator used in the T2tt search. Both use the CSV b tagging algorithm output values directly rather than setting a particular cutoff value as is done for standard CMS loose, medium, and tight working points [80]. It is therefore important that the CSV discriminator output distributions in simulated event samples match those seen in corresponding data samples. To this end, the CSV discriminator output of each picky jet is corrected so that the CSV output distributions for simulated tt and Z+jets event samples match those observed in the inclusive tt and Z+jets control samples, respectively. Similarly, the quark-gluon likelihood distribution for jets is corrected to match data. The jet energy scale is corrected as described in Section 3, and the simulated picky jet p T spectrum is corrected as described in Section 4.
The rejection of SM backgrounds in this analysis is very much dependent on the measurement of p miss T and its resolution, which is not modelled perfectly in simulation. Corrections are therefore applied to MC simulated samples of EW and QCD multijet processes in order to obtain good agreement with data in search region variables that depend on the correlation of event activity with p miss T . There are three separate corrections [110] applied for EW processes that are derived from a control sample of Z+jets events with Z → + − where, by conserva-tion of energy and momentum, the reconstructed Z boson provides an accurate measure of the energy associated with all other activity in the event as measured in the transverse plane. Sources of genuine E miss T such as neutrinos in these events are rare and have a negligible effect on the derived corrections. The corrections are based upon comparisons of data to simulation in the inclusive Z+jets control sample in which p miss T is decomposed into components parallel and perpendicular to the direction of the Z boson p T . The components and their resolutions are then investigated as a function of a variety of quantities to look for systematic trends and biases that can then be corrected. In this way, an E miss T scale correction of order 1% is obtained as a function of both the boson p T and the distribution of hadronic energy in the event relative to the energy of the boson. The second and third corrections involve an increase in the jet resolution by 9% and a smearing of the p miss T in both the directions parallel to the boson and perpendicular to it by approximately 4.5 GeV. The measured resolutions of the components of p miss T along and perpendicular to the boson direction as obtained in simulation match those found in the data control regions after these corrections are applied.
For the EW backgrounds the p miss T corrections are parameterised in such a way that the corrected MC samples are consistent with data in p miss T -related quantities, such as the reconstructed W boson m T . In contrast, for the discrimination between QCD multijet events and SUSY signal events, the angular correlations between p miss T and the p T of leading jets in the event are the most important variables. Corrections are therefore obtained expressly for this background process with the inclusive QCD multijet control sample. The corrected simulation samples provide a good match to the angular correlations between p miss T and the leading jets in data.

Corrections to the theoretical modelling of EW background processes
The kinematic distributions of simulated EW processes are validated and corrected with three control samples having charged leptons in the final state: the high purity tt, the inclusive Z+jets, and the inclusive W+jets control samples. Based on the physically reasonable assumption that the kinematics of the rest of the event should be largely independent of the boson decay(s) in these processes, the control samples are used in conjunction with corresponding MC samples to extract scale factors described below that are parameterised by generator-level quantities. They are then applied to MC samples in the search regions to estimate background contributions.
The scale factors are extracted as functions of the p T of the boson in the case of W+jets and Z+jets or of the momenta of the top quarks in the case of tt. They also depend on the multiplicity and flavour of radiated jets as well as H T . Because the control samples have finite sizes, the scale factors are organised into subsets that are derived and used sequentially. That is, prior to each derivation step, the scale factors extracted in the previous derivation steps are applied. For example, scale factors for correcting the tt jet multiplicity and top quark spectra are obtained and applied prior to calculating those used to correct the production of Z bosons in conjunction with heavy-flavour jets, since as much as 60% of the events in the Z control sample are tt events.
There is no suitable control region to accurately measure corrections to the theoretical modelling of the single top process. However, a precise modelling of this process is not important as its contribution in the search regions is much smaller than that of tt. A 50% systematic uncertainty on the single top yield, estimated with simulation, is therefore used. It appears under the label "Single top kinematics" in Tables 3 and 4.

Estimation of EW background
The corrections to the MC event samples based on scale factors, as discussed above, result in an agreement between MC and data distributions that is typically within 10% for all control samples, including samples that were not used to extract the scale factors. This level of agreement is also found for distributions of many kinematic variables for which no corrections were explicitly applied. There are a few regions in which kinematic distributions disagree at the level of 20%, but these disagreements have been found to have a negligible impact on the search region predictions. A bootstrapping procedure is used [111] to take into account statistical uncertainties in the derived scale factors for distributions of kinematic quantities and their correlations. The corresponding statistical uncertainty in the search region predictions is labelled "Kinematics reweighting" in Tables 3 and 4. While the corrected MC and data distributions are found to agree in many control regions, the corrected MC is not used to directly estimate the background in the search regions. Instead, corrections specific to each search region are derived in addition to the more general scale factors previously described.
After correcting MC simulation samples for detector, reconstruction, and kinematic discrepancies, a closure correction and its uncertainty are measured, where closure is defined as the largest residual data-MC difference seen in a number of kinematic distributions. To this end, data-MC comparisons are performed in a variety of leptonic control regions for which the kinematic distributions under study are as similar as possible to those in the search regions as seen for MC samples that pass the signal selection criteria. The leptonic control samples used for the closure tests are obtained by applying the full set of baseline requirements, with the exception of the lepton vetoes. The control samples used to correct the tt, W+jets and single top processes, referred to as the "1 closure samples," are subsets of the inclusive tt control sample, in which exactly one charged lepton has been identified. The charged lepton is removed from the list of physics objects in the event, leading to an additional component of p miss T that simulates the case in which the W boson decay has a large invisible component, which is common for events passing the search region selection. As a result, many events with low intrinsic E miss T pass the search region selection criteria, thereby enhancing the data statistics and significantly reducing the closure uncertainty. For similar reasons, this procedure also reduces potential contamination by semileptonic signal events to negligible levels. Likewise, "2 closure samples" are subsets of the inclusive Z+jets control sample and are used to correct the Z+jets process. The charged leptons are removed from the event, altering the p miss T to simulate the case in which the Z boson decays to neutrinos.
Comparisons of the BDT discriminator outputs for data and corrected MC simulation for the 1 closure samples, after removal of the single identified charged lepton in each event, are shown in Figs. 8 and 9, with the first ten bins in each plot covering the full BDT discriminator range. The closure is quantified by comparing the predicted event counts in MC simulation to those found in data in a 'validation region', defined as the region containing the events with a single lepton that pass all of the final signal selection criteria after the lepton is removed, and in two control regions that extend the final search region to lower BDT discriminator values. The latter are defined by doubling and tripling the difference between unity and the discriminator cutoff value used for the final search region. These two additional regions are needed because the search region is statistically limited in some cases. The results for the signal region and the two extended regions are shown in the last three bins in Figs. 8 and 9, for the four T2tt and five T2bW BDT discriminators, respectively. The differences seen in the event counts for data and MC simulation in the extended regions are in general statistically compatible with the difference seen in the search region. Therefore, the data over simulation ratio in the first extended region is used as a correction for any potential residual bias in the event counts obtained with MC samples in which the events pass all of the signal region selection criteria, now including the lepton veto requirements. The uncertainty in the correction is taken to be the statistical uncertainty in the data over simulation ratio in the last bin, which we have referred to as the validation region. This choice assures that the uncertainty covers any potential unknown differences between the search region and the first extended search region. For the four separate T2tt search regions, the largest correction is 1.08 ± 0.13 in the medium-mass region, with the closure uncertainties ranging from ±0.08 in the low-mass region to ±0.24 in the very-highmass region. For the five separate T2bW search regions, the largest correction is 0.85 ± 0.20, and the uncertainties in the corrections range from ±0.09 to ±0.25. This uncertainty in the search region predictions is denoted as "Closure (1 )" in Tables 3 and 4. The histogram labelled "MC without corr." in the bottom pane of each figure plots the ratio whose numerator is the total MC event count before corrections and whose denominator is the event count for the corrected MC shown in the upper pane. The other histograms indicate the contributions of the various background processes. The "LF" and "HF" labels denote the subsets of the W+jets process in which the boson is produced in association with light and heavy flavour (b) quark jets, respectively.
The simulated data are similarly compared to data in the 2 closure samples in Figs. 10 and 11. No statistically significant lack of closure is observed for any of the T2tt and T2bW search  Figure 9: Comparisons of BDT discriminator (D) outputs for data and corrected MC simulation for the 1 closure samples, with leptons removed, for the five T2bW validation regions. The three bins at the far right in each plot are used to validate the MC performance in the signal region and its two extensions. The points with error bars represent the event yields in data. The histogram labelled "MC without corr." in the bottom pane of each figure plots the ratio whose numerator is the total MC event count before corrections and whose denominator is the event count for the corrected MC shown in the upper pane. The other histograms indicate the contributions of the various background processes. The "LF" and "HF" labels denote the subsets of the W+jets process in which the boson is produced in association with light and heavy flavour (b) quark jets, respectively. regions. However, the small sample size makes it impossible to probe comparisons near to the search regions. An uncertainty is therefore obtained by measuring the largest data-MC discrepancy for each individual MVA input variable in the kinematic phase space of the search regions. This is defined for each input variable and search region as the ratio of event yields in data relative to MC simulation after reweighting both distributions. The weights that are used come from MC simulated distributions of the input variables after applying the MVA discriminator cutoff that is used for the search region. The distributions are normalised to unit area and the normalised bin contents are the final weights. The weights are applied to binned events in both samples before taking the data/MC ratio in the control region where we measure the uncertainty. The uncertainty in the Z+jets background prediction is then taken to be the difference with respect to unity of this ratio for the variable with the largest degree of nonclosure, defined as |(Data/MC) − 1|/σ where σ is the statistical uncertainty in the ratio. This closure test is repeated with successively tighter MVA discriminator cutoffs to check if the extracted closure uncertainty has any potential systematic trend related to discriminator cutoff. No significant trend is observed. To be conservative, the nonclosure is measured for an MVA discriminator value greater than or equal to 0.0 (−0.5) for T2tt (T2bW) search regions. These cutoff values are the highest ones for which the magnitude of the statistical uncertainty is smaller than the measured level of nonclosure. The uncertainties, denoted as "Closure (2 )" in Tables 3 and 4, are found to range between 16% and 39%.
A separate control sample, which is similar to the baseline selection but with relaxed jet and b-tag requirements, is studied as an independent check of the Z+jets and W+jets processes. Discrepancies of roughly 5% in the event counts relative to those predicted are observed for both the Z+jets and W+jets processes. The full magnitude of this discrepancy is taken as an additional uncertainty in the event counts for these background processes and it is included as "Closure (relaxed baseline)" in Tables 3 and 4.
While the efficiencies for selecting electrons and muons in simulation are relatively well matched to what is seen in data, the efficiency for selecting τ leptons is observed to be significantly higher in simulation than in data for high values of some of the T2bW search region discriminators. The discrepancy is traced to a mismodelling of m T , which, as discussed in Section 5, is used for a preselection requirement of the tau veto. The mismodelling of m T is due to the angular component of p miss T and is uncorrelated with its magnitude. To address this, a correction and associated uncertainty are determined by means of a control region made up of modified events that is safe from signal contamination. The control region is defined by applying the full search region selection criteria to events in which search region discriminator values are calculated with a E miss T value that is randomly selected from the distribution of E miss T values obtained for the search region in MC simulation. A τ lepton veto efficiency is then obtained separately in data and simulation by taking the ratio of the number of events that pass the full set of signal region selection criteria but fail the τ lepton veto to the total number of events that pass the selection criteria prior to applying the τ lepton veto. The ratio of the τ lepton efficiency in data to the efficiency in simulation is then used to correct the efficiency for the simulated background samples with τ leptons from W boson decays in the signal region. This correction reduces the data-MC discrepancy to a level that is not statistically significant and decreases the simulated τ lepton efficiency by a maximum of 29% in all cases considered, with an uncertainty of 13%. This uncertainty is included with the other lepton selection scale factor uncertainties under the label of "MVA lepton sel. scale factors" in Tables 3 and 4. The predictions in all search regions together with a breakdown of the various contributions to their uncertainties are provided in Tables 3 and 4. After applying all corrections described in this section to the MC simulated data, no statistically significant discrepancies with data are  The points with error bars represent the event yields in data. The histogram labelled "MC without corr." in the bottom pane of each figure plots the ratio whose numerator is the total MC event count before corrections and whose denominator is the event count for the corrected MC shown in the upper pane. The other histograms provide the contributions of the various background processes. The "LF" and "HF" labels denote the subsets of the Z+jets process in which the boson is produced in association with light and heavy flavour (b) quark jets, respectively.
observed in any bin of search region discriminator value for any search region.

Estimation of the QCD multijet background
Kinematic distributions obtained with the inclusive QCD multijet control sample are compared to those found in QCD multijet MC simulation. The same method of deriving a series of scale factors parameterised by generator-level quantities that was used in the estimation of the EW processes is applied here, but distributions of different quantities are used. In particular, the jet p T spectrum and angular correlations among jets in the event are the quantities that provide the most power in the identification of QCD background. We also consider the distributions of quantities related to heavy-flavour production and the relative momenta of jets in the event.
After all corrections are applied, good closure is obtained: discrepancies between data and simulation are less than 10% in distributions used to determine reweighting scale factors.  Figure 11: Comparisons of BDT discriminator (D) outputs for data and corrected MC simulation for the 2 closure samples, with leptons removed. All five T2bW validation regions are plotted. The points with error bars represent the event yields in data. The histogram labelled "MC without corr." in the bottom pane of each figure plots the ratio whose numerator is the total MC event count before corrections and whose denominator is the event count for the corrected MC shown in the upper pane. The other histograms provide the contributions of the various background processes. The "LF" and "HF" labels denote the subsets of the Z+jets process in which the boson is produced in association with light and heavy flavour (b) quark jets, respectively.  The one quantity that does, however, require special consideration is E miss T . Most of the QCD multijet background is eliminated by high-E miss T requirements. The events that are not eliminated largely originate from the extreme tails of very broad distributions associated with two mechanisms. Namely, in order to produce large E miss T , a QCD multijet event must either involve production of a heavy-flavour hadron that decays leptonically, or involve one or more jets that are poorly resolved, leading to severe underestimates of their momenta.
The simulation of these sources of E miss T , particularly for the rare cases in which the events survive all selection requirements for the search regions, is not well understood, and it is difficult to study these mechanisms directly in data. This means that the QCD multijet background cannot be estimated precisely and so a reliable upper bound is found instead. This is sufficient because the QCD multijet contribution is small compared to other backgrounds. To this end, simulation samples having sources of large E miss T are compared with E miss T -triggered data in control regions to obtain scale factors and associated uncertainties that are used to reweight simulated events. The resulting weights are then applied to simulation samples in the signal region. Additional systematic uncertainties are applied to cover the uncertainties in the extrapolations of these corrections into the search regions.
The high E miss T QCD multijet control sample, which is defined with the requirement that p miss T be aligned with one of the jets to a degree that is consistent with expectations for either of the two sources of E miss T discussed above, is used to derive scale factors. The jet with which p miss T is aligned is referred to as the probe jet in such events. The negative vector sum of momenta of all jets in the event, other than the probe jet, provides an alternative estimate of the probe jet momentum, since p T is conserved, within uncertainties, in the absence of other severe mismeasurements. The recoil response, defined as the ratio of the momenta of the probe jet to that for the rest of the activity in the event, (p T,probe /p T,recoil ), is a very good estimator for the true response of the probe jet, (p T,probe /p T,true ), in the tails of the distribution, where mismeasurement of the probe jet momentum dominates over the mismeasurement of the recoil momentum. It is therefore used to derive separate scale factors for the jet resolution, parameterised by jet p T , for each of the two sources of E miss T . These scale factors range between 0.6 and 1.8.
The central values of the QCD background predictions are taken to be the MC simulation yields in the signal regions after applying all of the corrections defined above. The various statistical and systematic uncertainties are highly asymmetric and in many cases non-Gaussian. Therefore, in each search region an MC integration procedure is used to properly combine the uncertainties. As expected from the central limit theorem, the combination of uncertainties can be approximated by a Gaussian distribution, the parameters of which are listed in Tables 3 and 4 under the label of "Integrated uncertainty band." Two shape uncertainties are assigned to the QCD multijet estimation in each search region. The first is a systematic uncertainty associated with the search region MVA discriminator distribution, denoted as "MVA discriminator shape" in Tables 3 and 4. It is obtained from a comparison of the distribution in MC simulation to that in data for the high E miss T QCD multijet control sample after also requiring that events pass the baseline selection criteria, with the exception of the requirements on the angular separation between the leading jets and p miss T . Dropping these criteria leads to a significant increase in the contribution of QCD multijet events to the final sample relative to all other backgrounds or signal. A second systematic uncertainty, labelled "∆φ shape upper and lower bounds" in Tables 3 and 4, is obtained from the same samples by comparing the MC distribution of the angle between p miss T and the leading jets to that for data for a variety of discriminator cutoffs. The distributions are found to differ increasingly with rising b-tagged jet multiplicity. The bias is eliminated by smearing the φ values of the p T of b jets with a Gaussian having a standard deviation of about 0.02. The upper bound on the QCD background is then obtained by increasing the width of the Gaussian until there is a larger number of MC events predicted to pass the selection criteria than is observed in data. The upper bounds found in this way are different for different search regions as a result of variations in statistics and contributions of other SM processes. The values of the Gaussian width that are found to cover all cases are 0.07 in the case of T2tt and 0.05 in the case of T2bW.
Finally, the QCD multijet simulated data are generated in discrete bins of H T in the case of MADGRAPH and in bins of quark and gluon p T in the case of PYTHIA. The effective integrated luminosity for some of the samples in particular bins can be much smaller than the 18.9 fb −1 of integrated luminosity collected in proton-proton collision data. A systematic uncertainty is therefore applied to each QCD background prediction to cover a possible underprediction that could be the result of a lack of events in these highly weighted bins. It is denoted as "Low luminosity bins upper bound" in Tables 3 and 4.

Estimation of the ttZ background
Standard model ttZ production is a rare process (σ ∼ 0.2 pb) that becomes an important background in CORRAL-based search regions for the T2tt signal model where general tt backgrounds have been greatly suppressed. There are no sufficiently populated and uncontaminated data control regions in which to perform careful studies of this rare SM process. The simulated data are studied instead, making use of variations in the parameters that control the generation and parton showering to establish systematic uncertainties in the estimated event counts in the signal regions. In addition, the relative difference in yields between the default MC@NLO sample, with parton showering by HERWIG, and a separate MADGRAPH sample, with parton showering by PYTHIA, is used to estimate a systematic uncertainty associated with MC generators. This uncertainty, listed in Tables 3 and 4 with the label "MC simulation," ranges between 3% and 26% depending on the search region.
The uncertainty in the ttZ production cross section is estimated from a data control sample with three reconstructed charged leptons drawn from a larger event sample that has been collected with a set of dilepton triggers used for multilepton SUSY searches [112]. The two charged leptons picked up by these triggers most often originate from the decay of a Z boson and are thus oppositely charged, same-flavour leptons. The third lepton can arise via the semileptonic decay of a W coming from the decay of a top quark in ttZ events. The selection of events for this control sample thus includes the requirement that two of the reconstructed leptons must be consistent with the expectations for leptons from Z boson decay in flavour, charge, and the invariant mass of the pair. In order to reduce the contamination from other SM backgrounds, events are also required to have at least three or more jets, at least six picky jets, and one or more b-jets tagged with the medium CSV working point [80] in order to increase the relative contribution of the ttZ process.
With a contribution of approximately 10%, diboson production is a leading SM process in this region after ttZ. Thus, a diboson-enriched control region is established that makes use of the same selection criteria described above for the ttZ control region, except that the b tagging requirement is inverted to form a corresponding b-tag veto. This sample is used to normalise the overall diboson process in MC simulation to that observed in data.
The ttZ and the diboson processes in the enriched control regions described above have estimated event yields that are statistically consistent with the event yields predicted by simulation samples. In view of this, the data-MC scale factors are taken to have a central value of unity, and no correction is applied. The statistical uncertainty in the ttZ scale factor is 31%. This is adopted as a systematic uncertainty in the estimated yield of this background source and is denoted as "MC normalisation" in Tables 3 and 4. A final systematic uncertainty takes into account differences observed between the kinematic distributions in MC simulation and data. To this end, we make use of the closure uncertainties in the W+jets (including tt and single top) and Z+jets background predictions that have been derived in the lepton control regions as necessitated by the lack of an appropriate ttZ data control sample. The maximum estimated uncertainty found for either of the two processes is taken to be the uncertainty in the modelling of the kinematics for the ttZ process. This uncertainty ranges between 16% and 39%, depending on the signal sample, and is included under the label of "Kinematic closure" along with the ttZ prediction and all other associated uncertainties in Tables 3 and 4.

Results and interpretation
The predicted distributions of discriminator values for the various T2tt and T2bW searches described earlier are shown in Figs. 12 and 13. Event yields in data are plotted with their statistical uncertainties and compared to the SM background predictions. The latter are represented by the coloured histograms in the upper pane. Error bars on the ratios of the observed to predicted event yields in the bottom pane include only statistical uncertainties. The filled band in the lower pane of each plot represents the relative systematic uncertainty in the background predictions. A vertical dashed red line near the right edge in the lower pane of each plot marks the MVA discriminator value that is used to define the lower boundary of the search region. Note that these figures are for illustrative purposes only, and so some minor uncertainties in event yields in the more inclusive regions did not receive the detailed treatment applied to the uncertainties in the final search region yields.
The line in the lower pane of each plot in Figs. 12 and 13 labelled "MC without corr." represents the sum of the MC contributions, relative to the prediction, prior to weighting by the corrective scale factors discussed in the preceding sections. There are no statistically significant differences observed upon comparing the data with the uncorrected (or corrected) MC samples. Figures 14 and 15 provide a completely equivalent set of plots to those just described, but in this case, no lepton vetoes have been included in the selection of events. The event yields therefore are much higher in these cases. These data are used to provide a useful cross-check of the tt, W+jets, and single top kinematic closure test. They also allow for a check of the agreement in event kinematics between MC simulation and data, without any potential biases that might arise in association with the application of the lepton vetoes to the simulation. Only those data with discriminator values less than 0.4 are used for these cross-checks because potential signal contamination could be non-negligible for larger discriminator values. Data and simulation agree within ±20% for all search regions.
The predicted and observed yields in the T2tt and T2bW search regions are summarized in Tables 5 and 6. No statistically significant excess in data is observed. We therefore use these results to set upper bounds on the production cross sections for the T2tt and T2bW families of signal models.   The signal yields and their corresponding efficiencies are estimated by applying the event selection criteria to simulated data samples. Systematic uncertainties in the signal selection effi-       ciencies are assessed as a function of the t and χ 0 1 masses, and as a function of the mass splitting parameter x in the case of the T2bW signal. The uncertainty in the jet energy scale (JES) has the largest impact on signal yield, followed by the b tagging efficiency uncertainty. The uncertainty associated with the parton distribution functions is evaluated by following the recommendation of the PDF4LHC group [113][114][115][116][117]. Uncertainties in the jet energy resolution, initial-state radiation, and integrated luminosity [73] are also included. For the T2tt channel, we assign three additional uncertainties. The first accounts for the difference observed in the performance of the CORRAL algorithm between the standard CMS full and fast detector simulations. This difference decreases with increasing top quark p T and so depends on the difference between m t and m χ 0 1 , reaching 20% for cases where m χ 0 1 is close to m t . The other two uncertainties each have a magnitude of 5% and cover the differences observed in parton shower (PS) algorithms (PYTHIA versus HERWIG) and top quark reconstruction efficiencies in data versus simulation. Table 7 lists the magnitude of each systematic uncertainty in signal points for which this search has sensitivity. For T2tt, the total systematic uncertainty is less than 15% for m t − m χ 0 1 > 300 GeV. In the absence of any significant observed excesses of events over predicted backgrounds in the various search regions, the modified frequentist CL S method [118][119][120] with a one-sided profile likelihood ratio test statistic is used to define 95% confidence level (CL) upper limits on the production cross section for both the T2tt and T2bW simplified models as a function of the masses of the SUSY particles involved. Statistical uncertainties related to the observed numbers of events are modelled as Poisson distributions. Systematic uncertainties in the background predictions and signal selection efficiencies are assumed to be multiplicative and are modelled with log-normal distributions.
For each choice of SUSY particle masses, the search region with the highest expected sensitivity (Fig. 7) is chosen to calculate an upper limit for the production cross section. The expected and observed upper limits in the production cross section for both the T2tt and T2bW topologies in the m t − m χ 0 1 plane are displayed in Fig.16. For the T2tt topology this search is sensitive to models with m t < 775 GeV, or 755 GeV when conservatively subtracting one standard deviation of the theoretical uncertainty, and provides the most stringent limit to date for proton-proton collisions at √ s = 8 TeV on this simplified model for m t > 600 GeV. Sensitivity extends to models with m χ 0 1 < 290 GeV and this search is especially sensitive to the case of large m t and low m χ 0 1 for which events typically have both large E miss T and a high CORRAL top pair reconstruction efficiency. In contrast, the analysis has no sensitivity to models with m t − m χ 0 1 < 200 GeV despite the large cross section of some signal scenarios.
This search is considerably less sensitive to the T2bW topology because that model does not feature on-shell top quark decays. The sensitivity in this case applies to scenarios with m t < 650 GeV, with the strongest results for large x models for which m χ ± is closer to m t than m χ 0 1 , resulting in a harder E miss T spectrum. For scenarios with x = 0.25 the search has less sensitivity to models with m χ 0 1 ≈ 0 GeV than to those with moderate m χ 0 1 . In the former case the χ ± and W boson are close in mass and the signal has a low efficiency to pass the baseline selection's E miss T criterion. The search also has less sensitivity to models with m χ 0 1 + m W ≈ m χ ± because in this scenario the signal has a low efficiency to pass the baseline selection's jet-multiplicity criterion.

Summary
We report a search for the direct pair production of top squarks in an all-hadronic final state containing jets and large missing transverse momentum. Two decay channels for the top squarks are considered. In the first channel, each top squark decays to a top quark and a neutralino, whereas in the second channel they each decay to a bottom quark and a chargino, with the chargino subsequently decaying to a W boson and a neutralino. A dedicated top quark pair reconstruction algorithm provides efficient identification of hadronically decaying top quarks. The search is carried out in several search regions based on the output of multivariate discriminators, where the standard model background yield is estimated with corrected simulation samples and validated in data control regions. The observed yields are statistically compatible with the standard model estimates and are used to restrict the allowed parameter space for these two signal topologies. The search is particularly sensitive to the production of top squarks that decay via an on-shell top quark. For models predicting such decays, a 95% CL lower limit of 755 GeV is found for the top squark mass when the neutralino is lighter than 200 GeV, extending the current limits based on Run 1 searches at the LHC on these models by 50-100 GeV. In models with top squarks that decay via a chargino, scenarios with a top squark mass up to 620 GeV are excluded.   Figure 16: Observed and expected 95% CL limits on the t t production cross section and exclusion areas in the m t − m χ 0 1 plane for the T2tt (top left) and T2bW signal topologies (with x = 0.25, 0.50, 0.75). In the rare cases in which a statistical fluctuation leads to zero signal events for a particular set of masses, the limit is taken to be the average of the limits obtained for the neighboring bins. The ±1σ theory lines indicate the variations in the excluded region due to the uncertainty in the theoretical prediction of the signal cross section.  [45] ATLAS Collaboration, "Search for direct top squark pair production in final states with one isolated lepton, jets, and missing transverse momentum in √ s = 7 TeV pp collisions using 4.7 fb −1 of ATLAS data", Phys. Rev. Lett [54] CMS Collaboration, "Search for top squark and higgsino production using diphoton Higgs boson decays", Phys. Rev. Lett  [65] D0 Collaboration, "Search for pair production of the scalar top quark in the electron+muon final state", Phys. Lett [69] CMS Collaboration, "Description and performance of track and primary-vertex reconstruction with the CMS tracker", JINST 9 (2014) P10009, doi:10.1088/1748-0221/9/10/P10009, arXiv:1405.6569.