Search for a scalar partner of the top quark in the jets plus missing transverse momentum final state at $\sqrt{s}$=13 TeV with the ATLAS detector

A search for pair production of a scalar partner of the top quark in events with four or more jets plus missing transverse momentum is presented. An analysis of 36.1 fb$^{-1}$ of $\sqrt{s}$=13 TeV proton-proton collisions collected using the ATLAS detector at the LHC yields no significant excess over the expected Standard Model background. To interpret the results a simplified supersymmetric model is used where the top squark is assumed to decay via $\tilde{t}_1 \rightarrow t^{(*)} \tilde\chi^0_1$ and $\tilde{t}_1\rightarrow b\tilde\chi^\pm_1 \rightarrow b W^{(*)} \tilde\chi^0_1$, where $\tilde\chi^0_1$ ($\chi^\pm_1$) denotes the lightest neutralino (chargino). Exclusion limits are placed in terms of the top-squark and neutralino masses. Assuming a branching ratio of 100% to $t \tilde\chi^0_1$, top-squark masses in the range 450-950 GeV are excluded for $\tilde\chi^0_1$ masses below 160 GeV. In the case where $m_{\tilde{t}_1}\sim m_t+m_{\tilde\chi^0_1}$, top-squark masses in the range 235-590 GeV are excluded.

Search for a scalar partner of the top quark in the jets plus missing transverse momentum final state at √ s = 13 TeV with the ATLAS detector The ATLAS Collaboration A search for pair production of a scalar partner of the top quark in events with four or more jets plus missing transverse momentum is presented. An analysis of 36.1 fb −1 of √ s = 13 TeV proton-proton collisions collected using the ATLAS detector at the LHC yields no significant excess over the expected Standard Model background. To interpret the results a simplified supersymmetric model is used where the top squark is assumed to decay viat 1 → t ( * )χ 0 1 andt 1 → bχ ± 1 → bW ( * )χ 0 1 , whereχ 0 1 (χ ± 1 ) denotes the lightest neutralino (chargino). Exclusion limits are placed in terms of the top-squark and neutralino masses. Assuming a branching ratio of 100% to tχ 0 1 , top-squark masses in the range 450−1000 GeV are excluded forχ 0 1 masses below 160 GeV. In the case where mt 1 ∼ m t + mχ0 1

Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6] is an extension of the Standard Model (SM) that can resolve, for example, the gauge hierarchy problem [7][8][9][10] by introducing supersymmetric partners of the known bosons and fermions. The SUSY partner to the top quark, the top squark (t), plays an important role in cancelling potentially large top-quark loop corrections in the Higgs boson mass. The superpartners of the left-and right-handed top quarks,t L andt R , mix to form the two mass eigenstatest 1 andt 2 , wheret 1 is the lighter one. Throughout this paper it is assumed that the analysis is only sensitive tot 1 .
In R-parity-conserving SUSY models [11], the supersymmetric partners are produced in pairs. Top squarks are produced by strong interactions through quark-antiquark (qq) annihilation or gluon-gluon fusion, and the cross section of direct top-squark pair production is largely decoupled from the specific choice of SUSY model parameters [12][13][14][15]. The decay of the top squark depends on the mixing of the superpartners of left-and right-handed top quarks, the masses of the top superpartner, and the mixing parameters of the fermionic partners of the electroweak and Higgs bosons. The mass eigenstates of the partners of electroweak gauge and Higgs bosons (binos, winos, higgsinos) are collectively known as charginos,χ ± i , i = 1, 2, and neutralinos,χ 0 i , i = 1, ..., 4, whereχ 0 1 is assumed to be the lightest supersymmetric particle (LSP) which is stable and a dark-matter candidate [16,17]. For the models considered, eitherχ 0 2 orχ ± 1 is assumed to be the next lightest supersymmetric particle (NLSP). Three different decay scenarios are considered in this search: (a) both top squarks decay viat 1 → t ( * )χ 0 1 , (b) at least one of the top squarks decays viat 1 → bχ  [18][19][20] where only one or two decay steps are allowed. In the case with two allowed decays, referred to later in this paper as a natural SUSY-inspired mixed grid, the mass splitting between theχ ± 1 and theχ 0 1 , ∆m(χ ± 1 ,χ 0 1 ), is assumed to be 1 GeV. A grid of signal samples is generated across the plane of the top-squark andχ 0 1 masses with a grid spacing of 50 GeV across most of the plane, assuming maximal mixing between the partners of the left-and right-handed top quarks. In both the oneand two-step decay scenarios the LSP is considered to be a pure bino state. Additionally, results are interpreted in two slices of phenomenological MSSM (pMSSM) [21,22] models, referred to as wino-NLSP and well-tempered neutralino pMSSM models in the remainder of this paper. The pMSSM models are based on the more general MSSM [23,24] but with the additional requirements of no new sources of CP violation and flavour-changing neutral currents, as well as first-and second-generation sfermion mass and trilinear coupling degeneracy. Finally, results are also interpreted in a simplified model which is inspired by the pMSSM and is referred to as non-asymptotic higgsino. Details of the models that are used in the various interpretations are given in Section 9.
In addition to direct pair production, top squarks can be produced indirectly through gluino decays, as shown in Figure 1 (d). This search considers models where the mass difference between the top squark and the neutralino is small, i.e. ∆m(t 1 ,χ 0 1 ) = 5 GeV. In this scenario, the jets originating from thet 1 decays have momenta below the experimental acceptance, resulting in a signature nearly identical to that oft 1 → tχ 0 1 signal models (Figure 1(a)). This paper presents the search for top-squark pair production using a time-integrated luminosity of 36.1 fb −1 of proton-proton (pp) collisions data provided by the Large Hadron Collider (LHC) at a centreof-mass energy of √ s = 13 TeV. The data were collected by the ATLAS detector in 2015 and 2016.  Figure 1: The decay topologies of the signal models considered with experimental signatures of four or more jets plus missing transverse momentum. Decay products that have transverse momenta below detector thresholds are designated by the term "soft".
All-hadronic final states with at least four jets and large missing transverse momentum 1 (p miss T , whose magnitude is referred to as E miss T ) are considered, and the results are interpreted according to a variety of signal models as described above. Signal regions are defined to maximize the experimental sensitivity over a large region of kinematic phase space. Sensitivity to high top-squark masses ∼ 1000 GeV (as in Figure 1(a)) and top squarks produced through gluino decays (as in Figure 1 (d)) is achieved by exploiting techniques designed to reconstruct top quarks that are Lorentz-boosted in the lab frame. The dominant SM background process for this kinematic region is Z → νν produced in association with jets initiated by heavy-flavour quarks (heavy-flavour jets). The sensitivity to the decay into bχ ± 1 is enhanced by vetoing events containing hadronically decaying top-quark candidates to reduce the tt background, leaving Z → νν as the largest SM background. Sensitivity to the region where mt 1 − mχ0 1 ∼ m t , which typically has relatively low-p T final-state jets and low E miss T , is achieved by exploiting events in which high-p T jets from initial-state radiation (ISR) boosts the di-top-squark system in the transverse plane. For this regime, tt production gives the dominant background contribution. Similar searches based on √ s = 8 TeV and √ s = 13 TeV data collected at the LHC have been performed by both the ATLAS [25-28] and CMS [29-33] collaborations. 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.
The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). Angular distance is measured in units of ∆R ≡ (∆η) 2 + (∆φ) 2 . The transverse momentum is the momentum component in the transverse plane.

ATLAS detector
The ATLAS experiment [34] at the LHC is a multi-purpose particle detector with a cylindrical forwardbackward and φ-symmetric geometry and an approximate 4π coverage in solid angle. It consists of an inner tracking detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer. The inner tracking detector covers the pseudorapidity range |η| < 2.5. It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. The newly installed innermost layer of pixel sensors [35] was operational for the first time during the 2015 data-taking. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A hadron (steel/scintillator-tile) calorimeter covers the central pseudorapidity range (|η| < 1.7). The end-cap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to |η| = 4.9. The muon spectrometer surrounds the calorimeters and features three large air-core toroidal superconducting magnets with eight coils each, providing coverage up to |η| = 2.7. The field integral of the toroids ranges between 2.0 and 6.0 Tm across most of the detector. It includes a system of precision tracking chambers and fast detectors for triggering.

Trigger and data collection
The data were collected from August to November 2015 and April to October 2016 at a pp centre-ofmass energy of 13 TeV with 25 ns bunch spacing. A two-level trigger system [36] is used to select events. The first-level trigger is implemented in hardware and uses a subset of the detector information to reduce the event rate to at most 100 kHz. This is followed by a software-based trigger that reduces the accepted event rate to 1 kHz for offline storage.
In all search regions, a missing transverse momentum trigger, which is fully efficient for offline calibrated E miss T > 250 GeV in signal events, was used to collect data events.
Data samples enriched in the major sources of background were collected with electron or muon triggers. The electron trigger selects events based on the presence of clusters of energy in the electromagnetic calorimeter, with a shower shape consistent with that expected for an electron, and a matching track in the tracking system. The muon trigger selects events containing one or more muon candidates based on tracks identified in the muon spectrometer and inner detector. The electron and muon triggers used are more than 99% efficient for isolated electrons and muons with p T above 28 GeV.
Triggers based on the presence of high-p T jets were used to collect data samples for the estimation of the multijet and all-hadronic tt background. The jet p T thresholds ranged from 20 to 400 GeV. In order to stay within the bandwidth limits of the trigger system, only a fraction of the events passing these triggers was recorded to permanent storage.

Simulated event samples and signal modelling
Simulated events are used to model the SUSY signal and to aid in the description of the background processes. Signal models were all generated with MG5_aMC@NLO 2.2-2.4 [37] interfaced to PY-THIA8 [38] for the parton showering (PS) and hadronization and with EvtGen 1.2.0 [39] for the b-and c-hadron decays. The matrix element (ME) calculation was performed at tree level and includes the emission of up to two additional partons for all signal samples. The parton distribution function (PDF) set used for the generation of the signal samples is NNPDF2.3LO [40] with the A14 [41] set of tuned underlying-event and shower parameters (UE tune). The ME-PS matching was performed with the CKKW-L [42] prescription, with a matching scale set to one quarter of the mass of thet 1 , org for the gluino pair production model. All signal cross sections were calculated to next-to-leading order in the strong coupling constant, adding the resummation of soft-gluon emission at next-to-leading-logarithm accuracy (NLO+NLL) [12][13][14]. The nominal cross section and the uncertainty were taken from an envelope of cross-section predictions using different PDF sets and factorization and renormalization scales, as described in Ref. [15]. For pMSSM models, the sparticle mass spectra were calculated with Softsusy 3.7.3 [43,44]  . The simulated events were reweighted to match the distribution of the number of pp interactions per bunch crossing in data. Corrections were applied to the simulated events to correct for differences between data and simulation for the lepton-trigger and reconstruction efficiencies, momentum scale, energy resolution, isolation, and for the efficiency of identifying jets containing b-hadrons, together with the probability for mis-tagging jets containing only light-flavour and charm hadrons.

Event reconstruction
Events are required to have a primary vertex [62] reconstructed from at least two tracks with p T > 400 MeV. Among the vertices found, the vertex with the largest summed p 2 T of the associated tracks is chosen.
Jets are reconstructed from three-dimensional topological clusters of noise-suppressed calorimeter cells [63] using the anti-k t jet algorithm [64, 65] with a radius parameter R = 0.4. An area-based correction is applied to account for energy from additional pp collisions based on an estimate of the pile-up activity in a given event [66]. Calibrated [67] jet candidates are required to have p T > 20 GeV and |η| < 2.8. Events containing jets arising from non-collision sources or detector noise [68] are removed ("no bad jets" requirement). Additional selections based on track information are applied to jets with p T < 60 GeV and |η| < 2.4 to reject jets that originate from pile-up interactions [69].
Jets containing b-hadrons and which are within the inner detector acceptance (|η| < 2.5) are identified (as b-tagged jets) with a multivariate algorithm that exploits the impact parameters of the charged-particle tracks, the presence of secondary vertices, and the reconstructed flight paths of b-and c-hadrons inside the jet [70][71][72]. The output of the multivariate algorithm is a single b-tagging weight which signifies the likelihood of a jet containing b-hadrons. The average identification efficiency of jets containing b-hadrons is 77% as determined in simulated tt events. A rejection factor of approximately 130 is reached for jets initiated by light quarks and gluons and 6 for jets initiated by charm quarks.
Electron candidates are reconstructed from clusters of energy deposits in the electromagnetic calorimeter that are matched to a track in the inner detector. They are required to have |η| < 2.47, p T > 7 GeV and must pass a variant of the "very loose" likelihood-based selection [73,74]. The electromagnetic shower of an electron can also form a jet such that a procedure is required to resolve this ambiguity. In the case where the separation between an electron candidate and a non-b-tagged (b-tagged) jet is ∆R < 0.2, 2 the candidate is considered to be an electron (b-tagged jet). If the separation between an electron candidate and any jet satisfies 0.2 < ∆R < 0.4, the candidate is considered to be a jet, and the electron candidate is removed.
Muons are reconstructed by matching tracks in the inner detector to tracks in the muon spectrometer and are required to have |η| < 2.7 and p T > 6 GeV. If the separation between a muon and any jet is ∆R < 0.4, the muon is omitted. Events containing muons identified as originating from cosmic rays (|d 0 | > 0.2 mm and |z 0 | > 1 mm) or as poorly reconstructed (σ(q/p)/|(q/p)| > 0.2) are removed ("cosmic and bad muon" requirement). Here, d 0 is the transverse impact parameter of a track with respect to the primary vertex, z 0 is the distance of this point from the primary vertex projected onto the z-axis, and σ(q/p)/|(q/p)| provides a measure of the momentum uncertainty for a particle with charge q.
The p miss T vector is the negative vector sum of the p T of all selected and calibrated electrons, muons, and jets in the event. An extra term is added to account for small energy depositions in the event that are not associated with any of the selected objects. This "soft" term is calculated from inner detector tracks with p T > 400 MeV matched to the primary vertex, to make it resilient to pile-up contamination, not associated with physics objects [75]. The missing transverse momentum from the tracking system (denoted by p miss,track T , with magnitude E miss,track T ) is computed from the vector sum of the reconstructed inner detector tracks with p T > 400 MeV, |η| < 2.5, that are associated with the primary vertex in the event. The p miss,track T and E miss,track T are used to reject events with large calorimeter-based E miss T due to pile-up contamination or jet energy mismeasurements. These events, where the p miss,track T tends to not be aligned with the p miss T and the E miss T tends to be much larger than the E miss,track T , are rejected by requiring that the ∆φ between the p miss T and p miss,track T is less than π/3 and that the E miss,track T > 30 GeV.
The requirements on electrons and muons are tightened for the selection of events in background control regions (described in Section 7) containing leptons. Electron and muon candidates are required to have p T > 20 GeV (p T > 28 GeV) for regions using the E miss T (lepton) triggers and to satisfy p T -dependent track-and calorimeter-based isolation criteria. The calorimeter-based isolation is determined by taking the ratio of the sum of energy deposits in a cone of R = 0.2 around the electron or muon candidate and the energy deposits associated with the electron and muon. The track-based isolation is estimated in a similar way but using a variable cone size with a maximum value of R = 0.2 for electrons and R = 0.3 for muons. An isolation requirement is made that is 95% efficient for electron or muon candidates with p T = 25 GeV and 99% for candidates with p T = 60 GeV.
Electron candidates are required to pass a "tight" likelihood-based selection [73]. The impact parameter of the electron in the transverse plane with respect to the reconstructed event primary vertex is required to be less than five times the impact parameter uncertainty (σ d 0 ). The impact parameter along the beam direction, |z 0 × sin θ|, is required to be less than 0.5 mm. Further selection criteria are also imposed on reconstructed muons: muon candidates are required to pass a "medium" quality selection [76]. In addition, the requirements |d 0 | < 3σ d 0 and |z 0 × sin θ| < 0.5 mm are imposed for muon candidates.

Signal region definitions
The main experimental signature for all signal topologies is the presence of multiple jets (two of which are b-tagged), no muons or electrons, and significant missing transverse momentum.
Five sets of signal regions (SRA-E) are defined to target each topology and kinematic regime. SRA (SRB) is sensitive to production of high-masst 1 pairs with large (intermediate) ∆m(t 1 ,χ 0 1 ). Both SRA and SRB employ top-mass reconstruction techniques to reject background. SRC is designed for the highly compressed region with ∆m(t 1 ,χ 0 1 ) ∼ m t . In this signal region, initial-state radiation (ISR) is used to improve sensitivity to these decays. SRD is targeted att 1 → bχ ± 1 decays, where no top-quark candidates are reconstructed. SRE is optimized for scenarios with highly boosted top quarks that can occur in gluino-mediated top-squark production.
A common preselection is defined for all signal regions. At least four jets are required, of which at least one must be b-tagged. The four leading jets (ordered in p T ) must satisfy p 0−3 T > 80, 80, 40, 40 GeV due to the tendency for signal events to have more energetic jets than background. Events containing reconstructed electrons or muons are vetoed. The E miss T trigger threshold motivates the requirement E miss T > 250 GeV and rejects most of the background from multijet and all-hadronic tt events. In order to reject events with mismeasured E miss T originating from multijet and hadronic tt decays, an angular separation between the azimuthal angle of the two highest-p T jets and the p miss

Signal Regions A and B
SRA and SRB are targeted at direct top-squark pair production where the top squarks decay viat 1 → tχ  The decay products of the tt system in the all-hadronic decay mode can often be reconstructed as six distinct R = 0.4 jets. The transverse shape of these jets is typically circular with a radius equal to this radius parameter, but when two of the jets are less than 2R apart in η-φ space, the one-to-one correspondence of a jet with a top-quark daughter may no longer hold. Thus, the two hadronic top candidates are reconstructed by applying the anti-k t clustering algorithm [64] to the R = 0.4 jets, using reclustered radius parameters of R = 0.8 and R = 1.2. Two R = 1.2 reclustered jets are required; the mass of the highest-p T R = 1.2 reclustered jet is shown in Figure 2(a). The events are divided into three categories based on the resulting R = 1.2 reclustered jet masses ordered in p T , as illustrated in Figure 3: the "TT" category includes events with two top candidates, i.e. with masses m 0 jet,R=1.2 > 120 GeV and m 1 jet,R=1.2 > 120 GeV; the "TW" category contains events with one top candidate and a W candidate, i.e. where m 0 jet,R=1.2 > 120 GeV and 60 < m 1 jet,R=1.2 < 120 GeV; and the "T0" category represents events with only one top candidate, i.e. where m 0 jet,R=1.2 > 120 GeV and m 1 jet,R=1.2 < 60 GeV. Since the signal-to-background ratio is different in each of these categories, they are optimized individually for SRA and SRB.
The most powerful discriminating variable against SM tt production is the E miss T value, which for the signal results from the undetectedχ 0 1 neutralinos. Substantial tt background rejection is provided by additional requirements that reject events in which one W boson decays via a charged lepton plus neutrino. The first requirement is that the transverse mass (m T ) calculated from the E miss T and the b-tagged jet with minimum distance in φ to the p miss T direction is above 200 GeV: since its upper bound (ideally, without consideration of resolution effects) is below the top-quark mass for the tt background, as illustrated in Figure 2(b). An additional requirement is made on the mass of the leading (in p T ) R = 0.8 reclustered jet to be consistent with a W candidate: m 0 jet,R=0.8 > 60 GeV. Additionally, requirements on the stransverse mass (m χ 2 T2 ) [77, 78] are made which are especially powerful in the T0 category where a χ 2 method is applied to reconstruct top quarks with lower momenta where reclustering was suboptimal. The m χ 2 T2 variable is constructed from the direction and magnitude of the p miss T vector in the transverse plane as well as the direction of two top-quark candidates reconstructed using a χ 2 method. The minimization in this method is done in terms of a χ 2 -like penalty function, χ 2 = (m cand −m true ) 2 /m true , where m cand is the candidate mass and m true is set to 80.4 GeV for W candidates and 173.2 GeV for top candidates. Initially, single or pairs of R = 0.4 jets form W candidates which are then combined with additional b-tagged jets in the event to construct top candidates. The top candidates selected by the χ 2 method are only used for the momenta in m χ 2 T2 while the mass hypotheses for the top quarks and the invisible particles are set to 173.2 GeV and 0 GeV, respectively. Finally, a "τ-veto" requirement is applied to reject semi-hadronically decaying τ-lepton candidates likely to have originated from a W → τν decay. Here, events that contain a non-b-tagged jet within |η| < 2.5 with fewer than four associated charged-particle tracks with p T > 500 MeV, and where the ∆φ between the jet and the p miss T is less than π/5, are vetoed. The systematic uncertainties for this requirement are found to be negligible [25]. In SRB, additional discrimination is provided by m b,max T and ∆R(b, b). The former quantity is analogous to m b,min T except that the transverse mass is computed with the b-tagged jet that has the largest ∆φ with respect to the p miss T direction. The latter quantity provides additional discrimination against background where the two jets with highest b-tagging weights originate from a gluon splitting. Table 1 summarizes the selection criteria that are used in these two signal regions. The categories are statistically combined within SRA and SRB to maximize the sensitivity to signal. > 120 GeV [60, 120] GeV < 60 GeV Signal Regions C SRC is optimized for direct top-squark pair production where ∆m(t 1 ,χ 0 1 ) ≈ m t , a regime in which the signal topology is similar to SM tt production. In the presence of high-momentum ISR, which can be reconstructed as multiple jets forming an ISR system, the di-top-squark system is boosted in the transverse plane. The ratio of the E miss T to the p T of the ISR system in the centre-of-mass (CM) frame of the entire (ISR plus di-top-squark) system (p ISR T ), defined as R ISR , is proportional to the ratio of theχ 0 1 andt 1 masses [79, 80]: A "recursive jigsaw reconstruction technique", as described in Ref.
[81], is used to divide each event into an ISR hemisphere and a sparticle hemisphere, where the latter consists of the pair of candidate top squarks, each of which decays via a top quark and aχ 0 1 . Objects are grouped together based on their proximity in the lab frame's transverse plane by minimizing the reconstructed transverse masses of the ISR system and sparticle system simultaneously over all choices of object assignment. Kinematic variables are then defined based on this assignment of objects to either the ISR system or the sparticle system. This method is equivalent to grouping the event objects according to the axis of maximum backto-back p T in the event's CM frame where the p T of all accepted objects sums vectorially to zero. In events with a high-p T ISR gluon, the axis of maximum back-to-back p T , also known as the thrust axis, approximates the direction of the ISR and sparticles' back-to-back recoil.
The selection criteria for this signal region are summarized in Table 2. The events are divided into five windows (SRC1-5) defined by non-overlapping ranges of the reconstructed R ISR , which target different top-squark andχ 0 1 masses: e.g., SRC2 is optimized for mt 1 = 300 GeV and mχ0 1 = 127 GeV, and SRC4 is optimized for mt 1 = 500 GeV and mχ0 1 = 327 GeV. At least five jets must be assigned to the sparticle hemisphere of the event (N S jet ), and at least one of those jets (N S b−jet ) must be b-tagged. Transversemomentum requirements on p ISR T , the highest-p T b-jet in the sparticle hemisphere (p 0,S T,b ), and the fourthhighest-p T jet in the sparticle hemisphere (p 4,S T ) are applied. The transverse mass formed by the sparticle system and the E miss T , defined as m S , is required to be > 300 GeV. The ISR system is also required to be separated in azimuth from the p miss T in the CM frame; this variable is defined as ∆φ(ISR, p miss T ). Similarly to the categories defined for SRA and SRB, the individual SRCs are statistically combined to improve signal sensitivity. Table 2: Selection criteria for SRC, in addition to the common preselection requirements described in the text. The signal regions are separated into windows based on ranges of R ISR . = 100 GeV, respectively. Tighter leading and sub-leading jet p T requirements are made for SRD-high, as summarized in Table 3. Table 3: Selection criteria for SRD, in addition to the common preselection requirements described in the text.

Variable
SRD-low SRD-high Signal Region E SRE is designed for models which have highly boosted top quarks. Such signatures can arise from direct pair production of high-mass top partners, or from the gluino-mediated compressedt 1 scenario with large ∆m(g,t 1 ). In this regime, reclustered jets with R = 0.8 are utilized to optimize the experimental sensitivity to these highly boosted top quarks. In this signal region, at least two jets out of the four or more required jets must be b-tagged. Additional discrimination is provided by the E miss  Table 4.

Background estimation
The main SM background process in SRA, SRB, SRD, and SRE is Z → νν production in association with heavy-flavour jets. The second most significant background is tt production where one W boson decays via a lepton and neutrino and the lepton (particularly a hadronically decaying τ lepton) is either not identified or is reconstructed as a jet. This process gives the major background contribution in SRC and an important background in SRB, SRD and SRE as well. Other important background processes are W → ν plus heavy-flavour jets, single top quark, and the irreducible background from tt + Z, where the Z boson decays into two neutrinos.
The main background contributions are estimated primarily from comparisons between data and simulation outside the signal regions. Control regions (CRs) are designed to enhance a particular background process, and are orthogonal to the SRs while probing a similar event topology. The CRs are used to normalize the simulation to data, but extrapolation from the CR to the SR is taken from simulation. Sufficient Table 4: Selection criteria for SRE in addition to the common preselection requirements described in the text.

Variable SRE
data are needed to avoid large statistical uncertainties in the background estimates, and the CR definitions are chosen to be kinematically as close as possible to all SRs, to minimize the systematic uncertainties associated with extrapolating the background yield from the CR to the SR. Where CR definitions are farther from the SR definition, validation regions are employed to cross-check the extrapolation. In addition, control-region selection criteria are chosen to minimize potential contamination from signal that could shadow contributions in the signal regions. The signal contamination is below 8% in all CRs for all signal points that have not been excluded by previous ATLAS searches. No significant difference in the background estimates was found between the case where only SM backgrounds were considered and when signal is included in the estimation. As the CRs are not 100% pure in the process of interest, the cross-contamination between CRs from other processes is estimated. The normalization factors and the cross-contamination are determined simultaneously for all regions using a fit described below.
Detailed CR definitions are given in Tables 5, 6, and 7. They are used for the Z (CRZs), tt (CRTs), W (CRW), single top (CRST), and tt+Z (CRTTGamma) background estimation. The ∆φ jet 0,1,2 , p miss T and m T ( , E miss T ) requirements are designed to reduce contamination from SM multijet processes . The number of leptons (from this point on, lepton is used to mean electron or muon) is indicated by N and the transverse momentum of the lepton is indicated by p T . In all one-lepton CRs, once the trigger and minimum p T selection are applied, the lepton is treated as a non-b-tagged jet (to emulate the hadronic τ decays in the SRs) in the computation of all jet-related variables. In the two-lepton CRZs, a lepton-p T requirement of at least 28 GeV is made to ensure the trigger selection is fully efficient. The invariant mass of the two oppositely charged leptons, denoted by m , must be consistent with the leptons having originated from a Z boson. The transverse momenta of these leptons are then vectorially added to the p miss T to mimic the Z → νν decays in the SRs, forming the quantity E miss ). Requirements such as the maximum m T ( , E miss T ) and the minimum ∆R between the two highest-weight b-tagged jets and the lepton, ∆R (b, ) min , are used to enforce orthogonality between CRT, CRW, and CRST. In CRST, the requirement on the ∆R between the two highest-weight b-tagged jets, ∆R (b, b), is used to reject tt contamination from the control region enriched in single-top events. Finally, the normalization of the tt+W/Z background in the signal region, which is completely dominated by tt + Z(→ νν), is estimated with a tt + γ control region in a way similar to the method described in Ref. [27]. The same lepton triggers and lepton-p T requirements are used for the tt + γ control region as in the CRZs. Additionally, the presence of an isolated photon with p T > 150 GeV is required and it is used to model the Z decay in the signal regions because of the similarity between the diagrams for photon and Z production. Similarly to the Z control region, the photon is used in the estimation of E miss T -related variables.
To estimate the Z + jets and tt background in the different kinematic regions of the signal regions, individual control regions are designed for all signal regions where possible. Only if the statistical power of control regions is low, are they merged to form one control region for multiple signal regions. In the case of CRST, CRW, and CRTTGamma, this results in the use of one common CR for all signal regions. Distributions from the Z + jets, tt, W + jets, single top, and ttγ control regions are shown in Figure 4.
Contributions from all-hadronic tt and multijet production are found to be negligible. These are estimated from data using a procedure described in Ref.
[82]. The procedure determines the jet response from simulated dijet events, and then uses this response function to smear the jet response in low-E miss T events. The jet response is cross-checked with data where the E miss T can be unambiguously attributed to the mismeasurement of one of the jets. Diboson production, which is also subdominant, is estimated directly from simulation.

Simultaneous fit to determine SM background
The observed numbers of events in the various control regions are included in a binned profile likelihood fit [83] to determine the SM background estimates for Z, tt, W, single top, and tt+Z in each signal region. The normalizations of these backgrounds are determined simultaneously to best match the observed data in each control region, taking contributions from all backgrounds into account. A likelihood function is built as the product of Poisson probability density functions, describing the observed and expected numbers of events in the control regions [84]. This procedure takes common systematic uncertainties (discussed in Section 8) between the control and signal regions and their correlations into account as they are treated as nuisance parameters in the fit and are modelled by Gaussian probability density functions. The contributions from all other background processes (dibosons and multijets) are fixed at the values expected from the simulation, using the most accurate theoretical cross sections available, as described in Section 4, while their uncertainties are used as nuisance parameters in the fit.
Zero-lepton VRs (VRZAB, VRZD, VRZE) are designed to validate the background estimate for Z + jets in the signal regions. No VRZ is designed for SRC due to the negligible contribution of the Z background in this region. The definitions of the VRZs, after the common zero-lepton preselection discussed in Section 6 is applied, are shown in Table 8. To provide orthogonality to the signal regions, the requirement on one or more of the following variables is inverted : ∆R (b, b) A one-lepton validation region for the W + jets background (VRW) is used to test the W background estimates in all SRs. In this case the validation region is designed based on the definition of CRW. Compared to CRW, the requirement that differs is ∆R(b 0,1 , ) min , which is greater than 1.8 for the validation region. Two additional requirements are included in the definition of VRW, namely m b,min T > 150 GeV and m 0 jet,R=1.2 < 70 GeV. Signal contamination in all the validation regions for all considered signals that have not yet been excluded was also checked. The largest contamination found is ∼25% and occurs in the VRTs for top-squark masses below 350 GeV and in VRZD and VRZE near top-squark masses of 700 GeV. The result of the simultaneous fit procedure, which is repeated with the VRs used as test signal regions, for each VR is shown in Figure 5, which displays agreement between data and MC predictions.

Systematic uncertainties
Experimental and theoretical systematic uncertainties in the SM predictions and signal predictions are included in the profile likelihood fit described in Section 7.
Statistical uncertainties dominate the total uncertainties of the background predictions in all SRs except SRB. The dominant systematic uncertainties for SRA and SRB are shown in Table 9 while the systematic < 120 GeV - uncertainties for the remaining SRs are shown in Table 10. The uncertainties are shown as a relative uncertainty to the total background estimate. The main sources of detector-related systematic uncertainty in the SM background estimates are the jet energy scale (JES) and jet energy resolution (JER), b-tagging efficiency, E miss T soft term, and pile-up. The effect of the JES and JER uncertainties on the background estimates in the signal regions can reach 17%. The uncertainty in the b-tagging efficiency is nowhere more than 9%. All jet-and lepton-related uncertainties are propagated to the calculation of the E miss T , and additional uncertainties in the energy and resolution of the soft term are also included [75]. The uncertainty in the soft term of the E miss T is most significant in SRC5 at 15%. An uncertainty due to the pileup modelling is also considered, with a contribution up to 14%. Lepton reconstruction and identification uncertainties are also considered but have a small impact.
The uncertainty in the combined 2015+2016 integrated luminosity is 3.2%. It is derived, following a methodology similar to that detailed in Ref.
[85], from a preliminary calibration of the luminosity scale using x-y beam-separation scans performed in August 2015 and May 2016.
Theoretical uncertainties in the modelling of the SM background are estimated. For the W/Z + jets background processes, the modelling uncertainties are estimated using SHERPA samples by varying the renormalization and factorization scales, and the merging and resummation scales (each varied up and down by a factor of two). PDF uncertainties were found to have a negligible impact. The resulting impact on the total background yields from the Z + jets theoretical uncertainties is up to 3% while the uncertainties from the W + jets sample variations are less than 3%.   final states (comparing Powheg-Box+PYTHIA vs HERWIG++ and SHERPA). More details are given in Ref.
[54]. The largest impact of the tt theory systematic uncertainties on the total background yields arises for SRC and it varies from 11% to 71% by tightening the R ISR requirement. For the tt+W/Z background, the theoretical uncertainty is estimated through variations, in both tt+W/Z and ttγ MC simulation, including the choice of renormalization and factorization scales (each varied up and down by a factor of two), the choice of PDF, as well as a comparison between MC@NLO and OpenLoops+SHERPA generators, resulting in a maximum uncertainty of 2% in SRA-TT. The single-top background is dominated by the Wt subprocess. Uncertainties are estimated for the choice of parton-showering model (PYTHIA vs HERWIG++) and for the emission of additional partons in the initial-and final-state radiation. A 30% uncertainty is assigned to the single-top background estimate to account for the effect of interference between single-top-quark and tt production. This uncertainty is estimated by comparing yields in the signal and control regions for a sample that includes resonant and non-resonant WW+bb production with the sum of the yields of resonant tt and single-top+b production. The final single-top uncertainty relative to the total background estimate is up to 12%. The detector systematic uncertainties are also applied to the signal samples used for interpretation. Theoretical uncertainties in the signal cross section as described in Section 4 are treated separately and limits on top-squark and neutralino masses are given for the ±1σ values as well as the central cross section.
Signal systematic uncertainties due to detector and acceptance effects are taken into account. The main sources of these uncertainties are the JER, ranging from 3% to 6%, the JES, ranging from 2% to 5.7%, pile-up, ranging from 0.5% to 5.5% and from b-tagging efficiency, ranging from 3% to 5.5%. Uncertainties in the acceptance due to theoretical variations are taken into consideration. Those originate from variations of the QCD coupling constant α s , the variations of the renormalization and factorization scales, the CKKW matching scale at which the parton-shower description and the matrix-element description are separate and the parton-shower tune variations (each varied up and down by a factor of two). These uncertainties range across the SRs between 10% and 25% for thet 1 → t ( * )χ 0 1 grid, the mixed grid, the nonasymptotic higgsino grid, and theg → tt 1 → tχ 0 1 +soft grid. For the wino-NLSP model, they range from 15% to 20%, and for the well-tempered neutralino pMSSM model they range from 10% to 35%. Finally, the uncertainty in the estimated number of signal events which arises from the cross-section uncertainties for the various processes is taken into account by calculating two additional limits considering a ±1σ change in cross section. The cross-section uncertainty is ∼15-20% for direct top-squark production and ∼15-30% for gluino production [12][13][14][15] depending on the top-squark and gluino masses. Table 9: Dominant systematic uncertainties (greater than 1% for at least one SR) for SRA and SRB in percent relative to the total background estimates. The uncertainties due to the normalization from a control region for a given signal region and background are indicated by µ tt+Z , µ tt , µ Z , µ W , and µ single top . The theory uncertainties are the total uncertainties for a given background. Additionally, the uncertainty due to the number of MC events in the background samples is shown as "MC statistical".

Results and interpretation
The observed event yields are compared to the expected total number of background events in Tables 11, 12, 13, and Figure 6. The total background estimate is determined from a simultaneous fit to all control regions, based on a procedure described in Section 7 but including the corresponding signal regions as well as control regions. Figure 7 shows the distribution of E miss T , m , m T , R ISR , and H T for the various signal regions, with R ISR being shown combining SRC1-5. In these distributions, the background predictions are scaled to the values determined from the simultaneous fit. No significant excess above the SM prediction is observed in any of the signal regions. The smallest pvalues, which express the probability that the background fluctuates to the data or above, are 27%, 27%, and 29% for SRB-T0, SRD-high, and SRA-TT, respectively. The largest deficit in the data can be found in SRC4 where one event is observed while 7.7 background events were expected. The 95% confidence level (CL) upper limits on the number of beyond-the-SM (BSM) events in each signal region are derived using the CL s prescription [86, 87] and calculated from asymptotic formulae [83]. Model-independent limits on the visible BSM cross sections, defined as σ vis = S 95 obs / L dt, where S 95 obs is the 95% CL upper   Table 14.
The profile-likelihood-ratio test statistic is used to set limits on direct pair production of top squarks. The signal strength parameter is allowed to float in the fit [84], and any signal contamination in the CRs is taken into account. Again, limits are derived using the CL s prescription and calculated from asymptotic formulae. Orthogonal signal subregions, such as SRA-TT, SRA-TW, and SRA-T0, are statistically combined by multiplying their likelihood functions. A similar procedure is performed for the signal sub- regions in SRB and SRC. For the overlapping signal regions defined for SRD (SRD-low and SRD-high), the signal region with the smallest expected CL s value is chosen for each signal model. Once the signal subregions are combined or chosen, the signal region with the smallest expected CL s is chosen for each signal model in thet 1 -χ 0 1 signal grid. The nominal event yield in each SR is set to the mean background expectation to determine the expected limits; contours that correspond to ±1σ uncertainties in the background estimates (σ exp ) are also evaluated. The observed event yields determine the observed limits for each SR; these are evaluated for the nominal signal cross sections as well as for ±1σ theory uncertainties in those cross sections, denoted by σ SUSY theory . Figure 8 shows the observed (solid red line) and expected (solid blue line) exclusion contours at 95% CL in thet 1 -χ 0 1 mass plane for 36.1 fb −1 . The data excludes top-squark masses between 450 and 1000 GeV forχ 0 1 masses below 160 GeV, extending Run-1 limits from the combination of zero-and one-lepton channels by 260 GeV. Additional constraints are set in the case where mt 1 ≈ m t + mχ0  is set to 0%, 25%, 50%, and 75% and yield the limits shown in Figure 9.   ) where the average comes from possibly multiple production channels and on the number of signal events (S 95 obs ). The third column (S 95 exp ) shows the 95% CL upper limit on the number of signal events, given the expected number (and ±1σ excursions of the expected number) of background events. The two last columns indicate the CL B value, i.e. the confidence level observed for the background-only hypothesis, and the discovery p-value (p) and the corresponding significance (z). assumed, which correspond to scenarios with mq L3 < mt R (regardless of the choice of tanβ) and mt R < mq L3 with tanβ = 20, respectively. Here mq L3 represents the left-handed third-generation mass parameter and mt R is the mass parameter of the superpartner to the right-handed top-quark. Limits in the mt 1 and mχ0 1 plane are shown in Figure 10.  Figure 11. Only bottom and top-squark production are considered in this interpretation. Allowed decays in the top-squark production scenario aret 1 → tχ 0 2 → h/Zχ 0 1 , at a maximum branching ratio of 33%, andt 1 → bχ ± 1 . Whether theχ 0 2 dominantly decays into a h or Z is determined by the sign of µ. Along the diagonal region, thet 1 → tχ 0 1 decay with 100% branching ratio is also considered. The equivalent decays in bottom-squark production areb → tχ  The SRE results are interpreted for indirect top-squark production through gluino decays in terms of thẽ t 1 vsg mass plane with ∆m(t 1 ,χ 0 1 ) = 5 GeV. Gluino masses up to mg = 1800 GeV with mt 1 < 800 GeV are excluded as shown in Figure 13.       Figure 12: Observed (solid line) and expected (dashed line) exclusion contours at 95% CL as a function oft 1 and χ 0 1 masses for thet L scan (red) as well as for thet R scan (blue) in the well-tempered pMSSM model. Uncertainty bands correspond to the ±1σ variation of the expected limit.

Conclusions
Results from a search for top squark pair production based on an integrated luminosity of 36.1 fb −1 of √ s = 13 TeV pp collision data recorded by the ATLAS experiment at the LHC in 2015 and 2016 are presented. Top squarks are searched for in final states with high-p T jets and large missing transverse momentum. In this paper, direct top squark production is studied assuming top squarks decay viat 1 → t ( * )χ 0 1 with large or small mass differences between the top squark and the neutralino ∆m(t 1 ,χ         [74] ATLAS Collaboration, Electron identification measurements in ATLAS using √ s = 13 TeV data with 50 ns bunch spacing, (2015), url: https://cds.cern.ch/record/2048202.  [77] C. Lester and D. Summers, Measuring masses of semiinvisibly decaying particles pair produced at hadron colliders, Phys. Lett. B 463 (1999) 99, arXiv: hep-ph/9906349.