Search for supersymmetry in events with $b$-tagged jets and missing transverse momentum in $pp$ collisions at $\sqrt{s}=13$ TeV with the ATLAS detector

A search for the supersymmetric partners of the Standard Model bottom and top quarks is presented. The search uses 36.1 fb$^{-1}$ of $pp$ collision data at $\sqrt{s}=13$ TeV collected by the ATLAS experiment at the Large Hadron Collider. Direct production of pairs of bottom and top squarks ($\tilde{b}_{1}$ and $\tilde{t}_{1}$) is searched for in final states with $b$-tagged jets and missing transverse momentum. Distinctive selections are defined with either no charged leptons (electrons or muons) in the final state, or one charged lepton. The zero-lepton selection targets models in which the $\tilde{b}_{1}$ is the lightest squark and decays via $\tilde{b}_{1} \rightarrow b \tilde{\chi}^{0}_{1}$, where $\tilde{\chi}^{0}_{1}$ is the lightest neutralino. The one-lepton final state targets models where bottom or top squarks are produced and can decay into multiple channels, $\tilde{b}_{1} \rightarrow b \tilde{\chi}^{0}_{1}$ and $\tilde{b}_{1} \rightarrow t \tilde{\chi}^{\pm}_{1}$, or $\tilde{t}_{1} \rightarrow t \tilde{\chi}^{0}_{1}$ and $\tilde{t}_{1} \rightarrow b \tilde{\chi}^{\pm}_{1}$, where $\tilde{\chi}^{\pm}_{1}$ is the lightest chargino and the mass difference $m_{\tilde{\chi}^{\pm}_{1}}- m_{\tilde{\chi}^{0}_{1}}$ is set to 1 GeV. No excess above the expected Standard Model background is observed. Exclusion limits at 95\% confidence level on the mass of third-generation squarks are derived in various supersymmetry-inspired simplified models.


Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6] provides an extension of the Standard Model (SM) that solves the hierarchy problem [7][8][9][10] by introducing partners of the known bosons and fermions. In the framework of R-parity-conserving models, SUSY particles are produced in pairs and the lightest supersymmetric particle (LSP) is stable, providing a possible candidate for dark matter [11,12]. In a large variety of models the LSP is the lightest neutralino (χ 0 1 ). Naturalness considerations [13,14] suggest that the supersymmetric partners of the third-generation SM quarks are the lightest coloured supersymmetric particles. This may lead to the lightest bottom squark (b 1 ) and top squark (t 1 ) mass eigenstates 1 being significantly lighter than the other squarks and the gluinos. As a consequence,b 1 andt 1 could be pair-produced with relatively large cross-sections at the Large Hadron Collider (LHC). This paper presents a search for the direct pair production of bottom and top squarks decaying into final states with jets, two of them originating from the fragmentation of b-quarks (b-jets), and missing transverse momentum (p miss T , whose magnitude is referred to as E miss T ). The dataset analysed corresponds to 36.1 fb −1 of proton-proton (pp) collisions data at inspired by the minimal supersymmetric extension of the SM (MSSM) [15][16][17], where theb 1 exclusively decays asb 1 → bχ 0 1 or where two decay modes for the bottom (top) squark are allowed and direct decays to the LSP,b 1 → bχ 0 1 (t 1 → tχ 0 1 ) compete with decays via an intermediate chargino (χ ± 1 ) state, b 1 → tχ ± 1 (t 1 → bχ ± 1 ). In this case it is assumed that theχ ± 1 is the next-to-lightest supersymmetric particle (NLSP) and is almost degenerate withχ 0 1 , such that other decay products are too low in momentum to be efficiently reconstructed. The first set of models lead to final-state events from bottom squark pair production characterized by the presence of two b-jets, E miss T and no charged leptons ( = e, µ), referred to as the zero-lepton channel (Figure 1(a)). For mixed decays (direct or through an intermediate stage), the final state of bottom or top squark pair production depends on the branching ratios of the competing decay modes. If the decay modes are equally probable, a large fraction of the signal events are characterized by the presence of a top quark, a bottom quark, and neutralinos. Hadronic decays of the top quark are targeted by the zero-lepton channel, whilst novel dedicated selections requiring one charged lepton, two b-jets and E miss T are developed for semi-leptonic decays of the top quark, referred to as the one-lepton channel (Figure 1(b)). A statistical combination of the two channels is performed when interpreting the results in terms of exclusion limits on the third-generation squark masses.
Previous searches for the exclusive decayb 1 → bχ 0 1 with the √ s = 13 TeV LHC Run-2 dataset at ATLAS and CMS have set exclusion limits at 95% confidence level (CL) onb 1 masses in such scenarios [18,19]. Searches in the context of mixed-decay models were performed only by ATLAS using the Run-1 √ s = 8 TeV dataset and resulted in exclusion limits on the third-generation squark mass that depend on the branching ratios of the competing decay modes [20].

ATLAS detector
The ATLAS detector [21] is a multi-purpose particle physics detector with a forward-backward symmetric cylindrical geometry and nearly 4π coverage in solid angle. 2 The inner tracking detector consists of pixel and silicon microstrip detectors covering the pseudorapidity region |η| < 2.5, surrounded by a transition radiation tracker which enhances electron identification in the region |η| < 2.0. Between Run 1 and Run 2, a new inner pixel layer, the insertable B-layer [22], was added at a mean sensor radius of 3.3 cm. The inner detector is surrounded by a thin superconducting solenoid providing an axial 2 T magnetic field and by a fine-granularity lead/liquid-argon (LAr) electromagnetic calorimeter covering |η| < 3.2. A steel/scintillator-tile calorimeter provides hadronic coverage in the central pseudorapidity range (|η| < 1.7). The endcap and forward regions (1.5 < |η| < 4.9) of the hadronic calorimeter are made of LAr active layers with either copper or tungsten as the absorber material. An extensive muon spectrometer with an air-core toroidal magnet system surrounds the calorimeters. Three layers of highprecision tracking chambers provide coverage in the range |η| < 2.7, while dedicated fast chambers allow triggering in the region |η| < 2.4. The ATLAS trigger system consists of a hardware-based level-1 trigger followed by a software-based high-level trigger [23].

Data and simulated event samples
The data used in this analysis were collected by the ATLAS detector in pp collisions at the LHC with a centre-of-mass energy of 13 TeV and a 25 ns proton bunch crossing interval during 2015 and 2016. The full dataset corresponds to an integrated luminosity of 36.1 fb −1 after requiring that all detector subsystems were operational during data recording. The uncertainty in the combined 2015+2016 integrated luminosity is 3.2%. It is derived following a methodology similar to that detailed in Ref. [24] from a preliminary calibration of the luminosity scale using x-y beam-separation scans performed in August 2015 and May 2016. Each event includes on average 13.7 and 24.9 inelastic pp collisions ("pile-up") in the same bunch crossing in the 2015 and 2016 dataset, respectively. In the zero-lepton channel, events are required to pass an E miss T trigger [25]. This trigger is fully efficient for events passing the preselection defined in Section 5, which requires the offline reconstructed E miss T to exceed 200 GeV. Events in the onelepton channel, as well as events used for control regions, are selected online by a trigger requiring the presence of one electron or muon. The online selection thresholds are such that a plateau of the efficiency is reached for charged-lepton transverse momenta of 27 GeV and above.
Monte Carlo (MC) samples of simulated events are used to model the signal and to aid in the estimation of SM background processes, except multijet processes, which are estimated from data only.
All simulated samples were produced using the ATLAS simulation infrastructure [26] using GEANT4 [27], or a faster simulation [28] based on a parameterization of the calorimeter response and GEANT4 for the other detector systems. The simulated events are reconstructed with the same algorithm as that used for data. 2 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point in the centre of the detector.
The positive x-axis is defined by the direction from the interaction point to the centre of the LHC ring, with the positive y-axis pointing upwards, while the beam direction defines the z-axis. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity η is defined in terms of the polar angle θ by η = − ln tan(θ/2). Rapidity is defined as y = 0.5 ln[(E + p z )/(E − p z )] where E denotes the energy and p z is the component of the momentum along the beam direction. Several samples produced without detector simulation are employed to estimate systematic uncertainties associated with the specific configuration of the MC event generators used for the nominal SM background samples. They include variations of the renormalization and factorization scales, the CKKW-L matching scale, as well as different PDF sets and fragmentation/hadronization models. Details of the MC modelling uncertainties are discussed in Section 7.

Event reconstruction
The search for pair production of bottom and top squarks is based on two distinct selections of events with b-jets and large missing transverse momentum, with either no charged leptons in the final state, or requiring exactly one electron or muon (for details, see Section 5). For the zero-lepton channel selection, events containing charged leptons are explicitly vetoed in the signal and validation regions. Events characterized by the presence of exactly one electron or muon with transverse momentum above 27 GeV are retained in the one-lepton selection and are also used to define control regions for the zero-lepton channel. Finally, same-flavour opposite-sign (SFOS) two-lepton (electron or muon) events with dilepton invariant mass near the Z boson mass are used for control regions employed to aid in the estimation of the Z+jets background for the zero-lepton channel. The details of the reconstruction and selection, as well as the overlap removal procedure are given below.
Selected events are required to have a reconstructed primary vertex consistent with the beamspot envelope and to consist of at least two tracks in the inner detector with p T > 0.4 GeV. When more than one such vertex is found, the one with the largest sum of the squares of transverse momenta of associated tracks [55] is chosen.
Jet candidates are reconstructed from three-dimensional energy clusters [56] in the calorimeter using the anti-k t jet algorithm [57, 58] with a radius parameter of 0.4. The reconstructed jets are then calibrated to the particle level by the application of a jet energy scale (JES) derived from √ s = 13 TeV data and simulation [59]. Quality criteria are imposed to identify jets arising from non-collision sources or detector noise, and any event containing such a jet is removed [60]. Further track-based selections are applied to reject jets with p T < 60 GeV and |η| < 2.4 that originate from pile-up interactions [61], and the jet momentum is corrected by subtracting the expected average energy contribution from pile-up using the jet area method [62]. Jets are classified as "baseline" and "signal". Baseline jets are required to have p T > 20 GeV and |η| < 4.8. Signal jets, selected after resolving overlaps with electrons and muons, are required to pass the stricter requirement of p T > 35 GeV and |η| < 2.8.
Jets are identified as b-jets if tagged by a multivariate algorithm which uses information about the impact parameters of inner detector tracks matched to the jet, the presence of displaced secondary vertices, and the reconstructed flight paths of b-and c-hadrons inside the jet [63]. The b-tagging working point with a 77% efficiency, as determined in a sample of simulated tt events, was chosen as part of the optimization procedure. The corresponding rejection factors against jets originating from c-quarks and from light quarks and gluons at this working point are 6.2 and 134, respectively [64]. To compensate for differences between data and MC simulation in the b-tagging efficiencies and mis-tag rates, correction factors are derived from data and applied to the samples of simulated events [63]. Candidate b-jets are required to have p T > 20 GeV and |η| < 2.5.
Electron candidates are reconstructed from energy clusters in the electromagnetic calorimeter matched to a track in the inner detector and are required to satisfy a set of "loose" quality criteria [65][66][67]. They are also required to lie within the fiducial volume |η| < 2.47. Muon candidates are reconstructed by matching tracks in the inner detector with tracks in the muon spectrometer. Events containing one or more muon candidates that have a transverse (longitudinal) impact parameter with respect to the primary vertex larger than 0.2 mm (1 mm) are rejected to suppress muons from cosmic rays. Muon candidates are also required to satisfy "medium" quality criteria [68] and have |η| < 2.5. All electron and muon candidates must have p T > 10 GeV. Lepton candidates remaining after resolving overlaps with baseline jets (see next paragraph) are called "baseline" leptons. In the control and signal regions where lepton identification is required, "signal" leptons are chosen from the baseline set with p T > 27 GeV to ensure full efficiency of the trigger and are required to be isolated from other activity in the detector using a criterion designed to accept at least 95% of leptons from Z boson decays as detailed in Ref. [69]. The angular separation between the lepton and the b-jet arising from a semi-leptonically decaying top quark narrows as the top quark's p T increases. This increased collimation is accounted for by varying the radius of the isolation cone as max(0.2, 10 GeV/p lep T ), where p lep T is the lepton p T . Signal electrons are further required to satisfy "tight" quality criteria. Electrons (muons) are matched to the primary vertex by requiring the transverse impact parameter (d 0 ) to satisfy |d 0 |/σ(d 0 ) < 5 (3), and the longitudinal impact parameter (z 0 ) to satisfy |z 0 sin θ| < 0.5 mm for both the electrons and muons. The MC events are corrected to account for differences in the lepton trigger, reconstruction and identification efficiencies between data and MC simulation.
The sequence to resolve overlapping electrons, muons and jets begins by removing electron candidates sharing an inner detector track with a muon candidate. Next, jet candidates within ∆R = (∆y) 2 + (∆φ) 2 = 0.2 of an electron candidate are discarded, unless the jet is b-tagged, in which case the electron is discarded since it is likely to originate from a semileptonic b-hadron decay. Electrons are discarded if they lie within ∆R = 0.4 of a jet. Muons with p T below (above) 50 GeV are discarded if they lie within ∆R = 0.4 (∆R = 0.04 + 10 GeV/p T ) of any remaining jet, except for the case where the number of tracks associated with the jet is less than three.
The missing transverse momentum is defined as the negative vector sum of the p T of all selected and calibrated physics objects (electrons, muons and jets) in the event, with an extra term added to account for soft energy in the event which is not associated with any of the selected objects. This soft term is calculated from inner detector tracks with p T above 0.4 GeV matched to the primary vertex to make it more robust against pile-up contamination [70,71].
Reconstructed photons are not used in the main signal event selections but are selected in the regions employed in one of the alternative methods used to estimate the Z+jets background, as explained in Section 6. Photon candidates are required to have p T > 10 GeV and |η| < 2.37, whilst being outside the transition region 1.37 < |η| < 1.52, to satisfy the tight photon shower shape and electron rejection criteria [72]. The photons used in this analysis are further required to have p T > 145 GeV and to be isolated.

Event selection
Two sets of signal regions (SRs) are defined and optimized to target different third-generation squark decay modes and mass hierarchies of the particles involved. The zero-lepton channel SRs (b0L) are designed to maximize the efficiency to retain bottom-squark pair production events whereb 1 → bχ 0 1 . The one-lepton channel selections (b1L) target SUSY models where bottom squarks decay with a significant branching ratio asb 1 → tχ ± 1 and the lightest chargino is almost degenerate with the lightest neutralino. With these assumptions, the final decay products of the off-shell W boson fromχ ± 1 →χ 0 1 W * are too soft to be detected. If the branching ratios of the two competing decay modes (bχ 0 1 , tχ ± 1 ) are around 50%, the final state for the largest fraction of signal events is characterized by the presence of a top quark, a bottom quark, and neutralinos escaping the detector. Similarly,t 1 pair production can lead to an equivalent final state if thet 1 → tχ 0 1 andt 1 → bχ ± 1 decay modes compete.

Discriminating variables
Several kinematic variables and angular correlations, built from the physics objects defined in the previous section, are employed to discriminate SUSY from SM background events and are reported below. In the following, signal jets are used and are ordered according to decreasing p T .
• ∆φ • H T : This is defined as the scalar sum of the p T of all jets in the event where the number of jets involved depends on the signal region. In addition, the modified form of H T , referred to as the H T4 variable, is used to reject events with extra-jet activity in signal regions targeting models characterized by small mass-splitting between the bottom squark and the neutralino. In H T4 the sum starts with the fourth jet (if any).
• m eff : This is defined as the scalar sum of the p T of the jets and the E miss T , i.e.: The m eff observable is correlated with the mass of the pair-produced SUSY particles and is employed as a discriminating variable in some of the zero-lepton and one-lepton channel selections, as well as in the computation of other composite observables.
• E miss T /m eff , E miss T / √ H T : The first ratio is the E miss T divided by the m eff , while the second emulates the global E miss T significance, given that the E miss T resolution scales approximately with the square root of the total hadronic energy in the event. Events with low values for these variables are rejected as it is most probable that E miss T arises from jets mismeasurements, caused by instrumental and resolution effects.
• m j j : This variable is calculated as the invariant mass of the leading two jets. In events where at least one of the leading jets is b-tagged, this variable aids in reducing the contamination from tt events. It is referred to as m bb for events where the two leading jets are b-tagged.  The minimum invariant mass of the lepton and one of the two b-jets is defined as: This variable is bound from above by m 2 t − m 2 W for tt production, and it is used to distinguish tt contributions from Wt-channel single-top-quark events in the one-lepton control regions.
• Contransverse mass (m CT ) [73]: This is the main discriminating variable in most of the zero-lepton channel signal regions [74]. It is used to measure the masses of pair-produced semi-invisibly decaying heavy particles. For identical decays of two heavy particles (e.g. the bottom squarks decaying exclusively asb 1 → bχ 0 1 ) into two visible particles v 1 and v 2 (the b-quarks), and two invisible particles X 1 and X 2 (theχ 0 1 for the signal), m CT is defined as with E T = p 2 T + m 2 , and it has a kinematical endpoint at m max CT = (m 2 i − m 2 X )/m i where i is the initially pair-produced particle. This variable is extremely effective in suppressing the top-quark pair production background (i = t, X = W), for which the endpoint is at 135 GeV.
• m min T (jet 1−4 , E miss T ) : This is the minimum of the transverse masses calculated using any of the leading four jets and the E miss T in the event. For signal scenarios with low values of m max CT , this kinematic variable is an alternative discriminating variable to reduce the tt background.
• am T2 : The asymmetric transverse mass [75, 76] is a kinematic variable which can be used to separate processes in which two decays giving missing transverse momentum occur, and it is the main discriminating observable in the one-lepton channel signal regions. The am T2 definition is based on the stransverse mass (m T2 ) [77]: where p T (v i ) are reconstructed transverse momenta vectors and q (i) T represent the missing transverse momenta from the two decays, with a total missing transverse momentum, p miss T ; χ is a free parameter representing the unknown mass of the invisible particles -here assumed to be zero. The a in am T2 indicates that the two visible decay legs are asymmetric, i.e. not composed of the same particles.
In the case of events with one lepton (electron or muon) and two b-jets, the m T2 variable is calculated for different values of p T (v 1 ) and p T (v 2 ), by grouping the lepton and the two b-jets into two visible objects v 1 and v 2 . The lepton needs to be paired with one of the two b-jets and the choice is driven by the value of m b (n) -the invariant mass of the n th b-tagged jet and the lepton. If the two particles are correctly associated, this value has an upper bound given by the top quark mass. The value of am T2 is thus computed accordingly: -If m b (1) and m b (2) are both > 170 GeV, neither of the two associations is compatible with the b-jet and the lepton originating from a top decay, so the event is rejected since all control, validation and signal regions require the smaller value of m b to be < 170 GeV. - This is done because only the first pairing is compatible with a top quark decay.
-If m b (1) and m b (2) are both < 170 GeV, am T2 is calculated in both configurations and its value is taken to be the smaller of the two. This must be done because, according to the m b check, both pairings would be acceptable.
• A : This is the p T asymmetry of the leading two jets and is defined as: The A variable is employed in scenarios where the mass-splitting between the bottom squark and the neutralino is small (< 20 GeV) and the selection exploits the presence of a high-momentum jet from initial-state radiation (ISR).

Zero-lepton channel selections
The selection criteria for the zero-lepton channel SRs are summarized in Table 1 and have the main requirement of no baseline leptons with p T > 10 GeV and two b-tagged jets. To exploit the kinematic properties over the large range ofb 1 andχ 0 1 masses explored, three sets of SRs are defined. The b0L-SRA regions are optimized to be sensitive to models with large mass-splitting between theb 1 and theχ 0 1 , ∆m(b 1 ,χ 0 1 ) > 250 GeV. Incremental thresholds are imposed on the main discriminating variable, m CT , resulting in three overlapping regions (m CT >350, 450 and 550 GeV). Only events with E miss T > 250 GeV are retained to ensure full efficiency of the trigger and comply with the expected signal topology. The two leading jets are required to be b-tagged whilst contamination from backgrounds with high jet multiplicity, particularly tt production, is suppressed by vetoing events with a fourth jet with p T > 50 GeV. To discriminate against multijet background, events where E miss T is aligned with a jet in the transverse plane are rejected by requiring min[∆φ(jet 1−4 , E miss T )] > 0.4, and E miss T /m eff > 0.25. A selection on the invariant mass of the two b-jets (m bb > 200 GeV) is applied to further enhance the signal yield over the SM background contributions.
The b0L-SRB region targets intermediate mass-splitting betweenb 1 andχ 0 1 , 50 < ∆m(b 1 ,χ 0 1 ) < 250 GeV. In these scenarios, the selections based on the m CT and m bb variables are no longer effective and the variable m min T (jet 1−4 , E miss T ) is employed to reduce SM background contributions from tt production, with events selected if m min T (jet 1−4 , E miss T ) > 250 GeV. No more than four signal jets are allowed, to reduce additional hadronic activity in the selected events. As opposed to the b0L-SRA criteria, no veto based on the fourth jet p T is applied. A series of selections on the azimuthal angle between the two b-tagged jets and the E miss Finally, the b0L-SRC region targets events where a bottom squark pair is produced in association with a jet from ISR. This selection provides sensitivity to models with a small mass difference between thẽ b 1 and theχ 0 1 , ∆m(b 1 ,χ 0 1 ) < 50 GeV, such that a boosted bottom squark pair would satisfy the trigger requirements. To efficiently suppress tt and W+jets backgrounds, events are selected with one high-p T non-b-tagged jet and E miss T > 500 GeV such that ∆φ( j 1 , E miss T ) > 2.5. Stringent requirements on the minimum azimuthal angle between the jets and E miss T are not suited for these scenarios where b-jets have softer momenta and are possibly aligned with E miss T . A large asymmetry A is required to reduce the multijet background while loosening the selection on the minimum azimuthal angle between the jets and E miss T to min[∆φ(jet 1−2 , E miss T )] >0.2, and relaxing the p T threshold on signal jets to 20 GeV. Table 1: Summary of the event selection in each signal region for the zero-lepton channel. For SRA, the "x" denotes the m CT selection used. The term lepton is used in the table to refer to baseline electrons and muons. Jets ( j 1 , j 2 , j 3 , j 4 and j 5 ) are labelled with an index corresponding to their decreasing order in p T .

One-lepton channel selections
The selection criteria for the one-lepton channel SRs are summarized in Table 2. Events are required to have exactly one signal electron or muon and no additional baseline leptons, two b-tagged jets and a large E miss T . Similarly to the zero-lepton channel, three sets of SRs are defined to maximize the sensitivity depending on the mass hierarchy betweenb 1 (t 1 ) andχ ± 1 ≈χ 0 1 . The b1L-SRA regions are optimized for models with large ∆m(b 1 ,χ 0 1 ): events are required to have large E miss T and E miss T / √ H T and ∆φ j min above 0.4 to reduce the multijet background contributions to negligible levels. Requirements on the m T and am T2 variables to be above 140 GeV and 250 GeV, respectively, are set to reject W+jets and tt events whilst the selection on the invariant mass of the two b-jets (m bb > 200 GeV) is applied to further enhance the signal yield over the SM background contributions. Two incremental thresholds are finally imposed on m eff (600 and 750 GeV) to define two overlapping signal regions.
The b1L-SRB region is designed to be sensitive to compressed mass spectra, hence low m bb is expected, and the selections on the m T and am T2 variables must be relaxed to avoid loss of signal events.
The min[m T (b-jet, E miss T )] is employed to discriminate signal from tt events, which is the dominant SM background contribution.
A third region, referred to as b1L-SRA300-2j, is defined similarly to the b1L-SRAs but requiring no extra jets beside the two b-jets and m eff above 300 GeV. Such a selection also targets SUSY models characterized by compressed mass spectra. It is kinematically similar to the signal region in the Run-1 analysis [20] with a veto requirement on the number of jets with p T > 50 GeV. Table 2: Summary of the event selection in each signal region for the one-lepton channel. For SRA, the "x" denotes the m eff selection used. The term lepton is used in the table to refer to signal electrons and muons. Jets ( j 1 , j 2 ) are labelled with an index corresponding to their decreasing order in p T .

Background estimation
Monte Carlo simulation is used to estimate the background yield in the signal regions. The MC prediction for the major backgrounds is normalized to data in control regions (CR) constructed to enhance a particular background and to be kinematically similar but mutually exclusive to the signal regions. The control regions are defined by explicitly requiring the presence of one or two leptons (electrons or muons) in the final state together with further selection criteria similar to those of the corresponding signal region. To ensure that the b0L and b1L analyses can be statistically combined, the CRs associated with b0L and b1L SRs are mutually exclusive, with the exception of the single-top CR, where the same CR is used for both channels.
The expected SM backgrounds are determined separately for each SR with a profile likelihood fit [78], referred to as the background-only fit. The fit uses as a constraint the observed event yields in a set of associated CRs to adjust the normalization of the main backgrounds, assuming that no signal is present. The inputs to the fit for each SR include the number of events observed in its associated CRs and the number of events predicted by simulation in each region for all background processes. The latter are described by Poisson statistics. The systematic uncertainties in the expected values are included in the fit as nuisance parameters. They are constrained by Gaussian distributions with widths corresponding to the sizes of the uncertainties and are treated as correlated, when appropriate, between the various regions. The product of the various probability density functions forms the likelihood, which the fit maximizes by adjusting the background normalization and the nuisance parameters. Finally, the reliability of the MC extrapolation of the SM background estimate outside of the control regions is evaluated in several validation regions (VRs).

Background estimation in the zero-lepton signal regions
The main SM background in the b0L signal regions is from the production of Z+jets followed by invisible decays of the Z boson. The production of top quark pairs, single top quarks and W+jets also results in important backgrounds, with their relative contributions depending on the specific SR considered. Full details of the CR definitions are given in Tables 3 and 4.
< 50 < 50 < 50 < 50 --b-jets j 1 and j 2 j 1 and j 2 j 1 and j 2 j 1 any 2 any 2 any 2 E miss Three same-flavour opposite-sign (SFOS) two-lepton (electron or muon) control regions with dilepton invariant mass near the Z boson mass (76 < m < 106 GeV) and two b-tagged jets provide data samples dominated by Z boson production. Signal leptons are considered, with the threshold for the second lepton p T loosened to 20 GeV. For these control regions, labelled in the following as b0L-CRzA, b0L-CRzB and b0L-CRzC, the p T of the leptons is added vectorially to the p miss T to mimic the expected missing transverse momentum spectrum of Z → νν events, and is indicated in the following as E miss,cor T (lepton corrected). In addition, a selection is applied to the uncorrected E miss T of the event, in order to further enhance the Z boson contribution. Table 4: Summary of the event selection in each control region corresponding to b0L-SRC. The term lepton is used in the table to refer to signal electrons and muons. Jets ( j 1 , j 2 , j 3 and j 4 ) and leptons ( 1 and 2 ) are labelled with an index corresponding to their decreasing order in p T .

b0L-CRzC CRttC CRwC
Number of leptons ( = e, µ) > 250 > 500 > 500 b-jets j 2 and ( j 3 or j 4 ) j 2 and ( j 3 or j 4 ) Events with one charged lepton in the final state are used to define control regions dominated by W+jets and top quark production by requiring either one or two b-tagged jets, respectively. Selections on the variable m T are used to ensure that the lepton originates from a W decay. For the CRs corresponding to b0L-SRA, the contribution from tt and single top quark production are separated by applying the selection m bb < 200 GeV and m bb > 200 GeV, respectively. To further enhance the single-top-quark contribution, a selection on the minimum invariant mass of the lepton and one of the b-jets, m min b > 170 GeV is applied. For the CRs corresponding to b0L-SRB, selections on the azimuthal angle between the b-jets and the E miss T value are applied to enhance the tt and W+jets contributions, while the single-top-quark background is estimated from MC simulation. The CRs corresponding to the b0L-SRC are defined with one or two b-jets to enhance the tt and W+jets contributions, respectively. Finally, the single top quark production is estimated using the MC normalization.
The contributions from dibosons (WW, WZ, ZZ), tt production associated with W and Z bosons, and other rare backgrounds are estimated from MC simulation for both the signal and the control regions and included in the fit procedure, and are allowed to vary within their normalization uncertainty. The background from multijet production is estimated from data using a procedure described in detail in Ref.
[79] and modified to account for the heavy flavour of the jets. The contribution from multijet production in all regions is found to be negligible.
In total, four CRs are defined for the b0L-SRA to estimate the contributions from W+jets, Z+jets, tt and single top quark production independently, while three CRs are defined for each of the b0L-SRB and b0L-SRC to estimate W+jets, Z+jets and tt. The E miss T distribution in b0L-CRwA and b0L-CRzC is shown in Figures 2(a) and 2(b), where good agreement with the SM prediction is achieved after the backgroundonly fit. The yields in all these CRs are shown in Figure 3 and compared to the MC predictions before the likelihood fit is performed, including only the statistical uncertainty of the MC samples. The bottom panel shows the value of the normalization factors, µ, used for each of the backgrounds fitted and given taking into account statistical and detector-related systematic uncertainties.
As a further validation, two alternative methods are used to estimate the Z+jets contribution. The first method exploits the similarity of the Z+jets and γ+jets processes [79]. For a photon with p T significantly larger than the mass of the Z boson, the kinematics of γ+jets events strongly resemble those of Z+jets events. A set of dedicated control regions is defined by requiring one isolated photon with p T > 145 GeV. The p T of the photon is vectorially added to the p miss T , and the magnitude of this sum is used to replace the E miss T -based selections. The yields are then propagated to the SRs using a reweighting factor derived using the MC simulation. This factor takes into account the different kinematics of the two processes and residual effects arising from the different geometrical acceptance and reconstruction efficiency for photons. In the second alternative method, applied to b0L-SRA only, the MC simulation is used to verify that the shape of the m CT distribution for events with no b-tagged jets is compatible with the shape of the m CT distribution for events where two b-tagged jets are present. A new highly populated Z+jets CR is defined, selecting Z → events with no b-tagged jets. The m CT distribution in this CR is constructed using the two leading jets and is used to estimate the shape of the m CT distribution in the b0L-SRA, whilst the normalization in SRA is rescaled based on the ratio in data of Z → events with no b-tagged jets to events with two b-tagged jets. Additional MC-based corrections are applied to take into account the two-lepton selection in this CR. The two alternative methods are in agreement within uncertainties with the estimates obtained with the profile likelihood fit to the control regions. Experimental and theoretical systematic uncertainties in the estimates from the nominal and alternative methods are taken into account (see Section 7).

Background estimation in the one-lepton signal regions
The main SM background in the b1L signal regions is the production of tt and single-top-quark events in the Wt channel. Two control regions (b1L-CRttA and b1L-CRttB) where the tt production is enhanced are defined by inverting the am T2 selection. In the case of b1L-CRttA the m bb selection is also inverted, while for b1L-CRttB the min[m T (b-jet, E miss T )] requirement is inverted. To allow a statistical combination of the results from the b0L-SRA and b1L-SRA regions the corresponding tt CRs are defined to be orthogonal via the m CT selection. The single-top-quark contribution is estimated with the same CR employed by the b0L analysis. In the case of b1L-SRB the production of W+jets is no longer negligible, and is estimated by using a dedicated control region b1L-CRwB, where only one b-tagged jet is required. In total, two CRs are used to estimate the event yields in b1L-SRA and three CRs to estimate the yields in b1L-SRB. Full details of the CR selections are given in Table 5 The yields in all these CRs are also shown in Figure 3 and compared to the direct MC prediction before the likelihood fit is performed. The normalization parameters reported for each SR and SM background process include the statistical and detector-related systematic uncertainties. The decrease of the µ tt parameter from SRA to SRC is related to mismodelling in the description of tt processes by Powheg +Pythia 6 MC samples. Previous analyses [80] also found normalization factors considerably smaller than unity for tt background processes in similar regions of phase space. The W+jets and Z+jets normalization factors are larger than unity. This is possibly related to the fact that in the default Sherpa 2.2.1 the heavy-flavour production fractions are not consistent with the measured values [81]. Table 5: Summary of the event selection in each control region corresponding to the b1L signal regions. The term lepton is used in the table to refer to signal electrons and muons. Jets ( j 1 , j 2 , j 3 and j 4 ) are labelled with an index corresponding to their decreasing order in p T .

Validation regions
The results of the background-only fit to the CRs are extrapolated to a set of VRs defined to be similar to the SRs, with some of the selection criteria modified to enhance the background contribution, while maintaining a small signal contribution. For each SR, one or more VRs are defined starting from the SR definition and inverting or changing some of the selections as summarized in Table 6.
The number of events predicted by the background-only fit is compared to the data in the upper panel of Figure 4. The pull, defined by the difference between the observed number of events (n obs ) and the predicted background yield (n pred ) divided by the total uncertainty (σ tot ), is shown for each region in the lower panel. No evidence of significant background mismodelling is observed in the VRs.

Systematic uncertainties
Several sources of experimental and theoretical systematic uncertainty in the signal and background estimates are considered in these analyses. Their impact is reduced through the normalization of the dominant backgrounds in the control regions defined with kinematic selections resembling those of the corresponding signal region (see Section 6). Experimental and theoretical uncertainties are included as nuisance parameters with Gaussian constraints in the likelihood fits, taking into account correlations between different regions. Uncertainties due to the numbers of events in the CRs are also introduced in the fit for each region. The dominant contributions are summarized in Table 7.  Tables 1 and 2, with a few selection requirements changed (right column) to ensure the selection has low efficiency for the expected signal.      Uncertainties in the modelling of the SM background processes from MC simulation and their theoretical cross-section uncertainties are also taken into account. The dominant uncertainty arises from Z+jets MC modelling for b0L-SRs and tt and single-top modelling (collectively referred to as "Top production" in Table 7) for b1L-SRs. The Z+jets (as well as W+jets) modelling uncertainties are estimated by considering different merging (CKKW-L) and resummation scales using alternative samples, PDF variations from the NNPDF30NNLO replicas [46], as well as an envelope formed from seven-point scale variations of the renormalization and factorization scales. The various components are added in quadrature. A 40% uncertainty [83] is assigned to the heavy-flavour jet content in W+jets background, which is estimated from MC simulation in the one-lepton channel control regions. For b0L-SRA, b0L-SRC and b1L-SRB the uncertainty accounts for the different requirements on b-jets between the signal regions and the corresponding control regions.  Theoretical and modelling uncertainties of the top quark pair and single-top-quark (Wt) backgrounds are computed as the difference between the prediction from nominal samples and those from additional samples differing in generator or parameter settings. Hadronization and PS uncertainties are estimated using samples generated using Powheg-Box v2 and showered by Herwig++ v2.7.1 [84] with the UEEE5 [85] underlying-event tune. Uncertainties related to initial-and final-state radiation modelling, PS tune and (for tt only) choice of h damp parameter in Powheg-Box v2 are estimated using alternative settings of the event generators. Finally, an alternative generator MadGraph5_aMC@NLO with showering by Herwig++ v2.7.1 is used to estimate the event generator uncertainties. One additional uncertainty stems from the modelling of the interference between the tt and Wt processes at NLO. Predictions from an inclusive WWbb sample generated at LO using MadGraph5_aMC@NLO are compared with the sum of the tt and Wt predictions, and differences from the nominal predictions are taken as systematic uncertainties.
Uncertainties in backgrounds such as diboson and ttV are also estimated by comparisons of the nominal sample with alternative samples differing in generator or parameter settings (Powheg v2 with showering by Pythia v8.210 for dibosons; renormalization and factorization scale and A14 tune variations for ttV) and contribute less than 5% to the total uncertainty. The cross-sections used to normalize the MC yields to the highest order available are varied according to the scale uncertainty of the theoretical calculation. The cross-section uncertainties are 5% for W boson, Z boson and top quark pair production, 6% for dibosons, and 13% and 12% for ttW and ttZ, respectively. Finally, a conservative 100% systematic uncertainty associated to the multijet background estimate is considered and found to have a negligible effect.
For the SUSY signal processes, both the experimental and theoretical uncertainties in the expected signal yield are considered. Experimental uncertainties are found to be between 15% and 30% across theb 1 -χ 0 1 mass plane for exclusiveb 1 → bχ 0 1 decays and between 10% and 25% for models where bottom squarks decay with a significant branching ratio asb 1 → tχ ± 1 , assuming the one-lepton channel selection. In all SRs, they are largely dominated by the uncertainty in the b-tagging efficiency. Theoretical uncertainties in the NLO+NLL cross-section are calculated for each SUSY signal scenario and are dominated by the uncertainties in the renormalization and factorization scales, followed by the uncertainty in the PDF. They vary between 15% and 25% for bottom squark masses in the range between 400 GeV and 1100 GeV. Additional uncertainties in the acceptance and efficiency due to the modelling of initial-state radiation and scale variations in SUSY signal MC samples are also taken into account and contribute up to about 10%. Tables 8 and 9 report the observed number of events and the SM prediction after the background-only fit for each signal region in the zero-lepton and one-lepton channels, respectively. The background-only fit results are compared to the pre-fit predictions based on MC simulation. The largest background contribution in b0L-SRs arises from Z → νν produced in association with b-quarks followed by W+jets production, whilst top quark and W+jets production dominates SM predictions for b1L-SRs. The results are also summarized in Figure 5, where the pulls for each of the SRs are also presented. No significant excess above the expected Standard Model background yield is observed, although b1L-SRA300-2j presents a discrepancy between data and SM predictions of about 1.5σ. Figure 6 shows the comparison between the observed data and the SM predictions for some relevant kinematic distributions for the b0L and b1L selections. For illustrative purposes, the distributions expected for scenarios with different bottom squark and neutralino masses depending on the SR considered are shown.

Results and interpretation
The results are translated into upper limits on contributions from physics beyond the SM (BSM) for each signal region. The CL s method [86,87] is used to derive the confidence level of the exclusion; signal models with a CL s value below 0.05 are said to be excluded at 95% CL. The profile-likelihood-ratio test statistic is used to exclude the signal-plus-background hypothesis for specific signal models. S 95 obs (S 95 exp ) is the observed (expected) upper limit at 95% CL on the number of events from BSM phenomena for each signal region. These limits, when normalized by the integrated luminosity of the data sample, may be interpreted as upper limits on the visible cross-section of BSM physics, σ vis , defined as the product of the production cross-section, the acceptance and the selection efficiency of a BSM signal. Table 10 summarizes S 95 obs , S 95 exp , and σ vis for all SRs, together with the p 0 -values, which represent the probability of the SM background alone to fluctuate to the observed number of events or higher.
Exclusion limits are obtained assuming two types of SUSY particle mass hierarchy such that the lightest bottom squark decays either exclusively viab 1 → bχ 0 1 or into multiple channels,b 1 → bχ 0 1 andb 1 → tχ ± 1 , assuming a 50% branching ratio and ∆m(χ ± 1 ,χ 0 1 ) ∼ 1 GeV. The first set of scenarios is targeted by the zerolepton channel SRs only. For models with mixed decays, the expected limits from the SRs are compared and the observed limits are obtained by statistically combining the most sensitive zero-lepton SR with the Table 8: Fit results in the b0L signal regions. The background normalization parameters are obtained from the fit in the control regions and are applied to the SRs. Smaller backgrounds such as diboson, ttV, multijet and rare processes are indicated as "Others". The individual uncertainties, including statistical, detector-related and theoretical systematic components, are symmetrized and can be correlated. They do not necessarily add in quadrature to the total systematic uncertainty.   most sensitive one-lepton SR. In all cases, the fit procedure takes into account correlations in the yield predictions between control and signal regions due to common background normalization parameters and systematic uncertainties. The experimental systematic uncertainties in the signal are taken into account for this calculation and are assumed to be fully correlated with those in the SM background.
For the exclusiveb 1 → bχ 0 1 decay mode, at each point of the parameter space the SR with the best expected sensitivity is used. Sensitivity to scenarios with the largest mass difference between theb 1 and theχ 0 1 is achieved with the most stringent m CT threshold (b0L-SRA550). Sensitivity to scenarios with intermediate and small mass differences is obtained with the dedicated b0L-SRB and b0L-SRC selections, respectively. For the mixed-decays scenarios, a statistical combination is computed with the results of the zero-lepton and one-lepton channels as explained above. A combined fit is performed simultaneously on the control and signal regions of the two analyses. The best sensitivity to regions of the (b 1 ,χ 0 1 ) mass plane close to the kinematic boundaries is obtained with the combination of the b0L-SR450 and b1L-SRB regions whilst stringent constraints on models with large mass differences are achieved with the results from the combination of zero-lepton and one-lepton SRs with the most stringent m CT and m eff thresholds, respectively.     Figure 7: (a) Observed and expected exclusion contours at 95% CL, as well as ±1σ variation of the expected limit, in theb 1 -χ 0 1 mass plane. The SR with the best expected sensitivity is adopted for each point of the parameter space. The yellow band around the expected limit (dashed line) shows the impact of the experimental and SM background theoretical uncertainties. The dotted lines show the impact on the observed limit of the variation of the nominal signal cross-section by ±1σ of its theoretical uncertainties. (b) Same, for scenarios where multiple decay modes are considered for bottom squarks. The statistical combination of b0L-SRs and b1L-SRs is used in this case.

Conclusion
The results of a search for pair production of bottom and top squarks are reported. The analysis uses 36.1 fb −1 of pp collisions at √ s = 13 TeV collected by the ATLAS experiment at the Large Hadron Collider in 2015 and 2016. Third-generation squarks are searched for in events containing large missing transverse momentum and jets, exactly two of which are identified as b-jets. Selections are defined with either no charged leptons (electrons and muons) in the final state, or one charged lepton. Zero-lepton channel signal regions target R-parity-conserving models in which theb 1 is the lightest squark and is assumed to decay exclusively viab 1 → bχ 0 1 , whereχ 0 1 is the lightest neutralino. One-lepton channel signal regions target models where bottom or top squarks are produced and can decay into multiple channels,b 1 → bχ 0 1 andb 1 → tχ ± 1 , ort 1 → tχ 0 1 andt 1 → bχ ± 1 , whereχ ± 1 is the lightest chargino and the mass difference mχ± 1 − mχ0 1 is set to 1 GeV. No significant excess above the expected Standard Model background is found and exclusion limits at 95% confidence level are placed on the visible cross-section and on the mass of the bottom (or top) squark. Bottom squark masses up to 950 GeV are excluded forχ 0 1 masses below 420 GeV in models with exclusive decay modes. Bottom or top squark masses up to 860 GeV are excluded forχ 0 1 masses below 250 GeV in models with mixed decay modes with equal branching ratios. The results significantly improve upon previous Run-1 and Run-2 searches at the ATLAS experiment and strengthen the constraints on bottom and top squark masses.
(Taiwan), RAL (UK) and BNL (USA), the Tier-2 facilities worldwide and large non-WLCG resource providers. Major contributors of computing resources are listed in Ref. [88].