Search for phenomena beyond the Standard Model in events with large b-jet multiplicity using the ATLAS detector at the LHC

A search is presented for new phenomena in events characterised by high jet multiplicity, no leptons (electrons or muons), and four or more jets originating from the fragmentation of b-quarks (b-jets). The search uses 139fb-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$139~\hbox {fb}^{-1}$$\end{document} of s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s}$$\end{document} = 13 TeV proton–proton collision data collected by the ATLAS experiment at the Large Hadron Collider during Run 2. The dominant Standard Model background originates from multijet production and is estimated using a data-driven technique based on an extrapolation from events with low b-jet multiplicity to the high b-jet multiplicities used in the search. No significant excess over the Standard Model expectation is observed and 95% confidence-level limits that constrain simplified models of R-parity-violating supersymmetry are determined. The exclusion limits reach 950 GeV in top-squark mass in the models considered.


Introduction
Events with a large number of high-transverse momentum ( p T ) jets originating from the fragmentation of b-quarks (bjets) are rarely produced by Standard Model (SM) processes in proton-proton ( pp) collisions at the LHC. As a result, this signature can provide sensitivity to certain phenomena beyond the SM (BSM) [1][2][3]. Event signatures with five or more b-jets, no leptons (electrons or muons) and without any requirements on missing transverse momentum are not covered by existing searches at the LHC. Supersymmetry (SUSY) provides an extension to the SM by introducing partners of the known bosons and fermions. It predicts the existence of superpartner states (with different statistics) associated to each of the SM particles and fields. The lightest among such superpartners (LSP) may or may not be stable, depending on the conservation of R-parity [4][5][6]. Final states with high leptonic or hadronic multiplicity are commonly predicted by R-parity-violating (RPV) SUSY. Models of RPV SUSY do not provide stable superpartners, and they give rise to a wide variety of experimental signatures e-mail: atlas.publications@cern.ch whose nature depends on which of the many RPV couplings are non-zero.
In the analysis presented here, a particular benchmark model is considered in order to interpret the measurements in the different jet and b-jet multiplicity regions. The process under consideration is the pair production of the top squark as the lightest of the coloured SUSY partners. The existence of light SUSY partners of third-generation quarks, bottom squarks (b) and top squarks (t), is favoured by naturalness considerations [7,8]. The scenario assumes the LSP to be a triplet of two neutralino (χ 0 1 ,χ 0 2 ) and one chargino (χ ± 1 ) states that are mass-degenerate and carry dominantly higgsino components (in the following collectively referred to as "higgsinos"). The top squark decays either into a chargino,χ ± 1 , and a bottom quark or into a neutralino,χ 0 1,2 , and a top quark. The chargino and neutralino decay, respectively, to bbs and tbs quark triplets, as shown in Fig. 1; this decay is mediated through their higgsino components via the non-zero baryon-number-violating RPV coupling λ 323 [9,10].
Previous searches targeting RPV SUSY models of pairproduced top squarks decaying through the coupling λ 323 have been carried out by the ATLAS and CMS collaborations. Those searches already exclude top-squark masses in the ranges 100 GeV ≤ mt ≤ 470 GeV and 480 GeV ≤ mt ≤ 610 GeV (ATLAS [12]), and 80 GeV ≤ mt ≤ 270 GeV, 285 GeV ≤ mt ≤ 340 GeV and 400 GeV ≤ mt ≤ 505 GeV (CMS [13]) in scenarios where the top squark is the LSP and decays directly viat → bs. For the direct top-squark production and λ 323 -mediated decays of higgsino LSP scenarios, ATLAS has excluded top-squark masses up to 1.10 TeV, depending on the higgsino mass considered, in the region where mt − mχ0 1,2 ,χ ± 1 ≥ m top , by analysing lepton plus jets events [11]. CMS has excluded top-squark masses between 100 and 720 GeV for top-squark decays into four quarks in boosted topologies and with the mass of the higgsinos set to 75% of the squark mass [14].
This analysis considers events with six or more jets, of which at least four are identified as b-jets (b-tagged). There must be no identified electron or muon, and no requirement is made on the missing transverse momentum. In this channel, the dominant background is the non-resonant production of multijet events, referred to as 'multijet' in the following, and a data-driven method is applied to estimate its yield. Other backgrounds arise from top-quark pair production accompanied by extra b-jets or by a Z or Higgs boson decaying into a b-quark pair. Results are reported as 95% confidence level (CL) exclusion limits on the top-squark mass in the benchmark models described above. Modelindependent limits on the possible contribution of BSM physics are also evaluated at large jet and b-tagged jet multiplicities.

ATLAS detector
The ATLAS experiment [15] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle. 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer (MS). The inner tracking detector covers the pseudorapidity range |η| < 2.5. It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. An additional innermost layer of the silicon pixel tracker, the insertable B-layer [16,17], was installed in 2014 at an average radial distance of 3.3 cm from the beam-line to improve track reconstruction and flavour identification of quark-initiated jets. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic energy measurements with high granularity. A steel/scintillator-tile calorimeter provides hadronic energy 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the zaxis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). Angular distance is measured in units of measurements and covers the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with LAr calorimeters for both the electromagnetic and hadronic energy measurements up to |η| = 4.9. The muon spectrometer surrounds the calorimeters and is based on three large air-core toroidal superconducting magnets with eight coils each. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. The muon spectrometer includes a system of precision tracking chambers and fast detectors for triggering. A two-level trigger system is used to select events to be recorded. The firstlevel trigger is implemented in hardware and uses a subset of the detector information to accept events at a rate of at most 100 kHz. This is followed by a software-based high-level trigger (HLT) that reduces the accepted event rate to ∼1.2 kHz, on average.

Data collection and simulated event samples
This search is based on 139 fb −1 of centre-of-mass energy √ s = 13 TeV pp collision data, collected between 2015 and 2018, that satisfy beam, detector and data-quality criteria. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [18], obtained using the LUCID-2 detector [19] for the primary luminosity measurements. The average number of interactions ( μ ) in the same and nearby bunch crossings (pile-up) varies from μ = 13.4 (2015 dataset) to μ = 36.1 (2018 dataset), with a highest μ = 37.8 (2017 dataset) and an average μ = 33.7. Data were collected using a four-jet trigger which, in the HLT, requires four jets each having |η| < 2.5, with p T > 100 GeV for the 2015-2016 data period and p T > 120 GeV for the 2017-2018 data period. Data events used for the validation of the data-driven multijet background were collected using the lowest unprescaled single-lepton triggers; the lowest trigger p T threshold used for muons is 20 (26) GeV in 2015 (2016-2018), while for electrons the trigger p T threshold is 24 (26) GeV in 2015-2017 (2018).
Monte Carlo (MC) simulations are used to model the SUSY signals, as well as to aid in the description of the background processes. In the remainder of this section, the simulation of the signal and of the main background processes contributing to the selected events in data is described. For all the simulated physics processes, the top-quark mass is assumed to be m top = 172.5 GeV and the Higgs boson mass is taken to be m H = 125 GeV. The generation of the simulated event samples includes the effect of multiple pp interactions in the same and neighbouring bunch crossings, as well as the effect of pile-up on the detector response. These interactions were produced using Pythia 8.230 [20]  MC samples for multijet production were generated using Pythia 8.230 with leading-order matrix elements for dijet production and a p T -ordered parton shower. EvtGen v1.6.0 was used for bottom and charm hadron decays. The renormalisation and factorisation scales were set to the geometric mean of the squared transverse masses of the two outgoing partons, ( p 2 T,1 + m 2 1 )( p 2 T,2 + m 2 2 ). The production of tt events (referred to as tt + jets) was modelled using the Powheg-Box v2 [30-33] generator at next-to-leading order (NLO) with the NNPDF3.0 NLO [34] PDF set and with the h damp parameter 2 set to 1.5 m top [35]. Pythia 8.230 was used for the parton shower and EvtGen v1.6.0 for bottom and charm hadron decays. The tt + jets sample was generated inclusively in the number of 2 The h damp parameter is a resummation damping factor and one of the parameters that controls the matching of Powheg matrix elements to the parton shower and thus effectively regulates the highp T radiation against which the tt system recoils. jets using fast simulation. The MC sample cross-section is corrected to the theory prediction at next-to-next-to-leading order (NNLO) in QCD including resummation of next-tonext-to-leading logarithmic (NNLL) soft gluon terms by means of the Top++ (v2.0) program [36][37][38][39][40][41][42]. The generated events may have jets which do not originate from the decay of the tt system. These additional jets are used to categorise the events depending on the flavour of the matching parton. Particle jets are reconstructed from all stable particles generated in the event (excluding muons and neutrinos) using the antik t algorithm [43] with a radius parameter R = 0.4 and are required to have p T > 15 GeV and |η| < 2.5. Events having at least one such particle jet, matched within R < 0.3 to a generated b-hadron having p T > 5 GeV and not originating from a top-quark decay, are labelled as tt+ ≥ 1b events. Similarly, events which are not already categorised as tt+ ≥ 1b, and where at least one particle jet is matched to a c-hadron not originating from a W boson decay, are labelled as tt+ ≥ 1c events. Events labelled as either tt+ ≥ 1b or tt+ ≥ 1c are referred to as tt + HF events (HF for 'heavy flavour'). The remaining events, including those with no additional jets, are labelled as tt + light events (light for 'light flavour').
The W t single-top-quark background was generated at NLO in QCD by Powheg-Box v2 with the NNPDF3.0 NLO PDF set. Overlaps between the tt and W t final states were removed using the 'diagram removal' scheme [44]. Pythia 8.230 was used for the parton shower and Evt-Gen v1.6.0 for bottom and charm hadron decays. Samples of single-top events are normalised to the cross-section calculated at NLO in QCD with NNLL soft gluon corrections [45,46]. The production of tt H events was modelled using the Powheg-Box v2 generator to NLO with the NNPDF3.0 NLO PDF set. Pythia 8.230 was used for the parton shower and EvtGen v1.6.0 for bottom and charm hadron decays. The cross-sections are calculated at NLO QCD and NLO electroweak accuracy using the generator MadGraph5_ aMC@NLO [48].
Signal events were produced using the MadGraph5_aMC@NLO v2.3.3 generator at NLO with the NNPDF2.3 LO PDF, and the fast simulation of the detector response. Pythia 8.230 was used for the parton shower and EvtGen v1.6.0 for bottom and charm hadron decays. Signal cross-section calculations include approximate next-to-nextto-leading-order (NNLO Approx ) supersymmetric QCD corrections and the resummation of soft gluon emission at NNLL accuracy [49]. The nominal cross-section and its uncertainty are taken from an envelope of predictions using different PDF sets as well as different factorisation and renormalisation scales. Top-squark masses between 600 GeV and 1 TeV and higgsino masses between 100 GeV and 950 GeV are considered.

Event reconstruction
Events are required to have a primary vertex reconstructed from at least two tracks with transverse momentum p T > 500 MeV. When several vertices are found in a given bunch crossing, the vertex with the largest summed p 2 T of the associated tracks is selected as the primary vertex.
Electrons are reconstructed from energy deposits (clusters) in the electromagnetic calorimeter matched to tracks reconstructed in the ID [50,51] and are required to have p T > 10 GeV and |η| < 2.47. Candidates in the calorimeter barrel-endcap transition region (1.37 < |η| < 1.52) are excluded. Electron tracks must match the primary vertex of the event: the longitudinal impact parameter 3 is required to satisfy |z 0 | < 0.5 mm, while the transverse impact parameter is required to satisfy |d 0 |/σ d 0 < 5, where σ d 0 represents the uncertainty in the measured |d 0 | values. Loose electrons are identified using the 'Medium' identification criterion provided by a likelihood-based discriminant [52]. Tight elec-3 The transverse impact parameter (d 0 ) is defined as the distance of closest approach in the transverse plane between a track and the beam-line. The longitudinal impact parameter (z 0 ) corresponds to the z-coordinate difference between the point along the track at which the transverse impact parameter is defined and the primary vertex.
Muons are reconstructed by matching either track segments or full tracks in the MS to tracks in the ID [53]. Combined tracks are then re-fitted using information from both detector systems. Muon tracks must match the primary vertex of the event: the longitudinal impact parameter is required to satisfy |z 0 | < 0.5 mm, while the transverse impact parameter is required to satisfy |d 0 |/σ d 0 < 3. Loose muons are those that pass the 'Loose' muon selection [53] and have p T > 10 GeV and |η| < 2.5, and Tight muons are those that pass the 'Medium' muon selection [53], satisfy the 'FixedCutTightTrackOnly' isolation criterion [53], and have p T > 27 GeV.
Jets are reconstructed from three-dimensional topological energy clusters [54] in the calorimeter using the anti-k t jet algorithm [43] with a radius parameter of 0.4. Reconstructed jets are then corrected to the particle level by the application of a jet energy scale calibration that is derived from simulation and by in situ corrections obtained from 13 TeV data [55]. Jets used in this analysis are required to have p T > 25 GeV and |η| < 2.5 after calibration.
To avoid selecting jets from pile-up, lowp T ( p T < 120 GeV) jets in the central (|η| < 2.5) region of the detector are required to satisfy the jet-vertex tagger [56] configured such that it has an efficiency of approximately 92% to identify jets from a primary vertex. This requirement is applied to both data and simulation. Quality criteria are imposed to identify jets arising from non-collision sources or detector noise (using the BadLoose operating point [57]), and any event containing at least one such jet is removed. This removal produces a negligible loss of efficiency for signal events.
The b-jets are identified via a b-tagging algorithm that uses multivariate techniques to combine information from the impact parameters of displaced tracks as well as topological properties of secondary and tertiary decay vertices reconstructed within the jet. This analysis uses the MV2c10 tagger [58], trained on a hybrid sample of simulated tt and Z events statistically enriched at highp T in order to discriminate b-jets from a background consisting of light-(93%) and c-labelled (7%) jets [29]. A weight is calculated corresponding to the probable presence of a b-quark or a c-quark, and jets are confirmed b-tagged if they satisfy a minimum requirement on the MV2c10 b-tagging weight corresponding to an average efficiency in tt events of 60% for b-jets, 4% for c-jets and a rejection factor of approximately 1200 for light-flavour jets across the jet p T range.
An overlap removal procedure is carried out to resolve ambiguities between jets and lepton candidates. To prevent treating electron energy deposits as jets, the closest jet within R y = ( y) 2 + ( φ) 2 = 0.2 of a selected electron is removed. 4 If the nearest jet surviving that selection is within R y = 0.4 of the electron, the electron is discarded. To reduce the background from heavy-flavour decays inside jets, muons are removed if they are separated from the nearest jet by R y < 0.4. However, if that jet has fewer than three associated tracks, the muon is kept and the jet is removed instead.

Analysis strategy
Events selected for further analysis are required to have at least five jets, of which at least two must be b-tagged. The four highestp T jets are required to be on the trigger efficiency plateau, namely to have p T > 120 GeV or p T > 140 GeV, depending on the jetp T trigger requirement in 2015-2016 or 2017-2018, and have |η| < 2.5. All other jets present in the event are required to have p T > 25 GeV and |η| < 2.5. A lepton veto is applied: events that contain loose muons or electrons with p T > 10 GeV, whether isolated or nonisolated, are discarded.
After the selections described above, the largest background contribution to the measurement is from non-resonant multijet production from light-quark and gluonic final states. The next largest is from tt+jets production. Other small background contributions originate from the production of a single top quark and from the production of a tt pair in association with either a vector boson or a Higgs boson. The estimation of the multijet background using a data-driven method and the validation of this estimate without significant bias from potential signal contamination are the main challenges for this analysis.
To probe top-squark pair production and estimate the contribution of signal top squarks in data, a model-dependent fit of the yield of events with jet multiplicity N j = 6, 7, 8 and ≥ 9 and b-tagged jet multiplicity N b = 4 and ≥ 5 is performed. These (N j , N b ) regions are indicated as SRt in Table 1. The signal contribution predicted for different values of mt and mχ0 1,2 ,χ ± 1 is considered in all bins and is scaled by one common signal-strength parameter (μtt * ). For the model considered here, the product of acceptance and reconstruction efficiency (A × ) is of order ∼ 5 × 10 −2 for N j ≥ 9 and N b ≥ 5. Figure 2 shows the number of signal events obtained from the model as a function of N j and N b compared to the estimated backgrounds. Their evaluation is described in Sect. 6. The signal yields are concentrated at high jet and b-tagged jet multiplicity, while the backgrounds are concentrated at low b-tagged jet multiplicity. To validate the background estimates, intervals with N j = 6, 7, 8 and ≥ 9, and N b = 3 and 4, subsequently referred to as VR-MJ, are used. In these, a 4 The rapidity is defined as y = 1 2 ln E+ pz E− pz where E is the energy and p z is the longitudinal component of the momentum along the beam-line. region-dependent selection is applied, based on a maximum accepted value of the centrality mass (C mass ), defined as: i.e. the ratio of the scalar sum of all jet p T in the event (H T ) to the invariant mass of the set of observed jets. The signalto-background ratio decreases monotonically with decreasing C mass for all N j and N b values. The value of the maximum value of C mass (C max mass ) is chosen such that the signal-tobackground ratio is less than 5%. Values of the C max mass limits used are listed in Table 1.
A separate, model-independent test is used to search for, and to set generic exclusion limits on, potential contributions from a hypothetical BSM signal by comparing the observed number of events with background predictions in two dedicated signal regions, one with N j ≥ 9 and N b ≥ 5 and the other with N j ≥ 8 and N b ≥ 5 (labelled SR discovery in Table 1), that were not explored in previous searches at the LHC.

Multijet background estimation
The predominant multijet background is estimated via a datadriven method, subsequently referred to as the tag-rate function method for multijet events (TRF MJ ) [59,60]. The aim is to extrapolate the b-tag multiplicity distributions from N j = 5, where the signal contamination for models not already excluded by other LHC searches is negligible, to larger N j values. The TRF MJ method uses a tag-rate function to quantify the experimental probability of b-tagging an additional jet in samples of events with at least two, or at least three, btagged jets. This per-jet probability is then used to estimate the shape of the multijet b-tag multiplicity distribution for each N j value.
Events that satisfy the selection criteria described in Sect. 5 and that have exactly five jets, of which at least two are b-tagged, are used to determine the b-tagging probability. The data are first corrected by subtracting the expected nonmultijet background found in simulation, approximately 5% of the total. After excluding the two jets in each event with the highest b-tagging weight, the probability that each remaining jet is b-tagged, denoted ε 2 , is calculated for this jet. A similar procedure is used to calculate the probability ε 3 of additional b-tagged jets in events with at least three b-tagged jets. These ε probabilities are parameterised as a function of both the p T of the remaining jet divided by H T , and the minimum R between that jet and the two (for ε 2 ) or three (for ε 3 ) jets with the largest b-tagging weight in the event ( R min ). This choice of variables for the parameterisation is made to minimise the residual differences (non-closure) between the Table 1 The strategy of the analysis. For the model-dependent fit, the signal regions (SRt ) consist of events with N j = 6, 7, 8 and ≥ 9 jets and N b = 4 and ≥ 5. These are used independently in the final fit. For the model-independent fit, two dedicated signal regions (SR discovery ), with (N j ≥ 9, N b ≥ 5) and (N j ≥ 8, N b ≥ 5), are used. The validation regions (VR-MJ), which are based on a maximum value of the centrality mass, C max mass , introduced for the description of the VRs in Sect. 5, are also indicated Predicted numbers of events as a function of jet multiplicity, N j , and b-tagged jet multiplicity, N b , for a SM background (multijet and top-quark production) and b top-squark pair production in thet →bχ + 1 (χ + 1 →bbs) (and c.c.) channel, for mt = 1000 GeV and mχ± 1 = 950 GeV TRF MJ prediction and the number of events obtained when selecting b-jets directly in the most sensitive signal regions in the multijet events simulated by MC. The dependence of ε 2 and ε 3 on both p T /H T and R min is shown in Fig. 3. The rapid variation with R min is consistent with the dependence expected from multi-b-jet production due to gluon-splitting. The p T /H T dependence, more visible at small R min , reflects the variation of the b-tagging efficiency with jet p T . Following the methods of Ref.
[61], in the second step of the TRF MJ method the expected number of events with each different number of b-tagged jets is estimated for each N j value by weighting all events with N b ≥ 2 by the event probability of having N b = 2, 3, 4 and ≥5, respectively. Upon subtracting the non-multijet background contribution [59], the event probabilities are estimated using both ε 2 and ε 3 , after first excluding the two jets with the highest b-tagging weight. For N b = 2 the event probabilities are estimated directly from ε 2 , treating the tagging probability for each jet as independent. For N b = 3, 4 and ≥5, a two-step procedure is employed. First, a 'pseudodata sample' with N b ≥ 3 is emulated, using ε 2 in events with N b ≥ 2. The additional emulated b-tagged jet is chosen randomly from the remaining N j −2 jets by using their probability-dependent b-tagging weights [60]. This emulated sample is then used to estimate the event probabilities, this time relying on ε 3 . The probability of finding N b = 4 and N b ≥ 5 is estimated using the emulated N b ≥ 3 sample via ε 3 . Due to too few events in the control sample from which the ε 2 and ε 3 values are extracted, it is not possible to estimate the probability of b-tagging an additional jet in a sample of events with at least four b-tagged jets.

Validation of TRF MJ method
The TRF MJ method is validated using two different comparisons with data: in the VR-MJ regions defined in Sect. 5, and in a separate set of Z + jets-enriched events. Figure 4 shows An independent test of the method is performed in Z + jetsenriched events, referred as 'VR-ZJ', where additional jets are produced by radiation and where bb pairs arise from gluon splitting. In order to select events where a Z boson decays into pairs of electrons or muons, events are required to pass a single-lepton trigger. Two opposite-sign, same-flavour, tight electrons or muons are required to each have p T > 27 GeV and a pair mass larger than 60 GeV. Events are required to have at least five jets with p T > 25 GeV and |η| < 2.5, of which at least two must be b-tagged. The tagging probabilities ε 2 and ε 3 are derived from five-jet VR-ZJ events and used to predict the number of events with N j = 6, 7, 8, ≥ 9 and N b = 4, ≥ 5. As shown in Fig. 5, this statistically limited test further validates the TRF MJ method.

Systematic uncertainties
Several sources of systematic uncertainty are considered that can affect the overall normalisation of signal and background samples and their relative contribution for different values of N j and N b . In estimating the dominant multijet background from the data, systematic uncertainties arise from the assumptions made in obtaining the TRF MJ background estimates. Uncertainties related to the theoretical modelling and due to the description of the detector response in simulated events are relevant only for the signal and background MC samples.
The main assumption of the TRF MJ method is that it is possible to define per-jet b-tagging probabilities (ε 2 and ε 3 ) in events with at least two or at least three b-tagged jets and, in particular, that the variables used for the parameterisation are sensitive to the heavy-flavour composition of the jet sample. A second assumption is that the per-jet probabilities are independent of the jet multiplicity and, therefore, may be derived in a specific region, namely that with exactly five jets, and applied to regions with N j = 6, 7, 8 and ≥ 9 jets. The validity of these assumptions is verified using MC simulations. The TRF MJ method is applied to Pythia 8 MC dijet events, and the larger of (a) the residual non-closure and (b) the statistical uncertainty in the number of events with a given b-tagged jet multiplicity, is symmetrised and taken to be the systematic uncertainty associated with the method. Table 2 shows the final TRF MJ systematic uncertainty in the multijet background estimation in each (N j , N b ) region. For N b = 4 the TRF MJ uncertainties are dominated by the non-closure component, while for N b ≥ 5, the statistical component dominates. The TRF MJ uncertainties are the source of the largest systematic uncertainty for the analysis.
The second largest contribution to the total systematic uncertainty arises from the modelling of the tt+jets background. The diagrams that contribute to tt+≥ 1b, tt+≥ 1c, and tt+light production are different, and the associated uncertainties may affect these processes differently in different regions. As a result, all uncertainties in tt+jets background modelling, except the uncertainty in the inclusive cross-section, are considered to be uncorrelated among tt+≥ 1b, tt+≥ 1c, and tt+light.  The uncertainty in the inclusive tt NNLO+NNLL production cross-section is taken to be ±6% [42]. This uncertainty includes effects from varying the factorisation and renormalisation scales, the PDF, α S , and the top-quark mass. The normalisations of the tt+≥ 1c and tt+≥ 1b yields are taken from their fractional contribution to the nominal tt+jets sample as generated using the Powheg-Box program. In addi-tion to the uncertainty in the inclusive tt cross-section, an additional uncertainty of 50%, based on the measurement of the tt+≥ 1b and tt+≥ 1c normalisation factors reported in Ref.
The impact of the parton shower and hadronisation model uncertainties on the tt+jets, tt H and W t single-top-quark Table 2 Systematic uncertainties in the data-driven estimation of the multijet background using the TRF MJ method. The uncertainties are assessed using Pythia 8 MC dijet events for each value of jet multiplicity (N j ) and b-tagged jet multiplicity (N b ) used in the final fit 16% 14% yields is evaluated by comparing the sample from the nominal generator set-up with a sample produced with the NLO Powheg-Box v2 generator using the NNPDF3.0 NLO PDF set. The latter events are interfaced with Herwig 7.04 [63,64], using the H7UE set of tuned parameters [64] and the MMHT2014LO PDF set [65], and processed using fast simulation of the detector response. The difference between the two predictions of the tt + ≥ 1b event yield ranges from 20% (33%) for N j = 6 and N b = 4(≥ 5) to 46% (60%) in the region with N j ≥ 9 and N b = 4 (≥ 5).
To assess the uncertainty due to the choice of matching scheme, the Powheg-Box sample is compared with a sample produced by MadGraph5_aMC@NLO and Pythia 8. For the calculation of the hard scattering, MadGraph5_ aMC@NLO v2.6.0 with the NNPDF3.0 NLO PDF set is used. The events are processed with Pythia 8.230, using the A14 set of tuned parameters and the NNPDF2.3 LO PDF set, and the fast simulation of the detector response. The uncertainty, which is obtained from the difference in yield between the two models and is symmetrised, affects both the normalisation and the N j -and N b -dependence of background rates. It is largest for large values of the jet and b-tagged jet multiplicities. For tt+≥ 1b, it reaches 25% for N j = 8, ≥ 9 and N b = 4, and 41% (32%) for N j = 8 (≥ 9) and N b ≥ 5.
The effect of renormalisation and factorisation scale uncertainties and PDF uncertainties is evaluated for tt H and tt V events. For the former, the scales are varied simultaneously by common factors of 2.0 and 0.5. For the latter, the envelope of the 100 variations for NNPDF3.0 NLO [34] are taken into account. An uncertainty of ±5% is assigned to the total cross-section for single-top production [45, 66,67]. For both the tt H and single-top events, additional uncertainties due to initial-and final-state radiation and the choice of generator are evaluated in a manner similar to that used for tt + jets. The uncertainty in the amount of interference between W t and tt production at NLO is assessed by comparing samples using the default 'diagram removal' scheme with those using an alternative 'diagram subtraction' scheme [44]. All modelling uncertainties from non-tt+jets simulated backgrounds are, after investigation, found to be negligible.
The uncertainties assigned to the expected signal yield for the SUSY benchmark processes considered include the experimental uncertainties related to the luminosity and to the detector modelling, which are dominated by the modelling of the jet energy scale and the b-tagging efficiencies. For example, for thet → bχ + 1 (χ + 1 →bbs and c.c.) signal model, the b-tagging uncertainties in the region N j ≥ 9 and N b = 4 are approximatively 10%, and the jet-related uncertainties of the signal yields are in the range of 3-5%. The uncertainties in the signal yields related to the modelling of additional jet radiation are studied by varying the factorisation, renormalisation, and jet-matching scales as well as the parton-shower tune in the simulation. The corresponding uncertainties are small for most of the signal parameter space and are largest for small top-squark masses, where they reach 7%. The uncertainty in the signal cross-section ranges between 8% and 11% for a top-squark mass in the range 600-1000 GeV.

Results
The events are allocated to (N j , N b ) regions with different signal-to-background ratios in order to constrain systematic uncertainties and to improve the separation of signal and background. Then, in each region, the total signal and background yields, shown in Tables 3 and 4, are used in combination as the input for the statistical analysis to extract the final results.
Hypothesis testing is performed using a modified frequentist method as implemented in RooStats [68] and is based on a profile likelihood which takes into account the systematic uncertainties as nuisance parameters. This procedure minimises the impact of systematic uncertainties on the search sensitivity by taking advantage of the highly populated, background-dominated (N j , N b ) regions included in the likelihood fit. The signal-strength parameter, μtt * , defined for positive values and corresponding to the signal normalisation, is unconstrained in the profile-likelihood fit. The normalisation of each component of the background and μtt * are determined simultaneously from the fit to the data.
Individual sources of systematic uncertainty are taken as uncorrelated. Contributions from tt + ≥ 1b, tt + ≥ 1c, tt + light, tt + V , tt H and single-top-quark backgrounds are constrained by the uncertainties of the respective theoretical calculations, the uncertainty in the luminosity (described in Sect. 3), and experimental data. The TRF MJ uncertainty is taken as uncorrelated across regions because of its large statistical component. In all cases, the profile-likelihoodratio test is used to establish 95% confidence intervals using the CL s [69] prescription. The likelihood is configured differently for the model-independent and model-dependent hypothesis tests.  For the model-independent test, a profile-likelihood fit is performed independently in the two SR discovery regions with (N j ≥ 8, N b ≥ 5) and (N j ≥ 9, N b ≥ 5). This test is used to search for, and to compute generic exclusion limits on, the potential contribution from a hypothetical BSM signal in the given SR discovery regions.
For the model-dependent test, assuming a specific topsquark model with variable mass values, tests of the signalplus-background hypothesis, i.e. μtt * = 1, are formed for a series of values of mt and mχ0 1,2 ,χ ± 1 . These are used to derive exclusion limits for the specific top-squark model. The full set of regions, N j = 6, 7, 8 and ≥ 9 and N b = 4 and ≥ 5, is employed in the likelihood. The expected signal contribution, as predicted by the given model, is considered in all regions and is scaled by μtt * . Figure 6 shows the observed numbers of data events compared with the fitted background model. The likelihood fit is configured using the model-dependent set-up where all bins are input to the fit, and μtt * is set to zero. This configuration is also referred to as the background-only fit and includes no free-floating parameters, only nuisance parameters with Gaussian constraints. An example signal model is also shown in the figure to illustrate the separation between the signal and the background.

Model-independent interpretation
The model-independent results are calculated from the observed number of events and the background predictions in the two SR discovery regions. The observed number of events and the backgrounds obtained from the fits are shown for both SR discovery regions in Table 5. Fig. 6 Expected background and observed number of events in different jet and b-tag multiplicity bins. The background is estimated by including all bins in a background-only fit and is plotted separately for each contribution. An example signal yield fort →bχ + 1 (χ + 1 →bbs and c.c.) production with mt = 600 GeV and mχ± 1 = 550 GeV is overlaid. The bottom panel displays the ratios of data to the total prediction, uncertainty bars are statistical only. All uncertainties, which can be correlated across bins, are included in the error bands (hatched regions) Table 5 Fitted background yields in (N j ≥ 8, N b ≥ 5) and (N j ≥ 9, N b ≥ 5) signal regions. The individual background uncertainties can be larger than the total uncertainty due to correlations between parameters Model-independent 95% CL upper limits on the expected and observed number of BSM events, N 95 exp and σ 95 obs , that may contribute to the signal regions are computed from the observed number of events and the fitted background. Normalising these results by the integrated luminosity, L, of the data sample, allows them to be interpreted as upper limits on the visible BSM cross-section σ 95 obs , defined as: where σ prod is the production cross-section. The resulting limits are presented in Table 6. In addition, the p 0 values, which quantify the probability that a background-only hypothesis results in a fluctuation giving an event yield equal to or larger than the one observed in the data, are calculated, as are the corresponding Gaussian significance values Z .

Model-dependent interpretation
For each signal model probed, the fit is configured using the model-dependent set-up, as detailed in the first part of Sect. 8. Figure 7 shows exclusion limits at the 95% confidence level in the top-squark production model when B(t → bχ + 1 ) is assumed to be unity. For this model, top-squark masses are excluded up to 950 GeV for chargino masses close to the kinematic threshold for producing this final state. For lower values of the chargino mass, the limit weakens such that for chargino masses of around 200 GeV, the top-squark mass is constrained to be more than 800 GeV. In this phase space region, the signal is concentrated at lower N j and N b values where the background is larger.
The limits for higgsino LSPs are shown in Fig. 8. In the region mt − mχ0 1,2 ,χ ± 1 ≥ m top the sensitivity of the analysis is lower than in the puret → bχ ± 1 case because contributions to the signal that have one leptonically decaying top quark fail the lepton-veto requirement. The large contribution of the multijet background reduces the present sensitivity relative to a previous ATLAS search that analysed events characterised by the presence of a lepton plus jets [11].

Conclusion
A search for physics beyond the Standard Model in events with high jet multiplicity and a large number of b-tagged jets is described in this paper. The search uses 139 fb −1 of √ s = 13 TeV proton-proton collision data collected by the ATLAS experiment at the LHC. In contrast to many previous   ), on the number of excess events. The limits are determined for two signal regions, (N j ≥ 8, N b ≥ 5) and (N j ≥ 9, N b ≥ 5). The p 0 value quantifies the probability that the background-only hypothesis would result in a fluctuation that gives an event yield equal to or larger than the one observed in the data, and Z is the corresponding Gaussian significance  [70].

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: All ATLAS scientific output is published in journals, and preliminary results are made available in Conference Notes. All are openly available, without restriction on use by external parties beyond copyright law and the standard conditions agreed by CERN. Data associated with journal publications are also made available: tables and data from plots (e.g. cross section values, likelihood profiles, selection efficiencies, cross section limits, ...) are stored in appropriate repositories such as HEPDATA (http:// hepdata.cedar.ac.uk/). ATLAS also strives to make additional material related to the paper available that allows a reinterpretation of the data in the context of new theoretical models. For example, an extended encapsulation of the analysis is often provided for measurements in the framework of RIVET (http://rivet.hepforge.org/)." This information is taken from the ATLAS Data Access Policy, which is a public document that can be downloaded from http://opendata.cern.ch/record/413 [opendata.cern.ch].] Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 .