Search for supersymmetry in pp collisions at sqrt(s) = 13 TeV in the single-lepton final state using the sum of masses of large-radius jets

Results are reported from a search for supersymmetric particles in proton-proton collisions in the final state with a single, high transverse momentum lepton; multiple jets, including at least one b-tagged jet; and large missing transverse momentum. The data sample corresponds to an integrated luminosity of 2.3 inverse femtobarns at sqrt(s) = 13 TeV, recorded by the CMS experiment at the LHC. The search focuses on processes leading to high jet multiplicities, such as gluino pair production with gluinos to t t-bar neutralino[1]. The quantity M[J], defined as the sum of the masses of the large-radius jets in the event, is used in conjunction with other kinematic variables to provide discrimination between signal and background and as a key part of the background estimation method. The observed event yields in the signal regions in data are consistent with those expected for standard model backgrounds, estimated from control regions in data. Exclusion limits are obtained for a simplified model corresponding to gluino pair production with three-body decays into top quarks and neutralinos. Gluinos with a mass below 1600 GeV are excluded at a 95% confidence level for scenarios with low neutralino[1] mass, and neutralinos with a mass below 800 GeV are excluded for a gluino mass of about 1300 GeV. For models with two-body gluino decays producing on-shell top squarks, the excluded region is only weakly sensitive to the top squark mass.


Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6][7][8] is an extension of the standard model (SM) of particle physics that is motivated by several considerations, including the gauge hierarchy problem [9][10][11][12][13][14], the existence of astrophysical dark matter [15][16][17], and the possibility of gauge coupling constant unification at high energy [18][19][20][21][22].In SUSY models, each SM particle has a corresponding supersymmetric partner (or partners) whose spin differs by one-half, such that fermions are mapped to bosons and vice versa.Gauge quantum numbers are preserved by this symmetry, and to preserve degrees of freedom, a SM spin-1/2 Dirac particle, such as the top quark, has two spin-0 partners, the top squarks.The SUSY partner of the (spin-1) gluon, the massless mediator of the strong interactions in the SM, is the spin-1/2 gluino.In R-parity-conserving models [23,24], SUSY particles are produced in pairs, and the lightest supersymmetric particle (LSP) is stable.If the LSP is the lightest neutralino ( χ 0 1 ), an electrically neutral mixture of the SUSY partners of the neutral electroweak gauge and Higgs bosons, then it has weak interactions only and can in principle account for some or all of the dark matter.
The gauge hierarchy problem has become more urgent with the discovery of the Higgs boson [25][26][27][28][29][30].Although the SM is conceptually complete, the Higgs boson mass, together with the electroweak scale, is unstable against enormous corrections from loop processes, which pull the Higgs mass to the cutoff scale of the theory, for example, the Planck scale.This outcome can be avoided within the framework of the SM only with extreme fine tuning of the bare Higgs mass parameter, a situation that is regarded as unnatural, although not excluded.This problem suggests that additional symmetries and associated degrees of freedom may be present that ameliorate these effects.So-called natural SUSY models [31][32][33][34], in which sufficiently light SUSY partners are present, are a major focus of current new physics searches at the CERN LHC.In natural models, several of the SUSY partners are constrained to be light [33]: both top squarks, t L and t R , which have the same electroweak couplings as the left-(L) and right-(R) handed top quarks, respectively; the bottom squark with L-handed couplings ( b L ); the gluino ( g); and the Higgsinos ( h).While the gluino mass is not constrained by naturalness considerations as strongly as that of the lighter top squark mass eigenstate, t 1 , the cross section for gluino pair production is substantially larger than that for top squark pair production, for a given mass.As a consequence, the two types of searches can have comparable sensitivity to these models.Both types of searches are currently of intense interest, and CMS and ATLAS data taken at √ s = 8 TeV have provided significant constraints [35] on natural SUSY scenarios.
This study uses the first LHC proton-proton collision data taken by the CMS experiment at √ s = 13 TeV to search for gluino pair production.Searches targeting this process in the singlelepton final state using 8 TeV data have been performed by both ATLAS [36,37] and CMS [38].For m g = 1.5 TeV, somewhat above the highest gluino masses excluded at √ s = 8 TeV, the cross section for gluino pair production increases dramatically with center-of-mass energy, from about 0.4 fb at √ s = 8 TeV to about 14 fb at √ s = 13 TeV [39].In contrast, the cross section for the dominant background, tt production, increases much more slowly, from about 248 pb at √ s = 8 TeV to 816 pb at √ s = 13 TeV [40].As a consequence, the sensitivity of this search can be significantly extended with respect to searches performed at √ s = 8 TeV, even though the 13 TeV data sample has an integrated luminosity of only 2.3 fb −1 , roughly one-tenth of that acquired at 8 TeV.: Gluino pair production and decay for the simplified models T1tttt (left) and T5tttt (right).In T1tttt, the gluino undergoes three-body decay g → tt χ 0 1 via a virtual intermediate top squark.In T5tttt, the gluino decays via the sequential two-body process g → t 1 t, t 1 → t χ 0 1 .Because gluinos are Majorana particles, each one can decay to t 1 t and to the charge conjugate final state t 1 t.Fig. 1 (right) and will be denoted by T5tttt.(For this scenario, the small contribution from the direct production of top squark pairs is also taken into account.)Regardless of whether the top squark is produced on or off mass shell, the final state is characterized by a large number of jets, four of which are b jets from top quark decays.Depending on the decay modes of the accompanying W bosons, a range of lepton multiplicities is possible; we focus here on the single-lepton final state, where the lepton is either an electron or a muon.Because the two neutralinos ( χ 0 1 ) are undetected, their production in SUSY events typically gives rise to a large amount of missing (unobserved) momentum, whose value in the direction transverse to the beam axis can be inferred from the momenta of the observed particles.The missing transverse momentum, p miss T , is a key element of searches for R-parity-conserving SUSY, and its magnitude is denoted by E miss T .
A challenge in performing searches for SUSY particles is obtaining sufficient sensitivity to the signal, while at the same time understanding the background contribution from SM processes in a robust manner.This analysis is designed such that the background in the signal regions arises largely from a single process, dilepton tt production, in which both W bosons from t → bW + decay leptonically, but only one lepton satisfies the criteria associated with identification, the minimum transverse momentum (p T ) requirement, and isolation from other energy in the event.The search signature is characterized not only by the presence of high-p T jets and b-tagged jets, an isolated high-p T lepton, and large E miss T , but also by additional kinematic variables.Apart from resolution effects, the transverse mass of the lepton + p miss T system, m T , is bounded above by m W for events with a single leptonically decaying W, and this variable is very effective in suppressing the otherwise dominant single-lepton tt background.The quantity M J , the scalar sum of the masses of large-radius jets, is used both to characterize the mass and energy scale of the event, providing discrimination between signal and background, and as a key part of the background estimation.A property of M J exploited in this analysis is that, for the dominant background, this variable is nearly uncorrelated with m T .Because of the absence of correlation between M J and m T , the background shape at high m T , including the signal region, can be measured to a very good approximation using a low-m T control sample.The quantity M J was first discussed in phenomenological studies, for example, in Refs.[45][46][47].Similar variables have been used by ATLAS for SUSY searches in all-hadronic final states using 8 TeV data [48,49].We have presented studies of basic M J properties and performance using early 13 TeV data [50].This paper is organized as follows.Section 2 gives a brief overview of the CMS detector.Section 3 discusses the simulated event samples used in the analysis.The event reconstruction is discussed in Section 4, while Section 5 describes the trigger and event selection.Section 6 presents the methodology used to predict the SM background from the event yields in control regions in data.The associated systematic uncertainties are also discussed.The event yields observed in the signal regions are presented in Section 7.These yields are compared with background predictions and used to obtain exclusion regions for the gluino pair production models shown in Fig. 1.Finally, Section 8 presents a summary of the methodology and the results.

Detector
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are the tracking and calorimeter systems.The tracking system, composed of silicon-pixel and silicon-strip detectors, measures charged particle trajectories within the pseudorapidity range |η| < 2.5, where η ≡ − ln[tan(θ/2)] and θ is the polar angle of the trajectory of the particle with respect to the counterclockwise proton beam direction.A lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections, provide energy measurements up to |η| = 3. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors up to |η| = 5.Muons are identified and measured within the range |η| < 2.4 by gas-ionization detectors embedded in the steel magnetic flux-return yoke outside the solenoid.The detector is nearly hermetic, permitting the accurate measurement of p miss T .A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, is given in Ref. [51].

Simulated event samples
The analysis makes use of several simulated event samples for modeling the SM background and signal processes.While the background estimation in the analysis is performed largely from control samples in the data, simulated event samples provide correction factors, typically near unity.The equivalent integrated luminosity of the simulated event samples is at least six times that of the data, and at least 100 times that of the data in the case of tt and signal processes.
The production of tt+jets, W+jets, Z+jets, and QCD multijet events is simulated with the Monte Carlo (MC) generator MADGRAPH5 AMC@NLO 2.2.2 [52] in leading-order (LO) mode.Single top quark events are modeled at next-to-leading order (NLO) with MADGRAPH5 AMC@NLO for the s-channel and POWHEG v2 [53,54] for the t-channel and W-associated production.Additional small backgrounds, such as tt production in association with bosons, diboson processes, and t tt t are similarly produced at NLO with either MADGRAPH5 AMC@NLO or POWHEG.All events are generated using the NNPDF 3.0 [55] set of parton distribution functions (PDF).Parton showering and fragmentation are performed with the PYTHIA 8.205 [56] generator with the underlying event model based on the CUETP8M1 tune detailed in Ref. [57].The detector simulation is performed with GEANT4 [58].The cross sections used to scale simulated event yields are based on the highest order calculation available.For tt, in addition to using the next-tonext-to-leading order + next-to-next-to-leading logarithmic cross section calculation [40], the modeling of the event kinematics is improved by reweighting the top quark p T spectrum to match the data [59], keeping the overall normalization fixed.
Signal events for the T1tttt and T5tttt simplified models are generated in a manner similar to that for the SM backgrounds, with the MADGRAPH5 AMC@NLO 2.2.2 generator in LO mode using the NNPDF 3.0 PDF set and followed with PYTHIA 8.205 for showering and fragmentation.The detector simulation is performed with the CMS fast simulation package [60] with scale factors applied to account for any differences with respect to the full simulation used for backgrounds.Event samples are generated for a representative set of model scenarios by scanning over the relevant mass ranges for the g and χ 0 1 , and the yields are normalized to the NLO + next-to-leading-logarithmic cross section [39,[61][62][63][64].
Throughout this paper, two T1tttt benchmark models are used to illustrate typical signal behavior.The T1tttt(1500,100) model, with masses m g = 1500 GeV and m χ 0 1 = 100 GeV, corresponds to a scenario with a large mass splitting (referred to as non-compressed, or NC) between the gluino and the neutralino.This mass combination probes the sensitivity of the analysis to a low cross section (14 fb) process that has a hard E miss T spectrum, which results in a relatively high signal efficiency.The T1tttt(1200,800) model, with masses m g = 1200 GeV and m χ 0 1 = 800 GeV, corresponds to a scenario with a small mass splitting (referred to as compressed, or C) between the gluino and the neutralino.Here the cross section is much higher (86 fb) because the gluino mass is lower than for the T1tttt(1500,100) model, but the sensitivity suffers from a low signal efficiency due to the soft E miss T spectrum.
Finally, to model the presence of additional proton-proton collisions from the same or adjacent beam crossing as the primary hard-scattering process ("pileup" interactions), the simulated events are overlaid with multiple minimum bias events, which are also generated with the PYTHIA 8.205 generator with the underlying event model based on the CUETP8M1 tune.The distribution of the number of overlaid minimum bias events is broad and peaks in the range 10-15.

Event reconstruction
The reconstruction of physics objects in an event proceeds from the candidate particles identified by the particle-flow (PF) algorithm [65,66], which uses information from the tracker, calorimeters, and muon systems to identify the candidates as charged or neutral hadrons, photons, electrons, or muons.Charged particle tracks are required to originate from the event primary vertex (PV), defined as the reconstructed vertex, located within 24 cm (2 cm) of the center of the detector in the direction along (perpendicular to) the beam axis, that has the highest value of p 2 T summed over the associated charged particle tracks.The charged PF candidates associated with the PV and the neutral PF candidates are clustered into jets using the anti-k T algorithm [67] with distance parameter R = 0.4, as implemented in the FASTJET package [68].The estimated pileup contribution to the jet p T from neutral PF candidates is removed with a correction based on the area of the jet and the average energy density of the event [69].The jet energy is calibrated using p T -and η-dependent corrections; the resulting calibrated jet is required to satisfy p T > 30 GeV and |η| ≤ 2.4.Each jet must also meet loose identification requirements [70] to suppress, for example, calorimeter noise.Finally, jets that have PF constituents matched to an isolated lepton, as defined below, are removed from the jet collection.
A subset of the jets are "tagged" as originating from b quarks using the combined secondary vertex (CSV) algorithm [71,72].For the CSV medium working point chosen for this analysis, the signal efficiency for b jets in the range p T = 30 to 50 GeV is 60-67% (51-57%) in the barrel (endcap), increasing with p T .Above p T ≈ 150 GeV the b tagging efficiency decreases.The probability to misidentify jets arising from c quarks is 13-15% (11-13%) in the barrel (endcap), while the misidentification probability for light-flavor quarks or gluons is 1-2%.
Throughout this paper, quantities related to the number of jets (N jets ) or to the number of btagged jets (N b ) are based only on small-R jets, not on the large-R jets discussed below.
Electrons are reconstructed by associating a charged particle track with an ECAL supercluster [73].The resulting candidate electrons are required to have p T > 20 GeV and |η| < 2.5, and to satisfy identification criteria designed to remove light-parton jets, photon conversions, and electrons from heavy flavor hadron decays.Muons are reconstructed by associating tracks in the muon system with those found in the silicon tracker [74].Muon candidates are required to satisfy p T > 20 GeV and |η| < 2.4.
To preferentially select leptons that originate in the decay of W bosons, leptons are required to be isolated from other PF candidates.Isolation is quantified using an optimized version of the "mini-isolation" variable originally suggested in Ref. [75], in which the transverse energy of the particles within a cone in η-φ space surrounding the lepton momentum vector is computed using a cone size that scales as 1/p T , where p T is the transverse momentum of the lepton.In this analysis, mini-isolation, I rel mini , is defined as the transverse energy of particles in a cone of radius R mini-iso around the lepton, divided by p T .The transverse energy is computed as the scalar sum of the p T values of the charged hadrons from the PV, neutral hadrons, and photons.The neutral hadron and photon contributions to this sum are corrected for pileup.The cone radius R mini-iso varies with the p T according to The 1/p T dependence is motivated by considering a two-body decay of a massive parent particle with mass M and large p T , for which the angular separation of the daughter particles is roughly ∆R daughters ≈ 2M/p T .The p T -dependent cone size reduces the rate of accidental overlaps between the lepton and jets in high-multiplicity or highly Lorentz-boosted events, particularly overlaps between b jets and leptons originating from a boosted top quark.The cone remains large enough to contain b-hadron decay products for non-prompt leptons across a range of p T values.Muons (electrons) must satisfy I rel mini < 0.2 (0.1).The combined efficiency for the electron reconstruction and isolation requirements is about 50% at a p T of 20 GeV, increasing to 65% at 50 GeV and reaching a plateau of 80% above 200 GeV.The combined reconstruction and isolation efficiencies for muons are about 70% at a p T of 20 GeV, increasing to 80% at 50 GeV and reaching a plateau of 95% at 200 GeV.
We cluster R = 0.4 ("small-R") jets and the isolated leptons into R = 1.2 ("large-R") jets using the anti-k T algorithm.The mass of the large-R jets retains angular information about the clustered objects, as well as their p T and multiplicity.Clustering small-R jets instead of PF candidates incorporates the jet pileup corrections, thereby reducing the dependence of the mass on pileup.The variable M J is defined as the sum of all large-R jet masses: ( The technique of clustering small-R jets into large-R jets has been used previously by ATLAS in, for example, Ref. [76].Leptons are included in the large-R jets to include the full kinematics of the event, and the choice R = 1.2 optimizes the background rejection power of M J while retaining signal efficiency.Larger distance parameters were found to offer no significant addi- Figure 2: Distributions of M J , normalized to the same area, from simulated event samples with a small ISR contribution (left) and a significant ISR contribution (right).These components are defined according to whether the p T of the tt system (or, in the case of signal events, that of the g g system) is <10 GeV or >100 GeV, respectively.The T1tttt(NC) signal model (dashed red line), is described in Section 3; the first model parameter in parentheses corresponds to m g and the second to m χ 0 1 , both in units of GeV.The events satisfy the requirements E miss T > 200 GeV and H T > 500 GeV and have at least one reconstructed lepton.tional discriminating power, while smaller parameters decrease the background rejection up to a factor of two for models with small mass splittings between the gluino and neutralino.
For tt events with a small contribution from initial-state radiation (ISR), the M J distribution has an approximate cutoff at twice the mass of the top quark, as shown in Fig. 2 (left).In contrast, the M J distribution for signal events extends to larger values.The presence of a significant amount of ISR generates a high-M J tail in the tt background, as shown in Fig. 2

(right).
The missing transverse energy, E miss T , is given by the magnitude of p miss T , the negative vector sum of the transverse momenta of all PF candidates [65,66].Correspondence to the true undetectable energy in the event is improved by replacing the contribution of the PF candidates associated with a jet by the calibrated four-momentum of that jet.To separate backgrounds characterized by the presence of a single W boson decaying leptonically but without any other source of missing energy, the lepton and the E miss T are combined to obtain the transverse mass, m T , defined as: where ∆φ , p miss T is the difference between the azimuthal angles of the lepton momentum vector and the missing momentum vector, p miss T .Finally, we define the quantity H T as the scalar sum of the transverse momenta of all the small-R jets passing the selection.

Trigger and event selection
The data sample used in this analysis was obtained with triggers that require H T > 350 GeV and at least one electron or muon with p T > 15 GeV, where these variables are computed with online (trigger-level) quantities and typically have somewhat poorer resolution than the corresponding offline variables.To ensure high trigger efficiency with respect to the offline definition of lepton isolation described in the previous section (mini-isolation), we designed these triggers with very loose lepton isolation requirements and fixed the isolation cone size to R = 0.2.For events passing the offline selection, the total trigger efficiencies, measured in data control samples that are independently triggered, are found to be (95.1 ± 1.1)% for the muon Table 1: Event yields obtained from simulated event samples, as the event selection criteria are applied.The category Other includes Drell-Yan, ttH(→ bb), tttt, WZ, and WW.The yields for tt events in fully hadronic final states are included in the QCD multijet category.The category ttV includes ttW, ttZ, and ttγ.The benchmark signal models, T1tttt(NC) and T1tttt(C), are described in Section 3. The event selection requirements listed above the horizontal line in the middle of the table are defined as the baseline selection.The background estimates before the H T requirement are not specified because some of the simulated event samples do not extend to the low H T region.Given the size of the MC samples described in Section 3, rows with zero yield have statistical uncertainties of at most 0.16 events, and below 0.05 events in most cases.channel and (94.1 ± 1.2)% for the electron channel and are independent of the analysis variables within the uncertainties.These efficiencies are applied to the simulation as a correction.
The offline event selection is summarized in Table 1, which lists the event yields expected from simulation for both SM background processes and for the two benchmark T1tttt signal models.We select events with exactly one isolated charged lepton (an electron or a muon), H T > 500 GeV, E miss T > 200 GeV, and at least six jets, at least one of which is b-tagged.After this set of requirements, referred in the following as the baseline selection, more than 80% of the remaining SM background arises from tt production.The contributions from events with a single top quark or a W boson in association with jets are each about 6-7%.The background from QCD multijet events after the baseline selection is negligible due to the combination of leptonic, E miss T , and N jets requirements.
After the baseline selection requirements are applied, events are binned in several other kinematic variables, both to increase the signal sensitivity and to define control regions, as described in Section 6.1.To illustrate the effect of additional requirements, Table 1 lists the expected yields for examples of event selection requirements on M J , m T , N jets , and N b .The events satisfying the baseline selection are divided in the M J -m T plane into a signal region, defined by the additional requirements M J > 400 GeV and m T > 140 GeV, and three control samples, bounded by M J > 250 GeV, that are used in the background estimation.Approximately 37% of signal T1tttt events are selected with the single-lepton requirement only.In non-compressed spectrum models, for which m g is significantly larger than m χ 0 1 , more than half of the events passing the lepton requirement lie in the signal region.For compressed spectrum models, where m χ 0 1 ≈ m g − 2m t , the M J , H T , and E miss T spectra become much softer and, as a result, only 5-10% of the singlelepton signal events are selected.
As shown in Fig. 3, backgrounds with a single W boson decaying leptonically are strongly suppressed after the m T > 140 GeV requirement, so the total SM background in the signal region is dominated by dilepton tt events.This dilepton background falls into two categories, which make roughly equal contributions.The first involves an identified electron or muon   and a hadronically decaying τ from W decay.The second source involves two leptons, each of which is an electron or a muon.One of the leptons fails to satisfy the lepton selection criteria, which include the p T and isolation requirements.This missed lepton can be produced either directly or indirectly in W decay, where in the indirect case the lepton is the daughter of a τ.
6 Background estimation

Method
The prediction of the background yields in each of the signal bins takes advantage of the fact that the M J and m T distributions of events with a significant amount of ISR are largely uncorrelated.The correlation coefficients for the single-lepton and dilepton tt events in the M J -m T plane after the baseline selection (as shown in Fig. 4) are small, in the range 0.03 to 0.05.The absence of a substantial correlation allows us to measure the M J distribution of the background at low m T with good statistical precision, and extrapolate it to high m T .The underlying explanation for this behavior is not immediately obvious, given that low-m T events originate mainly from tt events where only one of the top quarks decays leptonically (1 tt), while the high-m T regions are dominated by dilepton tt events (2 tt).In particular, as shown in Fig. 2 (left), in the absence of significant ISR, the dileptonic tt events have a softer M J spectrum than single-lepton tt events, simply because the reconstructed mass of a leptonically decaying top quark does not include the undetected neutrino.
In events with substantial ISR, however, the contributions to M J from the accidental overlap of jets can dominate the contributions due to the intrinsic mass of the top quarks.This effect is illustrated in Fig. 5, which compares the N jets and M J distributions of single-lepton and dilepton tt events at high and low m T after the baseline selection is applied.Since we require at least Figure 4: Distribution of simulated single-lepton tt events (dark-blue triangles), dilepton tt events (light-blue inverted triangles), and T1tttt(1500,100) events (red squares) in the M J -m T plane after the baseline selection.Each marker represents one expected event at 2.3 fb −1 .Overflow events are placed on the edge of the plot.The values of the correlation coefficients ρ for each background process are given in the legend.Region R4 is the nominal signal region, while R1, R2, and R3 serve as control regions.The small signal contributions in the control regions are taken into account in one of the global fits, as discussed in the text.
6 jets, single-lepton tt events must have at least 2 ISR jets and dilepton tt events must have at least 4. In this regime, the probability of additional ISR jets is similar for events with a given number of partons of similar momenta, and, as a result, the number of objects contributing to M J (jets plus the reconstructed lepton) is comparable in 1 and 2 tt events.When these ISR jets overlap with the top quark decay products, the masses of the resulting large-R jets are dominated by the accidental overlap and, thus, the shapes of the M J distribution of 1 and 2 tt events become more similar.This is the case for M J > 250 GeV, where Fig. 5 (right) shows that the distributions of the 1 and 2 tt backgrounds have nearly the same shape, and the low-m T to high-m T extrapolation is warranted.
We thus divide the M J -m T plane into four regions, three control regions (CR) and one signal region (SR): These regions are further subdivided into 10 bins of E miss T , N jets , and N b to increase signal sensitivity: where the multiplication indicates that the binning is two dimensional in N jets and N b .Given that the main background processes have two or fewer b quarks, the total SM contribution to the N b ≥ 3 bins is very small and is driven by the b-tag fake rate.Signal events in the T1tttt and  T5tttt models are expected to populate primarily the bins with N b ≥ 2, while bins with N b = 1 mainly serve to test the method in a background dominated region.
To obtain an estimate of the background rate in each of the signal bins, a modified version of an "ABCD" method is used.Here, the symbols A, B, C, and D refer to four regions in a twodimensional space in the data, where one of the regions is dominated by signal and the other three by backgrounds.In a standard ABCD method, the background rate in the signal region is estimated from the yields in three control regions with the expression where the labels on the regions correspond to those shown in Fig. 4. The background prediction is unbiased in the limit that the two variables that define the plane (in this case, M J and m T ) are uncorrelated.The effect of any residual correlation is corrected with factors κ that can be obtained from simulated event samples: When the two ABCD variables are uncorrelated or nearly so, the κ factors are close to unity.This procedure ignores potential signal contamination in the control regions, which is accounted for by incorporating the constraints in Eqs. 4 and 5 into a fit that includes both signal and background components, as described in Section 6.2.
In principle, the background in the 10 signal bins could be estimated by applying this procedure in 10 independent planes.However, this procedure would incur large statistical uncertainties in some bins due to low numbers of events in R3.This problem is especially important in bins with a high number of jets, where the M J spectrum shifts to higher values and the number of background events expected in R4 can exceed the background in R3.
To alleviate this problem, we exploit the fact that, after the baseline selection, the background is dominated by just one source (tt events), and the shapes of the N jets distributions are nearly identical for the single-lepton and dilepton components (due to the large amounts of ISR).As a result, the m T distribution is approximately independent of N jets and N b .We study this behavior with the ratio of the number of events at high to low m T : Because, as seen in Fig. 6, the values of R(m T ) do not vary substantially across N jets and N b bins, the predicted value of R(m T ) is not sensitive to the modeling of the distributions of those quantities.We exploit this result by integrating the yields of the low-M J regions (R1 and R3) over the N jets and N b bins for each E miss T bin.This procedure increases the statistical power of the ABCD method but also introduces a correlation among the predictions (Eq.4) for the N jets and N b bins associated with a given E miss T bin. Figure 7 shows the κ factors for the 10 signal bins after summing over N jets and N b in R1 and R3.In all cases, their values are close to unity.

Implementation
The method outlined in Section 6.1 is implemented with a likelihood function that incorporates the statistical and systematic uncertainties in κ, accounts for correlations arising from the common R1 and R3 yields, and corrects for signal contamination in the control regions.
The SM background contribution for each region is described as follows.We define µ bkg Ri as the estimated (Poisson) mean background in each region Ri, with i = 1, 2, 3, 4.Then, in an ABCD background calculation, these four rates can be expressed in terms of three floating fit parameters µ, R(m T ), and R(M J ), and the correlation correction factor κ, as Here, µ is the background rate fit parameter for R1, R(M J ) is the ratio of the R2 to R1 rates, and R(m T ) is the ratio of the R3 to R1 rates.The quantity κ is given by Eq. 5 after replacing the yields N as the expected signal rate in each region, and r as the parameter quantifying the signal strength relative to the expected yield across all analysis regions.We can then write the likelihood function as The indices k run over each of the E miss T , N jets , and N b bins defined in the previous section; these indices were suppressed in Eq. 7 for simplicity.Given the integration over N jets and N b at low M J , N bins (R1) = N bins (R3) = 2, while N bins (R2) = N bins (R4) = 10.
In Eq. 8, L data ABCD accounts for the statistical uncertainty in the observed data yield in the four ABCD regions, and L MC κ and L MC sig account for the uncertainty in the computation of the κ correction factor and signal shape, respectively, due to the finite size of the MC samples.
The systematic uncertainties in κ and the signal efficiency are described in the following sections.These effects are incorporated in the likelihood function as log-normal constraints with a nuisance parameter for each uncorrelated source of uncertainty.These terms are not explicitly shown in the likelihood function above for simplicity.
The likelihood function defined in Eqs.8-11 is employed in two separate types of fits that provide complementary but compatible background estimates based on an ABCD model.The first type of fit, which we call the predictive fit, allows us to more easily establish the agreement of the background predictions and the observations in the null (i.e., the background-only) hypothesis.We do this by excluding the observations in the signal regions in the likelihood (that is, by truncating the first product in Eq. 9 at i = 3) and fixing the signal strength r to 0. This procedure leaves as many unknowns as constraints: three data floating parameters (µ, R(M J ), and R(m T )) and three observations (N data Ri,k with i = 1, 2, 3) for each ABCD plane.In the likelihood function there are additional floating parameters associated with MC quantities, which have small uncertainties.As a result, the estimated background rates in regions R1, R2, and R3 converge to the observed values in those bins, and we obtain predictions for the signal regions that do not depend on the observed N data R4 .The predictive fit thus converges to the standard ABCD method, and the likelihood machinery becomes just a convenient way to solve the system of equations and propagate the various uncertainties.
Additionally, we implement a global fit which, by making use of the observations in the signal regions, can provide an estimate of the signal strength r, while allowing for signal events to populate the control regions.This is achieved by including all four observations, N data Ri,k with i = 1, 2, 3, 4, in the likelihood function.Since there are four observations and three floating background parameters in each ABCD plane, there are enough constraints for the signal strength also to be determined in the fit.

Systematic uncertainties
This section describes the systematic uncertainties in the background prediction, which are incorporated into the analysis as an uncertainty in the κ correction.Because the dominant background arises from 2 tt events, we use a control sample with two reconstructed leptons to validate our background estimation procedure and to quantify the associated uncertainty.The resulting uncertainty is augmented with simulation-based studies of effects that are not covered by this dilepton test.Table 2 summarizes all of the uncertainties in the background prediction.
The ability of the ABCD method to predict the 2 tt background is studied using a modified ABCD plane, in which the high-m T regions, R3 and R4, are replaced with regions D3 and D4, which have two reconstructed leptons.These regions have low and high M J , respectively, just as R3 and R4.The events in D3 and D4 pass the same selection as those in R3 and R4, except for the following changes: N jets bin boundaries are lowered by one to keep the number of large-R jet constituents the same as in the single-lepton samples; the m T requirement is not applied; and events with N b = 0 are included to increase the size of the event sample, while events with N b ≥ 3 are excluded to avoid signal contamination.We perform this test only for low E miss T to further avoid the potentially large signal contribution in the high-E miss T region.The low-M J regions (R1 and D3) are integrated over N jets , while the high-M J regions (R2 and D4) are binned in low and high N jets .The predictive fit is then used to predict the D4 event yields for both N jets bins.We predict 11.0 ± 2.3 (1.5 ± 0.5) events for the low (high) N jets bin, and we observe 12 (2) events.Given the good agreement between prediction and observation, the statistical precision of the test is taken as a systematic uncertainty in κ.These uncertainties are 37% and 88% for the low-and high-N jets regions, respectively.
Since the event composition of regions D3 and D4 is not fully representative of that in R3 , we perform studies on potential additional sources of systematic uncertainty in the simulation.We find that the main source of 1 tt events in the high-m T region is jet energy mismeasurement.We study the impact of mismodeling the size of this contribution by smearing the jet energies by an additional 50% with respect to the jet energy resolution measured in data [70] and calculating the corresponding shift in κ.To ensure that there are no further significant differences between the M J shapes of events reconstructed with one or two leptons, we also calculate the shift in κ due to jet energy corrections, potential ISR p T and top quark p T mismodeling, as well as the amount of non-tt background.Even though each of these can alone have a significant effect on the M J shape, the κ factor, as a double ratio, remains largely unaffected (Table 2).Including these uncertainties in the likelihood fit produces a negligible contribution to the total uncertainty.

Results and interpretation
Figure 8 shows the two-dimensional distribution of the data in the m T -M J plane after the baseline selection, but with the additional requirement N b ≥ 2. The baseline requirements include E miss T > 200 GeV and N jets ≥ 6, but no further event selection is applied.For comparison, the plot also shows the expected total SM background based on simulation, as well as a particular sample of the expected signal distribution.The overall distribution of events in data is consistent with the background expectation, where the majority of events are concentrated at low m T and M J .In R4, the nominal signal region, we observe only two events in data, while, as shown in Table 3, the predicted SM background is about 5 events.The T1tttt(1500,100) (NC) model would be expected to contribute 5 additional events to R4.
The validity of the central assumption of the background estimation method can be checked in the nearly signal-free N b = 1 region by comparing the M J shapes observed in the high-and low-m T regions in data.Figure 9 (left) shows the M J shapes in the N b = 1 sample, integrating over the N jets and E miss T bins.The low m T data have been normalized to the overall yields in the corresponding high-m T data.The shapes of the M J distributions for the high-and low-m T regions are consistent.Figure 9 (right) shows that the corresponding distributions in the N b ≥ 2 sample are also consistent, as expected in the absence of signal.Table 3 summarizes the observed event yields, the fitted backgrounds, and the expected signal yields for the two T1tttt benchmark model points.Two background estimates are given: the   predictive fit (PF), which uses only the yields in regions R1, R2, and R3, and the global fit (GF), which also incorporates region R4, as described in Section 6.In both versions of the fit, the signal strength r is fixed to zero, giving results that are model independent.(When setting limits on individual models, we allow r to float, as discussed below.)The rows labeled R4 give the results for each of the ten signal regions, as well as the corresponding κ factors.
In the absence of signal, the predictive fit and the version of the global fit performed under the null hypothesis, r = 0, should be consistent with each other.However, because the global fit incorporates more information, specifically the yields in R4, this fit has a smaller uncertainty.The regions with N b = 1 have small expected contributions from signal.Summing over all four such signal regions (R4), the number of estimated background events from the PF and GF are 6.1 ± 2.2 and 5.5 ± 1.3, respectively, compared with 8 events observed in data.The consistency between the two predictions and between the predicted and observed yields in the R4 regions with N b = 1, where the signal contribution is expected to be small, serves as a further check on the background estimation method.Summing the yields over the six signal bins with N b ≥ 2, the number of estimated background events from PF and GF is 5.6 ± 1.6 and 4.9 ± 1.0, respectively.In data, we observe 2 events, lower than, but consistent with the background-only hypothesis.
Given the absence of any significant excess, the results are interpreted first as exclusion limits on the production cross section for T1tttt model points as a function of m g and m χ 0 1 .Table 4 shows the ranges for the systematic uncertainties associated with predictions for the expected signal yields, including those on the signal efficiency.The largest uncertainties arise from the jet energy corrections and from the modeling of ISR.These uncertainties are generally in the range 10-20% but can increase to ∼30% as the mass splitting between the gluino and LSP decreases [77].The uncertainty associated with the renormalization and factorization scales is determined by varying the scales independently up and down by a factor of two; these are applied only as an uncertainty in the signal shape, i.e., the cross section is held constant.The uncertainty associated with the b tagging efficiency is in the range 1-15%.Uncertainties due to pileup, luminosity [78], lepton selection, and trigger efficiency are found to be ≤ 5%.Uncertainties for each particular source are treated as fully correlated across bins.
A 95% confidence level (CL) upper limit on the production cross section is estimated using the modified frequentist CL S method [79][80][81], with a one-sided profile likelihood ratio test statistic.For this test, we perform the global fit under the background-only and background-plus-signal (r floating) hypotheses.The statistical uncertainties from data counts in the control regions are modeled by the Poisson terms in Eq. 9.All systematic uncertainties are multiplicative and are treated as log-normal distributions.Exclusion limits are also estimated for ±1σ variations on the production cross section based on the NLO+NLL calculation [39].
Figure 10 shows the corresponding excluded region at a 95% CL for the T1tttt model in the m g − m χ 0 1 plane.At low χ 0 1 mass we exclude gluinos with masses of up to 1600 GeV.The highest limit on the χ 0 1 mass is 800 GeV, attained for m g of approximately 1300 GeV.The observed limits are within the 1σ uncertainty in the expected limits.The central value is slightly higher because the observed event yield is less than the SM background prediction, as shown in Table 3.
In the context of natural SUSY models, it is important to extend the interpretation to scenarios in which the top squark is lighter than the gluino.Rather than considering a large set of models with independently varying top squark masses, we consider the extreme case in which the top squark has approximately the smallest mass consistent with two-body decay, m t 1 ≈ m t + m χ 0 1 , for a range of gluino and neutralino masses.The decay kinematics for such extreme, com-Table 3: Observed and predicted event yields for the signal regions (R4) and background regions (R1-R3) in data (2.3 fb −1 ).Expected yields for the two SUSY T1tttt benchmark scenarios are also given.The results from two types of fits are reported: the predictive fit (PF) and the version of the global fit (GF) performed under the assumption of the null hypothesis (r = 0).The predictive fit uses the observed yields in regions R1, R2, and R3 only and is effectively just a propagation of uncertainties.The global fit uses all four regions.The values of κ obtained from the simulation fit are also listed.The first uncertainty in κ is statistical, while the second corresponds to the total systematic uncertainty.The benchmark signal models, T1tttt(NC) and T1tttt(C), are described in Section 3.  , because the top quark and the χ 0 1 are produced at rest in the top squark frame.As a consequence, the excluded signal cross section for fixed values of m g and m χ 0 1 and with m g > m t 1 ≥ m t + m χ 0 1 is minimized around this extreme model point.For physical consistency, the signal model used in this study, both in the fit procedure and in the theoretical cross section used to obtain mass limits, includes not only gluino-pair production, but also direct t 1 t 1 production.However, the effect of the direct top squark contribution on the results is small, 2% for m χ 0 1 > 400 GeV and up to 20% for low values of m χ 0 1 .
Figure 11 shows the excluded region in the m g -m χ 0 1 plane for this combined model with both gluino-mediated top squark production and direct top squark pair production.The top squark mass is assumed to be 175 GeV above that of the neutralino.For most of the excluded region, the boundary is close to that obtained for the T1tttt model, showing that there is only a weak sensitivity to the value of the top squark mass.The uncertainty on the boundary of the excluded region for the T5tttt model is similar to that shown for the T1tttt model in Fig. 10.For m χ 0 1 > 150 GeV, the excluded value of m g is typically within 60 GeV of that excluded for T1tttt.Models that have low values of m χ 0 1 show a reduced sensitivity because the neutralino carries very little momentum, reducing the value of m T .In this kinematic region, the sensitivity to the signal is dominated by the events that have at least two leptonic W boson decays, which produce additional E miss T , as well as a tail in the m T distribution.Although such dilepton events are nominally excluded in the analysis, a significant number of these signal events escape the dilepton veto.These events include both W decays to τ leptons that decay hadronically, and W decays to electrons or muons that are below kinematic thresholds or are outside of the detector acceptance.

Summary
Using a sample of proton-proton collisions at √ s = 13 TeV with an integrated luminosity of 2.3 fb −1 , a search for supersymmetry is performed in the final state with a single lepton, btagged jets, and large missing transverse momentum.The search focuses on final states resulting from the pair production of gluinos, which subsequently decay via g → tt χ 0 1 , leading to high jet multiplicities.
A key feature of the analysis is the use of the variable M J , the sum of the masses of large-R jets, which are formed by clustering anti-k T R = 0.4 jets and leptons.Used in conjunction with  plane for a model combining T5tttt, gluino pair production, followed by gluino decay to an on-shell top squark, together with a model for direct top squark pair production.The top squarks decay via the two-body process t → t χ 0 1 .The neutralino and top squark masses are related by the constraint m t 1 = m χ 0 1 + 175 GeV.For comparison, the excluded region (95% CL) from Fig. 10 for the T1tttt model, which has three-body gluino decay, is shown in red.The small difference between the two boundary curves shows that the limits for the scenarios with two-body gluino decay have only a weak dependence on the top squark mass.
the variable m T , the transverse mass of the system consisting of the lepton and the missing transverse momentum vector, M J provides a powerful background estimation method that is well suited to this high jet multiplicity search.
After the baseline selection is applied, signal (R4) and control regions (R1, R2, and R3) are defined in the M J -m T plane, which are further divided into bins of E miss T , N jets , and N b to provide additional sensitivity.In regions R3 and R4, the requirement m T > 140 GeV provides strong suppression of the single-lepton tt background, so that dilepton tt events dominate over all other background sources.For these dilepton events to enter a signal region, however, they must contain a substantial amount of initial-state radiation (ISR).For this extreme range of ISR jet momentum and multiplicity, the single-lepton and dilepton tt events have very similar kinematic properties.The variables M J and m T are nearly uncorrelated, even though different processes dominate the low-and high-m T regions.As a consequence, the low-m T regions (R1 and R2) can be used to measure the background shape for the M J distribution at high m T .A correction factor, near unity, is taken from simulation and is used to account for a possible correlation between M J and m T .
The observed event yields in the signal regions are consistent with the predictions for the SM background contributions, and exclusion limits are set on the gluino pair production cross sections in the m g -m χ 0 1 plane, as described by the simplified models T1tttt and T5tttt, where the latter is augmented with a model of direct top squark pair production for consistency.In the T1tttt model, gluinos decay via the three-body process g → tt χ 0 1 , which proceeds via a virtual top squark in the intermediate state.Under the assumption of a 100% branching fraction to this final state, the cross section limit for each model point is compared with the theoretical cross section to determine the excluded particle masses.Gluinos with a mass below 1600 GeV are excluded at a 95% CL for scenarios with low χ 0 1 mass, and neutralinos with a mass below 800 GeV are excluded for a gluino mass of about 1300 GeV.In the T5tttt model, the top squark is lighter than the gluino, which therefore decays via a two-body process.The boundary of the excluded region in the m g -m χ 0 1 plane for T5tttt is found to be only weakly sensitive to the top squark mass.These results significantly extend the sensitivity of single-lepton searches based on data at √ s = 8 TeV.

Figure 1
Figure1: Gluino pair production and decay for the simplified models T1tttt (left) and T5tttt (right).In T1tttt, the gluino undergoes three-body decay g → tt χ 0 1 via a virtual intermediate top squark.In T5tttt, the gluino decays via the sequential two-body process g → t 1 t, t 1 → t χ 0 1 .Because gluinos are Majorana particles, each one can decay to t 1 t and to the charge conjugate final state t 1 t.

Figure 3 :
Figure 3: Distribution of m T in data and simulated event samples after the baseline selection is applied.The background contributions shown here are from simulation, and their total yield is normalized to the number of events observed in data.The signal distributions are normalized to the expected cross sections.The dashed vertical line indicates the m T > 140 GeV threshold that separates the signal regions from the control samples.

Figure 5 :
Figure5: Comparison of N jets and M J distributions, normalized to the same area, in simulated tt events with two true leptons at high m T and one true lepton at low m T , after the baseline selection is applied.The shapes of these distributions are similar.These two contributions are the dominant backgrounds in their respective m T regions.The dashed vertical line on the right-hand plot indicates the M J > 400 GeV threshold that separates the signal regions from the control samples.The shaded region corresponding to M J < 250 GeV is not used in the background estimation.

Figure 6 :
Figure 6: The ratio R(m T ) of high-m T (R3 and R4) to low-m T (R1 and R2) event yields for the simulated SM background, as a function of N jets and N b .The baseline selection requires N jets ≥ 6.The uncertainties shown are statistical only.

Figure 7 :
Figure 7: Values of the double-ratio κ in each of the 10 signal bins, calculated using the simulated SM background.The κ factors are close to unity, indicating the small correlation between M J and m T .The uncertainties shown are statistical only.
define N data Ri as the observed data yield in each region, µ MC,sig Ri

Figure 8 : 1 =
Figure 8: Two-dimensional distributions for data and simulated event samples in the variables m T and M J in the N b ≥ 2 region after the baseline selection.The distributions integrate over the N jets and E miss T bins.The black dots are the data; the colored histogram is the total simulated background, normalized to the data; and the red dots are a particular signal sample drawn from the expected distribution for gluino pair production in the T1tttt model with m g = 1500 GeV and m χ 0 1 = 100 GeV for 2.3 fb −1 .Overflow events are shown on the edges of the plot.The definitions of the signal and control regions are the same as those shown in Fig. 4.

Figure 9 :
Figure 9: Comparison of the M J distributions for low-and high-m T in data with N b = 1 (left) and N b ≥ 2 (right) after the baseline selection.The expected M J distributions of the two benchmark T1tttt scenarios for m T > 140 GeV are overlaid.The distributions integrate over the N jets and E miss T bins.The low-m T distribution is normalized to the number of events in the high-m T region.The dashed vertical lines indicate the M J > 400 GeV threshold that separates the signal regions from the control samples.
limit on cross section[pb]

Figure 10 :Figure 11 :
Figure 10: Interpretation of results in the T1tttt model.The colored regions show the upper limits (95% CL) on the production cross section for pp → g g, g → tt χ 0 1 in the m g -m χ 0 1plane.The curves show the expected and observed limits on the corresponding SUSY particle masses obtained by comparing the excluded cross section with theoretical cross sections.

5 Trigger and event selection
T ISR p

Table 2 :
Summary of uncertainties in the background predictions.All entries in the table except for data sample size correspond to a relative uncertainty on κ.The ranges indicate the spread of each uncertainty across the signal bins.Uncertainties from a particular source are treated as fully correlated across bins, while uncertainties from different sources are treated as uncorrelated.

Table 4 :
Typical values of the signal-related systematic uncertainties.Uncertainties due to a particular source are treated as fully correlated between bins, while uncertainties due to different sources are treated as uncorrelated.