) Search for pair production of scalar leptoquarks decaying into first-or second-generation leptons and top quarks in proton-proton collisions at √s = 13 TeV with the ATLAS detector

A search for pair production of scalar lepto-quarks, each decaying into either an electron or a muon and a top quark, is presented. This is the ﬁrst leptoquark search using ATLAS data to investigate top-philic cross-generational couplings that could provide explanations for recently observed anomalies in B meson decays. This analysis targets high leptoquark masses which cause the decay products of each resultant top quark to be contained within a single high-p T large-radius jet. The full Run 2 dataset is exploited, consisting of 139 fb − 1 of data collected from proton–proton collisions at √ s = 13 TeV from 2015 to 2018 with the ATLAS detector at the CERN Large Hadron Collider. In the absence of any signiﬁcant deviation from the background expectation, lower limits on the leptoquark masses are set at 1480 GeV and 1470 GeV for the electron and muon channel, respectively.


Introduction
The quark and lepton sectors of the Standard Model (SM) are interestingly similar, motivating one to hypothesize a fundamental symmetry between the two sectors.Such a symmetry can be found in many grand unified theories, such as grand unified SU (5) [1], the Pati-Salam model based on SU(4) [2], or R-parity-violating (RPV) supersymmetry (SUSY) models [3].These models predict a new class of bosons carrying both lepton and baryon number, called leptoquarks (LQs).LQs are hypothetical colour-triplet bosons which couple directly to quarks and leptons.They can be of either scalar or vector nature, and carry fractional electric charge.The production cross section of vector LQs could be enhanced relative to that of scalar LQs due to the existence of a massive gluon partner in the minimal set of vector companions [4].
LQs have recently gained attention as they provide an attractive explanation of the recent hints of possible lepton-flavour-universality violation from the observed  meson decay anomalies in BaBar [5], Belle [6] and LHCb [7][8][9].Single scalar (S 3 ) or vector (U 3 ) LQ triplet models [10], as well as a mixed model of a doublet and a singlet LQ (R 2 + U 1 ) [11] are possible solutions to the flavour-changing neutral current  anomaly.LQs are also motivated by a long-standing deviation from the SM in the anomalous muon magnetic dipole moment measured with the E821 experiment at Brookhaven National Laboratory [12,13].At the Large Hadron Collider (LHC) [14], LQs could be produced in pairs, or singly in association with a lepton.This analysis targets LQ pair production, which is dominated by strong interactions and largely insensitive to the Yukawa coupling at a LQ-lepton-quark vertex.The lowest-order Feynman diagrams are shown in Figure 1.Gluon-initiated processes dominate for LQ masses less than 1.5 TeV.The t-channel lepton exchange process contributes to the cross section at the 10% level, and is thus neglected in this analysis [15,16].Only scalar LQ production is considered because this is less model dependent than vector LQ production.The LQ-lepton-quark couplings are determined by two parameters: a model parameter  and the coupling parameter .The coupling to charged leptons is given by √ , and the coupling to neutrinos by √︁ 1 − .
Most previous searches have assumed that leptoquarks couple to quarks and leptons of the same generation.
Recently, there have been dedicated searches at the LHC for LQ pair production in the LQ → ℓ, LQ → ℓ, LQ → ℓ, and LQ →  channels using the full Run 2 proton-proton ( ) collision dataset collected at √  = 13 TeV [17,18].The results presented here pertain to the search for cross-generational leptoquarks with decays into a top quark and an electron or a top quark and a muon, in which both top quarks decay hadronically.It is optimized for LQ masses larger than 1 TeV, for which the top quarks tend to be boosted.Therefore, the signature considered is a pair of same-flavour opposite-sign leptons and a pair of large-radius (large-) jets.Simultaneous couplings of LQs to the first-and second-generation leptons are tightly constrained by the measurements of rare lepton-flavour-violating decays [19], and thus not considered in this paper.A boosted decision tree (BDT) approach, based on kinematic variables and jet substructure variables, is applied to classify events as originating from the signal or background processes in the signal region.Dedicated control regions are constructed to control the normalization of the dominant backgrounds:  t and  + jets production.The extraction of the signal strength is performed through a simultaneous likelihood fit to the BDT discriminant distribution and the control region yields.The LQ→   and LQ→  channels have not been examined previously in ATLAS.The CMS Collaboration has published a search using 35.9 fb −1 of data collected in 2015-2016 that excluded masses below 1420 GeV for scalar LQs decaying exclusively into   [20].

ATLAS detector
The ATLAS detector [21][22][23] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry that covers nearly the entire solid angle around the collision point.It consists of an inner detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer.The inner detector covers the pseudorapidity range1 || < 2.5.It consists of a silicon pixel detector, including the insertable B-layer installed after Run 1 of the LHC, and a silicon microstrip detector surrounding the pixel detector, followed by a transition radiation straw-tube tracker.Lead/liquid-argon sampling calorimeters provide electromagnetic energy measurements with high granularity and a steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range (|| < 1.7).The endcap and forward regions are instrumented with liquid-argon calorimeters for both the electromagnetic and hadronic energy measurements up to || = 4.9.The outer part of the detector consists of a muon spectrometer (MS) with high-precision tracking chambers for coverage up to || = 2.7, fast detectors for triggering over || < 2.4, and three large superconducting toroid magnets with eight coils each.The ATLAS detector has a two-level trigger system to select events for offline analysis [24].

Data and simulation samples
The data utilized in this search correspond to 139 fb −1 of integrated luminosity from   collisions at √  = 13 TeV collected with the ATLAS detector.Only data collected during stable beam conditions with all ATLAS detector subsystems operational are considered.Simulated events with pair-produced scalar LQs were generated at next-to-leading order (NLO) in quantum chromodynamics (QCD) [16] with M G 5_aMC@NLO 2.6.0 [25] using the NNPDF3.0nlo[26] parton distribution function (PDF) set and  S = 0.118.M G was interfaced with P 8.230 [27] using the A14 set of tuned parameters (tune) [28] and the NNPDF2.3loset of PDFs [29] for the underlyingevent description, parton showering, and hadronization.The LQ pair-production cross section was obtained from the calculation of direct top-squark pair production, which was computed at next-to-next-to-leading order (NNLO) in QCD with resummation of next-to-next-to-leading logarithmic (NNLL) soft gluon terms, as they have the same production modes [30][31][32][33].Theoretical uncertainties were evaluated from variations of factorization and renormalization scales,  S , and PDFs.Only LQs coupling to the third-generation quarks and either exclusively to the first-generation leptons or exclusively to the second-generation leptons were considered.To ensure that LQs decay promptly, the coupling parameter  was set to give a LQ width of about 0.2% of its mass.M S [34,35] was used to decay top quarks while preserving the spin-correlation and finite-width effects.For this analysis, signal samples were produced for LQ mass values from 900 GeV to 2000 GeV, with a 100 GeV step size in general and a finer 50 GeV step size near the expected LQ mass exclusion limits, and  = 1.0 with fully hadronic top decays.
The dominant backgrounds in this search are  + jets and  t production, with two leptons in the final state.Sources of smaller backgrounds considered include single top quark,  t ( = , ), diboson ( ,  , ), and  + jets production.The background contribution from multi-jet production was found to be negligible and is not considered in this search.
The  + jets,  + jets and diboson samples were generated using S 2.2.1 [36] with the NNPDF3.0nnloPDF set.The  + jets and  + jets samples were normalized to the NNLO cross sections calculated with FEWZ [37].Matrix elements were calculated for up to two partons at NLO and four partons at leading order (LO) using C [38] and O L [39][40][41] matrix-element generators, and merged with the S parton shower [42] using the ME+PS@NLO prescription [43][44][45][46].For the diboson samples, matrix elements were calculated for up to one parton at NLO and three partons at LO using C and O L matrix-element generators, and merged with the S parton shower using the ME+PS@NLO prescription.
The nominal  t and single-top event samples in the -, t-and s-channels were simulated with P -B v2 [47][48][49][50][51][52] which provides matrix elements at NLO in  S with the NNPDF3.0nloPDF set.The P -B event generator was interfaced with P 8.230 for the parton shower and hadronization, using the A14 tune and the NNPDF2.3loPDF set.The NLO radiation factor, ℎ damp , was set to 1.5 times the mass of the top quark,  top .The diagram removal (DR) method was used to remove the interference between -channel single-top production and  t production [53].The related uncertainty is estimated by comparison with an alternative sample generated using the diagram subtraction (DS) scheme [53,54].The  t samples were normalized to the NNLO cross section with soft-gluon resummation to NNLL accuracy using T ++ 2.0 [55][56][57][58][59][60][61].The single-top cross sections for the t-and s-channels are normalized to their NLO predictions using H 2.1 [62,63], while for the -channel the cross section is normalized to its NLO+NNLL prediction [64,65].To estimate the modelling uncertainties from the choice of generator and parton shower, alternative samples were generated at NLO for both the  t and single-top events using M G 5_aMC@NLO 2.6.0 interfaced to P 8.230, and P -B v2 interfaced to H 7.04 [66,67], respectively.
The  t samples were simulated using M G 5_aMC@NLO v2.3.3 [25] at NLO in  S with the NNPDF3.0nloPDF set.M G was interfaced with P 8.210 [27] using the A14 tune and NNPDF2.3loPDF set for parton showering and hadronization.The cross sections of the samples were calculated at NLO QCD and NLO EW accuracy using M G 5_aMC@NLO as reported in Ref. [68].
In the case of  tℓℓ the cross section is additionally scaled by an off-shell correction estimated at one-loop level in  S .
The  t events used E G 1.2.0 [69] to simulate the modelling of -and -hadron decays, and all other simulated events, except those generated by S , used E G 1.6.0.
All simulated event samples for the nominal predictions were passed through the ATLAS simulation infrastructure [70], using the full G 4 [71] simulation of the ATLAS detector.The alternative  t and single-top generator samples were processed with a fast simulation [72] of the ATLAS detector with parameterized showers in the calorimeters.Simulated events were then reconstructed using the same software as used for the data, and overlaid with additional   collisions in the same or nearby bunch crossings (pile-up) simulated using the soft QCD processes of the P 8.186 [73] generator with the NNPDF2.3loPDF set and the A3 tune [74].The Monte Carlo samples were reweighted to match the distribution of the number of pile-up interactions to the data.

Analysis object selection
A set of physics objects (electrons, muons, jets, and missing transverse momentum) are reconstructed using an optimized combination of information from the various subsystems of the ATLAS detector.The reconstructed primary vertex of the event is required to have at least two associated ID tracks with  T > 0.5 GeV.If more than one primary vertex candidate is reconstructed in an event, the vertex with the largest  2  T of all associated tracks is considered as the hard-scatter vertex.Electron candidates are reconstructed from clusters of energy deposits in the electromagnetic calorimeter associated with a charged-particle track reconstructed in the ID.To ensure that electron candidates originate from the primary vertex, they are required to possess | 0 |/  0 < 5 and | 0 sin | < 0.5 mm, where  0 ( 0 ) is the transverse (longitudinal) impact parameter relative to the primary vertex and   0 is the uncertainty in  0 .The electron candidates are required to satisfy the Tight likelihood identification criteria for high purity [75].High-purity candidates must fulfil the Loose isolation criteria with fixed cuts on isolation variables to further suppress background contributions from hadrons that are misidentified as electrons [75].Additionally, the electrons are required to have  T > 30 GeV and pseudorapidity || < 2.47, while excluding those in the barrel-endcap transition region (1.37 < || < 1.52) of the electromagnetic calorimeters.
Muon candidates are reconstructed from a combined measurement of tracks in the inner detector and the muon spectrometer.The associated tracks must point to the primary vertex by satisfying | 0 |/  0 < 3 and | 0 sin | < 0.5 mm.The muon candidates are required to satisfy the Medium muon identification selection criteria [76] if the leading muon's  T is below 800 GeV; otherwise, tighter High-Pt muon identification requirements [76] are applied to guarantee the best muon resolution and removal of poorly measured tracks in the high- T regime.To reject background from muons originating from hadron decays, the FixedCutTightTrackOnly track-based isolation criterion is applied, with a wider isolation cone used for  T > 50 GeV [76].The muons are required to have  T > 30 GeV and || < 2.5.
Small-radius (Small-) jets are reconstructed using the anti-  algorithm [77,78] with a radius parameter of  = 0.4 and with particle-flow objects [79,80] as inputs.These particle-flow objects are typically either charged-particle tracks that originate from the hard-scatter vertex and are matched to a set of topo-clusters in the calorimeters [81], or the remaining calorimeter energy clusters after the subtraction of calorimeter energy associated with those charged-particle tracks.Small- jets are considered if they satisfy  T > 25 GeV and || < 2.5.For small- jets with  T < 60 GeV and || < 2.4, a multivariate jet vertex tagger is employed to reduce contamination by jets coming from pile-up [82].Small- jets are only used for the object overlap removal discussed below and for the event kinematic reconstruction discussed in Section 5.2.
Large- jets are reconstructed from topo-clusters of energy deposits in the calorimeters using the anti-  algorithm with a radius parameter of  = 1.0.To remove contributions from pile-up, the   -based trimming algorithm [83][84][85][86] is employed to recluster jet constituents into subjets with a finer -parameter value of 0.2 and discard subjets with energy less than 5% of the large- jet's energy [87].Trimmed large- jets are required to have  T > 200 GeV, || < 2.0 and jet mass  > 50 GeV.To identify large- jets that are likely to have originated from the hadronic decay of a top quark, jet substructure information is exploited as inputs to the BDT model in the muon channel, as discussed in Section 5.2, using the N-subjettiness ratio  32 [88,89], the splitting measure √  23 [90] and the   variables [91].
The missing transverse momentum,  miss T , in a given reconstructed event is computed as the magnitude of the negative vector sum of the  T of all reconstructed leptons and small- jets.A track-based soft term is also included in the  miss T calculation to account for the 'soft' energy from inner detector tracks that are not matched to any of the selected objects but are consistent with originating from the primary vertex [92,93].
To avoid double counting of the same object in different reconstructed object types, an overlap removal procedure is applied to specific pairs of objects that either share a track or have small separation in Δ.Electron candidates are discarded if they are found to share a track with a more energetic electron or a muon.For overlapping small- jets and electrons, small- jets within Δ = 0.2 of a reconstructed electron are removed.If the nearest surviving small- jet is within Δ = 0.4 of the electron, then the electron is discarded.To reject hadronic jet candidates produced by bremsstrahlung from very energetic muons, the jet is required to have at least three associated tracks if it lies within a cone of Δ = 0.2 around a muon candidate.However, if a surviving jet is separated from the nearest muon with transverse momentum   T by Δ < 0.04 + 10 GeV/  T up to a maximum of 0.4, the small- jet is kept and the muon is removed instead; this reduces the background contributions due to muons from hadron decays.No dedicated overlap-removal procedure between large- and small- jets is performed.As high- T electrons could deposit significant amounts of energy in the calorimeter to form large- jets, the electron energy is removed from any overlapping large- jets before the jet momentum requirements are applied to avoid double counting the electrons as large- jets.This approach has a 20% better signal efficiency compared to rejecting large- jets that overlap with a reconstructed electron.

Event selection
In the signal region (SR), events were recorded using either a set of single-electron triggers or a set of single-muon triggers.The single-electron triggers imposed a  T threshold of 26 GeV (24 GeV in 2015) and isolation requirements, or a  T threshold of 60 GeV and no isolation requirements [94].The single-muon triggers accepted an isolated muon with  T > 26 GeV (20 GeV in 2015) or any muon with  T > 50 GeV [95].Events with exactly two opposite-sign, same-flavour leptons with  T above 100 GeV are considered.Events must also have at least two large- jets.In addition, events containing a lepton pair with invariant mass below 120 GeV are removed to reduce background contributions from low-mass resonances.In the SR, the dominant backgrounds are from the  t and  + jets processes.The LQ signal,  t and  + jets events which satisfy these SR criteria are used to train a BDT for signal and background classification.
Dedicated control regions (CRs) are defined in order to extract the normalization of the  t and  + jets backgrounds from data.For the  t-enriched CR, the selection criteria are the same as in the SR, except that either a single-electron trigger or a single-muon trigger must be satisfied and events must contain exactly one opposite-sign electron-muon pair.The  + jets-enriched CR is kept orthogonal to the SR by selecting data in a dilepton invariant mass window 70 <  ℓℓ < 110 GeV around the  boson mass.A summary of the event selections for the signal and control regions is given in Table 1.
The expected numbers of events in the SR for the background processes and signal hypothesis with mass  LQ = 1500 GeV are shown in Table 2.For a signal model with  = 1 and a fully hadronic top-quark final state, the acceptance times efficiency of the SR selection, for LQ masses from  LQ = 900 to 2000 GeV, ranges from 32% to 49% in the electron channel, and from 36% to 43% in the muon channel.

Signal region BDT classification
A BDT classifier is trained in the SR to further separate the signal from the backgrounds.A gradient boosting approach is used with the XGB framework [96] as the back end for mathematical computations.
The gradient boosting algorithm contains at most 1000 trees with a maximal tree depth of 3, while early stopping is employed if no improvement in the classification is found after 10 iterations of the trees.To avoid overtraining the classifier, nested cross validation [97] was performed to obtain an unbiased evaluation of the classifier performance.The classifier produces an output score referring to the predicted probability that the event contains LQs, which is then used as the final discriminant to separate LQ signal events from the SM backgrounds.
A natural basis of kinematic observables can be created, utilizing Lorentz symmetry to reduce unnecessary duplication of observables, in the rest frames of intermediate particle states, conditioned on the hypotheses of LQ pair, dileptonic  t or  + jets decay processes.A suite of such discriminating variables is constructed using the recursive jigsaw reconstruction technique [98], and is provided as inputs to the classifier.The dileptonic  t reconstruction scheme is based on the 'min Δ top approach' of the recursive jigsaw reconstruction technique, in which the two leading small- jets are used as the -quark candidates from the top-quark decays.Variables related to hadronic and leptonic activity, missing transverse momentum and jet substructure are also used to provide additional separation power.Large- jet substructure variables are only used in the muon channel.As discussed in Section 4, the energy of electrons overlapping with large- jets is subtracted from the jet four-momentum to avoid double counting.Such kinematic modification of large- jets is incompatible with the use of substructure variables.In total, 29 inputs are used in the BDT classifier in the electron channel and 32 in the muon channel.The top five discriminating variables are the dilepton invariant mass, the scalar  T sum of the two leptons, the two large- jet masses, and the reconstructed LQ mass. Figure 2 shows the distributions of the dilepton invariant mass in the  + jets CR of the muon channel, and the reconstructed  mass based on a dileptonic  t hypothesis in the  t CR of the electron channel.In general, the kinematic variables show good agreement between data and the background expectation in the CRs.A complete list of the input variables is provided in Table 3.In order to maximize the sensitivity of the BDT over a wide mass range, and ensure a smooth interpolation of the signal efficiency between the mass points where it was trained, a parameterized machine-learning approach [99] is implemented.The inputs to the BDT classifier are expanded to include the theoretical LQ mass, resulting in a single BDT classifier that smoothly provides optimized discrimination across a range of masses, from 900 GeV to 2000 GeV.The parameterized BDT was trained with large samples of simulated signal events at  LQ values from 900 GeV to 1900 GeV, with a 200 GeV step size.The modelling of the BDT distribution of the main backgrounds is validated using events in the  + jets and  t control regions, as shown in Figure 3.The number of bins and their boundaries in the SR are optimized to maximize the expected scalar leptoquark sensitivity while ensuring a minimum of three background events in the highest BDT bin.It was found that having three bins in the SR was optimal, which are defined as the low, mid and high BDT SR.Of the signal events which enter the signal region, over 94% fall into the high BDT SR while only 1% and 8% of the  t and  + jets background do so, respectively.

Systematic uncertainties
The systematic uncertainties are broken down into three broad categories: luminosity and cross-section uncertainties, detector-related experimental uncertainties, and modelling uncertainties in simulated background processes.The uncertainty from each source is treated as a Gaussian-distributed or log-normal nuisance parameter in a profile-likelihood fit of the CR normalizations and BDT output score distributions, and shape effects are taken into account where relevant.Due to the tight selection criteria applied and resultant statistical limitation to the sensitivity, the systematic uncertainties only mildly degrade the sensitivity of the search.

Luminosity and normalization uncertainties
The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [100], obtained using the LUCID-2 detector [101] for the primary luminosity measurements.
Theoretical cross-section uncertainties are applied to the various simulated samples.For the LQ signal, PDF,  S and scale uncertainties are considered in the approximate NNLO + NNLL calculation of the cross section.The PDF and  S uncertainties are estimated from the PDF4LHC15 error set [102].The effect of uncertainties in the renormalization and factorization scales is estimated from variations by a factor of two about the central scales.The overall uncertainty ranges from 10% at low LQ masses to 25% at 2 TeV [30][31][32][33].This cross-section uncertainty is not included in the profile-likelihood fit, but represented by an uncertainty band around the theoretical prediction in the cross-section limit plots in Section 8.The uncertainties for +jets and diboson production are both assumed to be 50% [103,104].For single top quark and  t production, the uncertainties are taken as 7% [62,63] and 30% [105], respectively.The normalizations of  t and  + jets are determined from data via unconstrained normalization parameters.

Detector-related uncertainties
The dominant sources of detector-related uncertainties in the signal and background yields relate to the lepton identification efficiency scale factors that are used to correct for the difference between the Monte Carlo simulation and data.These uncertainties have an impact on the fitted signal yield of roughly 12% and 5% in the electron and muon channel respectively.Additional uncertainties to account for the degradation of the muon momentum resolution due to the impact of possible misalignment between layers of the MS, as well as between the MS and the ID, were estimated to be 5%.
Uncertainties in the small- and large- jet energy scales and resolutions are also considered.The small- and large- jet energy scales and their uncertainties are derived by combining information from test-beam data, LHC collision data and simulation [106].The uncertainties in the jet energy scale have an impact of up to ∼4% on the fitted signal yield.Moreover, in the case where an electron overlaps with a large- jet, the impact on the jet energy scale calibration due to the analysis-specific removal of the electron energy from the large- jets was evaluated.The jet axis shift and the fraction of calibrated jet energy contributed by the overlapping electrons were studied in simulated events.These additional jet systematic uncertainties have an impact of < 3% on the signal yield.
Other detector-related uncertainties come from uncertainties in the large- jet mass scales and resolutions; lepton isolation and reconstruction; lepton trigger efficiencies, energy scales, and resolutions; the  miss T reconstruction; pile-up modelling; and the jet-vertex-tagger requirement.Uncertainties in the object momenta are propagated to the  miss T measurement, and additional uncertainties in  miss T arising from the 'soft' energy are also considered.These all have negligible impact on the fitted signal yield (<3% each).

Generator modelling uncertainties
Modelling uncertainties are estimated for the signal as well as +jets,  t and single-top-quark backgrounds.The modelling uncertainties are estimated by comparing simulated samples generated with different configurations, described in Section 3.
For the LQ signal, in addition to the cross-section uncertainties, the impact on the acceptance due to variations of the QCD scales, PDF and shower parameters was studied.These uncertainties were estimated from the envelope of independent pairs of renormalization and factorization scale variations by a factor of 0.5 and 2, by propagating the PDF and  S uncertainties following the PDF4LHC15 prescription, and by considering two alternative samples generated with settings that increase or decrease the amount of QCD radiation.Both the PDF and scale variations have an impact below 15% for all bins considered, while variations of the underlying-event modelling have only a 1-2% effect.
For the +jets backgrounds, scale, PDF and  S variations are considered and their effects are evaluated within the S event generator.Seven variations are considered for the renormalization and factorization scales, with the maximum shift within the envelope of those variations taken to estimate the effect of the scale uncertainty.The PDF variations include the variation of the nominal NNPDF3.0nnloPDF as well as the central values of two other PDF sets, MMHT2014nnlo68cl [107] and CT14nnlo [108].The intra-PDF uncertainty is estimated as the standard deviation of the 100 variations of the NNPDF3.0nnloset.The envelope of the differences between the nominal and alternative PDF sets is used as an additional nuisance parameter.The effect of varying  S from its nominal value of 0.118 by ±0.001 is also considered.The dominant effect is from the renormalization and factorization scale variations and is about 6% of the signal yield.
For the  t background, four sources of modelling uncertainties are considered.The uncertainty in the matrix-element calculation is estimated by comparing events generated with two different Monte Carlo generators, M G 5_aMC@NLO and P -B , while keeping the same parton shower model.The uncertainty in the fragmentation, hadronization and underlying-event modelling is estimated by comparing two different parton shower models, P and H , while keeping the same hard-scatter matrix-element calculation.The effects of extra initial-and final-state gluon radiation are estimated by comparing simulated samples generated with enhanced or reduced initial-state radiation, doubling the ℎ damp parameter, and using different values of the radiation parameters [54].The PDF uncertainty is estimated from the PDF4LHC15 error set.The dominant effect is from the final-state radiation estimation uncertainty and is about 6% of the signal yield.
In this analysis, the single-top-quark background comes mainly from the -channel and is a minor background.Similarly to  t, uncertainties in the hard-scatter generation, the fragmentation and hadronization, the amount of additional radiation, and the PDF are considered.In addition, the uncertainty due to the treatment of the overlap between -channel single top quark production and  t production is considered by comparing samples using the DS and DR methods (see Section 3).The dominant effect is from the uncertainty in the fragmentation and hadronization and is about 7% of the signal yield.

Statistical interpretation
The binned distributions of the BDT score in the SR and the overall number of events in the  t and  + jets CR are used to test for the presence of a signal.Hypothesis testing is performed using a modified frequentist method as implemented in R S [109,110] and is based on a profile likelihood that takes into account the systematic uncertainties as nuisance parameters that are fitted to the data.A simultaneous fit is performed in the SR and the two CRs, but done separately for the electron and muon channel.As the  t CR is built requiring an electron and muon, the same events are considered in the independent electron and muon channel fits.
The statistical analysis is based on a binned likelihood function L (, ) constructed as a product of Poisson probability terms over all bins considered in the search.This function depends on the signal strength parameter , a multiplicative factor applied to the theoretical signal production cross section, and , a set of nuisance parameters that encode the effect of systematic uncertainties in the signal and background expectations and are implemented in the likelihood function as Gaussian and log-normal constraints.Uncertainties in each bin due to the finite size of the simulated samples are also taken into account via dedicated constrained fit parameters.There are enough events in the CRs and the lowest BDT bin in the SR, where the signal contribution is small, to obtain a data-driven estimate of the  t and  + jets normalizations and hence the normalizations of those two backgrounds are included as unconstrained nuisance parameters,   t and   .Nuisance parameters representing systematic uncertainties are only included in the likelihood if either of the following conditions are met: the overall impact on the normalization in a given region is larger than 3%, or any single bin within the region has at least a 3% uncertainty.This is done separately for each region and for each template (signal or background).When the bin-by-bin statistical variation of a given uncertainty is significant, a smoothing algorithm is applied.
The test statistic   is defined as the profile likelihood ratio,   = −2ln(L (, θ )/L ( μ, θ)), where μ and θ are the values of the parameters that maximize the likelihood function, and θ are the values of the nuisance parameters that maximize the likelihood function for a given value of .The compatibility of the observed data with the background-only hypothesis is tested by setting  = 0 in the profile likelihood ratio:  0 = −2ln(L (0, θ0 )/L ( μ, θ)).Upper limits on the signal production cross section for each of the signal scenarios considered are derived by using   in the so-called CL s method [111,112].For a given signal scenario, values of the production cross section (parameterized by ) yielding CL s < 0.05, where CL s is computed using the asymptotic approximation [113], are excluded at ≥ 95% confidence level (CL).

Likelihood fit results
The expected and observed event yields in the signal and control regions before and after fitting the background-only hypothesis to data, including all uncertainties, are listed in Table 2.The total uncertainty shown in the table is the uncertainty obtained from the full fit, and is therefore not identical to the sum in quadrature of each component, due to the correlations between the fit parameters.A comparison of the post-fit agreement between data and prediction for the signal and control regions is shown in Figure 4.In the electron (muon) channel, the ratio of the  t total post-fit yield over the pre-fit yield is 0.90 ± 0.25 (0.84 ± 0.24).The ratio of the  + jets total post-fit yield over the pre-fit yield is 0.95 ± 0.20 (0.87 ± 0.10).None of the individual uncertainties are significantly constrained by data.
The probability that the data is compatible with the background-only hypothesis is estimated by integrating the distribution of the test statistic, approximated using the asymptotic formulae, above the observed value of  0 .2This value is computed for each signal scenario considered, defined by the assumed mass of the leptoquark.The lowest local -value is found to be ∼11% (10%), for a LQ mass of 1450 (1600) GeV in the electron (muon) channel.Thus no significant excess above the background expectation is found.
Table 2: Event yields in the signal and control regions before and after the background-only fit to data in the electron and muon channel.The quoted uncertainties include statistical and systematic uncertainties; for the  t and  + jets backgrounds no cross-section uncertainty is included since it is a free parameter of the fit.The contributions from single top,  t, diboson and  + jets production are included in the 'Others' category.In the post-fit case, the uncertainties in the individual background components can be larger than the uncertainty in the sum of the backgrounds, due to the correlations between the fit parameters.Both signal models correspond to  LQ = 1500 GeV assuming 100% branching ratio into a hadronically decaying top quark and a charged lepton.

Limits on LQ pair production
Upper limits at the 95% CL on the LQ pair-production cross section, for an assumed value of  = 1, are set as a function of the LQ mass  LQ and compared with the theoretical prediction (Figure 5).The resulting lower limit on  LQ is determined using the central value of the theoretical NNLO+NNLL cross-section prediction.The observed (expected) lower limits on  LQ are found to be 1480 (1560) GeV and 1470 (1540) GeV for the electron and muon channel respectively.The sensitivity of the analysis is limited by the statistical uncertainty of the data.Including all systematic uncertainties degrades the expected mass limits by only around 10 GeV, and for a mass of 1.5 TeV the cross-section limits increase by less than 7% in both the electron and muon channel.
Exclusion limits on LQ pair production are also obtained for different values of  LQ as a function of the branching ratio (B) into a charged lepton and a top quark (Figure 6).The theoretical cross section was scaled by the branching ratio, and then used to obtain the corresponding limit.The full statistical interpretation is performed for each 0.1 step in B, covering the full plane.Figure 5: Upper limits at 95% CL on the cross section of LQ pair production as a function of LQ mass, assuming a branching ratio B (LQ → ℓ ± ) = 1, for the electron (left) and muon (right) channel.Observed limits are shown as a black solid line and expected limits as a black dashed line.The green and yellow shaded bands correspond to ±1 and ±2 standard deviations, respectively, around the expected limit.The red curve and band show the nominal theoretical prediction and its ±1 standard deviation uncertainty.Figure 6: Lower exclusion limits on the leptoquark mass for scalar leptoquark pair production as a function of the branching ratio into a top quark and an electron (left) or a muon (right) at 95% CL.The observed nominal limits are indicated by a black solid curve, with the surrounding red dotted lines obtained by varying the signal cross section by uncertainties from PDFs, renormalization and factorization scales, and the strong coupling constant  S .Expected limits are indicated with a black dashed curve, with the yellow and green bands indicating the ±1 standard deviation and ±2 standard deviation excursions due to experimental and modelling uncertainties.

Conclusion
A search for pair production of scalar leptoquarks, each decaying into a top quark and either an electron or a muon has been presented, targeting the high-mass region in which the decay products of each top quark are contained within a single large-radius jet.The analysis is based on tight selection criteria to reduce the SM backgrounds.The normalizations of the dominant  + jets and  t backgrounds were determined simultaneously in a profile likelihood fit to the binned output score of a boosted decision tree in the signal region and two dedicated control regions.The data used in this search correspond to an integrated luminosity of 139 fb −1 of   collisions with a centre-of-mass energy √  = 13 TeV recorded by the ATLAS experiment in the whole of Run 2 of the LHC.The observed data distributions are compatible with the expected Standard Model background and no significant excess is observed.Lower limits on the leptoquark masses are set at 1480 GeV and 1470 GeV for the electron and muon channel, respectively.

Appendix
Table 3: The discriminating variables used in the signal-background discrimination training can be classified into five different groups.The first three groups include kinematic variables that are physics-based rather than detector-based, conditioned on different physics process hypotheses: LQ, dileptonic  t and leptonic  decay hypothesis.These physics-based kinematic variables include the invariant masses and the momenta of intermediate and final-state particles in their parent's rest frame.In the dileptonic  t decay hypothesis, the combinatoric ambiguity in the small--jet-lepton pairing is resolved using the 'min Δ top approach' of the recursive jigsaw reconstruction technique [98].The reconstructed hemisphere of the decay process associated with the leading (subleading) lepton is labelled with 1 (2).The fourth group of variables is detector-based and defined in the lab frame.These variables are related to the event-level activity of visible objects or missing transverse momentum.The last group of variables is used to identify the three-prong jet structure of hadronic top-quark decays and is used only in the muon channel.

Input variables LQ hypothesis
LQLQ Invariant mass of LQ pair system, reconstructed from two leptons and two large- jets  max ℓ1,J1 The higher mass of the two LQ candidates, with the lepton-jet pair labelled as ℓ1 and J1. min ℓ2,J2 The lower mass of the two LQ candidates, with the lepton-jet pair labelled as ℓ2 and J2.The ratio of  3 to  2 , where N-subjettiness variable   is defined as   = 1 0  ∈jet constituents  T, × min( 1 , ...,    ) with  0 =  ∈jet constituents  T, × , where  is the radius parameter of the jet, and   is the distance between the subjet j and the constituent .WTA denotes the winner-take-all (WTA) recombination scheme [114] used in subjet reconstruction.

𝑄 w
The minimum invariant mass of the two subjets in the second-to-last reclustering step of the   algorithm, applied to a large- jet MVA parameterization  LQ, hypo Set to the test mass point at which the model is utilized.In the training phase, this parameter is set to the corresponding LQ mass for the signal samples, and a uniformly distributed random value from the training set of LQ mass points for the background samples.

Figure 1 :
Figure 1: The lowest-order Feynman diagrams for LQ pair production.In this paper, the t-channel lepton exchange diagram is ignored.

Figure 2 :
Figure 2: Distributions of the reconstructed  mass associated with the leading lepton assuming a dileptonic hypothesis in the  t CR after the simultaneous background-only fit of the electron channel CRs (left), and the dilepton invariant mass  ℓℓ in the  + jets CR after the simultaneous background-only fit of the muon channel CRs (right).The bottom panels show the ratio of data to expected background.The hatched band represents the total uncertainty.

Figure 3 :
Figure 3: Distributions of the BDT output score in the  + jets and  t CRs for the electron (top row) and muon (bottom row) channel after the simultaneous background-only fit of the CRs.The bottom panels show the ratio of data to expected background.The hatched band represents the total uncertainty.All BDT scores correspond to the theoretical LQ mass parameter  LQ, hypo set to 1.5 TeV.The first bin contains all underflow events.

Figure 4 :
Figure 4: Fit results (background-only) for the binned BDT output score distribution in the signal region of the electron (left) and muon (right) channel, and the overall number of events in the  t and  + jets control regions.The lower panel shows the ratio of data to the fitted background yields.The band represents the systematic uncertainty after the maximum-likelihood fit.
ℓ2,J1 Invariant mass of lepton-jet pair ℓ2 and J1  ℓ1,J2 Invariant mass of lepton-jet pair ℓ1 and J2  J1 Invariant mass of large- jet J1  J2 Invariant mass of large- jet J2  LQ ℓ1 Energy of lepton ℓ1 in its LQ parent's rest frame  LQ ℓ2 Energy of lepton ℓ2 in its LQ parent's rest frame  LQ 1 Energy of large- jet J1 in its LQ parent's rest frame  LQ 2 Energy of large- jet J2 in its LQ parent's rest frame Dilepton  t hypothesis   Invariant mass of  t system, reconstructed from two leptons, two resolved jets and  miss T  1 Invariant mass of top quark 1 , reconstructed from  boson 1 and -quark 1  2 Invariant mass of top quark 2 , reconstructed from  boson 2 and -quark 2  1 , swapped Invariant mass of top quark 1 , with its -quark child 1 swapped with that of top quark 2  2 , swapped Invariant mass of top quark 2 , with its -quark child 2 swapped with that of top quark 1   1 Invariant mass of  boson 1 , reconstructed from the leading lepton ℓ1 and  miss T   2 Invariant mass of  boson 2 , reconstructed from the subleading lepton ℓ2 and  miss T   1 Energy of small- jet j1 as -quark candidate 1 in its top quark parent (1 ) rest frame   2 Energy of small- jet j2 as -quark candidate 2 in its top quark parent (2 ) rest frame   ℓ1 Energy of the leading lepton ℓ1 in its  boson parent (1 ) rest frame   ℓ2 Energy of the subleading lepton ℓ2 in its  boson parent (2 ) rest frame  → ℓℓ hypothesis  ℓℓ Invariant mass of the dilepton system  lab T,ℓℓ Transverse momentum of the dilepton system in the lab frame Detector-based  T Scalar  T sum of the two leptons  T Scalar  T sum of the two leading large- jets  T Scalar  T sum of the two leptons and the two leading large- jets  miss T Missing transverse momentum  miss T sig.Missing transverse momentum significance, defined as  miss T / √  T Jet substructure sd 23   splitting scale for the 2nd and 3rd subjet, defined as sd 23 = min(  T,2 ,  T,3 ) × Δ 23  WTA 32