Search for leptoquarks decaying into the bτ final state in pp collisions at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{\textrm{s}} $$\end{document} = 13 TeV with the ATLAS detector

A search for leptoquarks decaying into the bτ final state is performed using Run 2 proton-proton collision data from the Large Hadron Collider, corresponding to an integrated luminosity of 139 fb−1 at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 13 TeV recorded by the ATLAS detector. The benchmark models considered in this search are vector leptoquarks with electric charge of 2/3e and scalar leptoquarks with an electric charge of 4/3e. No significant excess above the Standard Model prediction is observed, and 95% confidence level upper limits are set on the cross-section times branching fraction of leptoquarks decaying into bτ. For the vector leptoquark production two models are considered: the Yang-Mills and Minimal coupling models. In the Yang-Mills (Minimal coupling) scenario, vector leptoquarks with a mass below 1.58 (1.35) TeV are excluded for a gauge coupling of 1.0 and below 2.05 (1.99) TeV for a gauge coupling of 2.5. In the case of scalar leptoquarks, masses below 1.28 (1.53) TeV are excluded for a Yukawa coupling of 1.0 (2.5). Finally, an interpretation of the results with minimal model dependence is performed for each of the signal region categories, and limits on the visible cross-section for beyond the Standard Model processes are provided.


Introduction
The existing similarities between the structure of the quark and lepton sectors in the Standard Model (SM) suggest the possibility of a new underlying symmetry in particle physics. Leptoquarks (LQs) that couple to both quarks and leptons, with non-zero baryon and lepton numbers, and fractional electric charges are predicted by several beyond the SM theories that attempt to unify the fundamental interactions, such as technicolour [1][2][3], composite models [4], and grand unification [5][6][7].
Recent results reported by BaBar [8,9], Belle [10] and LHCb [11] show hints of deviations from lepton-flavour universality in B-meson decays into final states with D ( * ) mesons, which could be caused by the existence of LQs. The 4.2 standard deviation disagreement with respect to the SM prediction observed in the anomalous muon magnetic moment measurement [12], though significantly reduced when updated lattice quantum chromodynamics (QCD) calculations [13] are considered, could be caused by LQ contributions to the muon magnetic moment [14].
In light of the lepton-flavour universality anomalies observed in the B-meson decays into D ( * ) τ ν final states, the couplings of LQs to third-generation quarks and leptons are expected to be large [15]. At the LHC, third-generation LQs can be produced singly via quark-gluon fusion and quark-gluon scattering or in pairs via the gluon-gluon fusion process, as shown in the Feynman diagrams in figure 1. The search presented in this paper is optimised for the single production of third-generation LQ via the bg → LQτ → bτ τ channel, while LQ pair and non-resonant production processes are also considered since they can also contribute to the bτ τ final state. The single LQ production contribution where T a = λ a /2 with λ a (a = 1, · · · , 8) are the Gell-Mann matrices, g s is the QCD coupling, q L (ℓ L ) denotes the left-handed quark (lepton) doublets and d R (e R ) denotes the right-handed down-type quark (charged-lepton) singlets. The i and j indices represent the flavour generation. A summation over the colour indices is performed and omitted for clarity. The term −ig s (1 − κ)U † 1µ T a U 1ν G aµν describes the interaction between U 1 leptoquarks and SM gluon gauge fields G aµν . In this analysis, two vector LQ scenarios are considered: the Yang-Mills (U YM 1 ) coupling scenario, κ = 0, and the Minimal (U MIN 1 ) coupling scenario, κ = 1. The β ij L and β ij R parameters describe the coupling between U 1 leptoquarks and left-handed or right-handed charged leptons and quarks, respectively. In the framework of the U YM 1 and U MIN 1 scenarios, the probability to decay into the b-quark and τ -lepton final state is predicted to be the same as to decay into the top-quark and neutrino final state. Hence, the branching fraction B of the LQ decays into a b-quark and a -2 -

JHEP10(2023)001
τ -lepton is set to 0.5. In this search, all of β ij R are set to zero, β 33 L is set to one and other β ij L are set to zero, such that each LQ decays into a b-quark and a τ -lepton or into a top-quark and a neutrino. Due to these choices, the gauge coupling (λ) between U 1 leptoquarks and third-generation charged leptons and quarks can be written as λ = g U β 33 L / √ 2. The scalar LQ model S 1 is also considered, with F = 3B + L = −2 and electric charge of 4/3e [20,21]. There are three parameters in this model: the branching fraction B into charged leptons, the LQ to τ b Yukawa coupling parameter λ, and the mass term of the LQ. Following ref. [21], the Lagrangian terms for S 1 LQs related to this analysis are: where C in the superscript stands for the charge conjugation operation. The terms e R and d R are the right-handed charged leptons and down-type quarks and λ ij represents the Yukawa couplings between S 1 , charged leptons, and quarks, where the ij refers to the generations of the quark and charged lepton. In the framework of the S 1 model, the only non-zero Yukawa coupling considered in this paper is the coupling to a b-quark and a τ -lepton. Since only the coupling to the third-generation charged lepton and quark is considered, λ 33 = λ is assumed to be different from zero, while the rest of the λ ij are set to zero. Most of the previous searches for LQs performed by the ATLAS [22-28] and CMS [29-32] collaborations have been conducted on different final states compared to this search. For third generation LQs, the CMS collaboration has recently published results of searches for LQs decaying into tν and bτ [33]. The ATLAS collaboration performed a search for pair produced scalar LQs in bτ bτ final states with 36 fb −1 of proton-proton collision data at √ s = 13 TeV that excluded scalar LQs with masses below 1 TeV, assuming a LQ to bτ branching fraction equal to one [34].
The analysis described in this paper is the first search by the ATLAS collaboration for singly produced LQs decaying into bτ . The search is performed over a LQ mass (m LQ ) in the range of 0.4 TeV to 2.5 TeV. The λ range is chosen to be between 0.5 and 2.5 to cover possible regions where LQs could explain the anomalies observed in the B-meson decay and is extended to large λ where the single LQ production channel provides a significant contribution compared to the pair-production process. In the case of the vector LQ production, the contribution from LQs decaying into tν is neglected.
The analysis starts from the selection of a pair of oppositely charged τ -leptons produced in association with a jet identified as containing a b-hadron (b-jet). The main backgrounds to the search are the tt and tW production processes. Two signatures are considered, containing either a τ lep τ had or τ had τ had pair, where τ had (τ lep ) refers to a τ -lepton decaying into hadrons and a neutrino (two neutrinos and an electron or a muon). In each of these two analysis channels, events are classified, based on the transverse momentum (p T ) of the b-jet, in two categories of low and high b-jet p T . The search for LQs is only performed in the high b-jet p T category, where the effect from the interference of non-resonant LQ production with SM processes is expected to be small [35]. The non-resonant contribution can be significantly modified by the interference contribution, which depends on the signal model parameters [35,36]. The effect of the interference with SM diagrams, such as those -3 -

JHEP10(2023)001
with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range of |η| < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Interesting events are selected by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the highlevel trigger [38]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.
An extensive software suite [39] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Data and Monte Carlo samples
The data were collected using unprescaled single-lepton and single τ had triggers. A more detailed description of the triggers used in the analysis for each data-taking period is given in section 5. Quality criteria are applied to events to ensure that the data were not affected by any hardware-or software-related issues [40].
Monte Carlo (MC) simulated events of single LQs decaying into bτ were produced for masses ranging from 0.4 TeV to 2.5 TeV. The signal samples were produced at leading order (LO) in QCD in the five-flavour scheme using the MadGraph5_aMC@NLO 2.8.1 [41] generator with the NNPDF3.0nnlo [42] parton distribution function (PDF) followed by parton shower (PS) and hadronisation with Pythia 8.244 [43] using the A14 set of tuned parameters (tune) [44] and the NNPDF2.3lo PDF set. Single scalar and vector LQ signal samples were produced with coupling parameters λ from 0.5 to 2.5. The intrinsic width of the LQs increases quadratically with λ and linearly as a function of m LQ . In the considered range of parameters, the LQ width is 16% or less of the LQ mass. The simulated signal events do not include interference effects with the SM processes. For the vector LQ signal two samples were produced for each λ, one with Yang-Mills coupling (κ = 0) and the other with minimal coupling (κ = 1). The implementation of the signal model is based on that described in refs. [20,21,45].
Simulated events with pair produced scalar LQs were generated at next-to-leading order (NLO) in QCD with MadGraph5_aMC@NLO 2.6.0, using the LQ model described in ref. [46], which adds PS to the fixed-order NLO QCD calculations [47,48] interfaced to Pythia 8.230 for the PS and hadronisation. Parton luminosities are provided by the five-flavour scheme NNPDF3.0nlo PDF set with a value of the strong coupling constant α s = 0.118, and the underlying event was modelled with the A14 tune. The LQ pair-production cross-sections were obtained from the calculation of direct top-squark pair production assuming that all other supersymmetric particles are heavier, since the production modes of this process are the same as the LQ pair production. The cross-sections were computed at approximate next-to-next-to-leading order (NNLO) in QCD with resummation of next-to-next-to-leading logarithmic (NNLL) soft gluon terms [49][50][51][52] hadronisation model [83] were used. They employed the dedicated set of tuned parameters developed by the Sherpa authors and the NNPDF3.0nnlo PDF set [58]. The NLO matrix elements for a given jet multiplicity were matched to the PS using a colour-exact variant of the MC@NLO algorithm [84]. Different jet multiplicities were then merged into an inclusive sample using an improved CKKW matching procedure [85,86] that is extended to NLO accuracy using the MEPS@NLO prescription [87]. Diboson production was simulated with the Sherpa 2.2.1 or 2.2.2 generator depending on the process. Fully leptonic final states and semileptonic final states, where one boson decays leptonically and the other hadronically, were generated using matrix elements at NLO accuracy in QCD for up to one additional parton emission and at LO accuracy for up to three additional parton emissions. Samples for the loop-induced processes gg → V V were generated using LO-accurate matrix elements for up to one additional parton emission for both the fully leptonic and semileptonic final states. The matrix element calculations were matched and merged with the Sherpa PS based on Catani-Seymour dipole factorisation using the MEPS@NLO prescription. The virtual QCD corrections were provided by the OpenLoops library. The NNPDF3.0nnlo set of PDFs were used, along with the dedicated set of tuned PS parameters developed by the Sherpa authors. The samples were normalised to a NLO prediction.
A summary of all the features used for the simulation of the signal and background processes is shown in

Object reconstruction and identification
Tracks measured in the ID are used to reconstruct the interaction vertices [92]. The primary vertex of the hard interaction is chosen as the proton-proton vertex candidate with the highest sum of the squared transverse momenta of the associated tracks. Electrons are reconstructed from topological clusters of energy deposits in the electromagnetic calorimeter that are matched to a track reconstructed in the ID [93]. In the τ lep τ had (τ had τ had ) final state, the selected (rejected) electrons are required to satisfy the 'medium' ('loose') identification criteria and have p T > 20 GeV (15 GeV). Moreover, electrons are required to be within |η cluster | = 2.47 with the exclusion of the region between the barrel and endcap calorimeters (1.37 < |η cluster | < 1.52). An additional 'loose' isolation criterion [93] is also required, which has an efficiency of 90 % for candidates with p T > 15 GeV, increasing to more than 98 % for candidates with p T > 30 GeV.
Muons are reconstructed from signals in the MS matched with tracks inside the ID. In the τ lep τ had final state, the selected muons are required to satisfy the 'medium' identification criteria with an average efficiency of 97 %, and have p T > 25 GeV and |η| < 2.5. In the τ had τ had chanel, muons having p T > 7 GeV are rejected if they satisfy the 'loose' identification criteria. A 'tight' isolation criterion [94] based on track information and having an average efficiency of 89 % is also applied.
Jets are reconstructed with a particle-flow algorithm, which combines energy deposits in the calorimeter with ID tracks The τ had decays are composed of a neutrino and a set of visible decay products, most frequently one or three charged pions and up to two neutral pions. The visible decay products of the τ had decay are denoted by τ had-vis . The reconstruction of the τ had-vis is seeded by jets reconstructed by the anti-k t algorithm [96], using calibrated topological clusters [101] as inputs, with a radius parameter of R = 0.4 [102]. Reconstructed tracks are matched to τ had-vis candidates and a multivariate discriminant is used to assess whether these tracks are likely to have been produced by the charged τ had decay products, rejecting tracks originating from other interactions, nearby jets, photon conversions or misreconstructed tracks. The τ had-vis objects are required to have one or three associated charged-particle tracks selected by this discriminant. Their charge (q) is defined as the sum of the measured electric charges of these associated tracks and is required to be |q| = 1. The τ had-vis objects are also required to satisfy p T > 20 GeV and |η| < 2.5, excluding the region 1.37 < |η| < 1.52. To separate the τ had-vis candidates produced by hadronic τ -lepton decays from those due to jets initiated by quarks or gluons, a recurrent neural network (RNN) identification algorithm [103] (τ had -ID) is constructed using information from reconstructed charged-particle tracks and calorimeter-energy clusters associated with τ had-vis candidates. This analysis uses two τ had -ID working points: 'medium', which has a 75% (60%) acceptance efficiency and a background rejection of 35 (240) and 'loose', which has a 85% (75%) acceptance efficiency and a background rejection of 21 (90) for τ had with one (three) charged-particle  tracks. A 'very loose' working point, having a 95% acceptance efficiency, is also used for background estimation. A separate boosted decision tree discriminant ('eBDT') is also used to reject backgrounds arising from electrons misidentified as τ had-vis . This discriminant is built using information from the calorimeter and the ID, most notably transition radiation information from the TRT system and variables sensitive to the ratio of the energy deposited in the calorimeter to the visible momentum measured from the reconstructed tracks. The reconstructed objects used in this analysis are not built from disjoint sets of tracks or calorimetric clusters. It is therefore possible that two different objects share most of their constituents. An overlap removal procedure is applied to resolve this ambiguity. This procedure is summarised in table 2.
The missing transverse momentum vector, ⃗ E miss T , is reconstructed as the negative vector sum of the transverse momenta of leptons, τ had-vis and jets, and a "soft-term" [104]. The soft-term is calculated as the vectorial sum of the ⃗ p T of tracks matched to the primary vertex but not associated with a reconstructed lepton, τ had-vis or jet. The magnitude of ⃗ E miss T is referred to as the missing transverse energy, E miss T .

Event selection
Events are required to contain at least one primary vertex with at least two associated tracks.
For the τ lep τ had channel events were selected by single-lepton triggers. In 2015 singleelectron triggers were simultaneously active with p T thresholds of 24, 60 and 120 GeV . The trigger thresholds were raised to keep the trigger rates sufficiently low as the luminosity was increased. The lowest p T threshold electron and muon triggers also have an isolation requirement. The lepton isolation and identification requirements loosen as the trigger p T thresholds increase. Events must contain at least one τ had candidate and exactly one electron or one muon. The electron or muon must be isolated and satisfy the medium lepton identification. Events with more than one lepton satisfying the medium identification are rejected, considering electrons (muons) with a p T greater than 15 (7) GeV. This helps to reject Z/γ * → ee/µµ events and Z/γ * → τ lep τ lep . Furthermore, the electron and muon candidates are required to have p T > 30 GeV, and be matched to the trigger object that caused the event to be selected. The τ had candidate is required to have p T > 50 GeV, satisfy the medium τ had -ID selection and have |η| < 2.3. The pseudorapidity selection requirement rejects events with τ had candidates in a region with a higher background contamination and large uncertainties in the determination of the rate of electrons misidentified as τ had .
In the τ had τ had channel, events were selected by a single τ had trigger [107]. For 2015 and 2016, three single τ had triggers were available with p T thresholds of 80, 125 and 160 GeV. In 2017 and 2018, due to higher instantaneous luminosity, only the p T > 160 GeV trigger threshold was used. The τ had identification requirements become less stringent as the trigger p T thresholds rise. Events must contain at least two τ had candidates where the leading τ had candidate in p T must be matched to the trigger within an angular distance of ∆R = 0.2 and have p T that is at least 5 GeV above the trigger threshold. The subleadingp T τ had candidate is required to have p T > 65 GeV. Identification requirements are applied to both τ had candidates; the leading-p T τ had must satisfy the medium selection and the subleading-p T τ had the loose selection. Events that contain any electron or muon that satisfies the loose identification requirements are rejected, which ensures orthogonality to the τ lep τ had channel.
Events passing the previous requirements are then selected with criteria that are similar between the two channels. The two τ had or the electron/muon (denoted by ℓ) and τ had must have opposite electric charges and at least one b-tagged jet is required. The invariant mass of the visible decay products of the two τ -leptons, m vis (ℓ, τ had ) or m vis (τ had , τ had ), is required to be above 100 GeV, which is effective at reducing the Z/γ * → τ τ background. An additional requirement ∆ϕ(ℓ, E miss T ) < 1.5 is applied in the τ lep τ had channel to reduce single top and tt events.
The variable S T is defined as the scalar p T sum of the two τ had-vis (or ℓ and τ had-vis ) and the leading-p T b-jet. A minimum requirement of S T > 300 GeV is applied, as there is almost no improvement in the expected results of the analysis, discussed in section 8, by adding events with lower S T values.
The selection criteria described above define the signal region (SR) of the analysis. The signal acceptance times efficiency of the event selection varies between 3% and 10%, depending on the LQ mass and coupling. The efficiency is defined as the ratio of events passing the selection in each channel with respect to the signal events of the bτ lep τ had and bτ had τ had final states, respectively. Events in the SR of each channel are assigned to two categories of low (< 200 GeV) and high (> 200 GeV) transverse momentum of the -10 - Table 3. Definition of signal regions (SR) used in the τ lep τ had and τ had τ had channel. The symbol ℓ represents the selected electron or muon candidate and τ had-vis represents the leading τ had-vis candidate. The symbol τ 1 (τ 2 ) represents the leading (sub-leading) τ had-vis candidate.
leading-p T b-jet. The two categories are called high and low b-jet p T SRs, respectively. The high b-jet p T SR is found to perform better for low-mass singly produced LQs, where the resonant contribution is dominant. Conversely, the low b-jet p T SR has a better acceptance for high mass signals, where the non-resonant contribution is dominant for signals with m LQ ≥ 0.9 TeV. This split into two categories improves the expected results of the analysis, discussed in section 8, by up to 30%. Alternative selections define the control regions (CR), used to evaluate the contribution of the main background processes in the SR, and the validation regions (VR), used to verify the good modelling of the backgrounds. The selection requirements used for the signal regions are summarised in table 3. The use of the control and validation regions in the background estimation methods is discussed in section 6.

τ lep τ had channel
Control and validation regions are used in the analysis to estimate and study the modelling of the main background processes. The selection requirements used for the CR and VR in the τ lep τ had channel are summarised in table 4.
In the τ lep τ had channel the dominant background contributions are from tt and single top-quark events. Processes involving top quarks can produce real τ -leptons, or jets that are misidentified as τ had , and are estimated by using simulation with data-driven corrections. The tt and tW contributions are treated as one combined top-quark background due to their similar kinematics and final states. In the low (high) b-jet p T SR, tt accounts for 90% (86%) of all top-quark processes and 96% (97%) of the single top-quarks are from tW . To ensure that this background is accurately modelled, a top-quark control region (Top-CR) is defined. With respect to the SR selection, the requirements on the leading b-jet p T and the S T are removed, and the condition ∆ϕ(ℓ, E miss T ) < 1.5 is replaced by ∆ϕ(ℓ, E miss T ) > 2.5. This results in a region with a purity of 91% in top-quark processes and negligible signal contamination. Out of all top-quark events in the Top-CR, 91% are from tt processes and -11 -JHEP10(2023)001 normalisation factor Table 4. Definition of the background-enriched control regions (CR) and validation regions (VR) used in the τ lep τ had channel. The symbol ℓ represents the selected electron or muon candidate and τ had-vis represents the leading τ had-vis candidate.
∼97% of the single top-quark events are from tW , which is compatible with the composition of the SRs. A discrepancy between the data and simulation prediction is observed in the Top-CR, with the simulation overestimating the background contribution. Recent measurements of differential cross-sections have demonstrated that the current simulations of tt processes overestimate the upper tail of the top-quark p T spectrum [108, 109]. The discrepancy varies depending on S T ; for this reason, a correction is derived as a function of S T in this region based on the ratio between data and simulation. A top-quark correction scale factor is defined in eq. (6.1) and is applied to all tt and single top-quark simulated events. The comparison between data and the background prediction in the Top-CR and the derived correction as a function of S T are shown in figure 2, where tt and tW events with a generated lepton reconstructed as a lepton (a 'true' lepton) and a jet misidentified as a τ had are included under the Jet→ τ fake contribution. This demonstrates that the Top-CR is dominated by tt and tW events with true leptons and τ had in the final state, thus this correction does not account for mismodelling due to jets being misidentified as τ had . The modelling of events with a true lepton and jet misidentified as a τ had in the final state is discussed later in this section.
The top-quark correction scale factor is defined as a function of S T : where N data and N Top represent respectively the number of data events and of tt plus single top-quark events predicted by simulation, N Top includes events with both true and misidentified τ had in the final state and N non-Top includes all the other backgrounds estimated by using simulation. The resulting correction is well fitted by a linear function, which is used to derive the correction scale factors. The correction is also derived with an alternative logarithmic function: SF Top (S T ) = a ln(S T ) + b, and the difference between the two corrections is taken as an uncertainty on the correction. Additional uncertainties related to the cross-section and acceptance of the top-quark processes, as well as the sta- tistical and cross-section uncertainties related to the subtraction of the contribution from the other processes, are applied to account for the slight difference between the fractions of top-quark events that are due to tt production in the Top-CR and SRs and for the extrapolation to the SR. The scale factor is applied at the per-event level to the tt plus single top-quark events passing the selections of the signal, control or validation regions. The total uncertainty in the scale factor varies between 4% and 7% for S T in the range of 300-700 GeV. Different binning choices for the S T distribution were also considered, but the impact on the SF Top (S T ) uncertainty is found to be below 5% and therefore not considered as an additional source of uncertainty.
Another source of background in the τ lep τ had channel stems from multi-jet events, where jets can mimic both the τ lep and τ had . This type of background from multi-jet events is estimated via a data-driven fake-factor method by deriving a lepton fake-factor. The lepton fake-factor is measured in the multi-jet control region (multijet-CR) that is enriched in multi-jet events, but is similar kinematically to the SR. The events are still required to satisfy the single lepton trigger and to have exactly one b-jet, but the identification algorithm to reject jets misidentified as τ had is instead used to select multi-jet events by -13 -JHEP10(2023)001 requiring an extremely low value for τ had RNN identification score (corresponding to only 1% acceptance for true τ had ). Additional selection criteria on m T (ℓ, E miss T ) < 30 GeV and E miss T < 50 GeV are applied to increase the purity of multi-jet events relative to other backgrounds. The fake-factor is measured with a requirement on the leading b-jet p T > 25 GeV and is defined as: The variable N data is the total number of data events and N MC is the number of background events predicted by simulation that contain a true τ lep . Events are split between the numerator and denominator based on whether the τ lep satisfied the lepton isolation requirement or not. The fake-factor is parameterised as a function of the τ lep p T and split into central (|η| < 1.52) and forward (|η| > 1.52) regions. The statistical uncertainty on the F F lep (p T (τ lep ), η(τ lep )) fake-factor and simulation-related uncertainties on N MC are considered as systematic uncertainties and propagated to the estimate of the multi-jet background. The uncertainties are in the range of 6-230% as a function of the τ had-vis η and p T . A control region is defined by having the same selection as the SR, except that the lepton isolation requirement inverted. Applying the fake-factor at the per-event level, the multi-jet estimate in each SR is then obtained by scaling the distribution in the corresponding control region where the isolation criteria are not satisfied.
An additional source of background are events where a lepton is produced in association with a jet that is misidentified as a τ had (Jet → τ fake). These contribute approximately 20% to the expected background in the SR and are mostly from tt with contributions from W +jets, Z+jets, and diboson events. To ensure that these are well modelled, a 'samesign' control region (SS-CR) is defined by taking the same selection as the SR, but with a light lepton with the same electric charge as τ had . The requirements on ∆ϕ(ℓ, E miss T ), S T and the leading b-jet p T are removed to increase the number of events in the CR. The top-quark correction scale factor derived in eq. (6.1) is applied to top-quark events in this region (approximately 81% of the total). As the Top-CR used to derive that scale factor is dominated by tt and tW events with true τ -leptons in the final state, it does not correct for mismodelling of jets that are misidentified as a τ had . As a difference between the simulation prediction and the data is still observed, another scale factor is derived to account for any remaining differences from those backgrounds with a lepton and misidentified τ had (approximately 60% of the events in this region). The remaining events contain true τ had and are subtracted, before calculating the scale factor, by applying the top-quark correction scale factor. Then, the scale factor for events with a lepton and a jet misidentified as a τ had is defined as: where N true-τ is the total number of events predicted by simulation where both the τ had and τ lep are true and N fake-τ is the number of predicted events with a jet misidentified as a τ had and a true τ lep . The scale factor is parameterised as a function of p T (τ had-vis ) -14 -

JHEP10(2023)001
and the number of charged-particle tracks (n track ). It is applied to any MC background event with a true lepton and a jet misidentified as a τ had . The correction is derived in the SS-CR of the τ µ τ had channel and then applied to both τ e τ had and τ µ τ had , because the τ e τ had SS-CR contains events with misidentified electrons, which are not well modelled by simulation. The SF fake-τ correction values are in the range of 1-1.2 (1-1.5) for τ had with one (three) charged-particle tracks. The statistical uncertainty on the SF fake-τ correction and the simulation-related uncertainties affecting N true-τ and N fake-τ are propagated to the background estimate as systematic uncertainties. As a function of the τ had-vis p T , the uncertainties amount to 15-20% (22-140%) for τ had-vis candidates with one (three) associated charged-particle track.
To validate the background modelling in a region depleted in signal, high and low b-jet p T validation regions (high and low b-jet p T VR) are defined by applying the SR requirements, with the exceptions of the ∆ϕ(ℓ, E miss T ) < 1.5 and the S T > 300 GeV criteria, that are modified into 1.5 < ∆ϕ(ℓ, E miss T ) < 2.5 and 300 < S T < 600 GeV. The low (high) b-jet p T VR consists of 82% (80%) tt events, of 8% (10%) single top-quark events, and of 9% (9%) of events where a jet is misidentified as a τ had . Good modelling of the background is found in the validation regions; the background estimate agrees with data within the total uncertainty, as shown in figure 3.

τ had τ had channel
In the τ had τ had channel, the main background sources are Z/γ * → τ had τ had events, as well as top-quark processes, with W bosons decaying to τ had , or to electrons, muons or jets misidentified as τ had . Both Z/γ * → τ had τ had and top-quark backgrounds are estimated using simulation with data-driven corrections, which are discussed further below. The selection requirements used to define the CR and VR in the τ had τ had channel are summarised in table 5.
As the mismodelling of the kinematic distributions observed in the τ lep τ had channel originates from the underlying top-quark process rather than the τ had decay, it is also expected to be present in the τ had τ had channel. However, due to small number of events in the τ had τ had channel the statistical uncertainty in the top-quark processes is comparable with the expected mismodelling. This makes it difficult to select a τ had τ had -only control region to quantify this mismodelling. Therefore, the S T -dependent top-quark correction scale factor from eq. (6.1) derived in the τ lep τ had channel is also applied to the τ had τ had channel. The shape of the S T distribution is checked and found to be compatible between the τ lep τ had and τ had τ had channels.
The Z/γ * → τ had τ had background is also modelled using simulation. Due to a known discrepancy in the simulation compared with the data for Z(→ τ τ ) + heavy-flavour jets with at least one b-or c-jet (Z+HF) [110], a correction factor for the normalisation of this background is derived in the τ lep τ had b-jet Z-CR, defined in table 4, which has a purity of around 60% for the Z+HF processes and is inclusive in p T of the leading b-jet. A comparison between the data and the prediction from the simulation in the b-jet Z-CR before deriving the correction factor is shown in figure 4.  The scale factor is derived by subtracting backgrounds estimated from simulation that are not from the Z+HF process (N non−ZHF ): where N ZHF is the number of Z+HF events predicted by simulation.

JHEP10(2023)001 τ had τ had Control/Validation Selection Purpose Regions
Dijet-CR Satisfy SR except: τ 1 and τ 2 satisfy very loose τ had -ID, Measure τ had-vis fake-factor τ 1 fail medium τ had -ID CR-1 Satisfy SR except: τ 2 fail loose τ had -ID Apply τ had-vis fake-factor SS-VR Satisfy SR except: q(τ 1 ) × q(τ 2 ) > 0 Multijet modelling check Z+light flavour jets VR Satisfy SR except: 0 b-jets, ∆ϕ(τ 1 , τ 2 ) > 0.25, Z+light jets m vis (τ 1 , τ 2 ) < 100 GeV, E miss T > 60 GeV modelling Table 5. Definition of the background-enriched control regions (CR) and validation regions (VR) used in the τ had τ had channel. The symbol τ 1 (τ 2 ) represents the leading (sub-leading) τ had-vis candidate. The scale factor is applied as a normalisation to the total Z+HF contribution, with a value of 1.13 ± 0.23 obtained from the control region. The uncertainty includes the statistical uncertainty, the uncertainty in the subtraction of the simulation events and the extrapolation uncertainty from the control region to the SRs. The extrapolation uncertainty is obtained by repeating the scale factor calculation in the τ lep τ had channel using selection criteria for the control and signal regions equivalent to the ones used in the τ had τ had channel.
For Z(→ τ τ ) + light-flavour jets (Z+LF, no b-or c-jets), the modelling is validated in a b-veto region (Z+LF VR). This region has the same event selection as the SR, except that zero b-jets are required. In addition, the requirements m vis < 100 GeV, E miss T > 60 GeV and ∆ϕ(τ, τ ) > 0.25 are applied to ensure a high Z+LF purity. The data is found to be in agreement with the simulation within the statistical uncertainty in the data.

JHEP10(2023)001
Finally the background originating from multi-jet events, where jets are misidentified as τ had , is estimated by using a data-driven fake-factor method. A control region dominated by multi-jet events, called Dijet-CR, is defined by taking events that satisfy one of the single-jet triggers (with thresholds between 15 and 420 GeV). The leading τ had-vis candidate is required to not satisfy the medium τ had identification and the subleading τ had-vis candidate is used as a probe. Both τ had-vis candidates are still required to pass the very loose τ had identification requirement. As for the τ lep τ had channel, the fake-factor is measured inclusively in the leading b-jet p T . The leading and subleading τ had-vis candidates are required to have opposite electric charges and have a p T > 65 GeV. At least one additional b-tagged jet is required, but no selection is made on the leading b-jet p T . Then, the fake-factor is defined as: where N MC includes all simulated background events. The pass or fail τ had -ID superscript refers to whether the subleading τ had-vis candidate satisfies the loose τ had -ID or not, while still satisfying the very loose requirement. For each of the SRs, the multi-jet estimate is then obtained from the a control region, called CR-1, composed of events in which the subleading τ had-vis candidate fails the loose τ had -ID, using the fake-factor: The fake-factor is parameterised as a function of the p T and number of chargedparticle tracks of the subleading τ had-vis candidate. The statistical uncertainty on the f τ had -ID (p T , N track ) fake-factor is considered as systematic uncertainty, and it varies in the range of 4-15% as a function of the τ had-vis p T , for τ had-vis candidates with both one and three associated charged-particle tracks. For this method to be accurate, it is important that the fail-τ had -ID and multi-jet control regions have a similar composition of quark-and gluon-initiated jets. This is obtained by inspecting the shape of subleading τ had identification scores in the two regions, which depends on the quark-gluon fraction. A lower threshold than the very loose requirement is applied on this score, which ensures that the shapes of the distributions are compatible.
After the selection, the multi-jet contribution to the SR is expected to be small. The modelling of the multi-jet background is verified in the same-sign validation regions (SS-VR). The SS-VR has the same selection as the SR, but the electric charges of the τ had candidates are required to be the same. For key distributions in the low b-jet p T category, the data is found to agree with the background prediction, with approximately half of the events being from tt and half from multi-jet background. The selection in the high b-jet p T SS-VR leads to low statistics with 2.8 expected events (mostly from tt) and 4 observed data events.

Systematic uncertainties
Systematic uncertainties arise from the reconstruction of the various physics objects and from theoretical or modelling uncertainties affecting the predictions for both the back--18 -JHEP10(2023)001 grounds and signals. These uncertainties manifest themselves in both the overall yield and shape of the final observable, and can be divided into two main groups: the experimental uncertainties and the modelling uncertainties.
The experimental uncertainties include the uncertainties related to the trigger, reconstruction, calibration and identification of electrons [93], muons [94], taus [102] and jets [98,99,111]; for electron and muons, additional uncertainties in the lepton isolation are considered. Uncertainties related to background with misidentified τ -leptons are described in section 6. Another source of experimental uncertainties is given by the luminosity measurement, whose primary measurement is obtained using the LUCID-2 detector [112]. An uncertainty value of 1.7% [113] is assigned for the combined 2015-2018 integrated luminosity.
Among the experimental uncertainties, the ones with the highest impact on the analysis sensitivity are the τ had-vis related uncertainties, with an impact on the results in the range of 30-40% depending on the LQ coupling and mass values considered. The uncertainties in the τ had-vis identification efficiency are in the range of 2 % to 6 %, while the eBDT efficiency uncertainties are of the order of 1 % to 2 %. These uncertainties are parameterised as a function of the τ had-vis p T and the number of associated tracks for the τ had-vis identification efficiency, and as a function of the τ had decay mode for the eBDT efficiency. In both cases, the uncertainties are derived in dedicated tag and probe measurements [102]. The τ had-vis reconstruction efficiency uncertainty is derived from comparisons between simulations using different detector geometries or Geant4 physics lists; this uncertainty is parameterised as a function of true τ had-vis p T and is between 1 % and 1.5 %. For the τ had-vis energy scale, the total uncertainty is in the range of 1 % to 4 % of the τ had-vis p T , arising from a combination of measurements: a direct measurement with Z → τ τ → µτ had-vis +3 ν events, measurements of the calorimeter response to single particles, and comparisons between simulations using different detector geometries or Geant4 physics lists. This uncertainty is also parameterised as a function of the τ had-vis p T and the number of associated tracks.
The uncertainties in the background modelling include uncertainties in the top-quark, Z+jets and diboson backgrounds, as well as multijet events in which quark-or gluoninitiated jets are misidentified as a τ had . Among the background modelling uncertainties, the ones related to the top-quark background have the largest impact on the analysis sensitivity, with an impact on the results in the range of 40-50% depending on the LQ coupling and mass values. This uncertainty is extracted by comparing nominal and alternative tt and single top-quark MC samples in the phase space of the SR and Top-CR. For each sample, a dedicated data-driven S T -dependent correction is applied before the comparison. The difference between the nominal and alternative samples in the S T distribution is taken as the uncertainty in the top-quark processes. The alternative samples have variations of the initial/final-state radiation, matrix element and PS compared to the nominal sample. To derive the initial/final-state radiation uncertainty, the generator parameters used to produce the nominal samples are varied. The matrix element to PS NLO matching uncertainty is derived by comparing the MadGraph5_aMC@NLO and Powheg predictions while keeping the same generator for the PS component. For the PS, the uncertainty is derived by a comparison with an alternative sample generated by using Herwig for the -19 -PS while keeping the same generator for the hard-scattering simulation component. The uncertainties in the background modelling originating from the PDF and α S uncertainties are found to be less than 1% and are neglected. Finally, an uncertainty in the tW interference for the single top-quark background is estimated by comparing the nominal sample, where the diagram removal scheme is applied, to an alternative sample that uses the diagram subtraction scheme [114].
The uncertainties in the signal modelling include those from the signal cross-section and acceptance due to renormalisation scale (µ R ) and factorisation scale (µ F ) variations, PDF and α S . The µ R and µ F uncertainties are estimated through an envelope of the variations obtained from scaling µ R and µ F by a factor between 0.5 and 2, while keeping their ratio between 0.5 and 2. The uncertainties due to the NNPDF3.0nlo PDF set and α S are evaluated following the PDF4LHC recommendation [115].

Results
The distribution of the S T variable for the events of the signal regions defined in table 3 for the τ lep τ had and τ had τ had channels, is used as final discriminant between the leptoquark signal and the background. The statistical analysis of the data is performed using the profile likelihood ratio method [116], to test whether a model can be rejected given the observed data. As the model under test is a signal plus background hypothesis, the chosen parameter of interest is the signal strength, µ, defined as the ratio of the observed to the predicted value of the signal cross-section times branching fraction. The likelihood function L(µ, θ) is then constructed as a product of Poisson probability terms for each bin of the distributions. It depends on µ and on the nuisance parameters θ, which encode systematic uncertainties that can affect the signal and background distributions and are constrained using Gaussian probability density functions. The asymptotic approximation is used when constructing the test statisticq u [117] from the likelihood ratio, defined asq µ = −2 ln(L(µ,θ)/L(μ,θ)) whereμ andθ are the parameters that define the global maximum-likelihood function and θ are the nuisance parameters that give the maximum likelihood for a given value of µ. To ensure a reliable estimation of the backgrounds in the fit model, the binning in S T is optimised so that sufficient background events (greater than 10) are present in each bin of the pre-fit distribution. No significant excess above the background expectations is observed and corresponding limits on production cross-sections of the LQ signals are set.
For the LQ results interpretation, only the high b-jet p T signal regions from the τ lep τ had and τ had τ had channels are considered and fit simultaneously. For the high b-jet p T signal regions, the contribution from the non-resonant LQ production process is small. Therefore the interference between the LQ non-resonant processes and the SM processes is not expected to be substantial in these SRs, and it is neglected. By setting µ = 0 in the profile likelihood ratio, the test statistic can be used to check for compatibility with the background-only hypothesis. The data are first fit under the background-only hypothesis and the resulting post-fit distributions are found to be in good agreement with the data, as shown in figure 5. Table 6 shows the yields for the τ lep τ had and τ had τ had channels, respectively. Data 1053 29 Table 6. Post-fit background yields in the high b-jet p T signal region of τ lep τ had and τ had τ had channels. 'Jet → τ fake' indicates the events with a true lepton and a quark-or gluon-initiated jet misidentified as a τ had . The 'Two jet → τ fake' indicates the events where two jets are misidentified as τ had . 'Others' in τ lep τ had includes Z(→ τ τ )+LF jets, diboson, W +jets and Z(→ ee, µµ)+jets while 'Others' in τ had τ had includes Z(→ τ τ )+LF jets, diboson and W +jets. The results are extracted from a fit assuming the background-only hypothesis.
As good agreement is found between the data and the background expectation, upper limits are set on the cross-section assuming that the branching fraction B(LQ → bτ ) is 100% in the case of the S 1 model and 50% for the U YM 1 and U MIN 1 scenarios. This is performed using the frequentist CL s method [116]. A production cross-section for a given signal scenario is excluded at the 95% confidence level (CL) when CL s < 0.05.   Interference with SM neglected Obs. limit Exp. limit Interference with SM neglected Obs. limit Exp. limit Interference with SM neglected Obs. limit Exp. limit The results are interpreted considering all LQ production modes in the U 1 model. Several values of the coupling λ are considered, each one with a value of the coupling parameter κ of 0 or 1. The exclusion limits for the single plus non-resonant plus pair vector LQ production are shown in figure 6. The behaviour of the upper limits as a function of m LQ reflects the signal acceptance times efficiency of the analysis. Figure 7 shows the vector LQ limits in the λ − m LQ plane for each of the κ coupling values considered.
The same procedure is used to interpret the results for single, non-resonant and pair production of scalar LQs from the S 1 model. The single S 1 production and the combined single plus non-resonant plus pair S 1 production are considered. The 95% CL s limits on the S 1 production cross-section are derived as a function of LQ mass for various assumptions on the coupling λ value. The single plus non-resonant plus pair S 1 production result is shown in figure 8. The exclusion limits in the λ − m LQ plane are shown in figure 9.
The observed and expected limits on the LQ mass for the various signal production modes considered are reported in   [118]. The interference of the non-resonant LQ production with SM processes is expected to be small in the high b-jet p T category and it is neglected. Interference with SM neglected Obs. limit Exp. limit Interference with SM neglected Obs. limit Exp. limit Interference with SM neglected Obs. limit   Figure 9. The two-dimensional 95% CL exclusion limits in the λ − m LQ plane for singly plus nonresonant produced S 1 (green lines) and for the sum, referred as Total, of single plus non-resonant plus pair vector LQ production (blue lines). Regions to the left of the lines are excluded. The interference of the non-resonant LQ production with SM processes is expected to be small in the high b-jet p T category and it is neglected. channel is more sensitive than the τ lep τ had channel due to the smaller background and the larger signal to background ratio in the last S T bin. This difference arises from the larger tt background for the bτ lep τ had final state in the most sensitive part of the S T spectrum. An additional model-independent search considering both the high and low b-jet p T signal regions in the τ lep τ had and τ had τ had channels is performed. For each of these regions, the events with S T < 600 GeV form a sideband region, while a signal region is defined by counting the number of events with a S T value above a variable threshold. First the four sideband regions, one for each channel and for each b-jet p T signal region, are fit simultaneously considering the background-only hypothesis. Then, the fit results are used to scale the predicted background contribution in the signal regions. In each signal region, the signal is obtained by counting the number of observed data events after subtracting -24 -JHEP10(2023)001 the background prediction. Figure 10 shows the post-fit distributions of the S T variable in the sideband region and the background composition in each channel as a function of the S T lower bound threshold used to define the signal regions.
Since no significant excess is observed in any of the signal regions, a signal-plusbackground fit is performed considering a generic signal in the signal region. As for the LQs search, the parameter of interest of the statistical analysis is the signal strength µ, and the results are translated into upper limits on the number of signal events and, dividing them by the integrated luminosity, they can be expressed in terms of upper limits on the visible cross-section, σ vis . Figure 11 shows the limit values of the visible cross-section as a function of S T lower bound threshold in each signal category. The visible cross-section limits can be reinterpreted as limits on specific physics models as long as the selection efficiency and acceptance of the model (including any uncertainties in these values) for a specific signal region definition used in this analysis is known. By dividing the visible cross-section limits given here by this efficiency and acceptance, upper limits on the cross-section can be derived.

Conclusion
A search for scalar and vector leptoquarks is performed in the bτ τ final state using pp collision data at √ s = 13 TeV recorded by the ATLAS detector at the LHC from 2015 to 2018 corresponding to an integrated luminosity of 139 fb −1 . Final states including one leptonic and one hadronic τ -lepton decay or two hadronic τ -leptons decays are considered. In each of these two final states, events are classified, based on the p T of the b-jet, in two signal regions of low and high b-jet p T . The benchmark model is U 1 for vector leptoquarks in the Yang-Mills or Minimal coupling scenarios with λ between 0.5 and 2.5. For scalar leptoquarks the benchmark model is S 1 , with values of the λ parameter ranging between 0.5 and 2.5.
Upper limits at 95% CL on the cross-section for leptoquarks produced via either single plus non-resonant production, or considering all production modes (including pair production), and decaying into bτ are set. The results have been extracted considering only the high b-jet signal region and the combination of both final states. For the Yang-Mills coupling, the observed (expected) lower limits on the leptoquark mass are 1. An interpretation of the results in a model-independent scenario is also performed for each of the signal region categories. The 95% confidence level limits on the visible crosssection vary between 0.17 fb and 4.8 · 10 −2 fb as a function of the event variable S T ranging from S T > 600 GeV to S T > 950 GeV.