Search for long-lived heavy neutral leptons with lepton flavour conserving or violating decays to a jet and a charged lepton

A search for long-lived heavy neutral leptons (HNLs) is presented, which considers the hadronic final state and coupling scenarios involving all three lepton generations in the 2-20 GeV HNL mass range for the first time. Events comprising two leptons (electrons or muons) and jets are analyzed in a data sample of proton-proton collisions, recorded with the CMS experiment at the CERN LHC at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 138 fb$^{-1}$. A novel jet tagger, based on a deep neural network, has been developed to identify jets from an HNL decay using various features of the jet and its constituent particles. The network output can be used as a powerful discriminating tool to probe a broad range of HNL lifetimes and masses. Contributions from background processes are determined from data. No excess of events in data over the expected background is observed. Upper limits on the HNL production cross section are derived as functions of the HNL mass and the three coupling strengths $V_{\ell\mathrm{N}}$ to each lepton generation $\ell$ and presented as exclusion limits in the coupling-mass plane, as lower limits on the HNL lifetime, and on the HNL mass. In this search, the most stringent limit on the coupling strength is obtained for pure muon coupling scenarios; values of $\lvert V_{\mu\mathrm{N}}\rvert^{2}$ $\gt $ 5 (4) $\times$ 10$^{-7}$ are excluded for Dirac (Majorana) HNLs with a mass of 10 GeV at a confidence level of 95% that correspond to proper decay lengths of 17 (10) mm.


Introduction
The observation of neutrino oscillations [1][2][3], a mechanism by which neutrinos can change their flavour, implies that at least two of the standard model (SM) neutrinos must have a nonzero mass and that neutral lepton flavour number is not a conserved quantity.Upper limits on the sum of all neutrino masses have been derived from cosmological considerations, whereas the study of particle decays via charged currents (e.g.β-decay of tritium [4]) has allowed upper limits to be set on the mass per lepton generation [5].The smallness of the neutrino masses compared to those of the other SM particles and to the vacuum expectation value of the Higgs potential suggests that there might exist another mechanism, beyond the SM, to explain the fundamental nature of neutrinos.
In this context, a minimal extension to the SM is given by a "seesaw" mechanism [6][7][8][9][10][11], which postulates additional neutral leptons that are singlets with respect to the SM gauge groups and hence can possess an arbitrary mass.Through coupling to the Higgs field, these new particles mix with the SM neutrinos after electroweak symmetry breaking.The resulting mass eigenstates correspond to the three SM neutrinos with small masses and to three nearly sterile heavy neutral leptons (HNLs) that couple to the W and Z bosons in a similar way as the SM neutrinos.However, their coupling strengths are suppressed because of small mixing angles.An overview of the phenomenology of HNLs at the GeV mass scale can be found in Ref. [12].
A common model that incorporates HNLs via the seesaw mechanism is the neutrino minimal standard model [13,14].Models incorporating HNLs are further motivated since, beside neutrino oscillations, they can explain additional phenomena that are not described by the SM: the baryonic asymmetry of the universe and the nature of dark matter [15,16].Phenomenologically, an HNL can be either a Dirac or Majorana particle, N, that is characterized by its mass m N and coupling strengths V eN , V µN , and V τN to electrons, muons, and tau leptons, respectively.The HNL can be long-lived for sufficiently low coupling strengths since its partial three-body decay widths are proportional to m 5 N |V ℓN | 2 , where ℓ ∈ {e, µ, τ} [12].In general, lepton flavour number is not a conserved quantity in HNL interactions.It is only conserved in the special case where only one V ℓN coupling is nonzero.
Searches for HNLs are pursued at various experiments covering a wide range of masses from a few keV up to several TeV [17,18].At the CERN LHC, searches for HNLs are performed employing their production in the decay of mesons or through on-shell W or Z bosons, where the most stringent limits on V eN and V µN can be derived starting at the GeV mass scale.Corresponding analyses have been conducted by the ATLAS, CMS, and LHCb Collaborations [19][20][21][22][23][24][25][26][27][28][29][30][31][32], which mainly target signatures arising from the production of HNLs and their prompt decay to only one lepton generation (e or µ), whereas long-lived HNLs and the possibility of cross-generation lepton couplings are only recently being explored.Various of these experimental results have been statistically combined using the GAMBIT tool [33].
The two most common three-body decay modes of HNLs involve a virtual W boson and result in final states comprising two leptons and a neutrino or one lepton and two quarks with branching fractions of approximately 25 and 50% for m N > 5 GeV, respectively.A signature considered in this search is depicted in Fig. 1, which shows the Feynman diagram for Dirac HNL production at the LHC via an on-shell W boson for m N < m W , and the subsequent HNL decay to a lepton and two quarks.Alternatively, HNLs can also be produced in decays of b hadrons for m N < m b at the LHC, which is not considered in this search.For long-lived HNLs, the decay products are spatially separated (i.e.displaced) from the point where the HNL is produced.Thus, the process depicted in Fig. 1 results in a signature comprising one prompt lepton ℓ 1 , a displaced lepton ℓ 2 , and at least one displaced jet j ⋆ , a collimated set of particles originating from the hadronization of the two quarks.If the HNL is highly Lorentz-boosted, a single jet can also be formed encapsulating all displaced decay products from the HNL, a case that is also considered in this analysis.For Dirac HNLs, the final-state leptons have opposite electrical charges, while for Majorana HNLs they can have either the same or opposite electrical charges with equal probability.
Figure 1: Born-level Feynman diagram for Dirac HNL production and decay via a chargedcurrent interaction.Corresponding diagrams exist also for Dirac anti-HNL and Majorana HNL production and decay.In this search, the HNL decay products, encapsulated by the boxes, can be found collimated within a single jet j ⋆ .
In this paper, a search for long-lived Dirac and Majorana HNLs is performed without assuming that their coupling is exclusive to only one lepton generation.Instead, HNL production is investigated in the full coupling space spanned by V eN , V µN , and V τN for 2 < m N < 20 GeV, where an HNL proper lifetime τ 0 of up to cτ 0 = 10 4 mm can be probed.A data set of protonproton (pp) collisions recorded by the CMS detector at a centre-of-mass energy of 13 TeV and corresponding to an integrated luminosity of 138 fb −1 is analyzed.Events containing two leptons (e or µ) and jets are selected to target the hadronic HNL decay channel.Several event categories are defined to provide sensitivity to Dirac and Majorana HNL signals, both prompt and displaced, while contributions from background processes are estimated from data.A central feature of the analysis is the identification of displaced jets (with or without an overlapping displaced lepton) using a deep neural network (DNN) that does not explicitly require the reconstruction of displaced vertices.The DNN uses a domain adaptation technique to ensure accurate performance of the resulting classifier in data.The presented search explores an orthogonal phase space compared to Ref. [28], where similar parameter ranges in m N and cτ 0 are probed but events with three leptons and no jets have been investigated instead.
An outline of the paper is as follows: a brief description of the CMS detector and the reconstruction of events is given in Section 2. The real and simulated data sets are detailed in Section 3, followed by a description of the event selection and categorization in Section 4. In Section 5, details of the identification of displaced jets based on a DNN are presented including its architecture and training setup.The determination of the number of events from background processes is described in Section 6.A list of the sources of systematic uncertainties affecting the expected number of signal and background events can be found in Section 7. The results are presented in Section 8, followed by a summary in Section 9.
Tabulated results are provided in the HEPData record for this analysis [34].

The CMS detector and event reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintilla-tor hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity η coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [35].
The particle-flow (PF) algorithm [36] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector.The energy of photons is obtained from the ECAL measurement.The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track.The energy of muons is obtained from the curvature of the corresponding track.The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers.Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.The PF algorithm is able to reconstruct electrons and muons with efficiencies of about 80 and 95%, respectively, for transverse displacements of less than 1 cm.At larger displacements, the efficiencies gradually reduce and approach zero at 40 cm and 1 m, respectively, as determined from simulation.
The missing transverse momentum vector ⃗ p miss T is defined as the projection onto the plane perpendicular to the beams of the negative vector momentum sum of all PF candidates in an event.Its magnitude is referred to as p miss T .Jets are reconstructed from PF candidates and clustered by applying the anti-k T algorithm [37] with a distance parameter of 0.4.The reconstruction of jets is fully efficient with transverse displacements up to the edge of the inner tracking volume.After this, the reconstruction efficiency degrades as displaced charged particles can only be identified through their calorimetric energy deposits.
The primary vertex (PV) is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [38].
Events with spuriously high p miss T , originating from a variety of reconstruction failures, detector malfunctions, or noncollision backgrounds, are vetoed by event filters with negligible impact on events with genuine p miss T [39].
Events of interest are selected using a two-tiered trigger system.The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of 4 µs [40].The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [41].

Data set and simulated samples
The analyzed sample of pp collision data was recorded at √ s = 13 TeV with the CMS detector during 2016, 2017, and 2018.The subsamples by year correspond to integrated luminosities of 36.3, 41.5, and 59.7 fb −1 , respectively.The 2016, 2017, and 2018 data sets are analyzed independently and only combined for the final result to account for differences related to the data-taking conditions and to apply proper corrections and calibrations per year.Events were triggered by requiring at least one electron or muon candidate within |η| < 2.4.Electron candidates must have a transverse momentum p T of at least 27 (32) GeV in 2016 (2017, 2018), and pass additional identification requirements [42].Muon candidates have to fulfil p T > 24 GeV (2016, 2018) or p T > 27 GeV (2017), and have to be isolated from other activity in the event [43].
Samples of simulated signal events are used to evaluate the detector resolution, efficiency, and acceptance for HNL production processes.Additionally, samples of simulated background events are used for understanding the composition of the data and to optimize the analysis strategy.Samples of simulated events are furthermore used to train machine learning algorithms.However, the final results are obtained by determining the contribution from background processes using data in sideband regions.
Samples of HNL production with various values of m N ranging from 2 to 20 GeV, and with 10 −5 < cτ 0 < 10 4 mm, are generated in the five-flavour scheme at leading order (LO) accuracy (i.e.tree-level) using the MADGRAPH5 aMC@NLO v2.6.5 event generator [44].The parton distribution function (PDF) set used is NNPDFv3.1 [45].Parton showering is performed with PYTHIA 8.230 [46] using the underlying event tunes CUETP8M1 [47] (2016) or CP5 [48] (2017, 2018).Each sample is generated assuming an equal coupling strength for the three lepton generations, V eN = V µN = V τN .The resulting events are reweighted to also probe different combinations of coupling strengths.For these, the HNL displacement, determined by the total HNL decay width Γ N , has to be kept constant such that the subsequent detector simulation and event reconstruction do not need to be rerun.The HNL decay width can be written as where the factors A ℓ , calculated by MADGRAPH5 aMC@NLO, describe the dependence of the partial widths per lepton flavour Γ ℓ on m N .Equation (1) describes an ellipsoid spanned by V eN , V µN , and V τN for a given Γ N and m N .Thus, the generated events of a particular sample can be reweighted to any coupling combination on the ellipsoid surface.The barycentric coordinates f e , f µ , and f τ with ∑ f ℓ = 1 are introduced, which denote the relative ratios between the coupling strengths squared |V eN | 2 : |V µN | 2 : |V τN | 2 ≡ f e : f µ : f τ .In total, 66 equidistant weights in f ℓ space are used, which have been calculated by the event generator at the matrix element level [49].The modelling of low-p T jets originating from HNL decays has been validated by using the average charged particle multiplicity and comparing it to previous measurements in e + e − annihilation [50].The HNL production cross sections are calculated at next-to-leading order (NLO) in the strong coupling constant, using the same generator settings, and are found to be approximately 10% higher than the cross sections calculated at tree level for all samples.
Further details on the HNL Lagrangian and the implementation of models for both Dirac and Majorana HNLs can be found in Refs.[51][52][53][54].
The major background processes are the production of vector bosons in association with jets (W+jets, Z/γ * +jets, and Vγ * +jets, where V ∈ {W, Z}), top quark production (tt, single t), and multijet events consisting of jets produced through the strong interaction.Samples of simulated background events are generated at NLO accuracy using the MADGRAPH5 aMC@NLO or POWHEG [55][56][57] 2.0 event generators with the exception of multijet events that are generated at LO with PYTHIA.The NNPDFv3.1 set is used and parton showering is performed by PYTHIA with the same underlying event tunes as used for the signal samples.
The simulated events are overlaid with additional simulated minimum bias collisions ("pileup") according to the distributions inferred from the data per year.All generated events un-dergo a full GEANT4 [58] simulation of the detector response and are subjected to the same event reconstruction as the data.

Event selection and categorization
Events containing two leptons (ℓ ∈ {e, µ}) and at least one jet are analyzed.This signature targets the production and decay of HNLs via charged currents, pp → ℓ 1 N, N → ℓ 2 qq ′ .The first lepton ℓ 1 is expected to originate from the production vertex, while the second lepton ℓ 2 and the jet(s) from the hadronization of the two quarks can be displaced with respect to the PV.The hadronic HNL decay channel targeted by this analysis has a branching fraction of about 50% for m N > 5 GeV, which is approximately twice as large compared to the leptonic decay channel, N → ℓ ± ℓ ∓ ν.Other decay channels occur via neutral currents and result in νqq or ννν final states.Decays to other BSM particles, e.g.lighter HNLs, are not considered in the model used for generating the signal samples.In scenarios where m N ≲ 8 GeV, the HNL is found to be sufficiently Lorentz-boosted with respect to the laboratory frame such that its decay products are collimated inside a single jet.The selected events also include leptonic τ decays to e or µ.Tau lepton decays to hadrons are not considered in this analysis because efficient event trigger strategies with sufficiently low p T thresholds on τ candidates were not available.In the following, the selection criteria for the physics objects are described.

Physics object selection
Two types of candidates are defined for electrons and muons depending on whether they are considered for ℓ 1 or ℓ 2 .The former is used to trigger the event and thus has to fulfil more stringent, tight identification criteria, while the latter is required to pass only relaxed, loose identification criteria instead.This increases the signal selection efficiency in cases when ℓ 2 is displaced and/or clustered inside a jet.
Tight electron candidates are required to have a p T of at least 29 (34) GeV and be within |η| < 2.4 for the 2016 (2017, 2018) data-taking periods.A boosted decision tree (BDT) discriminant is used to identify genuine electrons with an efficiency of ≈90% as measured in Drell-Yan events [42].This BDT is trained using several features of the electron track and of the associated energy deposits in the ECAL, including the relative isolation parameter I rel (e).This parameter is defined as the scalar p T sum of the charged and neutral PF candidates within a cone of radius ∆R = √ (∆η) 2 + (∆ϕ) 2 = 0.3 around the electron candidate, excluding the candidate itself, divided by the electron p T .Here, ∆η and ∆ϕ are, respectively, the pseudorapidity and azimuthal angle measured in radians and relative to the electron direction.The relative contribution from pileup is estimated as A eff ρ and subtracted from the isolation parameter, where A eff denotes an η-dependent effective area, and ρ is the median of the transverse energy density in a δη-δϕ region calculated using the charged particle tracks associated with the pileup vertices.Furthermore, the electron candidate has to originate from the vicinity of the PV by requiring a transverse and longitudinal impact parameter of |d xy | < 0.05 (0.10) cm and |d z | < 0.1 (0.2) cm, respectively, in the ECAL barrel (endcap) region.
Loose electron candidates need to fulfil p T > 5 GeV, |η| < 2.4, and pass a series of loose selection criteria that have been optimized to enhance the acceptance for displaced electrons that can also be clustered within nearby jets, as defined below.The criteria are a modified version of the standard identification requirements [42] but allow for any number of missing electron track hits in the inner tracker.Furthermore, no requirements on the compatibility with the PV or on I rel (e) are imposed.
Tight muon candidates must be within |η| < 2.4, have a p T of at least 26, 29, 26 GeV for the 2016, 2017, 2018 data-taking periods, respectively, and fulfil additional identification requirements optimized for the selection of genuine muons that result in efficiencies of >96% as determined from a data sample of Drell-Yan events [59].In particular, muon candidates are required to be within |d xy | < 0.01 cm and |d z | < 0.05 cm of the PV.Furthermore, the candidates must be isolated with a relative isolation parameter I rel (µ) < 15%, which is defined as the scalar p T sum of PF photons and charged and neutral PF hadrons within a cone of radius of ∆R = 0.4, divided by the muon p T .Contributions from pileup are estimated from the energy deposited by charged hadrons within the isolation cone that are associated with pileup vertices and subtracted from the isolation parameter.
Loose muon candidates are required to pass p T > 3 GeV, |η| < 2.4, and fulfil only minimal requirements necessary for being reconstructed by the PF algorithm [59].Similar to loose electron candidates, loose muon candidates are allowed to be inside jets and do not have to be compatible with the PV.
Events with exactly two leptons of any charge and flavour combination are considered for further analysis, where at least one needs to fulfil the tight candidate requirements.This results in event categories, denoted "ℓ 1 ℓ 2 ", where the first lepton is the tight candidate that is also matched to a corresponding event trigger and has to have p T (ℓ 1 ) > p T (ℓ 2 ) to avoid the double counting of events where both leptons are responsible for a positive decision by the trigger system.
The momentum of each jet is defined as the vectorial momentum sum over all its constituent PF candidates.To mitigate the contribution of additional tracks and calorimetric energy depositions from pileup interactions, PF candidates identified as originating from pileup vertices are discarded and an offset correction is applied to account for remaining contributions [60].Jet energy corrections are derived from simulation so that their average measured energy becomes identical to that of particle level jets.In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made.The corrections are propagated to the measured ⃗ p miss T .
The analysis considers jets within |η| < 2.4 with a calibrated p T greater than 20 GeV.Jets that overlap with ℓ 1 within ∆R < 0.4 are ignored, whereas the jet p T needs to be at least 30 GeV if ℓ 2 is within ∆R < 0.4 from the jet.Jets close to ℓ 2 are also ignored if I rel (ℓ 2 ) < 15% to reject jets that are primarily clustered from isolated leptons.The jet that is the closest in ∆R to ℓ 2 is interpreted as the jet from the (displaced) HNL decay and is hereafter referred to as j ⋆ .In simulation this procedure assigns a reconstructed jet correctly to the HNL decay with an efficiency of larger than 95%.Events with more than one displaced jet from the HNL decay passing the event selection constitute a negligible fraction and are hence not explicitly targeted by the selection.

Event categories
The invariant mass m ℓℓj ⋆ is calculated from the summed four momenta of the two leptons and j ⋆ .In signal events, m ℓℓj ⋆ approximates the mass of the W boson.The signal region (SR) is defined for selected events with an invariant dilepton mass of 20 < m ℓℓ < 80 GeV, p miss T < 60 GeV, and consisting of 1-4 jets, to reject Z/γ * +jets, W+jets, and high-p T processes (e.g.tt) with many jets that can additionally be produced in association with neutrinos.Additionally, j ⋆ must be within ∆R < 1.3 of ℓ 2 since in signal events both are expected to be spatially close together and even overlap when the HNL is sufficiently boosted.To suppress contributions from low-p T leptons produced during hadronization, which are often found in W+jets and multijet events, the ratio p T (j ⋆ )/p T (ℓ 2 ) must be smaller than 2-4 depending on the data-taking period and category, since the p T threshold for leptons also varies per data-taking period.Additionally, µ ± µ ∓ events must have m ℓℓ < 70 GeV to further reject the overwhelming Z/γ * +jets background in these categories.After applying all object and event selection criteria, the following approximate fractions of signal events remain when summed over all lepton flavour combinations: 2.5% for cτ 0 ≤ 1 mm, 1.5% for cτ 0 = 10 mm, 0.9% for cτ 0 = 100 mm, and less than 0.1% for scenarios with larger cτ 0 values.These low efficiencies are primarily due to HNLs decaying into low-p T leptons that often do not satisfy the p T thresholds applied as part of the event selection.
A control region (CR), enriched in background events (mainly Z/γ * +jets), is obtained by inverting the SR selection on m ℓℓ , i.e. to be larger than 80 GeV.This region is used for validating various aspects of the analysis strategy such as the modelling of the simulation and the displaced jet tagger (Section 5).
Events in the SR are categorized through a series of criteria to be simultaneously sensitive to various HNL mass and coupling scenarios.First, events are separated depending on the lepton flavour combination (ee, eµ, µe, µµ) for probing different coupling scenarios and on the lepton charge combination, i.e. opposite-or same-sign (OS, SS) events, to be sensitive to Dirac and Majorana HNL production, simultaneously.Furthermore, events are classified as either "boosted" if ℓ 2 overlaps with j ⋆ (i.e.∆R < 0.4) or "resolved" otherwise.Finally, events are split into three subcategories depending on the significance of the transverse displacement d Distributions of m ℓℓj ⋆ in the SR are presented in Fig. 2 for events with OS and SS lepton pairs.The data are found to be described well by the prediction from simulation normalized to the integrated luminosity.A representative signal scenario for Majorana HNL production assuming m N = 10 GeV and cτ 0 = 1 mm with V eN = V µN = V τN is overlaid.A resonance is expected in the m ℓℓj ⋆ distribution for signal events at the W boson mass, while a broader spectrum is found for background processes.The background in events with OS lepton pairs consists primarily of Z/γ * +jets and W+jets events, while Vγ * +jets, tt, single top quark, and multijet processes are subdominant.A peak around the Z boson mass is seen in Z/γ * +jets events since m ℓℓj ⋆ is mainly calculated from the two leptons momenta when the hadronic activity in j ⋆ is low.In SS lepton pair events, multijet and W+jets events get selected because of misidentified leptons and compose the majority of the background.The number of multijet events appears to be slightly overestimated by the simulation resulting in an overall mismatch in normalization between data and prediction, which is however found to be covered by uncertainties.For further analysis, only events in the signal-enhanced region, defined by 70 < m ℓℓj ⋆ < 90 GeV, are kept, whereas the sideband region in m ℓℓj ⋆ is used to estimate the background contributions from data (Section 6).Overall, the total background is of comparable size in both categories and since the Majorana and Dirac HNL production cross sections are identical, a similar sensitivity for both HNL types is expected.

Identification of displaced jets
Displaced jets that originate from the HNL decay products are identified using a jet tagger based on a DNN.The tagger is trained using jets originating from HNL decays with a transverse displacement L xy between 100 µm and 1 m, where the lower threshold corresponds to a factor 2-4 times the position resolution of the PV [61].The DNN is trained on various features of the selected jets and parameterized with respect to L xy , resulting in an optimized multiclass classifier that can be applied for a large variety of jets and displacements.The network is based on the developments reported in Ref. [62], but its setup has been extended in several aspects to account for the specific properties of jets from HNL decays, as detailed in the following.
The DNN architecture consists of approximately 330 000 parameters.The set of input features includes global jet features, as well as specific features for each type of jet constituent.The global jet features are the jet p T , |η|, mass, and area; the fraction of energy carried by the constituents within cones of ∆R = 0.1, 0.2, 0.3, 0.4 with respect to the jet axis; the number of constituents carrying 60 or 90% of the jet energy; the charged, electromagnetic, muon, and electron energy fractions; the variables used in the combined secondary vertex b tagging algorithm [63]; the 1-, 2-, and 3-subjettiness variables [64]; and various event shape observables that are calculated from the constituents in the rest frame of the jet to probe its substructure.In total, 47 global jet features are used.
The features per constituent are grouped into blocks according to their type as follows: • Charged PF candidates: These candidates have an associated track and can thus provide information about the potential displacement of the jet within the tracking volume.The features encapsulate various properties of the track, its displacement with respect to the PV, and whether the track is also linked to a secondary vertex or a PF electron or muon.Up to 25 charged PF candidates are considered per jet, which are sorted according to the significance of their displacement in the transverse plane with respect to the PV.In total, 36 features are used per charged PF candidate.
• Neutral PF candidates: Features from up to 25 neutral constituents are used as input, which are important for identifying jets from HNL decays that occur at the outer edge of the tracking volume, where the track reconstruction efficiency is low.
The neutral candidate features include the likelihood that the candidate stems from pileup interactions as calculated by the PUPPI algorithm [65].The candidates are sorted by either their distance in ∆R to a secondary vertex or by their p T in cases where no secondary vertex has been reconstructed.In total, 9 features are used per neutral PF candidate.
• Secondary vertices: Features from up to four secondary vertices, reconstructed with the inclusive secondary vertex finder algorithm [66], that are found within ∆R < 0.4 of the jet axis and use at least one jet constituent in their reconstruction, are provided as input to the DNN as well.They are sorted by the significance of their displacement in the transverse plane with respect to the PV.In total, 14 features are used per secondary vertex.
• PF electrons/muons: If a charged PF candidate is linked to a PF electron or muon, features that are specific to their reconstruction are used as additional inputs.For electrons, the properties of the Gaussian-sum-filter track [67] and matching clusters in the ECAL are used.For muons, the features include details of the global track fit and the hit pattern in the muon chambers.Up to two electrons and muons are considered per jet, which are sorted according to the significance of their displacement in the transverse plane with respect to the PV.In total, 78 (37) features are used per electron (muon) candidate.
In the first stage of the DNN, the various jet constituent features are compressed through a series of one-dimensional convolutions with varying filter sizes using a kernel size of one.The last convolutional layer can be seen as a bottleneck that forces the DNN to extract only the most discriminating information from the larger number of input features, which is then passed on to the next stage.The compressed features of each block are flattened and concatenated with global jet features and L xy , where the latter acts as a parameter of the DNN [68].The result is then passed through two dense layers with 200 nodes.At this stage, the network splits into one branch to predict the jet class and into another branch to predict the jet domain (i.e.simulation or data), where each branch consists of another series of dense layers.The following jet classes are considered: jets from the hadronization of gluons or quarks; jets from pileup interactions; jets clustered primarily from prompt e, µ, photons, or hadronically decaying τ leptons; displaced jets from the HNL decay with or without displaced e, µ, or hadronically decaying τ leptons.
In general, the intricate modelling of displaced signatures is less well studied compared to prompt ones.Hence, features of displaced jets can possess a different trend in data compared to simulation.The distribution of output discrimination scores from a classifier, trained solely on simulated events, can differ significantly when evaluated on simulation or data.In a previous study, differences of up to 50% have been seen between the selection efficiencies of data and simulation [62].In this search, a domain adaptation technique is used to mitigate such differences and effectively calibrate the response of the tagger for simulated events.Technically, a layer is inserted into the domain branch that reverses the gradients of the loss function with respect to the DNN parameters in the preceding layers.During the training, this forces the DNN to retain only discriminating features from the inputs that are domain-invariant [69], i.e. features that are well described by the simulation.The loss function is chosen to approximate the earth mover distance [70], which yields well-defined gradients and thus stabilizes the ad-versarial training.The relative strength of the domain loss is chosen such that a trade-off is achieved between obtaining a discriminant with approximately equal efficiencies in data and simulation while simultaneously retaining high discrimination between the various jet classes.
The class branch is trained using jets from independent samples of simulated events from various processes.The jets are resampled to obtain a balanced set with equal class proportions as a function of the jet p T and η.Random L xy values are assigned to background jets that are sampled from the distribution of L xy values of true HNL jets [68].The domain branch is trained on jets from data and simulated background events in the CR, where the latter are weighted according to the expected cross section per process.
The best discrimination against SM jets is found by calculating the ratios for each displaced jet likelihood over the sum of the SM background likelihoods.The ratios are monotonically mapped to the range 0-1 for simplicity.Per displaced jet, the L xy parameter is chosen in the range 10 −1 -10 3 mm to maximize the resulting score for each class.For true displaced jets, the score will be the largest when L xy matches the displacement of the HNL, whereas it is randomly distributed for SM jets.The resulting scores are used as follows.For resolved events, the score for displaced jets with and without hadronically decaying τ leptons, P q (j ⋆ ), is evaluated.On the other hand, for boosted events the flavour of ℓ 2 determines whether the score for displaced jets with electrons, P e (j ⋆ ), or with muons, P µ (j ⋆ ), is evaluated instead.Distributions of the resulting scores are shown in Fig. 3 for resolved and boosted categories in the CR and SR.In general, the prediction from simulation, normalized to the integrated luminosity, describes the data well in all regions, except for a slight overestimation of multijet events as seen before in Fig. 2. Selections on the resulting displaced-jet tagger score can reduce the SM backgrounds by several orders in magnitude.For example, a selection of P(j ⋆ ) > 0.6 reduces the background yield by a factor of about 3000 (800) while retaining 10 (17%) of signal events for boosted (resolved) events considering the signal scenario in Fig. 3.In the analysis, optimized thresholds on P q,ℓ (j ⋆ ) are applied ranging from 0.2-0.7 depending on the event category to achieve the best sensitivity, as described later in Section 8.

Estimation of background processes
Various background sources contribute to the SR, with different relative fractions depending on the category.A substantial background contribution originates from Z/γ * +jets production in events with two genuine OS leptons of same flavour.In events with either SS leptons or OS leptons of different flavour, the dominant source of background events is misidentified leptons, which primarily originate from W+jets, Vγ * +jets, and multijet production, whereas events from tt production are less prominent.For final states containing electrons, "chargeflip" backgrounds, where the charge of one of the electrons is misidentified, are also possible.
Contributions from background processes are determined from data to accurately account for the reconstruction effects, selection efficiencies, and object calibrations of background events in the final event categories.This procedure has the additional advantage that no large samples of background events have to be simulated and subsequently calibrated for modelling the background contribution in the SR.
The yields in data, n α X , from three background-enriched sideband regions α ∈ {A, B, C} in each of the 48 categories X per data-taking period are used to determine the background contribution, b D X , in a signal-enriched region.The regions are defined by selecting events to be either above or below optimized thresholds on the P q,ℓ (j ⋆ ) score or on the m ℓℓj ⋆ observable, which are found to be independent from studies of simulated events.These selections partition the SR            Figure 4: Schematic of the orthogonal regions in the (m ℓℓj ⋆ ,P q,ℓ (j ⋆ )) plane, that are used to determine the background in region D from data through an ABCD method.The threshold P opt.depends on the category.Regions with P q,ℓ (j ⋆ ) < 0.1 (0.2) for resolved (boosted) events are not considered.The three VRs that are subsets of regions A and C are indicated in lighter shades.The figure is not to scale.
A so-called matrix ("ABCD") method is used, where the pass-fail ratio of an event passing the threshold on P q,ℓ (j ⋆ ) is measured in m ℓℓj ⋆ sidebands (50-70 GeV or 90-110 GeV) and then ap- plied to the observed yield found within 70-90 GeV.The yields b D X are determined through a maximum likelihood fit to data using the following model to properly account for non-Gaussian uncertainties in case of small event yields: where n α ijk is the number of events observed in region α with i denoting the data-taking period, j denoting the event topology, and k denoting the subcategory {ee, eµ, µe, µµ} ⊗ {OS,SS} ⊗ {d sig xy (ℓ 2 ) bins}.Nuisance parameters ⃗ θ that control the effects of systematic uncertainties are constrained through prior probability density functions p( ⃗ θ).A possible signal contamination of r s α ijk in the sidebands is explicitly considered, where r modifies the expected signal yields s α ijk from simulation for the given HNL scenario under consideration.
In addition to the signal strength r, fit parameters κ ij are introduced, for which the fitted values will be found to diverge from unity in the presence of a bias from the ABCD method.Separate κ ij parameters are used per data-taking period and per event topology.In summary, the parameters of this fit are r, b α ijk , ⃗ θ, and κ ij .The validity of the background estimation method is tested by predicting the data yields in three background-dominated validation regions (VRs) that are defined as follows.The first two VRs comprise events within m ℓℓj ⋆ ranges of 65-70 GeV or 90-100 GeV together with selections on the P q,ℓ (j ⋆ ) scores.For the third VR, events within a m ℓℓj ⋆ range of 70-90 GeV that are within an interval in the P q,ℓ (j ⋆ ) scores below P opt.are used.The three VRs in the (m ℓℓj ⋆ ,P q,ℓ (j ⋆ )) plane are sketched in Fig. 4. Another test is performed in an additional VR, defined using a narrow m ℓℓj ⋆ window within 5 GeV of region D, comprising subsets of events from both VR1 and VR2, to test the background estimation method close to region D.
Per VR, each κ ij parameter is determined through a simultaneous fit to 24 independent event categories.Overall, the determined event yields agree well with the observations in all VRs and uncertainties of 10-25% are assigned on b D ijk , chosen based on the most significant deviations of κ ij from unity.These uncertainties are implemented in fits of the SR through log-normal priors on the κ ij parameters.

Systematic uncertainties
The expected numbers of signal and background events can be affected by several sources of systematic uncertainties.Their impact is modelled through nuisance parameters that change the expected events yields and are correlated across all categories and years, where applicable.Most uncertainties only impact the signal prediction since it is obtained from simulation, whereas the number of expected background events is determined from data in sideband regions, as detailed in Section 6.
A set of systematic uncertainties related to the selection efficiencies and calibration of leptons, jets, and p miss T is evaluated for simulated events as described below.These are propagated through the displaced jet tagger by recalculating the scores on the modified inputs where applicable.
Systematic uncertainties in the prompt lepton reconstruction, identification, and trigger efficiencies are accounted for by varying p T and η-dependent normalization factors, derived from data-to-simulation ratios, within their uncertainties.Systematic uncertainties in the identification efficiency of the displaced lepton candidates are taken into account by varying additional scale factors within their uncertainties.These have been derived in J/ψ → µµ events and in asymmetric γ → ee conversion events.Furthermore, an uncertainty in the lepton track reconstruction efficiency is considered, derived from tracks in tt-enriched events.
The jet energy scale in simulated events is modified for all jets using the uncertainty associated with the p T and η-dependent scale factors.The resulting differences are propagated to p miss T .The jet energy resolution is modified in simulated events by either increasing or decreasing the difference in p T with respect to the matched jet at the generator level.Random smearing is used when no generator-level jet is found within ∆R < 0.4.The component of p miss T that is not clustered inside jets, and thus remains uncorrected by the jet energy scale, is varied within its uncertainty.
An uncertainty of 5% in the total inelastic cross section [71] is used when calculating the expected distribution of the number of pileup interactions in data.
An uncertainty in the displaced track reconstruction and jet tagging efficiency, which accounts for differences between data and simulation, is determined using tracks matched to secondary vertices in events from a tt-enriched region.These differences are parameterized as a function of the track |d xy |.A correction and the corresponding uncertainty in the displaced jet tagging efficiency is derived by varying the efficiency of the three most significantly displaced tracks within j ⋆ .These are found to be less than 10% for jets with prompt-like tracks (d xy < 0.1 cm) and in the range 10-20% at larger displacements.Uncertainties in the mistagging efficiency for jets from SM backgrounds are accounted for through the uncertainties in the background estimation.
The factorization and renormalization scales are varied by factors of 2 and 0.5 in simulated signal samples using precomputed weights by the event generator [72].Only residual changes after the selection are taken as uncertainties while the overall effect on the normalization is considered part of the theoretical uncertainty instead.Uncertainties due to the PDF choice are determined by reweighting each event according to the prescribed variations of the corresponding PDF set.
The integrated luminosities of the 2016, 2017, and 2018 data-taking periods are individually known with uncertainties in the 1.2-2.5% range [73][74][75], while the total 2016-2018 integrated luminosity has an uncertainty of 1.6%, the improvement in precision reflecting the (uncorrelated) time evolution of some systematic effects.
Uncertainties resulting from the finite size of the simulated samples are incorporated into the likelihood model using the Barlow-Beeston method [76].
The background yields are determined in-situ using the likelihood model defined in Eq. ( 2), which properly accounts for the statistical uncertainties in the data yields in the sidebands.Additional uncertainties in the range of 10-25% in the background yield are considered to account for the difference in the background estimation per data-taking period and for additional differences between resolved and boosted categories.
In summary, the statistical uncertainties from the sideband regions in the background yields are found to have the largest impact on the determined signal yield for the signal+background hypothesis, which can be as large as 10% depending on the category.Smaller impacts originate from the loose muon reconstruction (7%), the jet tagging efficiency (6%), the loose electron reconstruction (4%), and the jet energy scale and resolution (2%).

Results
The thresholds on the P q,ℓ (j ⋆ ) scores are optimized per category to yield the best discovery sensitivity [77] to the various HNL scenarios as follows.The expected yields and uncertainties from background processes are determined from data in sideband regions and the expected signal yields are taken from simulation, such that no bias with respect to the observed data can occur.The optimal thresholds are found to be within the range 0.2-0.7,depending on the event category.The resulting background yields are compared to the observed number of events in Fig. 5.The events in data are found to agree well with the estimated number of background events in each category.
Upper limits on the HNL production cross sections at 95% confidence level (CL) are determined as functions of m N and the absolute values of the couplings strengths squared |V eN | 2 , |V µN | 2 , and |V τN | 2 .A modified frequentist approach is employed, which uses the profile likelihood ratio as the test statistic [78], the CL s criterion [79,80], and the asymptotic formulae [81] to approximate the distributions of the test statistic under the background-only and signal-plusbackground hypotheses.
Two-dimensional exclusion limits on Dirac and Majorana HNL production are shown in Figs. 6,  7, and 8 for scenarios of pure coupling, of mixed coupling, and of democratic couplings to all lepton generations, respectively.A broad parameter space in mass and coupling values is excluded by this search.For m N ≲ 8 GeV, most exclusion limits roughly follow a constant cτ 0

fb
Figure 5: Observed number of events and predicted number of background events per category for (left) resolved and (right) boosted categories.The bin label denotes the flavour of the prompt (ℓ 1 ) and displaced (ℓ 2 ) lepton as ℓ 1 ℓ 2 .Two representative signal scenarios for Majorana HNL production with equal coupling to all lepton generations are overlaid.The lower panels show the ratio of the data to the predicted background.The hatched band shows the total systematic uncertainty in the predicted background.value.At higher masses, the cross section becomes too low and, at the same time, sensitivity to displaced scenarios is lost causing the limit curve to turn.At m N ≳ 14 GeV, the limits have transitioned into a prompt regime and follow a constant cross section value with negligible dependence on m N for m ℓ,q ≪ m N ≪ m W .The sensitivity rapidly deteriorates towards the pure τ lepton coupling scenario since ττ+jets events are only selected in the analysis in the case of leptonic τ decays to e or µ, whereas τ lepton decays to hadrons are not considered.No limits can be derived for scenarios with m N ≲ 3 GeV that are close to the pure τ lepton coupling case, because of the tau mass.For some HNL scenarios, the results are compared to an orthogonal search exploiting N → ℓ ± ℓ ∓ ν decays, where the two leptons form a displaced vertex [28], which prevents sensitivity to prompt scenarios, resulting in a different trend in the limits at high masses.In this search, the sensitivity to prompt scenarios is achieved by the displaced jet tagger, in part due to the absence of any requirement on the presence (or otherwise) of reconstructed secondary vertices.The best limits are obtained for pure muon coupling scenarios due to the excellent muon identification performance.Coupling strength values as low as |V µN | 2 > 5 (4) × 10 −7 are excluded for Dirac (Majorana) HNLs with m N = 10 GeV at 95% CL that correspond to proper decay lengths of 17 (10) mm.Limits on the HNL mass and proper lifetime are presented in Figs. 9 and 10, respectively, as functions of the relative coupling strengths to the three lepton generations, defined as ).The results are obtained by interpolating the logarithmic value of the limits linearly between neighbouring points in f ℓ -space.The most stringent limits are found for pure muon coupling scenarios.
The results obtained from the hadronic HNL decay show a comparable sensitivity as previous HNL searches targeting the orthogonal leptonic final state by the ATLAS, CMS and LHCb Collaborations [20,21,25,26,28,31,32].In particular, the presented results for the pure electron coupling scenario are found to be more stringent for m N ≲ 4 GeV.These are the first results reported that involve a prompt or long-lived Majorana or Dirac HNL that couples to all three lepton generations in the 2-20 GeV mass range.

Summary
A search is presented for long-lived heavy neutral leptons (HNLs), which are predicted in extensions of the standard model of particle physics through a seesaw mechanism.Dirac or Majorana HNLs can mix with the three standard model lepton generations in a nontrivial way, resulting in lepton flavour number violation.Hence, a complete set of coupling scenarios involving all three lepton generations is considered.A novelty of this analysis is the simultaneous sensitivity to prompt and displaced scenarios through the use of a dedicated displaced jet tagger that does not explicitly rely on the presence of secondary vertices.
A data sample of proton-proton collision events recorded with the CMS experiment at the CERN LHC at a centre-of-mass energy of 13 TeV is analyzed, corresponding to an integrated luminosity of 138 fb −1 .Events containing two leptons (electrons or muons) and jets are selected and categorized according to the lepton flavour and electrical charge, the displacement of the lower-momentum lepton, and whether the lepton overlaps with a nearby jet.Jets originating from the decay of a long-lived HNL are identified through a deep neural network using various features of the jet and its constituent particles.The network training relies on simulated event samples that cover a broad range of HNL masses and lifetimes, and acts as a powerful discriminant for jets that are spatially separated from the luminous region, even in the absence of reconstructed secondary vertices.
Contributions from background processes are determined from data in sideband regions.No excess of events in the data over the expected background is observed.Upper limits on the HNL production cross section are determined as functions of the HNL mass and the three coupling strengths to each lepton generation.Exclusion limits are presented in the couplingmass plane, as lower limits on the HNL lifetime, and on the HNL mass.Results are provided for Dirac and Majorana HNL production considering various coupling combinations.The most stringent limit on the coupling strength is obtained for pure muon coupling scenarios, with values of |V µN | 2 > 5 (4) × 10 −7 excluded for Dirac (Majorana) HNLs with a mass of 10 GeV at a confidence level of 95% that correspond to proper decay lengths of 17 (10) mm.This analysis is the first HNL search at the LHC that targets long-lived and hadronically decay- sig xy of ℓ 2 with respect to the PV; d sig xy is defined as d xy /σ xy , where d xy and σ xy are the transverse impact parameter and its uncertainty, respectively, as determined by the lepton track fit.The d xy /σ xy parameter is chosen because potential mismeasurements in d xy and σ xy cancel each other out in the ratio and its distribution in the CR is found to be well described by the simulation.Events are grouped by the d sig xy values into prompt-like (d sig xy < 3), displaced (3 < d sig xy < 10), and very displaced (d sig xy > 10) subcategories that probe a wide range of HNL lifetime scenarios.In total, this results in 48 independent event categories.

Figure 2 :
Figure2: Distributions of m ℓℓj ⋆ for events with (left) OS leptons and (right) SS leptons in the signal region.A representative signal scenario for Majorana HNL production with equal coupling to all lepton generations is overlaid with its expected cross section scaled up as indicated in parentheses.The hatched band shows the total experimental systematic uncertainty in the simulated background prediction including the uncertainty from the finite sample size.The bottom panel shows the ratio of data over the prediction.

Figure 3 :
Figure 3: Distributions of the displaced jet tagging score for (left) resolved and (right) boosted categories: (upper row) control region; (middle row) signal region with OS leptons; (lower row) signal region with SS leptons.A representative signal scenario for Majorana HNL production with equal coupling to all lepton generations is overlaid with its expected cross section scaled up as indicated in parentheses.The hatched band shows the total experimental systematic uncertainty in the simulated background prediction including the uncertainty from the finite sample size.The bottom panel shows the ratio of data over the prediction.

Figure 6 :
Figure 6: Expected and observed 95% CL limits on (left) Majorana and (right) Dirac HNL production as functions of the HNL mass and coupling strengths for (upper row) pure electron, (middle row) pure muon, and (lower row) pure τ lepton coupling scenarios.For the last coupling case no limits are derived for m N ≲ 3 GeV because of the tau mass.The dashed-dotted line shows the result of an orthogonal CMS analysis targeting HNL decays to leptons.

Figure 7 :
Figure 7: Expected and observed 95% CL limits on (left) Majorana and (right) Dirac HNL production for various coupling scenarios as functions of the HNL mass and coupling strengths: (upper row) mixed e-µ couplings; (middle row) mixed e-τ couplings; (lower row) mixed µ-τ couplings.The relative ratios of the coupling per lepton generation are indicated in the plots.

Figure 8 :
Figure 8: Expected and observed 95% CL limits on (left) Majorana and (right) Dirac HNL production for the democratic couplings to all lepton generation scenario as functions of the HNL mass and coupling strengths.

Figure 9 :Figure 10 :
Figure 9: Observed 95% CL lower limits on the (left column) Majorana and (right column) Dirac HNL mass as functions of the relative couplings to the three lepton generations considering a fixed proper decay length of (upper row) 0.1 mm and (lower row) 1 mm.The limits are determined for m N > 3 GeV.ingHNLs in the 2-20 GeV mass range, with inclusive coupling to all three lepton generations.The results show comparable sensitivity to an orthogonal analysis that targets long-lived HNLs decaying to leptons.