Soft displaced leptons at the LHC

Soft displaced leptons are representative collider signatures of compressed dark sectors with feeble couplings to the standard model. A prime target are dark matter scenarios where co-scattering or co-annihilation sets the relic abundance upon freeze-out. At the LHC, searches for soft displaced leptons are challenged by a large background from hadron or tau lepton decays. In this article, we present an analysis tailored for displaced leptons with a low transverse momentum threshold at 20 GeV. Using a neural network, we perform a comprehensive analysis of the event kinematics, including a study of the expected detection efficiencies and backgrounds at small momenta. Our results show that weak-scale mediators to a compressed dark sector with decay lengths between 1 mm and 1 m can be probed with LHC Run 2 data. This motivates the design of dedicated triggers that maximize the sensitivity to displaced soft leptons.


Introduction
Collider searches for long-lived particles open a new dimension in the hunt for new physics. Lifetime measurements are particularly sensitive to hidden sectors of new particles with tiny couplings to the Standard Model (SM), well below the strength of weak interactions [1,2]. If the mediator of such an interaction can be produced at a sizeable rate, its decay is suppressed by the tiny coupling to the hidden sector and visible decay products appear displaced from the collision point. Such signatures are naturally predicted in dark matter scenarios beyond the thermal WIMP (for an overview see Ref. [3]). The potential to discover feebly coupling dark matter through long-lived mediators at colliders is unique, as potential signals in direct and indirect detection experiments are often suppressed by the tiny interaction.
Tiny couplings not only affect the collider phenomenology, but also the thermal history of dark matter. In scenarios where the relic abundance is produced via thermal freeze-out, pair annihilation is suppressed and no longer efficient around the freeze-out temperature. Instead, the relic abundance is set by processes of co-annihilation [4] or co-scattering [5] with a mediator particle, followed by efficient mediator annihilation. Freeze-out through co-annihilation or co-scattering is generally realized for mediators M with small couplings to dark matter χ and sizeable couplings to SM particles. In addition, the number densities of dark matter and mediator particles around the freeze-out temperature need to be comparable, which typically results in a mass difference of about 10% [4]. These conditions set a clear target for collider searches: Small dark matter couplings lead to displaced mediator decays M → χf , with soft visible decay products f because of the compressed mass spectrum. For mediator masses between 100 GeV and 1 TeV the transverse momenta of the visible decay products range around 10 − 30 GeV (see for example Ref. [6]. The decay lengths of the mediator can vary between zero and a few centimeters for co-annihilation and up to a meter for co-scattering [5,[7][8][9][10][11][12], depending on the realization in concrete models. For even smaller dark matter couplings, dark matter is never in thermal equilibrium and the relic abundance can be produced through freeze-in [13] or the decay of thermal mediators [14], which results in even larger decay lengths at colliders [11,15,16]. At the LHC, searches for soft displaced objects face substantial experimental challenges. Signals with small momenta and small displacements are produced with a large background from displaced decays of B hadrons, commonly referred to as heavy flavor (HF). In current searches, this background is rejected by trigger settings or other selection criteria, which means that these searches also reject signals with small transverse momenta. In this work, we show how to overcome these hurdles to search for signals with soft displaced leptons.
Despite the challenges, existing searches for soft or displaced particles have been performed that probe co-annihilating or co-scattering dark matter in parts of the predicted signal space. Searches for prompt soft leptons by ATLAS [17] and CMS [18] trigger on events with a high-energetic jet from the initial state.
They probe co-annihilation in scenarios with promptly decaying mediators, for instance, supersymmetric electroweakinos. In scenarios with charged mediators, lifetimes of a few nanoseconds can be probed by searches for disappearing charged tracks [19][20][21]. Due to the detector properties, these searches are only sensitive to long decay lengths and require that the charged decay products escape the detector. The applicability of disappearing track searches to dark sectors is therefore model-dependent and limited to this specific region of the signal space.
Two searches for displaced leptons have been performed by CMS, one at 8 TeV [22] and an updated search at 13 TeV [23]. However, the 8 TeV analysis is not sensitive to compressed dark sectors with electroweak production rates due to the limited data set. The 13 TeV analysis uses a lepton trigger with a hard momentum threshold of 40 GeV, which drastically reduces the sensitivity to dark sectors with small mass splittings. Both searches are sensitive to hard displaced leptons from mediators with decay lengths between about 1 cm and 1 m.
In this work, we perform a first analysis making use of soft displaced leptons at the 13 TeV LHC. We show that a search for leptons with p T > 20 GeV can probe weak-scale mediators with decay lengths from 1 mm to 2 m and a mass splitting around 20 GeV. This search is optimal to probe co-annihilating and co-scattering dark matter by covering the previously unexplored phase space region.
The article is organized as follows. In Sec. 2, we discuss the phenomenology of co-annihilation and co-scattering dark matter and define benchmark scenarios for soft displaced leptons at the LHC. In Sec. 3, we analyze the phenomenology of these signal benchmarks and perform a detailed analysis of the expected background from heavy flavor decays at low transverse momenta. In Sec. 4, we describe our multivariate analysis and show the gain in performance compared to a simple cut-based analysis. Our predictions for the LHC based on Run 2 and Run 3 data are presented in Sec. 5. By comparing the sensitivity of our proposed analysis with existing searches, we demonstrate that a dedicated search for soft displaced leptons covers the entire phase-space region predicted by compressed hidden sectors that is currently still unexplored. We conclude in Sec. 6 with suggestions for the experimental realization of future searches for soft displaced leptons.

Soft displaced leptons from dark sectors
Signals with soft particles and missing energy are generally predicted from compressed dark sectors. A minimal version of a compressed dark sector consists of a stable neutral dark matter candidate χ 0 with mass m 0 and a mediator χ + with mass m c and a small mass splitting If the mediator carries electroweak charge, its production at the LHC proceeds through weak interactions and leads to signatures like Due to the small mass splitting ∆m among the dark states, the standard-model decay products f carry little momentum, provided that the mediator is produced at moderate boost. Depending on the decay, f can be one or several particles, at least one of them carrying electric charge. A well known example of such a process are supersymmetric charginos decaying into neutralinos and leptons or jets [24][25][26][27].

Co-annihilating and co-scattering dark matter
Scenarios of feebly interacting dark matter, where the dark matter freeze-out in the early universe is driven by co-annihilation or co-scattering, necessarily require a compressed spectrum. We focus on dark states around the weak scale, which are usually non-relativistic at freeze-out, so that their number densities n 0 and n c scale exponentially with the freezeout temperature T f . As a consequence, the relative number density of χ ± and χ 0 scales exponentially with the mass difference as For efficient dark matter interactions with a mediator, ∆m should not exceed the freeze-out temperature T f ≈ 0.1 m 0 . This condition results in a compressed spectrum of dark states, For thermal relics around m 0 ≈ 200 GeV, this corresponds to a mass difference of about ∆m ≈ 20 GeV . (2.5) Which processes determine the relic abundance upon freeze-out mostly depends on the coupling of dark matter to the mediator. For sizeable couplings pair annihilation χ 0 χ 0 → f f is efficient around the freeze-out temperature and sets the relic abundance. When successively decreasing the coupling, pair annihilation becomes inefficient and co-annihilation processes χ 0 χ + → f f set the relic abundance instead. For even smaller couplings, even co-annihilations become inefficient at freeze-out. The relic abundance is now set by co-scattering of dark matter with particles d from the thermal bath, followed by efficient mediator annihilation For weak-scale dark sectors, freeze-out via co-annihilation or co-scattering typically occurs at dark matter couplings to bath particles in the range At even smaller couplings, dark matter leaves chemical or even kinetic equilibrium well before freeze-out, and a different mechanism has to be invoked to explain the observed relic abundance.

Signal characteristics
Remarkably, co-annihilating and co-scattering dark matter predicts signatures with soft displaced particles within the acceptance of the LHC detectors. Our analysis focuses on final states with soft displaced leptons and missing energy, produced via In Fig. 1  where cτ c is the nominal decay length of the mediator. The mediator mass m c determines the χ + χ − production rate. The mass splitting ∆m is set by the freeze-out condition from Eq. (2.4). In Fig. 2 we show the transverse momentum distribution of the lepton + for m c = 324 GeV and various ∆m, without applying any kinematic cuts. For ∆m = 20 GeV, as predicted by co-annihilation or co-scattering at this mass scale, the distribution peaks at p T ( ) ≈ 7 GeV. Therefore the ballpark of events fall well below the threshold of conventional lepton triggers used by ATLAS and CMS [29][30][31], as indicated by the dashed line. The main goal of our analysis is to lower the threshold to p T > 20 GeV (black line) to be sensitive to these dark matter scenarios. Notice that the mass scale of the mediator has little impact on the spectrum at low momenta. In the relevant mass range m c = 100 − 500 GeV, it only mildly affects the tail of the distribution. The nominal decay length cτ c is determined by the dark matter coupling g χ , see Eq. (2.7). For ∆m m 0 , the three-body partial decay width can be estimated as [24] Γ c ∼ g 2 (2.10) where we have chosen typical parameters for weak-scale co-scattering as a reference.
Assuming that no other decay channels are accessible, these reference parameters TeV for m c = 324 GeV and various mass splittings ∆m, normalized to the total cross section. The distribution at ∆m = 20 GeV (thick green) is typical for weak-scale dark matter. The dashed line shows the lepton momentum threshold in [23,28]; the solid line indicates the threshold in our analysis. Based on event generation at parton level using MadGraph5 aMC@NLO.
correspond to a proper lifetime nominal decay length of the mediator around More generally, co-scattering predicts mediator decay lengths between between a centimeter and up to a few meters. For co-annihilation the decay length is typically shorter, due to the larger dark matter coupling. The nominal decay length is related to the decay length in the lab frame as where (βγ) is the Lorentz boost of the mediator. Due to the exponential decay probability, the number of mediators decaying within a sphere of radius d around the production point is given by 13) where N (0) is the number of mediators produced at the collision point. The decay length d is not directly observable at the LHC. To describe the displacement of the leptons, we employ the widely used unsigned transverse impact parameter d 0 , defined as the distance of closest approach of the lepton track from the collision point in the azimuthal plane. Following Ref. [23], identification of displaced leptons requires that the transverse impact parameter should lie within the range 200 µm < d 0 < 10 cm . (2.14) The range of nominal decay lengths cτ c that can be probed within this range of d 0 depends on the overall boost of the mediator, which determines the decay length d (see Eq. (2.12)), and on the transverse component of the boost. For m c < 500 GeV, most mediators are produced with boosts within 0.2 < (βγ) < 5, with a peak around (βγ) ≈ 1. Highly boosted, i.e., light mediators tend to be emitted along the beam line, leading to a smaller transverse decay length d 0 < d. In a typical scenario with m c = 220 GeV and ∆m = 20 GeV, the mediator has a transverse momentum of p T (χ ± ) ≈ 100 GeV and the decay lepton carries p T ( ) ≈ 10 GeV. In this case the observable range of d 0 from Eq. (2.14) roughly corresponds to 1 mm cτ c 20 cm .
(2.15) By comparing with Eq. (2.11), we see that the LHC detectors are well suited to probe co-scattering dark matter, as well as co-annihilation with sufficiently small dark matter couplings.

Benchmarks
Depending on the underlying model, the production cross section, lifetime and decay channels of the mediator may vary. For concreteness, we consider a specific model for co-scattering dark matter in which we can predict the three characteristic variables (m c , ∆m, cτ c ) in terms of fundamental parameters. The dark sector consists of two fermion fields, transforming under weak interactions as a singlet and an adjoint triplet with vector-like couplings. Upon electroweak symmetry breaking the neutral components of these fields mix through an effective scalar interaction with the Higgs field. The mixing θ leads to a dark sector with three mass eigenstates χ 0 , χ ± , χ 0 h . The lightest state χ 0 = χ 0 is a stable dark matter candidate and the charged states χ ± act as mediators. Co-scattering is realized for (m c − m 0 )/m 0 ≈ 0.1 and small θ. For details we refer the reader to Ref. [10]. The setup is similar to the bino-wino scenario in supersymmetry with decoupled higgsinos [32]. Motivated by this model, we define specific scenarios with soft displaced lepton signals as benchmarks for our analysis. In Tab. 1, benchmarks 1 and 2 correspond to dark matter scenarios, where ∆m and θ are chosen such that the observed relic abundance is obtained for mediator lifetimes in reach of the LHC, see Eq. (2.15).
In our dark matter model the mediator decays via weak interactions, leading to final states with leptons and hadrons with branching ratios determined by their gauge quantum numbers. The branching ratio into a specific lepton flavor can be estimated as Here Γ and Γ π are the partial decay widths into leptons and pions, N c = 3 is the number of colors and CKM mixing is approximated by V ud = V cs = 1. In our analysis we use numerical predictions for the three-body partial decay rates, see Sec. 2.2. Two-body decays can play a role if χ + is part of a larger SU (2) multiplet. In our model, the decay χ + → χ 0 h π + can open up for small mixing θ 10 −5 , due to electroweak corrections [33]. The partial decay rate is [10]  where ∆m hc = m h − m c 140 MeV is the mass difference between χ 0 h and χ ± , and f π 130 MeV is the pion decay constant. Decays into pions are relevant in benchmark 2.
To explore the full discovery potential of the LHC, we extend the scope of our analysis beyond this model and cover a broader range of displaced soft lepton signals. To this end, we fix the mediator mass to m c = 220 GeV and vary the lifetime (benchmarks 3 to 6) and the mass splitting (benchmark 7). Smaller mediator masses can be probed down to the LEP bound of m c 100 GeV. For larger masses the sensitivity at the LHC is statistically limited as the cross section rapidly decreases.
In benchmarks 3 to 7 we assume that the mediator decays exclusively into electrons and muons. Such an assumption is realistic in many scenarios with hidden sectors that directly couple to leptons. Examples are feebly coupled leptophilic dark matter [11,[34][35][36] or models with heavy neutral leptons [37,38]. This assumption maximizes the expected signal rate. It also makes the results of our analysis less model-dependent and easier to reinterpret in other scenarios.

LHC signals of soft displaced leptons
Three kinds of searches are potentially sensitive to models with co-annihilation or coscattering described above, i.e., searches for prompt soft leptons from compressed spectra, searches for displaced leptons, and searches for disappearing tracks. In searches for prompt soft leptons [17,18], a high-energetic jet from initial-state radiation was used for triggering, as well as to enhance the amount of missing energy and the boost of the visible final-state particles [39]. Such requirements, however, come at the cost of reducing the signal rate, which in the context of dark matter decreases the search sensitivity for heavy mediators. The current limit for promptly decaying weak triplet mediators with a mass splitting of ∆m = 20 GeV stands at 220 GeV [9,10]. As both searches require leptons with transverse impact parameters d 0 0.1 mm, they are largely insensitive to long-lived mediators.
Searches for disappearing tracks probe the other end of the lifetime range [19][20][21]. The latest CMS search excludes charged mediators with decay lengths larger than a few centimeters [21], while ATLAS is sensitive only to much larger cτ because of the different detector geometry. However, these bounds only apply if the decay products of the original charged particle are invisible in the detector. The presence of an extra lepton in the final state may degrade the sensitivity, either by causing the original track to fail the isolation criteria, or because the kink from the lepton track modifies the reconstruction of the original mediator track. This reduces the sensitivity to dark matter scenarios predicting soft displaced particles in the final state.
The sensitivity gap between searches for prompt soft leptons and disappearing charged tracks corresponds to decay lengths cτ c from about 0.1 mm to several centimeters. In this region, any leptons produced in mediator decays are not identified as prompt leptons, and the track of the mediator is not long enough to be classified even as a "disappearing" track. With our analysis we cover this gap by looking for decay products that are displaced, i.e., have a transverse impact parameter larger than the detector resolution d 0 200 µm.

Existing displaced lepton searches
Several searches for displaced leptons have already been performed at the LHC [22,23,28,40,41]. Of these, the analyses in Refs. [28,40,41] rely on reconstructing two leptons from the same vertex; therefore they are not sensitive to signals with two single leptons as in our scenarios.
We design our search for two individual displaced leptons without any vertex requirement, which covers a broader class of event topologies. Refs. [22,23] describe searches for such individual displaced leptons from event topologies close to our dark matter scenarios, see Eq. (2.8). The searches at 8 TeV [22] and 13 TeV [23] differ mainly in the requirement of the lepton transverse momenta and therefore will be described together.
Both analyses have been performed in a largely model-independent way, in particular without imposing many kinematic restrictions on the leptons or requiring further activity in the event. They select events with one electron and one muon of opposite charge, produced inside the active detection region of the CMS experiment with pseudorapidity |η( )| < 2.4, = e, µ. At 13 TeV the transverse momentum requirements are p T (e) > 42 GeV and p T (µ) > 40 GeV; at 8 TeV both leptons satisfy p T ( ) > 25 GeV. In the 13 TeV incarnation, the strong selection requirement on the lepton transverse momenta is necessary due to the trigger used for the analysis.
The displacement of the leptons is observable through the unsigned impact parameter d 0 , defined as the distance of closest approach in the azimuthal plane of the lepton track to the collision point. Based on the impact parameter, three exclusive signal regions (SR) are defined as 1 SR III : both leptons satisfying 1 mm < d 0 < 10 cm SR II : at least one lepton failing SR III and both with d 0 > 500 µm (3.1) SR I : at least one lepton failing SR II and both with d 0 > 200 µm .
We will adopt these signal regions for our analysis.
Since neither search observed an excess of events over the expected background, 95% C.L. upper limits on the signal were obtained. To compare our results with the previous analyses, we use the Poisson log likelihood ratio where N i is the observed number of events in signal region i, S i and B i are the expected signal and background events in region i, and N = i N i is the total number of observed events. The 95% C.L. upper limit for one measurement in 3 signal regions corresponds to Q = 5.99. We define the ratio so that R 95 = 1 corresponds to the exclusion limit at the 95% C.L. We checked explicitly that the likelihood ratio Q reproduces the exclusion limits from Ref. [22] within 1σ.
In scenarios with compressed dark sectors the leptons are typically much softer than the transverse momentum requirements in previous analyses, as we illustrated in Fig. 2. If we were to apply the analysis of Ref. [23] directly to compressed dark sectors with ∆m 20 − 30 GeV, the requirement p T ( ) > 40 GeV would eliminate a nearly all of events and make the analysis essentially insensitive to compressed sectors. Even with the 8 TeV search, the situation remains unsatisfactory. We demonstrate the loss of sensitivity of this search at small mass splittings in Fig. 3. The reach of the search is calculated as a function of the mediator decay length cτ c using the log-likelihood ratio R 95 defined in Eq. (3.3). As is clearly visible, the sensitivity at ∆m = 25 GeV is already severely reduced, and for ∆m = 20 GeV the search cannot exclude any range of cτ c at 95% C.L. A search that targets precisely soft and displaced kinematics is clearly motivated from this observation.
For the background estimation, we will rely on the 13 TeV search [23]. In each of the signal regions from Eq. (3.1), background estimates were determined using side bands in data. The major background in displaced lepton searches are multijet events with heavy flavor jets, which produce B hadrons that decay semi-leptonically at a distance from the production point. Due to statistical fluctuations in the number of hadrons produced during hadronization, as well as reconstruction artefacts, occasionally the lepton from the B decay is misidentified as an isolated displaced lepton. The estimate is derived in a region with moderate displacements, the so-called displaced control region, and extrapolated to the three signal regions from Eq. (3.1) using a transfer factor method.  [22] for different compression ∆m ∼ 20 − 40 GeV. They correspond to pair-production via pp → χ + χ − , followed by χ + → χ 0 ν decays. Maximal bounds expected from the most recent disappearing track search [21] are shown in yellow.
Using the d 0 distribution in orthogonal bb enriched data, CMS sets 95% C.L. upper limits to the number of background events in each signal region in 2.6 fb −1 of data at √ s = 13 GeV.
In each region, multijet events dominated by bb production are the dominant background. All other SM backgrounds, for instance from top-antitop production or electroweak processes, are orders of magnitude smaller.

Simulation of signal and background events
We describe our signal model using FeynRules [42] and use the UFO [43] interface to MadGraph5 aMC@NLO 2.6.6 [44] to generate 2 · 10 6 signal events of the process for each of the various benchmarks from Tab. 1. Initial and final state showers as well as hadronization are modelled with Pythia v8.243 [45]. As in the CMS analysis [23], we require one electron and one muon of opposite charge in the final state. The cross section σ(m c ) for pp → χ + χ − is identical to pair production of supersymmetric wino-like charginos. For each benchmark, we rescale our event simulations by the corresponding LHC prediction for √ s = 13 TeV at NLO+NLL precision [46,47] σ(220 GeV) = 903 ± 54 fb, σ(324 GeV) = 127.7 ± 9.5 fb . (3.6) The detector effects for both signal and background samples have been emulated by passing the events through Delphes 3.4.1 [48], using the CMS detector response.
Electrons and muons are required to fulfill |η( )| < 2.4. A pre-selection cut of p T ( ) > 15 GeV is required for all leptons. The lepton isolation criteria are Iso(e) < 0.12 and Iso(µ) < 0.15, where isolation is defined as the sum of the p T of all reconstructed particles within a cone of R = 0.2 around the lepton, divided by the p T of the lepton. At analysis level, we set a selection cut of p T ( ) > 20 GeV. Delphes also provides us with information about the lepton transverse impact parameter d 0 and includes a veto for leptons that are created outside the CMS tracking volume.
As the dominant background for our study is from heavy flavor production, the final background will be estimated using the data-driven CMS calculations in Ref. [23], as described in detail in the following section. However, the shape of the p T ( ) spectrum of the background leptons is estimated from Monte Carlo events. To this end we have simulated a large sample of bb events at NLO QCD with MadGraph5 aMC@NLO 2.6.6 [44], followed by hadronization using Pythia v8.243 [45]. In order to efficiently generate events with leptonic B decays, we have turned off purely hadronic decay modes of hadrons containing b and c quarks, such that the decay into muons and electrons is dominant. We confirm that the shapes of the p T ( ) and d 0 distributions of the leptons remain unchanged for our region of interest. Following this procedure, we generate 2 × 10 7 di-lepton events via pp → bb → + X, of which about 3 × 10 5 events contain a muon or electron with p T ( ) > 15 GeV. Just like in the analogous method in Ref. [23], lepton isolation is not applied in the determination of the transfer factors and was verified not to affect the p T ( ) and d 0 spectra. This procedure improves the fraction of di-lepton events from bb events by at least three orders of magnitude.

Background extrapolation to low momenta
We now describe our procedure for estimating the heavy flavor background for relaxed p T ( ) requirements. For the leptons originating from bb decays we observe no large correlation between p T ( ) and d 0 . A similar feature has been observed in Ref. [22]. This allows us to extrapolate the background predictions from Ref. [23] to regions with lower p T ( ) independently from d 0 . The overall normalisation of the background is set by scaling the number of simulated events in the control regions to the event rates in Eq. (3.4) from Ref. [23].
In Fig. 4 we show the transverse momentum distribution of all isolated electrons and muons from HF background processes. The distributions are parameterized with a doubleexponential fit to allow extrapolation from high to low transverse momenta. We define extrapolation transfer factors for electrons and muons as   of hadrons produced in bb production, their p T distributions are largely independent from each other. This allows us to define the total transfer factor for the event rate as Using this transfer factor we derive upper limits on the expected background yield B i in each signal region in terms of the event numbers, N i , provided by CMS. The background yield in each signal region i is finally given by and is listed in Tab. 3.

Event yields for signal and background
We can now estimate the total number of expected background events from bb production by multiplying the upper limits for different regions in d 0 provided by Ref. [23] with the κ eµ (p T ) transfer factors, see Eq. (3.9). The resulting background rates in each signal region are listed in Tab. 3. As expected, once the p T threshold for the leptons is lowered, the  Table 4. Expected number of events for the signal benchmarks and 95% C.L. upper limit on the HF background events in the three signal regions from Eq. number of expected background events from bb production increases exponentially.
To obtain realistic signal and background yields, the deterioration of the lepton reconstruction efficiency with increasing d 0 must be taken into account. To model this effect, we make use of the published identification efficiency parametrization from the 8 TeV search [28] up to the limit d 0 = 2 cm and linearly extrapolate to 10 cm to cover the full range of the 13 TeV search [23]. After verification that this estimates a realistic signal selection efficiency, the signal cross sections from Sec. 2.3 are used to obtain signal event yields.
Finally the expected signal and background events for an integrated luminosity of 140 fb −1 are listed in Tab. 4. Aside from benchmark 7, which corresponds to the case of large ∆m, none of the benchmarks show a good signal-to-background ratio. To preserve the sensitivity for the signal when moving the p T ( ) threshold to lower momenta, it is therefore essential to further reject the bb background using additional techniques.

Multi-variate analysis
To assess the expected significance to soft displaced leptons at the LHC during Run 3, we perform a multi-variate analysis using a neural network. In what follows we discuss the topology and kinematic distributions of the leptons in signal and background. The use of neural networks is crucial for successfully extracting a signal with soft displaced leptons from LHC data. We show that an analysis with basic kinematic cuts alone does not provide a good signal-background discrimination. However, with a straightforward neural network architecture we achieve a good sensitivity to our signal. This demonstrates a realistic discovery potential of a dark sector through soft displaced leptons at the LHC.

Kinematic distributions
As discussed in Secs. 2 and 3, in scenarios of co-annihilation and co-scattering leptons from mediator decays χ + → χ 0 ν can be very soft. For ∆m = 20 GeV their transverse momentum distribution resembles the HF background from Fig. 4; for ∆m = 40 GeV the signal leptons tend to larger p T ( ) than in the background. For signals with ∆m = 20 GeV or lower, that create soft lepton signals, the transverse momentum of the leptons is thus not a good discriminator between signal and background. We therefore systematically explore other kinematic features that characterize the signal and background. In Sec. 3 we introduced three signal regions distinguished by the transverse impact parameter d 0 , which is one of the main discriminants in our analysis. In Fig. 5, we show normalized distributions of d 0 for the benchmarks defined in Tab. 1 and for the HF background. For reference, we indicate the boundaries of the signal regions from Eq. (3.1) as dashed lines. SR I is sensitive to mediator decay lengths up to cτ c ≈ 2 mm, while SR II and SR III probe larger decay lengths up to 1 m. The background peaks around d 0 = 0.1 mm, as suggested by the decay length of a B meson from the background, cτ B ≈ 0.5 mm. For comparison, in benchmark 3: (220, 20, 0.1) (light red), the mediator has a decay length of cτ c = 1 mm. All other signal distributions are shifted to larger displacements and allow for a better discrimination from the background.
The impact parameter scales logarithmically with the decay length, reflecting the exponential decay of the mediator, see Eq. (2.13). Therefore the d 0 distributions for benchmarks with exponentially increasing cτ c are equally separated, as is apparent from Fig. 5, right. The normalized distributions include the detector acceptance of selecting two leptons that are identifiable in the fiducial tracking volume of the CMS detector. The decrease of detector acceptance at larger decay length is responsible for the shape change that can be observed at impact parameters larger than 50 cm.
In the center-of-mass frame, the mediators χ + and χ − are produced back-to-back. Due to the overall boost in proton-proton collisions, light mediators are emitted closer to the beamline and with a smaller angular separation. Leptons inheriting the mediator's boost are produced with smaller transverse impact parameters d 0 than for heavy mediators. This effect is even more pronounced for the very boosted B hadrons from the background. For the benchmarks shown in Fig. 5, the mediator boost is moderate, resulting in a significant event rate at central pseudorapidity. An important discriminator that explores the backto-back topology is the angular separation between the signal leptons ∆R(e, µ) = ∆η 2 (e, µ) + ∆φ 2 (e, µ) . (4.1) In Fig. 6, left, we show the ∆R(e, µ) distributions for three selected signal benchmarks and for the HF background. Soft leptons from heavy mediator decays tend to be emitted into opposite directions, leading to a maximum around ∆R(e, µ) = π. On the other hand, background leptons are likely to be collimated (anti-collimated) when produced from the same jet (two back-to-back jets). The angular distribution of the background therefore peaks twice and features a different shape than the signal leptons.
Additional rejection power can be obtained using combinations of missing momentum, leptons, and any reconstructed jets, 2 if present. We consider the following kinematic variables: sphericity and spherocity [50,51]; H T ; the / p T / √ H T significance, where both / p T and H T are calculated using only the lepton and jet objects; the transverse mass m T ( 1 , / p T ) calculated with the leading lepton in p T ; the azimuthal angle between the leading lepton and the missing transverse momentum, ∆φ( 1 , / p T ); and the lepton imbalance α T = p T ( 2 )/m T ( 1 , 2 ) [52], where 2 is the sub-leading lepton and m T ( 1 , 2 ) is the transverse mass, calculated from the Lorentz vector sum of the leptons in the transverse plane. Each of these variables is known to be robust in a hadron collider environment and verified to be almost completely independent of d 0 . We have tested several other kinematic variables commonly used in collider studies, but found them less suitable for signal-background discrimination.
We also confirmed that our signal benchmarks for ∆m = 20 GeV show only modest kinematic differences.
None of the variables by themselves have enough discriminating power to reject the background sufficiently to compensate for the lower threshold of p T ( ) > 20 GeV. To optimize the sensitivity, we combine them in a multivariate discriminant. Each variable contributes additional sensitivity, while bearing in mind that there are obvious minor correlations between them that, for example, neural nets can use to increase discrimination between the signal and background hypothesis.

Neural network structure
The nine kinematic variables described in Sec. 4.1 are combined in a neural network. Ranked by performance, these are sphericity, α T , spherocity} .
To avoid any possible effects of unrealistic modelling of the transverse impact parameter d 0 , variables that depend on the displacement or on d 0 itself are not used in the training. The neural network is implemented in sci-kit learn and tensorflow, including keras [53,54], and includes 2 hidden layers, each with 18 hidden nodes. The hidden layers use activation with a rectified linear unit [55], while the output layer uses sigmoid activation so the output probability is summed to unity. For the training procedure, we use the dark matter model with m c = 324 GeV, ∆m = 20 GeV and cτ c = 2 cm, listed as benchmark 1 in Tab. 1, as the signal, together with the HF background sample. The training was performed on 80% of the HF background sample, and with an equal number of signal events. As the number of training events is limited by the size of the HF background sample (about 4100 events), dropout regularization is used, and a dropout of 20% is applied in every hidden layer. The results were verified to be insensitive to over-training, using the remaining 20% of events that were reserved for independent testing.

Performance
The neural network output discriminant, NN, is shown in Fig. 7, left, quantifying the discrimination between the HF background and different signal models. The output discriminator of the neural network is relatively independent of the benchmark models when the different signal models are compared, as can be observed in the ROC curve in Fig. 7, right. This behaviour is expected, because the input variables in Fig. 6 were chosen to only display minor kinematic differences. The observed similarity suggests a good sensitivity to other benchmarks with soft displaced leptons, without the need to retrain the network. Fig. 8 shows the neural network output for an integrated luminosity of 140 fb −1 in signal region SR III, which has the highest discrimination power between signal and background for most benchmarks. For scenarios with moderate decay lengths, additional sensitivity is gained by combining SR III with the other signal regions. The background rejection created by the neural network allows to distinguish between signal and background at high NN values, thus providing sensitivity to hidden benchmark points. For benchmarks 3 to 6 with leptonically decaying mediators and ∆m = 20 GeV (Fig. 8 left), a visible excess at high NN values can be observed. Applying a hard cut of NN ≥ 0.9 reduces the background to less than 1% of the original value, while retaining a large fraction of the signal. The retained numbers of events are listed in Tab. 5.
For comparison, we also show benchmark 7 with ∆m = 40 GeV (Fig. 8 left). The signal-to-background ratio in this scenario is higher than for ∆m = 20 GeV at all NN values, because the signal leptons generally carry more transverse momentum than those from HF decays. For the dark matter benchmarks 1 and 2 (Fig. 8 right) the excess is small, due to the suppressed mediator decays into leptons, see Tab. 1. Further sensitivity could be obtained by performing a binned likelihood fit of the NN output instead of a cut of NN ≥ 0.9. Particularly for the second dark matter benchmark scenario, where over 5 -18 -   events are expected in the SR III before the cut on the NN, such an approach would have potential.  Figure 9. Projected signal exclusion limits at the LHC as a function of the mediator decay length cτ c . The expected sensitivity obtained from a neural network analysis is shown for different integrated luminosities (left) and in comparison with a simple kinematic cut-based analysis for 140 fb −1 (right) at √ s = 13 TeV. A likelihood ratio R 95 > 1 indicates that a certain decay length is excluded at more than 95% C.L. For comparison, we show existing upper bounds from a displaced lepton search at √ s = 8 TeV (plain black curve) [22].
The presented results are based on individual analyses of the NN for the soft lepton benchmarks 3-6, which we have interpolated to obtain bounds for intermediate decay lengths.
With 140 fb −1 of data, compressed dark sectors with mediator mass m c = 220 GeV and decay lengths in the range 1 mm < cτ c < 1 m can be excluded at the 95% C.L. This covers essentially the entire range of decay lengths predicted by co-annihilating or co-scattering dark matter, see Sec. 2.2. The maximum sensitivity is reached at cτ c = 3 cm. For benchmarks with smaller or larger cτ c , fewer events fall into the signal region, resulting in a reduced sensitivity at both extremes. The good sensitivity at small decay lengths very close to the detector resolution of 200 µm confirms that the background rejection by our analysis is efficient. The sensitivity at large cτ c depends on the experimental efficiency to detect displaced leptons.
In Fig. 9, right, we compare the neutral network results with a simple cut-based analysis where only the p T ( ) requirement is lowered from 40 to 20 GeV. It is evident that a multivariate analysis of the event kinematics is crucial to overcome the HF background and obtain a good sensitivity to soft leptons.
We also show the current 95% C.L. upper bounds from the previous displaced lepton search at √ s = 8 TeV with p T ( ) > 25 GeV [22], which is not sensitive to displaced soft leptons from mediators produced in weak interactions. The much higher sensitivity of our analysis is due to a combination of the larger data set, the use of a neural network, and the lower lepton threshold. While the 8 TeV analysis was restricted to d 0 < 2 cm, the sensitivity in our analysis extends to cτ c = 1 m, thus covering the entire tracking region of the CMS detector. This explains why our analysis is sensitive to larger decay lengths, reaching its maximum at cτ c = 3 cm.
In Fig. 9, left, we include our projections for Run 3 (in red), assuming the same detection efficiencies and scaling our event numbers from Run 2. With the higher luminosity of 300 fb −1 , the expected reach can be slightly extended towards both smaller and larger decay lengths. It is also instructive to scale our results down to the data set of 2.6 fb −1 (in blue), which was used in the previous displaced lepton search [23]. We see that even with such a small data set the sensitivity to soft leptons reaches the 95% C.L. around cτ c = 3 cm and exceeds the very limited reach of the 8 TeV search.
To reliably reproduce this analysis at the LHC, the data will need to be collected with triggers that have acceptance for leptons with lower transverse momenta. Compared to the triggers used in Ref. [23], it will be essential to lower the p T ≥ 40 (42) GeV requirement on reconstructed muons (electrons). Suitable triggers would place additional requirements on the presence on the angle between the leptons, or can use information from missing transverse momentum or the presence of additional objects such as jets. Such cross triggers should allow the rejection of more HF background at trigger level, so that the data collection rates remain at an acceptable level while still being able to collect soft leptons. Another interesting option is to explore mediator production from vector boson fusion with dedicated dijet triggers [56].

Conclusions and outlook
We have performed an analysis of soft displaced leptons with p T > 20 GeV, produced in association with missing energy at the 13 TeV LHC. This signature is typically predicted from compressed hidden sectors and is particularly well motivated by co-annihilation and co-scattering dark matter. Our analysis builds on a previous CMS search for displaced leptons with p T > 40 GeV. By lowering the lepton threshold we drastically improve the sensitivity to soft displaced leptons, despite the enhanced background from heavy flavor decays.
Overcoming this background requires a dedicated study of the signal and background kinematics beyond a simple cut-and-count analysis. To this end we have examined seven benchmarks -two of them motivated by dark matter, four to explore the sensitivity to the displacement for a fixed mediator mass m c = 220 GeV and mass splitting ∆m = 20 GeV in the hidden sector, and one with a larger mass splitting that covers less compressed scenarios. We have trained a neural network with nine selected kinematic variables to maximize the signal sensitivity. For the two dark matter scenarios our search predicts a low sensitivity, due to the small event rates of leptons from electroweak mediator decays. In turn, we find that mediators with purely leptonic decays in the range 1 mm < cτ c < 1 m can be explored with 140 fb −1 of current Run 2 data. With 300 fb −1 of data collected during Run 3, the sensitivity could be extended to 0.8 mm < cτ c < 2 m. Our search is also sensitive to heavier mediators. For a mass splitting of ∆m = 20 GeV, we expect a 95% C.L. sensitivity to mediator masses up to 500 GeV with 140 fb −1 and 640 GeV with 300 fb −1 of data. Signatures of this kind have so far been entirely unobservable in existing searches.
Beyond the scenarios considered in this work, our search strategy applies generally to models that predict new particles in the 100 GeV range decaying into displaced leptons and missing energy. Specific examples are leptophilic dark matter or models with heavy neutral leptons, which produce the same final state, but predict different event kinematics.
Optimized searches for some specific scenarios exist, but are too model-dependent to be sensitive to co-scattering dark matter and similar models. Our proposed search extends the discovery potential for new physics with displaced leptons beyond existing searches.
To realize our analysis at the LHC, dedicated triggers need to be designed to overcome the enhanced event rate at low lepton momenta. One possibility is the use of cross triggers with additional jets or missing energy together with soft leptons, to compensate for the lower lepton threshold. A further possibility may be to use the angular separation between the leptons to reduce the HF background events that pass the trigger. The broad applicability and the promising results of our analysis strengthen the motivation to develop such triggers for ATLAS and CMS.
It would be very interesting to extend the reach of our analysis to longer mediator decay lengths, which would allow us to explore hidden sectors with even smaller couplings. For di-muon signatures, information on the charged track from the inner detector could be combined with the observation of displaced muons in the muon chambers. Combinations of this kind could access decay lengths of a few meters, which are well motivated for instance in freeze-in dark matter scenarios. We leave these ideas for future research and look forward to the first searches for soft displaced leptons at the LHC.