1 Introduction

To date, no evidence of particles beyond the standard model (BSM) has been found by any experiment, including those at the CERN LHC. However, the vast majority of LHC searches assume the lifetimes of new particles are short enough that their decay products are prompt, i.e., their decay products are consistent with originating at the primary proton–proton (\({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \)) interaction. Search strategies are rarely optimized for particles with measurable lifetimes whose decays produce detector signatures that are displaced from the primary \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interaction. Therefore, new phenomena with displaced signatures can escape such searches. Particles in BSM scenarios can have long lifetimes for the same reasons as standard model (SM) particles, namely, small couplings between the long-lived particles (LLPs) and lighter states, approximate symmetries, heavy intermediate states, or limited phase space availability for decays.

While the majority of searches are only sensitive to models that predict new phenomena with prompt signatures, the ATLAS, CMS, and LHCb Collaborations have performed dedicated searches for decays of BSM particles with long lifetimes. Direct search strategies include finding BSM particles via their anomalous energy loss and/or low velocity [1, 2] or via a disappearing track signature [3,4,5]. There are also numerous indirect searches targeting the decay products of LLPs, such as nonprompt final-state jets [6,7,8,9,10,11,12], photons [13, 14], leptons [15,16,17,18,19,20,21], or combinations thereof [22,23,24,25]. These searches are often complementary, providing sensitivity to different ranges of particle lifetimes and masses.

The CMS Collaboration has previously performed a search for signatures with one displaced electron and one displaced muon, using \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) collision data recorded at \(\sqrt{s}=8\,\text {Te}\text {V} \) and corresponding to an integrated luminosity of 19.7\(\,\text {fb}^{-1}\)  [26]. Both this previous analysis and the one described in this paper, which uses data recorded at \(\sqrt{s}=13\,\text {Te}\text {V} \), are optimized to the phase space just beyond the sensitivity of prompt searches but with smaller displacements than other searches for long-lived BSM signatures. As a result, this search is sensitive to LLPs with proper decay lengths (\(c\tau _0\)) between approximately \(10^{\text {-}3}\) and \(10^{3}\,\text {cm} \), where c is the speed of light and \(\tau _0\) is the proper lifetime. These two analyses are unique in that they allow but do not require the displaced final-state particles to originate from a common vertex. Such a vertex is often required in other searches under the assumption that an LLP will decay to multiple leptons. In contrast, here we perform an inclusive search that is sensitive to one LLP that decays to multiple leptons, two LLPs that each decay to at least one lepton, and any other topology whose final state includes at least two displaced leptons. The search described in this paper uses data taken with the CMS detector during 2016, 2017, and 2018. We conduct a search for events in which there are one electron and one muon, two electrons, or two muons in the final state, where both of the leptons are displaced from the beam axis. With respect to the previous search, this search is based on about a factor of 6 larger integrated luminosity, is performed at a higher \(\sqrt{s}\), and adds two same-flavor channels corresponding to the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) and \({\upmu } {\upmu } \) final states.

This search is designed to be model independent and to be sensitive to as many event topologies as possible. Consequently, the event selection focuses exclusively on a displaced, isolated dilepton signature and does not try to identify signal events using hadronic activity or missing transverse momentum from undetected particles. In this way, we retain sensitivity to any model that can produce leptons with displacements on the order of 0.01 to 10\(\,\text {cm}\) and with sufficiently high momenta, regardless of whether these leptons are accompanied by jets, missing transverse momentum, or other kinematic features.

We interpret the search results in the context of several models that produce final states with displaced leptons, the Feynman diagrams for which are shown in Fig. 1. The first model introduces R-parity-violating (RPV) terms in the superpotential of the minimally supersymmetric (SUSY) SM [27], allowing the lightest SUSY particle (LSP) to decay to SM particles. Only lepton-number-violating operators are considered because of constraints from measurements of the lifetime of the proton [28]. With sufficiently small couplings for these operators, the LSP has a long enough lifetime that its decay products are measurably displaced from the \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interaction region. We focus on the case in which the LSP is the top squark (\(\widetilde{\mathrm{t}}\)), the superpartner of the top quark. At the LHC, the top squark would be dominantly pair produced, and we consider its decay through an RPV vertex to a \({\mathrm{d}}_{\mathrm{}}^{\mathrm{}}\) (\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)) quark and a charged lepton \(\ell \) via \(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell \) (\(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \)). With respect to the previous analysis [26], we have added the \(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell \) decay to facilitate reinterpreting the results in a wider range of BSM scenarios, although we find that this decay mode produces similar results to those from the \(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \) decay. We expect that the \(\widetilde{\mathrm{t}} \rightarrow {{\mathrm{s}}_{\mathrm{}}^{\mathrm{}}} \ell \) decay mode will also produce similar results, although we did not explore it. For simplicity, we assume lepton universality in the top squark decay vertex, so that the branching fraction to any lepton flavor, namely, an electron, muon, or tau lepton, is equal to one-third. We also interpret the results with a gauge-mediated SUSY breaking (GMSB) model in which the next-to-lightest SUSY particle (NLSP) is long lived because of its small gravitational coupling to the LSP gravitino \(\widetilde{\mathrm{G}}\), which is nearly massless [29]. In this model, the NLSP is the superpartner of a lepton (slepton \({\widetilde{\ell }}\)). We consider selectrons \(\widetilde{\mathrm{e}}\), smuons \(\widetilde{\upmu }{}{}\), and staus \(\widetilde{{\uptau }}{}{}\) separately, as well as together, as co-NLSPs. The sleptons would be pair produced at the LHC and each would decay to a lepton (\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\), \(\upmu \), or \(\uptau \)) of the same flavor and a gravitino. In addition, we consider a model that produces BSM Higgs bosons (\({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\)) with a mass of 125\(\,\text {Ge}\text {V}\) through gluon–gluon fusion [30]. The \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) decays to two long-lived scalars \({\mathrm{S}}_{\mathrm{}}^{\mathrm{}}\), each of which decays to two oppositely charged and same-flavor leptons, and the probabilities of the lepton pair being \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) or \({\upmu } {\upmu } \) are assumed to be the same. For the scenarios where the long-lived top squarks and sleptons decay to tau leptons, events in which the tau leptons both subsequently decay into electrons or muons are considered.

Tabulated results are provided in the HEPData record for this analysis [31].

Fig. 1
figure 1

Feynman diagrams for \(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \) (upper left), \(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell \) (upper right), \({\widetilde{\ell }} \rightarrow \ell \widetilde{\mathrm{G}} \) (lower left), and \({{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \), \({{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \rightarrow \ell ^{+}\ell ^{-}\) (lower right)

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\,\text {m}\) internal diameter, providing a magnetic field of 3.8\(\,\text {T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. The electron momentum is estimated by combining the energy measurement in the ECAL with the momentum measurement in the tracker. Forward calorimeters extend the pseudorapidity \(\eta \) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid using three technologies: drift tubes (DTs) in the barrel, cathode strip chambers (CSCs) in the endcaps, and resistive-plate chambers in both the barrel and the endcaps. Each of the four muon detector planes provides reconstructed hits on several detection layers, which are combined into local track segments, forming the basis of muon reconstruction inside the muon system.

The silicon tracker measures charged particles with \(|\eta | < 3.0\). During the 2016 LHC run, the silicon tracker consisted of 1440 silicon pixel and 15 148 silicon strip detector modules. The pixel detector was then upgraded, such that in the 2017 and 2018 LHC runs, it consisted of 1856 pixel modules. After the upgrade, the number of pixel layers increased from three, with radii between 4.4 and 10.2\(\,\text {cm}\) from the interaction region, to four, with radii between 2.9 and 16.0\(\,\text {cm}\)  [32]. In the 2017 and 2018 LHC runs, for nonisolated particles within the transverse momentum range \(1< p_{\mathrm {T}} < 10\,\text {Ge}\text {V} \) and \(|\eta | < 3.0\), the average track \(p_{\mathrm {T}}\) resolution is 1.5%. The transverse impact parameter (\(d_0\)) is defined as the distance of closest approach in the transverse plane of the helical trajectory of the track with respect to the beam axis [33, 34]. The sign of \(d_0\) is given by the sign of the scalar product between the lepton momentum and the vector from the beam axis to the lepton track reference point. The track \(d_0\) resolution improves from 75 to 20\(\,\upmu \text {m}\) as the \(p_{\mathrm {T}}\) increases, for nonisolated particles with \(1< p_{\mathrm {T}} < 10\,\text {Ge}\text {V} \) and \(|\eta | < 3.0\) in 2017 and 2018. With the upgraded silicon pixel tracker, the \(d_0\) resolution is approximately 25% better than in earlier data sets. The efficiency to reconstruct tracks as a function of \(|d_0 |\) is given in Ref. [34].

Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100\(\,\text {kHz}\) within a fixed latency of about 4\(\,\upmu \text {s}\)  [35]. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1\(\,\text {kHz}\) before data storage [36, 37].

A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [38].

3 Data and Monte Carlo simulation

The data correspond to an integrated luminosity of 118 (113)\(\,\text {fb}^{-1}\) in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel (\({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) and \({\upmu } {\upmu } \) channels), 16\(\,\text {fb}^{-1}\) of which were collected in 2016 before the pixel detector upgrade. The difference in integrated luminosity in the different channels is due to the availability of the triggers, which will be described in Sect. 5. In addition, we use events containing muons from cosmic ray showers that were recorded with dedicated triggers for a control sample to evaluate the tracking efficiency of displaced particles, as will be described later.

In the Monte Carlo (MC) simulation of background and signal processes, minimum-bias interactions are superimposed on each event to simulate the effect of additional interactions within the same or neighboring bunch crossing (pileup). The frequency distribution of the additional interactions is adjusted to match that observed in data. The average number of pileup interactions was 23 (32) in 2016 (2017 and 2018). For the 2016 samples, the NNPDF3.0 [39] parton distribution function (PDF) set at next-to-leading order (NLO) is used, while for the samples describing the 2017 and 2018 data, the NNPDF3.1 PDF set computed at next-to-NLO order [40] is used. The pythia generator is used to simulate the parton showering and hadronization for all processes. The modeling of the underlying event uses pythia 8.226 [41] with the CUETP8M1 [42] tune and pythia 8.230 with the CP5 tune [43] for simulated samples corresponding to the 2016 and 2017–2018 data sets, respectively. The MC-generated events are then processed through a detailed simulation of the CMS detector based on Geant4  [44] and are reconstructed with the same algorithms used for data.

While the background is estimated using data control samples, simulated background samples are produced to perform basic checks such as comparisons of data and simulation in control regions (CRs). Samples of simulated \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\)+jets, \({\mathrm{W}}_{\mathrm{}}^{\mathrm{}}\)+jets, and \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}\) production are generated at leading order (LO) using MadGraph 5_amc@nlo v2.2.2 (v2.4.2) for the 2016 (2017 and 2018) samples [45] and the MLM merging scheme between jets from matrix element calculations and parton showers [46]. Samples of diboson and single top quark events are simulated at NLO with powheg v2.0 [47,48,49,50,51]. Quantum chromodynamics (QCD) multijet events, which give rise to a background from SM events composed uniquely of jets produced through the strong interaction, are simulated with pythia, selecting events that are enriched in muons. Samples of the signal process \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \rightarrow \widetilde{\mathrm{t}} {\overline{\widetilde{\mathrm{t}}}} \), with the top squarks decaying via \(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell \) or \(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \), are generated using pythia at LO. The top squarks can form strongly produced hadronic states called R-hadrons, which are generated with pythia. In the samples used in this analysis, the interactions of the R-hadrons with matter are not simulated in Geant4. However, such interactions are not expected to have a significant impact because these particles do not encounter a significant number of interaction lengths before decaying. Nevertheless, we study the impact of the R-hadron interactions using the “cloud model,” which assumes that the top squark is surrounded by a cloud of colored, light constituents that interact during scattering [52, 53], and find the effect on the signal efficiency to be negligible. The GMSB sleptons are generated at LO using MadGraph 5_amc@nlo v2.6.5 and pythia, and the slepton decay via \({\widetilde{\ell }} \rightarrow \ell \widetilde{\mathrm{G}} \) is simulated using Geant4, which ignores information about the chirality of the slepton. Thus, we do not present results for left- and right-handed sleptons separately. The signal process \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \), \({{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \rightarrow \ell ^{+}\ell ^{-}\) is generated using powheg and pythia at NLO.

4 Analysis strategy

We perform an inclusive search for displaced leptons by selecting events with well-reconstructed electrons and muons, and by rejecting background events from SM processes that produce displaced leptons, as will be described in Sect. 5. We use the lepton \(|d_0 |\), which is the main discriminating variable in the analysis, to define the signal regions (SRs). Using data in a prompt CR, we correct the \(|d_0 |\) distributions in the background and signal simulations to account for alignment and resolution effects that are not fully modeled in the simulations; this procedure will be presented in Sect. 6. Section 7 describes how we perform a background estimate based on control samples in data, using the lepton \(|d_0 |\) to separate signal-like events from background-like ones. Because the lepton \(|d_0 |\) distributions for the signal processes are modeled with simulation, we validate the modeling of the displaced tracking efficiency in data using displaced muons from cosmic ray events. Section 8 describes how, from these studies, we derive systematic uncertainties that are applied to the signal.

5 Event reconstruction and selection

Events were recorded with different triggers in each of the three channels. Because standard CMS electron triggers are not designed to recognize displaced tracks, we use photon triggers instead to ensure efficiency for finding displaced electrons. In fact, photon HLT paths efficiently select electrons as well, as these triggers do not veto electrons or charged particle tracks. In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel, the trigger requires at least one muon candidate that is not constrained to the primary \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interaction vertex and without any maximum \(|d_0 |\) requirement, and at least one photon candidate. In 2016 data, the muon and photon candidates are both required to have \(p_{\mathrm {T}} >28\,\text {Ge}\text {V} \) if the muon candidate \(|d_0 |\) is greater than 0.01\(\,\text {cm}\), and \(p_{\mathrm {T}} >38\,\text {Ge}\text {V} \) otherwise. In 2017 and 2018 data, the muon and photon candidates are both required to have \(p_{\mathrm {T}} >43\,\text {Ge}\text {V} \); the \(p_{\mathrm {T}}\) threshold was increased between the 2016 and 2017 data-taking periods to mitigate the effects of the increased pileup. In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, the events are required to pass at least one of two HLT paths. The first HLT path simply requires at least two photon candidates with \(p_{\mathrm {T}} >60\) (70)\(\,\text {Ge}\text {V}\) in 2016 (2017 and 2018) data. The second HLT path, which is included to partially recover events with lower \(p_{\mathrm {T}}\) electrons, requires the highest \(p_{\mathrm {T}}\) photon candidate to have \(p_{\mathrm {T}} >30\,\text {Ge}\text {V} \) and the second-highest \(p_{\mathrm {T}}\) photon candidate to have \(p_{\mathrm {T}} >18\) (22)\(\,\text {Ge}\text {V}\) in 2016 (2017 and 2018) data. The photon candidates passing this second trigger must satisfy requirements based on calorimeter cluster shape, isolation, and the ratio of the hadronic to electromagnetic energy, and the diphoton invariant mass must be \({>}90\,\text {Ge}\text {V} \). In the \({\upmu } {\upmu } \) channel, the trigger requires at least two muon candidates without any primary vertex constraints and without any maximum requirement on the impact parameter. In 2016 data, the muon candidates are required to have \(p_{\mathrm {T}} >23\,\text {Ge}\text {V} \) if the muon candidate \(|d_0 |\) is greater than 0.01\(\,\text {cm}\), and \(p_{\mathrm {T}} >33\,\text {Ge}\text {V} \) otherwise. In 2017 and 2018 data, the muon candidates are required to have \(p_{\mathrm {T}} >43\,\text {Ge}\text {V} \). For the masses and lifetimes we consider, the efficiency for signal events to pass these triggers is 20–40%, depending on mass, lifetime, analysis channel, and year.

After requiring that the events pass the triggers described above, we preselect well-reconstructed electrons and muons in each channel. Electrons and muons are reconstructed by associating a track reconstructed in the tracking detectors with either a cluster of energy in the ECAL [54, 55] or a track in the muon system [56]. The leptons used in this search are reconstructed with the particle-flow (PF) algorithm [57], which aims to reconstruct and identify each individual particle in an event with an optimized combination of information from the various elements of the CMS detector. The primary \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interaction vertex is taken to be the candidate vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\). The physics objects used for this determination are the jets, clustered using the jet finding algorithm [58, 59] with the tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the \(p_{\mathrm {T}}\) of those jets.

In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel, we preselect events with at least one PF electron and at least one PF muon, while in the same-flavor channels, we preselect events with at least two PF electrons or muons. To retain sensitivity to signals that could produce leptons with the same charge, we make no requirements on the charge of the leptons. In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel, we require the electrons to have \(p_{\mathrm {T}} >42\) (45)\(\,\text {Ge}\text {V}\) and the muons to have \(p_{\mathrm {T}} >40\) (45)\(\,\text {Ge}\text {V}\) in 2016 (2017 and 2018). In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, we require the electrons to have \(p_{\mathrm {T}} >65\) (75)\(\,\text {Ge}\text {V}\), and in the \({\upmu } {\upmu } \) channel, we require the muons to have \(p_{\mathrm {T}} >35\) (45)\(\,\text {Ge}\text {V}\). These \(p_{\mathrm {T}}\) requirements ensure that the trigger efficiencies do not depend on \(p_{\mathrm {T}}\). In all three channels, we require the electrons and muons to have \(|\eta |<1.5\) in order to remove leptons with poorly measured \(|d_0 |\), which would be predominantly at high \(|\eta |\). Since the signal leptons in the benchmark models are predominantly central, this requirement has a minimal impact on the signal efficiencies. In addition, electrons in the ECAL transition region between the barrel and endcap detectors are rejected because the electron reconstruction in this region is not optimal. This criterion effectively means that electrons are required to have \(|\eta | < 1.44\). We also reject electrons and muons in certain regions of the \(\eta \)-\(\phi \) plane, where \(\phi \) is the azimuthal angle, because two layers of the pixel tracker were not fully functional during certain data-taking periods, resulting in increased \(|d_0 |\) mismeasurements. The rejected regions are \(1.0<\eta <1.5\), \(2.7 <\phi \le \pi \) in 2017 and \(0.3<\eta <1.2\), \(0.4<\phi <0.8\) in 2018. This requirement reduces the relative signal efficiency by \({<}1\%\), and so we apply it for the entire 2017 and 2018 data-taking periods.

Identification requirements, based on energy deposits in the calorimeters and on hit information in the tracker and muon systems, are imposed on the electrons and muons at the “tight” working point [54,55,56]. Included in these identification requirements is the criterion that at least one of the first two functional pixel layers traversed by the electron must register a hit. The identification requirements for muons include that each muon track must have at least one hit in the pixel detector, at least six tracker layer hits, and segments with hits in two or more muon detector stations.

To ensure that electrons and muons are isolated from other particles, we calculate the scalar \(p_{\mathrm {T}}\) sum of all other particles around the electron (muon) within a cone of \(\varDelta R\equiv \sqrt{\smash [b]{(\varDelta \eta )\,^2 + (\varDelta \phi )\,^2}}< 0.3\) (0.4), correct this sum for contributions from pileup, and define the relative isolation as the ratio of this sum to the electron (muon) \(p_{\mathrm {T}}\). For each lepton, the pileup correction term is calculated as the area of the lepton’s isolation cone multiplied by the average energy per unit area in \(\eta \)-\(\phi \) space in the event. By including objects from any vertex in the isolation sum, we allow for the possibility that the displaced lepton is associated with the wrong primary vertex. We require that the relative isolation is \({<}0.10\) for muons, \({<}0.0588\) for electrons in 2016, and \({<} 0.0287 + 0.506\,\text {Ge}\text {V}/p_{\mathrm {T}} \) for electrons in 2017 and 2018.

Besides these object-level selections, we also impose several event-level selections. To remove cosmic ray muons in the \({\upmu } {\upmu } \) and \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channels, we require that there are zero pairs of muons with \(\cos {\alpha }<-0.99\), where \(\alpha \) is the three-dimensional angle between the muons. This selection removes back-to-back muons, which is how cosmic ray muons from the Earth’s atmosphere appear in the detector. In addition, we require that the relative time between the leading (largest \(p_{\mathrm {T}}\)) two muons is not consistent with the timing of cosmic ray muons. To determine the time associated with each muon, we propagate the signal times as measured in the DTs and CSCs to the beam axis assuming the muons are outgoing. We then determine which muon is above the other based on their \(\phi \) measurements and find \(\varDelta t\), the time of the lower muon subtracted from the time of the upper muon. Since cosmic ray muons traverse the detector from top to bottom, the lower muon appears later in time than the upper muon, assuming the muons are outgoing from the beam axis, making \(\varDelta t\) negative for cosmic ray muons. On the other hand, muons from the \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) collision have similar times as they are both outgoing from the beam axis, and so they have a \(\varDelta t\) distribution centered at 0\(\,\text {ns}\). We reject events with \(\varDelta t< -20\,\text {ns} \) if there are at least eight independent time measurements to determine the time of flight of each muon. We also require that the relevant leptons in each channel are separated by \(\varDelta R>0.2\). This loose requirement is sufficient to significantly reduce the contribution of the heavy-flavor background from cascade decays of \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) or \({\mathrm{c}}_{\mathrm{}}^{\mathrm{}}\) hadrons. To remove pairs of leptons from material interactions, we reject events where the candidate leptons form a good displaced vertex that overlaps with the tracker material, which is measured in Ref. [60]. The vertices are reconstructed with the Kalman vertex fitter [61, 62], and a “good” vertex is one with a position uncertainty of less than 10\(\,\upmu \text {m}\) and a \(\chi ^{2}/N_{\text {dof}}< 20\), where \(N_{\text {dof}}\) is the number of degrees of freedom of the fit.

The events satisfying the preselection criteria for each channel are further categorized using the \(|d_0 |\) of the selected leptons. We define a “prompt CR” by requiring the electrons and muons to have \(|d_0 | <50\,\upmu \text {m} \). This region is dominated by SM processes with prompt leptons and is used to check that the background simulation accurately reproduces the behavior of the data. The “inclusive SR” is defined by requiring the electrons and muons have \(100\,\upmu \text {m}<|d_0 | <10\,\text {cm} \). We do not select leptons that are displaced by more than 10\(\,\text {cm}\) to ensure that the leptons originate within the pixel tracker, since the lepton identification criteria require hits in at least one pixel layer.

To remove overlaps among the three channels, events that pass the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) inclusive SR selection are rejected from the same-flavor channels.

Table 1 shows the cumulative signal efficiency for several choices of \(\widetilde{\mathrm{t}}\) mass and \(c\tau _0\). The efficiencies are similar for each year of data taking. The larger signal efficiency in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel relative to either of the same-flavor channels occurs primarily because two top squarks decaying with equal probability to an electron, a muon, or a tau lepton will produce an \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) final state twice as often as either of the same-flavor final states.

Table 1 The cumulative efficiencies for \(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \) signal events to pass the 2018 inclusive SR selection, for several choices of \(\widetilde{\mathrm{t}}\) mass (columns) and \(c\tau _0\) (rows). For each entry, the numerator is the weighted number of events passing the SR selection, and the denominator is the total weighted number of generated signal events. The corrections described in Sect. 6 are applied

6 Corrections to the simulation

Several corrections are applied to the MC simulations in order to account for the known differences between simulation and data. The simulation is corrected so that its distribution of pileup interactions matches that of 2016, 2017, and 2018 data. In addition, the trigger efficiency is measured in simulation and in an independent data sample recorded with missing \(p_{\mathrm {T}}\) triggers, for the three data-taking years and for each analysis channel separately. The ratio of the trigger efficiency in data and simulation is applied as a “scale factor” to each event in the simulated samples. Averaged over the years, the trigger scale factors for the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \), \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \), and \({\upmu } {\upmu } \) channels are \(0.94 \pm 0.01\), \(1.00\pm 0.17\), and \(0.88\pm 0.01\), respectively. While these trigger scale factors are derived with samples dominated by prompt events, we also evaluate consistent scale factors when the leptons in these samples are required to be displaced. Scale factors are also applied as functions of lepton \(p_{\mathrm {T}}\) and \(\eta \) in order to correct the performance of the lepton identification and isolation algorithms [54,55,56].

Corrections are also applied to each lepton in order to make the simulated lepton \(|d_0 |\) distributions match those in data, in prompt CRs. These corrections are derived by fitting the simulated background and the data lepton \(d_0\) distributions with Gaussian functions in each channel’s prompt CR, and by then comparing the widths of the Gaussian functions. The width of each Gaussian function is largely set by the lepton \(d_0\) resolution, and the discrepancy between the data and background MC simulation lepton \(d_0\) distributions is largely due to an overly optimistic alignment in the simulation, which creates an unrealistically precise \(d_0\) resolution. Therefore, we define \(\sigma _{\text {data}}^2 = \sigma _{\text {bkg}}^2 + \sigma _{\text {correction}}^2\), where \(\sigma _{\text {data}}\) is the width of the function fit to data, \(\sigma _{\text {bkg}}\) is the width in background simulation, and \(\sigma _{\text {correction}}\) is the additional piece that is needed to make the background simulation and data agree in each channel’s prompt CR. We find \(\sigma _{\text {data}}\) and \(\sigma _{\text {bkg}}\) from the fits, and compute \(\sigma _{\text {correction}}\). Because the fit results are similar in each channel, we average the \(\sigma _{\text {correction}}\) derived in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channels for electrons, and in the \({\upmu } {\upmu } \) and \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channels for muons. We then smear the simulated \(d_0\) distribution by picking random values within a Gaussian distribution centered at 0 and with a width of the average \(\sigma _{\text {correction}}\), and applying these random values as corrections to each lepton \(d_0\). The average \(\sigma _{\text {correction}}\) is \(14.8\pm 0.4\) (\(9.2\pm 0.4\))\(\,\upmu \text {m}\) for electrons and \(7.6\pm 0.1\) (\(8.1\pm 0.1\))\(\,\upmu \text {m}\) for muons in 2017 (2018). No \(|d_0 |\) correction is found to be needed for the simulation with 2016 data-taking conditions, as the simulation for 2016 data already has refined alignment conditions that match the data. The corrections are applied to both background and signal MC simulation.

7 Background estimation

Most leptons resulting from SM processes originate in prompt particle decays. However, displaced leptons can arise from several different sources. Displaced leptons resulting from cosmic ray muons, material interactions, and long-lived SM hadrons are largely rejected by the analysis selection criteria. The tight isolation criterion is particularly useful in rejecting the vast majority of the heavy-flavor background. Nevertheless, leptons from mismeasurements of prompt tracks or from decays of tau leptons, which have a mean \(c\tau _0\) value of 87\(\,\upmu \text {m}\), and long-lived hadrons such as \({\mathrm{B}}_{\mathrm{}}^{\mathrm{}}\) and \({\mathrm{D}}_{\mathrm{}}^{\mathrm{}}\) mesons, which have mean \(c\tau _0\) values of 500 and \({<}100\,\upmu \text {m} \), respectively, may still appear in the SRs.

7.1 The ABCD method

To estimate the number of background events in the SRs, we employ an “ABCD method” based on control samples in data that account for all three significant background sources: mismeasurements, tau lepton decays, and heavy-flavor decays. First, we categorize the events that pass the preselection criteria into four regions (A, B, C, and D) based on each lepton \(|d_0 |\) measurement, namely, \(|d^{\text {a}}_0 |\) and \(|d^{\text {b}}_0 |\), as shown in Fig. 2. For the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel, \(|d^{\text {a}}_0 |\) is defined as the leading electron \(|d_0 |\) and \(|d^{\text {b}}_0 |\) is defined as the leading muon \(|d_0 |\). For the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) (\({\upmu } {\upmu } \)) channel, \(|d^{\text {a}}_0 |\) is defined as the leading electron (muon) \(|d_0 |\) and \(|d^{\text {b}}_0 |\) is defined as the subleading electron (muon) \(|d_0 |\). We use the number of background events in regions A, B, and C to estimate the expected background in region D, which is the SR. We start from the assumption that \(N_{\text {B}}/N_{\text {A}}=N_{\text {D}}/N_{\text {C}}\) so that the number of background events in D is \(N_{\text {B}}N_{\text {C}}/N_{\text {A}}\), where \(N_{\text {X}}\) is the number of background events in the given region. This assumption is valid if \(|d^{\text {a}}_0 |\) and \(|d^{\text {b}}_0 |\) are statistically independent. We identify deviations from this assumption using the closure tests described in Sect. 7.3 and correct for them using the procedure described in Sect. 7.4.

Fig. 2
figure 2

A diagram of the ABCD method, shown for illustration on simulated background events passing the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) preselection with 2018 conditions. In each \(|d^{\text {a}}_0 |\)-\(|d^{\text {b}}_0 |\) bin, the number of events divided by the bin area is plotted. A, B, and C are CRs. SRs I–IV are described in Sect. 7.2

7.2 Signal regions

We subdivide the inclusive SR to define nonoverlapping SRs in \(|d_0 |\):

  • SR I: \(100 < \) both \(|d_0 | <500\,\upmu \text {m} \)

  • SR II: \(100< |d^{\text {a}}_0 | <500\,\upmu \text {m} \), \(500\,\upmu \text {m}< |d^{\text {b}}_0 | <10\,\text {cm} \)

  • SR III: \(500\,\upmu \text {m}< |d^{\text {a}}_0 | <10\,\text {cm} \), \(100< |d^{\text {b}}_0 | <500\,\upmu \text {m} \)

  • SR IV: \(500\,\upmu \text {m} < \) both \(|d_0 | <10\,\text {cm} \)

We also split SR I, which has the largest number of expected background events, into two bins. In the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) and \({\upmu } {\upmu } \) channels, these bins are in the leading muon \(p_{\mathrm {T}}\), and in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, these bins are in the leading electron \(p_{\mathrm {T}}\). The bin boundary is at 90 (140)\(\,\text {Ge}\text {V}\) for the 2016 (2017 + 2018) \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel, at 300 (400)\(\,\text {Ge}\text {V}\) for the 2016 (2017 + 2018) \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, and at 100\(\,\text {Ge}\text {V}\) for all years in the \({\upmu } {\upmu } \) channel. The \(p_{\mathrm {T}}\) bins are chosen such that the bin with higher \(p_{\mathrm {T}}\) contains less than one expected background event, which maximizes the sensitivity to short lifetimes.

Table 2 Closure test results in data, background simulation, and background simulation with the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events removed in the 100–500\(\,\upmu \text {m}\) region. The extrapolated ratios of the actual yield to the estimated yield (averaged over the two one-prompt/one-displaced sidebands) and their statistical uncertainties are given

Dividing the inclusive SR in this way separates the expected contribution of different background sources into individual SRs, and gives loose SRs with some amount of background contamination but high signal efficiency and tight SRs with little background contamination but also smaller signal efficiency. The signal efficiency in each SR depends on the lifetime of the LLP, so dividing the inclusive SR into multiple SRs also helps the search to be sensitive to a wide range of LLP lifetimes. Because they are nonoverlapping, we can use these SRs simultaneously in the signal extraction procedure.

Table 3 Closure test results in data, background simulation, and background simulation with the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events removed in the 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) region. The ratios of the actual yield to the estimated yield (averaged over the two one-prompt/one-displaced sidebands) and their statistical uncertainties are given

We perform a separate ABCD estimate for each SR. When performing the estimates, we subdivide regions B and C into 100–500\(\,\upmu \text {m}\) and 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) regions to match the SR definitions, and subdivide region A and the 100–500\(\,\upmu \text {m}\) subregions of B and C in \(p_{\mathrm {T}}\) to match the binning of SR I.

7.3 Closure tests in one-prompt/one-displaced sidebands

The background estimation method starts from the premise that the lepton \(|d_0 |\) values are independent. The preselection criteria remove one possible source of dependence between the \(|d_0 |\) values by ensuring that leptons that share a common displaced vertex do not contribute meaningfully to the SRs, but dependence between the \(|d_0 |\) values may still arise from processes in which the leptons originate from the same type of parent particle. Specifically, we find that \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events in which each tau lepton decays to an electron or muon lead to a dependence between the lepton \(|d_0 |\) values, since each electron or muon is produced in the decay of a long-lived tau lepton. In principle, processes that produce pairs of \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) or \({\mathrm{c}}_{\mathrm{}}^{\mathrm{}}\) hadrons could introduce dependence between the \(|d_0 |\) values through this same mechanism, but the lepton isolation criteria ensure a negligible SR contribution from events in which both leptons are produced in \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) or \({\mathrm{c}}_{\mathrm{}}^{\mathrm{}}\) hadron decays. We therefore expect the degree of \(|d_0 |\) dependence to increase with the fraction of events from \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \). Studies with simulation show that leptons from tau lepton decays contribute significantly from about 100–500\(\,\upmu \text {m}\) and peak around 200\(\,\upmu \text {m}\), so we expect the \(|d_0 |\) dependence to appear in this range and peak accordingly. Thus, the background contribution from leptons from tau lepton decays is confined to SR I. The ability to measure such dependence between the \(|d_0 |\) values depends on the quality of the \(|d_0 |\) measurement. Because \(|d_0 |\) resolution is better for muons than for electrons and is better in the 2017 and 2018 data-taking periods relative to the 2016 data-taking period, we also expect the dependence between the \(|d_0 |\) values to be more apparent in 2017 and 2018 and to increase with the number of muons in the final state. We observe this dependence between the \(|d_0 |\) values in the closure tests described below and correct for it using the procedure described in Sect. 7.4.

We perform closure tests in sideband regions that are orthogonal to the SRs, in data and background simulation. These sideband regions have one prompt and one displaced lepton. We first perform these closure tests in the region where the prompt (displaced) lepton has a displacement of 30–100 (100–500)\(\,\upmu \text {m}\). We use the estimated and actual yields in several subregions of each sideband region to estimate the ratio of the actual yield to the estimated yield in SR I. This procedure will be described in more detail in Sect. 7.4. Table 2 shows the resulting ratios in data, background simulation, and background simulation with the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events from the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\)+jets samples removed. As expected, the ratios in data are frequently greater than unity, which indicates nonclosure of the ABCD method and positive \(|d_0 |\) dependence. The data ratios generally agree with those of the full background simulation, which indicates that the source of nonclosure is modeled in the background simulation. When the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events are removed from the simulation, we find that the ratios are consistent with unity. Because the full simulation successfully models the observed nonclosure in data and because removing the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events results in closure, we conclude that \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events are indeed the only meaningful source of \(|d_0 |\) dependence. We note that the superior \(|d_0 |\) resolution of the upgraded pixel detector frequently results in fewer events in the sideband regions in 2017 and 2018 data than in 2016 data, which can lead to larger uncertainties in the 2017 + 2018 closure test results shown in Table 2, notably in the \({\upmu } {\upmu } \) channel.

We next perform closure tests in sideband regions where one lepton is prompt (30–100\(\,\upmu \text {m}\)) and the other is even more displaced (500\(\,\upmu \text {m}\)–10\(\,\text {cm}\)). Table 3 shows the ratios of the actual yield to the estimated yield averaged over the two one-prompt/one-displaced sidebands in data, background simulation, and background simulation with the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events removed. In each case, the resulting ratios are consistent with unity, and data and background simulation agree. Thus, in contrast to the 100–500\(\,\upmu \text {m}\) region tests, the 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) region tests show no evidence of \(|d_0 |\) dependence. We also note that removing the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {\uptau } {\uptau } \) events from the background simulation does not significantly affect the results, which provides further evidence that the background from tau lepton decays is small in this region. We therefore conclude that \(|d_0 |\) dependence is significant in the 100–500\(\,\upmu \text {m}\) region and insignificant in the 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) region.

7.4 Background estimate correction and systematic uncertainty

We now define a procedure to account for the \(|d_0 |\) dependence in the ABCD method and assign a systematic uncertainty in the estimate. First, as is done in the closure tests, we divide each one-prompt/one-displaced sideband into two subregions in the displaced lepton \(|d_0 |\): (1) the 100–500\(\,\upmu \text {m}\) region, where we find \(|d_0 |\) dependence from tau leptons to be significant, and (2) the 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) region, where we find \(|d_0 |\) dependence from tau leptons to be insignificant. The 100–500\(\,\upmu \text {m}\) (500\(\,\upmu \text {m}\)–10\(\,\text {cm}\)) sideband region is used as a CR for SR I (SRs II–IV). We perform closure tests in data in each sideband subregion and use the ratio of the actual to the estimated number of events as a measure of nonclosure.

From the 500\(\,\upmu \text {m}\)–10\(\,\text {cm}\) region tests, we take the largest deviation of the ratio from unity as a systematic uncertainty in SRs II–IV, and apply no correction. This approach produces systematic uncertainties between about 40 and 200% in the background estimates, whose central values range from 0.003 to 3.6 events.

When the displaced lepton \(|d_0 |\) is between 100 and 500\(\,\upmu \text {m}\), we fit the ratio as a function of the prompt lepton \(|d_0 |\) with a straight line, where the slope and y intercept are allowed to vary. We extrapolate the prompt lepton fit to 200\(\,\upmu \text {m}\) (within SR I), which is the value where simulation indicates we should expect the largest contribution from tau lepton decays. The mean lepton \(|d_0 |\) of the 100–500\(\,\upmu \text {m}\) bin in the background simulation is also approximately 200\(\,\upmu \text {m}\). We determine the uncertainty in each extrapolated ratio by simultaneously varying the fit parameters according to their 68% confidence interval, while accounting for correlations between the parameters. We take the average of the two extrapolated ratios (one from each one-prompt/one-displaced sideband) and derive a correction and its systematic uncertainty from this average ratio. If the average is greater than unity, we use the average as a multiplicative correction to the background estimate, and we use the uncertainty in the average as a systematic uncertainty in the background estimate. The uncertainty in the average is obtained through simple uncertainty propagation. In this case, we also vary the 200\(\,\upmu \text {m}\) extrapolation point by \({\pm }50\,\upmu \text {m} \), which is the approximate range of the tau lepton contribution as a function of \(|d_0 |\), and apply the variation in the resulting correction as an additional systematic uncertainty in the background estimate. If the average is less than or equal to unity, we set the correction equal to unity and use the uncertainty in the average as a symmetric systematic uncertainty about unity. The size of the correction varies between \(1.0\pm 0.6\) and \(4.2\pm 1.8\), depending on channel and year.

7.5 Closure tests in SRs

To test the full background estimation procedure, we perform closure tests in background simulation in the four SRs, with all corrections and systematic uncertainties derived from background simulation. In these tests, both leptons are displaced. The results of these closure tests in the SRs are shown in Table 4 with the 2016 and 2017+2018 yields combined in each channel. In this table, the actual and estimated yields are given, as opposed to Tables 2 and 3, which display the ratios. The actual yields are compatible with the estimated yields, which indicates that the correction performs as expected and the systematic uncertainties are sufficient to cover any unforeseen dependency.

Table 4 Closure test results in background simulation in the SRs, with the corrections applied. The estimated numbers of events, the actual numbers of events, and their total uncertainties (statistical plus systematic) are given. In cases where the actual number of events is zero, the uncertainty is given by the product of the average background simulation event weight and the upper bound of the 68% confidence level Poisson interval given by a single observation of zero events

7.6 Additional studies

In addition, we perform several studies to check for other potential sources of background in the SRs. First, to check the significance of material interactions, we invert the criterion that rejects material interactions. After doing so, we find no events in the SRs in data, across all channels and years. Thus, we conclude that there is no significant background from material interactions after the full selection criteria are applied. Second, to test for the presence of cosmic ray muon events, we invert the cosmic ray muon rejection criteria in the \({\upmu } {\upmu } \) channel and scale the number of events by the efficiency of cosmic ray muon events to survive the cosmic ray muon rejection criteria, which is found to be \({<}0.03\%\) from a dedicated data sample of cosmic ray muon events. With this study, we find a negligible number of cosmic ray muon events in data.

To estimate an upper limit on the amount of heavy-flavor background in the SR and to check that this background is covered within the nominal background prediction, several studies are performed. First, we perform the nominal ABCD method while additionally requiring at least one \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged jet, using the medium working point of the combined secondary vertices algorithm (version 2) [63]. In the \({\upmu } {\upmu } \) channel, which has the smallest relative SR contribution from mismeasurements and thus the most sensitivity to heavy-flavor backgrounds, we find that the estimate with at least one \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged jet is negligible. In the second study of the heavy-flavor background in the SR, we look at samples in which we invert the isolation criterion for events that pass the \({\upmu } {\upmu } \) preselection, for data and simulated background from muon-enriched QCD multijet events. These samples are dominated by muons from decays of \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) hadrons, and the QCD multijet simulation describes the data well in the region outside of the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson peak in the invariant mass distribution. We perform a naive ABCD estimate with the QCD multijet simulation and find no evidence for \(|d_0 |\) dependence, which indicates that the nominal background estimation already accounts for the heavy-flavor background. In the last heavy-flavor background study, we estimate this particular background in the SRs from the ratios of the number of events in each SR to the number of events in the prompt CR, from the QCD multijet simulation in the nonisolated region. We multiply these ratios by a normalization factor obtained from the number of QCD multijet simulated events that pass the nominal \({\upmu } {\upmu } \) preselection. Using this approach, we estimate that the heavy-flavor background is about 2 (20)% of the nominal background estimate in SR I (IV), which is well covered within the nominal prediction uncertainties.

To investigate possible SR contributions from displaced leptonic decays of SM hadrons, we examine 2018 data and QCD multijet simulation in the \({\upmu } {\upmu } \) channel with both the muon isolation and \(\varDelta R\) requirements inverted. When the SR muon displacement criteria are applied, this region is dominated by events with dimuon invariant masses near those of the \({\mathrm{J/\uppsi }}_{\mathrm{}}^{\mathrm{}}\) and \({{\mathrm{\uppsi }}_{\mathrm{}}^{\mathrm{}}} (2\text {S})\) mesons. Such events likely result from \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) hadron decays and are therefore already covered by the studies described in the previous paragraph, but we nevertheless estimate an upper bound on their SR contribution as an independent check. Using data in this inverted-isolation, inverted-\(\varDelta R\) region, we find the ratios of the number of events in each SR to the number of events in the prompt CR. We multiply these ratios by a normalization factor found from QCD multijet simulated events and find that the SM hadron contribution is less than 0.2% of the nominal prediction in SR I, which is negligible, and about 17% of the nominal prediction in SR IV, which is well covered by the large systematic uncertainty in this SR.

8 Systematic uncertainties in the signal efficiency

The systematic uncertainty in the background estimation method is the most significant one in the analysis: varying the nuisance parameter by one standard deviation shifts the best fit signal strength by about 5%. The following paragraphs describe the systematic uncertainties that are applied to the signal efficiency.

The integrated luminosities of the 2016, 2017, and 2018 data-taking periods are individually known with uncertainties in the 1.2–2.5% range [64,65,66], which when combined for the data set used in this analysis results in a total uncertainty of 1.8%, the improvement in precision reflecting the uncorrelated time evolution of some systematic effects.

The simulation of pileup events assumes a total inelastic pp cross section of 69.2\(\,\text {mb}\), with an associated uncertainty of 5% [67]. The systematic uncertainty arising as a result of the modeling of pileup events is estimated by varying the cross section of the minimum-bias events by 5% when generating the target pileup distributions. The pileup weights are recomputed with these new distributions and applied to the simulated events to obtain the variation in the yields in the inclusive SR. The average uncertainty is \({<} 1\%\). We treat these uncertainties as 100% correlated across the three years of data taking.

The trigger efficiency systematic uncertainty is given by the uncertainty in the measured trigger efficiency scale factors. These uncertainties are about 1% for the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) and \({\upmu } {\upmu } \) channels and 10–19% for the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel. The uncertainty is larger for the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel relative to the other two channels because there are fewer events available for the efficiency measurement in this channel. In addition, to cover the systematic variation observed in the muon trigger efficiency in signal simulation over the full \(|d_0 |\) range, we assign an additional 20% uncertainty. We treat these uncertainties as 100% correlated across the three data-taking years.

Table 5 Systematic uncertainties in the signal efficiency, for all three years and the three channels. For many sources of uncertainty, a range indicating the 68% confidence level of the spread is given. Uncertainties in the same row are treated as correlated among the data-taking years, except for the displaced tracking and pixel detector hit efficiencies for muons, where the 2016 uncertainty is treated as uncorrelated with the 2017 and 2018 uncertainties, as explained in the text

The efficiency to reconstruct displaced, isolated, high-\(p_{\mathrm {T}}\) muons can be measured using cosmic ray muon events, as they also have these properties. The tracking efficiency of displaced muons is measured using cosmic ray muon events in simulation and data, and this efficiency is also used as a proxy for the tracking efficiency for displaced electrons. We take the difference in the mean efficiency between data and simulation as a systematic uncertainty in the signal yield. This uncertainty is 2–14%, depending on the data-taking year. The 2017 and 2018 systematic uncertainties are treated as fully correlated, while the 2016 uncertainty is treated as uncorrelated with the 2017 and 2018 uncertainties, since the pixel detector was upgraded after the 2016 data taking. The choice of how to correlate the uncertainties does not significantly affect the results.

One selection within the muon identification could have some dependence on \(|d_0 |\), namely, the requirement that the muons have at least one pixel detector hit. We find the efficiency of this criterion in simulated cosmic ray muon events and cosmic ray muon events from data, and we apply the difference in mean efficiency between data and simulation as a systematic uncertainty in the signal yield. The average uncertainty is about 16 (32)% in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) (\({\upmu } {\upmu } \)) channel. The 2017 and 2018 systematic uncertainties are treated as fully correlated, while the 2016 uncertainty is treated as uncorrelated with the 2017 and 2018 uncertainties, since the pixel detector was upgraded after the 2016 data taking. Although we apply a similar pixel detector hit requirement on electrons, we do not apply a systematic uncertainty for electrons because it would require a sample of displaced electrons in data, which is difficult to obtain and verify. We checked that adding such a systematic uncertainty would not significantly affect the results.

For the two systematic uncertainties derived with cosmic ray muon events, the largest uncertainty is in 2016, compared with relatively smaller uncertainties in 2017 and 2018. This is in part because the 2016 cosmic data sample is much smaller, meaning that the statistical uncertainties are larger in this year. In addition, the uncertainty is reduced in 2017 and 2018 due to the upgrade of the pixel tracker, which allows for more precise tracking measurements.

To find the systematic uncertainty associated with the corrections to the lepton identification and isolation, we fluctuate the lepton scale factors up and down by their uncertainties and observe the change in the simulated event yields in the inclusive SR. The average uncertainty for electrons is about 3% in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel and about 7% in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, while the average uncertainty for muons is \({<} 1\%\). We treat these uncertainties as 100% correlated across the three data-taking years.

To find the systematic uncertainty associated with the corrections to the lepton \(|d_0 |\), we fluctuate the lepton \(|d_0 |\) corrections up and down by their uncertainties and observe the change in the simulated event yields in the inclusive SR. We find that this uncertainty is negligible in 2017 and 2018, and there is no \(|d_0 |\) correction needed for the 2016 simulation.

The systematic uncertainties in the signal efficiency are summarized in Table 5.

9 Results

Figure 3 shows the expected number of background events and the observed data, with a representative signal overlaid, in each \(p_{\mathrm {T}}\) bin and each SR, for each channel. Because the electron \(|d_0 |\) values are measured less precisely than those of muons, the background estimates are generally greatest in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel, despite its stricter \(p_{\mathrm {T}}\) requirements. Furthermore, the 2016 data contributes relatively more background events than the 2017–2018 data because of the improved \(|d_0 |\) resolution after the pixel detector upgrade. We also note that the background predictions are lowest in SR IV, especially when there is at least one muon in the final state.

The observed number of events is consistent with the predicted amount of background. All data points are observed to deviate by less than \(\pm 2\) standard deviations from the expected standard model background, for each analysis channel as well as for the channel combination.

Fig. 3
figure 3

The number of observed and estimated background events in each channel and SR, with a representative signal overlaid. The lower panel shows the fractional difference between the data and the background. For each background estimate and signal yield, the total uncertainty (statistical plus systematic) is given. For each observed yield, the 68% Poisson confidence interval is given. The distributions shown are those obtained before the final maximum likelihood fit to the data

In the high-\(p_{\mathrm {T}}\) SR I bin, which is the most sensitive bin for large top squark masses and small \(c\tau _0\) values, particularly \(c\tau _0 \lesssim 1\,\text {cm} \), the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel has the largest signal yield relative to the other two channels. As described above, this is because there are twice as many chances to have one electron and one muon, since the top squarks decay to each lepton flavor with equal probability. In this bin, the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) and \({\upmu } {\upmu } \) channel signal yields are similar. In SR IV, which is the most sensitive bin for large top squark masses and long lifetimes, the \({\upmu } {\upmu } \) channel has the largest signal yield for \(c\tau _0 \gtrsim 10\,\text {cm} \), relative to the other two channels. This is because the muon reconstruction and selection efficiency is better than that of electrons, which is particularly true at large \(|d_0 |\) values. In this bin, the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) and \({\upmu } {\upmu } \) channels have similar amounts of background, and the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) channel has the smallest signal yield out of the three channels for \(c\tau _0 \gtrsim 10\,\text {cm} \). Therefore, for large top squark masses, the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel is the most sensitive for \(c\tau _0 \lesssim 10\,\text {cm} \) and the \({\upmu } {\upmu } \) (\({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \)) channel is the most (least) sensitive for \(c\tau _0 \gtrsim 10\,\text {cm} \). For a given \(c\tau _0\) and mass, the relative distribution of signal events across SRs is similar for all benchmark signals we consider.

We perform a simultaneous counting experiment in each SR bin for most of the interpretations we consider. However, the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) and \({\upmu } {\upmu } \) channels are fit individually to calculate limits on GMSB models with a \(\widetilde{\mathrm{e}}\) or \(\widetilde{\upmu }{}{}\) NLSP. We set 95% confidence level (\(\text {CL}\)) upper limits on the product of the signal production cross section and branching fraction to leptons (\(\sigma \mathcal {B}\)), using a modified frequentist method [68,69,70,71,72]. This approach uses the profile likelihood ratio determined by pseudo-experiments as the test statistic [71] and the \(\text {CL}_\text {s}\) criterion. The expected and observed upper limits are evaluated through the use of pseudodata sets. Potential signal contributions to event counts in the SRs and CRs are taken into consideration, as are correlated statistical uncertainties that arise when CR event counts are used to predict the number of background events in multiple SRs. The systematic uncertainties and their correlations are incorporated in the likelihood as nuisance parameters with log-normal probability density functions. The statistical uncertainties in the signal and background estimates are modeled with gamma functions. By comparing the expected and observed cross section limits to the theoretical cross sections at NLO, mass and \(c\tau _0\) exclusion limits are set for each of the models we consider.

Figures 45, and 6 show the limits for the top squarks, sleptons, and exotic Higgs bosons, respectively. The top squark limits assume either \(\mathcal {B}(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell )\) or \(\mathcal {B}(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell )\) is 100%, and each lepton has an equal probability of being an electron, a muon, or a tau lepton. The slepton limits assume that the superpartners of the left- and right-handed leptons are degenerate in mass. The Higgs boson limits assume that the mass of \({\mathrm{S}}_{\mathrm{}}^{\mathrm{}}\) is 30 or 50\(\,\text {Ge}\text {V}\), \(\mathcal {B}({{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}})=100\%\), and each \({\mathrm{S}}_{\mathrm{}}^{\mathrm{}}\) has a 50% probability of decaying to two electrons or two muons. The \({\mathrm{S}}_{\mathrm{}}^{\mathrm{}}\) masses are chosen as two kinematically accessible benchmark values. In Figs. 4 and 5, the area to the left of the solid curves represents the observed exclusion region, and the dashed lines indicate the expected limits. The maximum sensitivity of this search occurs at \(c\tau _0 = 2\,\text {cm} \), where \(\widetilde{\mathrm{t}}\) masses up to 1500\(\,\text {Ge}\text {V}\) are excluded. The sensitivity degrades for \(\widetilde{\mathrm{t}}\) masses and \(c\tau _0\) values above and below this point. The previous CMS analysis [26], which was performed only in the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu } \) channel and at \(\sqrt{s}=8\,\text {Te}\text {V} \), excluded \(\widetilde{\mathrm{t}}\) masses up to 790\(\,\text {Ge}\text {V}\) at a \(c\tau _0\) of 2\(\,\text {cm}\). As a result of the higher \(\sqrt{s}\) and integrated luminosity, as well as the addition of the same-flavor channels, the mass exclusion limits for this search improve upon the previous CMS analysis by approximately a factor of 2. Furthermore, this search can be directly compared with the search described in Ref. [16], which looks for displaced leptons with the ATLAS detector at \(\sqrt{s}=13\,\text {Te}\text {V} \). The smaller \(|d_0 |\) lower bound of the SRs enables the present analysis to have greater sensitivity to shorter slepton lifetimes than the ATLAS analysis.

Fig. 4
figure 4

The observed 95% \(\text {CL}\) upper limits on the long-lived top squark production cross section, in the \(c\tau _0\)-mass plane, for the three channels combined. The \(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell \) (upper) and \(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell \) (lower) processes are shown. These limits assume either \(\mathcal {B}(\widetilde{\mathrm{t}} \rightarrow \mathrm{d}\ell )\) or \(\mathcal {B}(\widetilde{\mathrm{t}} \rightarrow \mathrm{b}\ell )\) is 100%, and each lepton has an equal probability of being an electron, a muon, or a tau lepton. The area to the left of the black curve represents the observed exclusion region, and the dashed red lines indicate the expected limits and their 68% confidence intervals

Fig. 5
figure 5

The 95% \(\text {CL}\) constraints on the long-lived slepton \(c\tau _0\) and mass. The \(\widetilde{{\uptau }}{}{}\) and co-NLSP limits are shown for the three channels combined, while the \(\widetilde{\mathrm{e}}\) and \(\widetilde{\upmu }{}{}\) NLSP limits are shown for the \({{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) and \({\upmu } {\upmu } \) channels, respectively. These limits assume that the superpartners of the left- and right-handed leptons are degenerate in mass and \(\mathcal {B}({\widetilde{\ell }} \rightarrow \ell \widetilde{\mathrm{G}})\) is 100%. The area to the left of the solid curves represents the observed exclusion region, and the dashed lines indicate the expected limits

Fig. 6
figure 6

The 95% \(\text {CL}\) upper limits on the \({{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \), \({{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} \rightarrow \ell ^{+}\ell ^{-}\) branching fraction as a function of \(c\tau _0\), for a Higgs boson with a mass of 125\(\,\text {Ge}\text {V}\) and a long-lived scalar with a mass of 30\(\,\text {Ge}\text {V}\) or 50\(\,\text {Ge}\text {V}\), for the three channels combined. These limits assume that \(\mathcal {B}({{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{S}}_{\mathrm{}}^{\mathrm{}}})=100\%\) and each \({\mathrm{S}}_{\mathrm{}}^{\mathrm{}}\) has a 50% probability of decaying to two electrons or two muons. The area above the solid (dashed) curve represents the observed (expected) exclusion region

10 Summary

A search has been presented for long-lived particles decaying to displaced leptons in proton–proton collisions at \(\sqrt{s}=13\,\text {Te}\text {V} \) at the LHC. With collision data recorded in 2016, 2017, and 2018, and corresponding to an integrated luminosity of 113–118\(\,\text {fb}^{-1}\), no excess above the estimated background has been observed. Exclusion limits have been set at 95% confidence level. Top squarks with masses between 100 and at least 460\(\,\text {Ge}\text {V}\) have been excluded for \(0.01<c\tau _0 <1000\,\text {cm} \), with a maximum exclusion of 1500\(\,\text {Ge}\text {V}\) occurring at \(c\tau _0 =2\,\text {cm} \), where \(c\tau _0\) is the proper decay length. These exclusions assume that 100% of the top squarks decay to a lepton and a \({\mathrm{d}}_{\mathrm{}}^{\mathrm{}}\) or \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) quark, where the lepton has an equal probability of being an electron, muon, or tau lepton. The following exclusions assume that the superpartners of the left- and right-handed leptons are mass degenerate. Electron superpartners with masses of at least 50\(\,\text {Ge}\text {V}\) have been excluded for \(0.007<c\tau _0 <70\,\text {cm} \), with a maximum exclusion of 610\(\,\text {Ge}\text {V}\) occurring at \(c\tau _0 =0.7\,\text {cm} \). Muon superpartners with masses of at least 50\(\,\text {Ge}\text {V}\) have been excluded for \(0.005<c\tau _0 <265\,\text {cm} \), with a maximum exclusion of 610\(\,\text {Ge}\text {V}\) occurring at \(c\tau _0 =3\,\text {cm} \). Tau lepton superpartners with masses of at least 50\(\,\text {Ge}\text {V}\) have been excluded for \(0.015<c\tau _0 <20\,\text {cm} \), with a maximum exclusion of 405\(\,\text {Ge}\text {V}\) occurring at \(c\tau _0 =2\,\text {cm} \). In the case that electron, muon, and tau lepton superpartners are mass degenerate, lepton superpartners with masses between 50 and at least 270\(\,\text {Ge}\text {V}\) have been excluded for \(0.005<c\tau _0 <265\,\text {cm} \), with a maximum exclusion of 680\(\,\text {Ge}\text {V}\) occurring at \(c\tau _0 =2\,\text {cm} \). For sleptons with \(c\tau _0 <0.8\,\text {cm} \), these are the most sensitive results published to date. For \(0.10<c\tau _0 <12\,\text {cm} \), branching fractions greater than 0.03% have been excluded for 125\(\,\text {Ge}\text {V}\) Higgs bosons decaying to two long-lived scalar particles, assuming each has a mass of 30\(\,\text {Ge}\text {V}\) and decays with equal probability to electrons or muons. For scalar particles with \(0.1< c\tau _0 < 1000\,\text {cm} \) that decay to any final state, these are the most sensitive results published to date.