1 Introduction

Supersymmetry (SUSY) is one of the most popular extensions of the Standard Model (SM), which can solve the hierarchy problem, can unify the gauge couplings at the Planck scale and proposes dark matter candidates. The minimal supersymmetric extension of the Standard Model (MSSM) is the simplest phenomenologically viable realisation of SUSY [1, 2]. The present study addresses a subset of models featuring massive long-lived particles (LLPs) with a measurable flight distance [3, 4], decaying semileptonically. Long-lived particles decaying semileptonically with displaced jets composed of SM particles have been studied by the experiments at the LHC [5,6,7,8,9] Additional information on searches for LLPs at collider experiments can be found in Refs. [10,11,12].

This analysis uses proton-proton (pp) collision data at a centre-of-mass energy \(\sqrt{s}\) =13\(\text {\,TeV}\) collected by the LHCb experiment at the LHC, corresponding to a total integrated luminosity of 5.4\(\text {\,fb} ^{-1}\). It extends the analysis of Ref. [9] on data collected at \(\sqrt{s} =7\) and 8\(\text {\,TeV}\). The adopted theoretical framework is inspired by the SUper GRAvity (mSUGRA) with R-parity violation (RPV) [13], in which the neutralino can decay into a muon and two quarks: \(\tilde{\chi }^{0}_{1} \rightarrow \mu ^{+} q_{i} q_{j}( \mu ^{-} \bar{q_{i}} \bar{q_{j}})\). Neutralinos can be produced by a variety of processes. In this paper the analysis has been performed assuming the two mechanisms depicted in Fig. 1. In the first process, a Higgs-like particle, \(h^0\), is produced by gluon fusion and decays into two LLPs. The analysis covers \(h^0\) masses from 30 to 200\(\text {\,GeV\!/}c^2\), LLP lifetimes from 5 to 200 ps and LLP mass values from 10\(\text {\,GeV\!/}c^2\) up to about one half the \(h^0\) mass. The second mode is a direct LLP production from quark interactions. The LLP lifetime range considered is from 5 to 200 ps and the mass range from 10 to 90\(\text {\,GeV\!/}c^2\). The LLP lifetime range begins at 5 ps, well above the typical b-hadron lifetime, and extends up to 200 ps, where most of the vertices are still within the LHCb vertex locator (VELO). The mass range avoids the region of the SM b-quark states, but also takes into account the forward acceptance of the LHCb detector within which the decay products of relatively light LLPs can be efficiently detected.

The LLP signature is a displaced vertex made of charged particle tracks accompanied by an isolated muon with high transverse momentum with respect to the proton beam direction, \(p_{\mathrm {T}}\). This study benefits from the excellent vertex reconstruction provided by the VELO, and by the low \(p_{\mathrm {T}}\) threshold of the muon trigger, compared to the other LHC experiments. In addition, the LHCb experiment is probing a rapidity region only partially accessible by other LHC experiments. These properties allow the LHCb experiment to be complementary to similar analyses performed by the two central detectors at the LHC and even explore regions of the theoretical parameter space where these experiments are limited by their low efficiency to reconstruct highly boosted LLPs.

2 Detector description and simulation

The LHCb detector [14, 15] is a single-arm forward spectrometer covering the pseudorapidity range \(2<\eta <5\), designed for the study of particles containing \(b \) or \(c \) quarks. The detector includes a high-precision tracking system consisting of the VELO which is a silicon-strip detector surrounding the pp interaction region [16], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,Tm}}\), and three stations of silicon-strip detectors and straw drift tubes [17, 18] placed downstream of the magnet. The tracking system provides a measurement of the momentum, \(p\), of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200\(\text {\,GeV\!/}c\). The minimum distance of a track to a primary pp collision vertex (PV), the impact parameter, is measured with a resolution of \((15+29/p_{\mathrm {T}})\,\upmu \text {m} \), where \(p_{\mathrm {T}}\) is in  \(\text {\,GeV\!/}c\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [19]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic (ECAL) and a hadronic calorimeter (HCAL) [20]. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [21]. The online event selection is performed by a trigger [22], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. During data taking an alignment and calibration of the detector is performed in near real-time and used in the software trigger [23]. The same alignment and calibration information is propagated to the offline reconstruction.

Fig. 1
figure 1

LLP production processes considered in this paper, where the \(\tilde{\chi }^{0}_{1}\) represents the LLP: a di-LLP production via a scalar particle \(h^0\); b non-resonant, direct LLP production from quark interactions, where X is a stable particle, with mass identical to the LLP. The LLP decays into a muon and two quarks: \(\tilde{\chi }^{0}_{1} \rightarrow \mu ^{+} q_{i} q_{j}( \mu ^{-} \bar{q_{i}} \bar{q_{j}})\)

Simulation is used to model the effects of the detector acceptance and the imposed selection requirements. In the simulation, pp collisions are generated using Pythia 8 [24, 25] with a specific LHCb configuration [26] and with parton density functions taken from CTEQ6L [27]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [28, 29] as described in Ref. [30]. The simulation includes pileup events with an average of 1.1 pp visible interactions per bunch crossing.

Several sets of signal events have been produced assuming the processes illustrated in Fig. 1, where the \(\tilde{\chi }^{0}_{1}\) plays the role of a long-lived particle. For the first process considered, two \(\tilde{\chi }^{0}_{1}\) particles are obtained from the decay of the Higgs-like boson produced by gluon fusion, \( gg \rightarrow h^0 \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1} \). For the second process, the LLP is produced in a non-resonant mode, \( q\bar{q} \rightarrow \tilde{\chi }^{0}_{1} X \). Here X is a stable neutral particle with the same mass as that of the \(\tilde{\chi }^{0}_{1}\) state. This production of a LLP in association with a stable particle X is included, which enables probing the sensitivity to this topology, with the signal LLP recoiling against such a particle.

The LLP decays into a muon and two quarks; the branching ratio of \(\tilde{\chi }^{0}_{1} \rightarrow \mu ^{+} q_{i} q_{j}( \mu ^{-} \bar{q_{i}} \bar{q_{j}})\) is set to be equal for each quark combination (\(q_{i} = u,c\) and \(q_{j} = \bar{d},\bar{s},\bar{b}\)), with an equal proportion of \(\mu ^{+}\) and \(\mu ^{-}\).

In the following, the model name is indicated by the values of \(m_{h^0}\), \(m_{\tilde{\chi }^{0}_{1}}\) and \(\tau _{\tilde{\chi }^{0}_{1}}\); h125-chi40-10ps, for example, corresponds to \(m_{h^0} =125\text {\,GeV\!/}c^2 \), \(m_{\tilde{\chi }^{0}_{1}}=40\text {\,GeV\!/}c^2 \), \(\tau _{\tilde{\chi }^{0}_{1}}=10\text {\,ps} \). For the direct production, the Higgs mass is omitted from this notation, such as for example in chi30-10ps.

The most relevant background in this analysis is from events containing heavy quarks. The background from heavy quarks directly produced in pp collisions, as well as from \({W} \), \({Z} \), Higgs boson and top quark decays, is studied using the simulation. The simulation of inclusive \({b} {\overline{{b}}} \) and \({c} {\overline{{c}}} \) events is not efficient to produce a large enough sample to cover the relevant high-\(p_{\mathrm {T}}\) muon kinematic region. Hence, a dedicated sample of \(20 \times 10^6\) (\(1 \times 10^6)\) simulated \({b} {\overline{{b}}} \) (\({c} {\overline{{c}}} \)) events has been produced with a minimum parton \(\hat{p}_\mathrm{T}\) of 20\(\text {\,GeV\!/}c\) and requiring a muon with \(p_{\mathrm {T}} >12\text {\,GeV\!/}c \) and \(1.5<\eta <5.0\). All the simulated background species are suppressed by the multivariate analysis presented in the next section. Therefore, a data-driven approach is employed for the final background estimation.

3 Signal selection

Signal events are selected by requiring a vertex displaced from any PV in the event and containing one isolated, high-\(p_{\mathrm {T}}\) muon. Due to the relatively high LLP mass, the muons from the LLP decay are expected to be more isolated than muons from hadron decays. The events from pp collisions are selected online by a trigger requiring muons with \(p_{\mathrm {T}} >10\text {\,GeV\!/}c \). The offline analysis requires that the triggering muon has an impact parameter, \(\mathrm{IP^{\mu }}\), with respect to any PV, larger than \(0.25\text {\,mm} \) and a transverse momentum, \({p^{\mu }_{\mathrm {T}}}\), larger than \(12\text {\,GeV\!/}c \). Primary and displaced vertices are reconstructed offline from charged particle tracks [31]. Genuine PVs are identified by a small radial distance from the beam axis, \(R_\mathrm{xy} <0.3\) mm. Once the set of PVs is identified, all the other vertices are candidates for the decay position of LLPs. An LLP candidate is formed by requiring three or more tracks including the muon and having an invariant mass above 4.5\(\text {\,GeV\!/}c^2\). There is no requirement for the reconstructed momentum to point to a specific PV. Particles interacting with the detector material are an important source of background. Therefore, a geometric veto is used to reject candidates with vertices in regions occupied by detector material [32]. The event preselection requires at least one PV in the event and at least one LLP candidate.

Figure 2 compares the distributions from data and from the simulated \({b} {\overline{{b}}} \) events for the relevant observables, after preselection. For illustration the shapes of simulated h125-chi40-10ps events are also superimposed. The effect of the geometric veto is visible in the \(R_\mathrm{xy}\) distribution, for candidates with \(R_\mathrm{xy}\) above 5\(\text {\,mm}\). From simulation, the veto introduces a loss of efficiency of 3% (27%) for the detection of LLPs with a 50\(\text {\,GeV\!/}c^2\) mass and a 10 ps (200 ps) lifetime, \(m_{h^0}\) =125\(\text {\,GeV\!/}c^2\). The muon-isolation variable is defined as the sum of the energy of tracks surrounding the muon direction, including the muon itself, in a cone of radius \(R_{\eta \phi }= 0.3\) in the pseudorapidity-azimuthal \((\eta , \phi )\) space, divided by the energy of the muon track. The radius is reduced to  \(R_{\eta \phi }= 0.2\)  when the theoretical hypothesis assumes a LLP mass of 10\(\text {\,GeV\!/}c^2\), to account for the reduced aperture of the jet of particles produced by the LLP decay. A muon-isolation value of unity denotes a fully isolated muon. In simulation the muon from the signal is found to be more isolated than the hadronic background. The variables \(\sigma _R\) and \(\sigma _Z\) are the vertex uncertainties in the radial direction and in the z direction respectively.

Fig. 2
figure 2

Distributions from data compared to simulated \({b} {\overline{{b}}} \) events (blue) and the simulated signal h125-chi40-10p (red), after preselection. From a to j: muon transverse momentum; muon impact parameter; muon isolation; the calorimetric energy, \(\text {E}_{\mathrm{calorimeters}}\), associated with the muon normalised by the muon energy, \(\hbox {E}_{\mathrm{muon}}\); the number of tracks used to reconstruct the LLP vertex including the muon; the radial distance to the beam line of the reconstructed vertex; longitudinal and radial vertex fit errors, \(\sigma _Z\) and \(\sigma _R\); reconstructed transverse momentum and mass of the LLP candidate. The distributions from simulated events are normalised to the data

The reconstructed vertex mass is very broad and does not peak at the neutralino mass values, because it misses some charged particle tracks, and any neutrals produced in the LLP decay.

The shapes of the distributions in Fig. 2 are all consistent with a dominant \({b} {\overline{{b}}} \) composition of the background. This is confirmed by comparing the yields in data and simulation: after preselection and requiring the isolation parameter below 1.2, the total number of LLP candidates in data is \(148\times 10^3\). The predicted background yields from \({b} {\overline{{b}}} \) and \({c} {\overline{{c}}} \) events are \((120 \pm 20)\times 10^3\) and \((14 \pm 4)\times 10^3\), respectively. Small contributions are expected from processes with \(W \), \(Z \) bosons plus jets, top and Standard Model Higgs events: 260, 20, 2, and 1 candidates, respectively. The \({b} {\overline{{b}}} \) and \({c} {\overline{{c}}} \) prediction uses the cross-sections measured by the LHCb experiment at 13 TeV [33, 34]. The acceptance of this analysis is computed with MadGraph5-aMC@NLO [35] and the detection efficiency is obtained from simulated events. As already stated, these background estimations are only used for cross-checks.

A multivariate analysis (MVA) based on a boosted decision tree [36, 37] is used to further purify the data sample. Ten MVA input variables are selected to optimise the signal–background separation. They are: \({p^{\mu }_{\mathrm {T}}}\) and \(\mathrm{IP^{\mu }}\), the ratio of the energies associated with the muon measured in ECAL and HCAL normalised to the muon energy, the LLP candidate \(p_{\mathrm {T}}\), its pseudorapidity, the number of tracks forming the LLP, the vertex uncertainties \(\sigma _R\) and \(\sigma _Z\), and the vertex \(R_\mathrm{xy}\) distance.

Larger vertex uncertainties are expected on the vertices of candidates from \({b} {\overline{{b}}} \) events compared to signal LLPs. The former are more boosted and produce more collimated tracks, while the relatively heavier signal LLPs decay into more divergent tracks. This effect decreases when the mass of the LLP approaches the mass of \(b \)-quark hadrons. The selection based on the energy deposit in the calorimeters is efficient to suppress the background due to kaons or pions punching through the calorimeters and being misidentified as muons. The muon-isolation variable and the reconstructed mass of the long-lived particles are not included in the classifier; the discrimination power of these two variables is subsequently exploited for the signal determination.

The signal MVA training samples are provided by simulation. The background training sample is obtained from data, based on the hypothesis that the fraction of signal in the data after preselection is small. This automatically includes all possible background sources, with the correct relative abundance.

The training is performed independently for each simulated model. The MVA classifier is subsequently applied to the data and to the simulated signal. For each model, the optimal MVA cut value is chosen by an iterative minimization procedure to give the best expected cross-section upper limit, but keeping at least ten candidates to allow the invariant-mass fit to work properly.

The classifier can be biased by the presence of signal in data used as background training set. To quantify the potential bias, the MVA training is performed adding a fraction of simulated signal events (up to 5%) to the background set. This test demonstrates a negligible effect on the MVA performance for all the signal models.

Fig. 3
figure 3

Reconstructed invariant mass of the LLP candidates. Subfigures a, c, and e correspond to the signal selections which assume the models h80-chi30-10ps, h200-chi20-10ps, and the non-resonant model chi30-10ps, respectively. Subfigures b, d, and f are the corresponding distributions for candidates selected in the background region. The results of the fits are superimposed

4 Determination of the signal yield

The signal yield is determined with an unbinned extended maximum-likelihood fit to the distribution of the reconstructed LLP mass. The shape of the signal component is taken from the simulated models, and a background component is added. After the MVA selection, no simulated background survives, therefore the background shape is determined by a data-driven method, which also avoids potential simulation mismodeling of the reconstructed mass. The data candidates are separated into a signal region with muon isolation below 1.2 and a background region with isolation values from 1.4 to 2.0. The signal-region selection accepts more than 80% of the signal for all the models considered (see e.g. Fig. 2). Any potential signal yield in the background region is considered negligible. The reconstructed mass distribution obtained from the background candidates is used to constrain an empirical probability density function (PDF) consisting of the sum of two negative-slope exponential functions, one of them convolved with a Gaussian function. Shape parameters and amplitudes are left to vary in the fit. It is possible that the mass distribution obtained after selection of the background region does not represent exactly the background component in the signal region. Hence, a correction is applied before performing the fit: the mass distribution selected in the background region is weighted with weights deduced from the comparison of the candidate mass distributions of signal and background regions obtained from data with a relaxed MVA selection. This relaxed selection is required to have sufficiently populated samples and to minimise the correlation with the final distributions from which signal yields are obtained. The consistency of this procedure is tested on \({b} {\overline{{b}}} \) simulated events.

Table 1 Signal detection efficiency, in percent, after preselection, \({\varepsilon } _{presel}\), including the geometrical acceptance, and after MVA selection, \({\varepsilon } \), and numbers of fitted signal and background events in the signal region, \(N_\mathrm{s}\) and \(N_\mathrm{b}\), for the different signal hypotheses with resonant LLP production (masses units are \(\text {\,GeV\!/}c^2\), lifetimes in \(\text {\,ps}\)). The last column gives the value of \(\chi ^2\) per degree of freedom, ndf, from the fit
Table 2 Signal detection efficiency, in percent, after preselection, \({\varepsilon } _{presel}\), including the geometrical acceptance, and after MVA selection, \({\varepsilon } \), and numbers of fitted signal and background events in the signal region, \(N_\mathrm{s}\) and \(N_\mathrm{b}\), for the different signal hypotheses for the LLP non-resonant production models (masses units are \(\text {\,GeV\!/}c^2\), lifetimes in \(\text {\,ps}\)). The last column gives the value of \(\chi ^2\) per degree of freedom, ndf, from the fit

Examples of the invariant mass of the selected LLP candidates are shown in Fig. 3 for the signal and background regions. The invariant-mass fit is performed simultaneously on LLP candidates from the signal and from the background regions. In the former, the numbers of signal and background events are free parameters of the fit. The results of the fit are shown in the figure. The sensitivity of the fit procedure is studied by adding a small number of simulated signal events to the data according to a given signal model. The fitted yields are on average consistent with the numbers of added events. The fitted signal yields, given in Tables 1 and 2 are compatible with the background-only hypothesis for all the theoretical models.

5 Detection efficiency and systematic uncertainties

The detection efficiency required in the calculation of the signal yield is estimated from the simulated signal events. The efficiencies after preselection and after MVA selection are shown in Tables 1 and 2, for the considered models of resonant and non-resonant LLP productions, respectively. The values include the geometrical acceptance. Several phenomena compete to determine the detection efficiency. In general the efficiency after preselection increases with the LLP mass because more particles are produced in the decay of heavier LLPs. There is a loss of particles outside the spectrometer acceptance, especially when the LLPs are produced from the decay of heavier states, such as the Higgs-like particle. In addition, the lower boost of heavier LLPs results in a shorter average flight length, which is disfavoured by the requirement of a minimum \(R_\mathrm{xy}\) value. With increasing LLP lifetimes a larger portion of the decays falls into the material region and is vetoed. Finally, a drop of sensitivity is expected for LLPs with a lifetime close to the \(b \)-hadron lifetimes, where the contamination from \({b} {\overline{{b}}} \) events becomes even more important, especially for low-mass LLPs. The detection efficiency is reduced by up to one order of magnitude after the optimised MVA selection while the background is reduced by 3–4 orders of magnitude.

Table 3 Contributions to the relative systematic uncertainties. The indicated ranges cover the theoretical models considered. The contributions from the signal and background models used in the LLP mass fit are treated separately

A breakdown of the relative systematic uncertainties is shown in Table 3. The uncertainties of the partonic luminosity depend upon the process considered; they are estimated following the procedure explained in Refs. [38, 39] and vary from 3% up to 6%, which is found for the gluon fusion process. The integrated luminosity [40] contributes with an uncertainty of 2%. The statistical precision of the efficiencies determined from simulation is in the range 2–4% for the different models. Different sources of systematic uncertainty arising from discrepancies between data and simulation have been considered. The size of those discrepancies for the relevant observables are inferred from a comparison of the distributions obtained from data and from \({b} {\overline{{b}}} \) simulated events, which describes the data quite completely, or from other calibration processes.

Table 4 Upper limits at 95% CL on the production cross-section times branching ratio for signal models with a resonant production via an Higgs-like boson. Masses are given in \(\text {\,GeV\!/}c^2\), lifetimes in \(\text {\,ps}\), cross-sections in \(\text {\,pb}\)

The muon detection efficiency, including trigger, tracking, and muon identification efficiencies, is studied by a tag-and-probe technique applied to muons from \(J/\psi \rightarrow {\mu ^+\mu ^-} \), \({\varUpsilon } (1S) \rightarrow {\mu ^+\mu ^-} \) and \({Z} \rightarrow {\mu ^+\mu ^-} \) decays. The corresponding systematic effects due to differences between data and simulation are estimated to be between 2 and 3.7%, depending on the theoretical model considered.

A comparison of the simulated and observed \(p_{\mathrm {T}}\) distributions of muons from \({Z} \rightarrow {\mu ^+\mu ^-} \) decays shows a maximum difference of 0.2\(\text {\,GeV\!/}c\) in the selected region; this difference is propagated to the LLP analysis by shifting the muon \(p_{\mathrm {T}}\) threshold by the same amount. The corresponding systematic uncertainty is below 1% for all models under consideration.

Table 5 Upper limits at 95% CL on the production cross-section times branching ratio for signal models with a non-resonant production. Masses are given in \(\text {\,GeV\!/}c^2\), lifetimes in \(\text {\,ps}\), cross-sections in \(\text {\,pb}\)
Fig. 4
figure 4

Expected (open dots and \(1\sigma \) and \(2\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL) as a function of \(\hbox {m}_{\tilde{\chi }^{0}_{1}}\) for the resonant production processes with \(m_{h^0} =125\text {\,GeV\!/}c^2 \), and, from a to g, \(\tau _{\tilde{\chi }^{0}_{1}}\) of 5, 10, 20, 30, 50, 100, and 200\(\text {\,ps}\)

Fig. 5
figure 5

Expected (open dots and 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL) as a function of \(\tau _{\tilde{\chi }^{0}_{1}}\) for the resonant production with \(m_{h^0} =125\text {\,GeV\!/}c^2 \), and, from a to e, \(m_{\tilde{\chi }^{0}_{1}}\) of 20, 30, 40, 50, and 60\(\text {\,GeV\!/}c^2\)

Fig. 6
figure 6

Expected (open dots and 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL) as a function of \(m_{h^0}\) and, from a to f, \(m_{\tilde{\chi }^{0}_{1}}\) of 10, 20, 30, 40, 50, and 60\(\text {\,GeV\!/}c^2\)

Fig. 7
figure 7

Expected (open dots and 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL), a: as a function of \(\tau _{\tilde{\chi }^{0}_{1}}\) with \(m_{\tilde{\chi }^{0}_{1}}=30\text {\,GeV\!/}c^2 \), b: as a function of \(m_{\tilde{\chi }^{0}_{1}}\) with \(\tau _{\tilde{\chi }^{0}_{1}}=10\text {\,ps} \). The processes are from direct, non-resonant, LLP production

The muon impact-parameter distribution is also studied from \({Z} \) decays and shows a discrepancy between data and simulation of about 10\(\,\upmu \text {m}\) close to the \({p^{\mu }_{\mathrm {T}}}\) threshold. By changing the minimum \(\mathrm{IP^{\mu }}\) requirement by this amount, the change in the detection efficiency is below 1% for all the models.

The vertex reconstruction efficiency has a complicated spatial structure due to the geometry of the VELO and the material veto. Uncertainties in the estimated vertex-finding efficiency are due to the per-track efficiency, track resolution, and differences in the contribution from background tracks due to the underlying interaction and pile-up. In the material-free region, \(R_\mathrm{xy} < 4.5\) mm, the efficiency as a function of the flight distance has been studied in the context of lifetime measurements [41], showing that the simulation reproduces the data within 1%. In the region \(R_\mathrm{xy} > 4.5\) mm a deviation of less than 6% is inferred from the study of inclusive \({b} {\overline{{b}}} \) events in data and simulation. By altering the efficiency in the simulation program as a function of the true vertex position, the effect on the LLP detection efficiency is estimated to be 1–2%. A second method to determine this contribution uses vertices from \(B^0 \rightarrow {{J /\psi }} {{K} ^{*0}} \) decays with \({{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} \) and \({{K} ^{*0}} \rightarrow K^+ \pi ^-\). For this process the vertex detection efficiencies in data and simulation agree within 10%. This result, obtained from a process with four final-state particles, is propagated to the LLP decay into a larger number of charged particle tracks and a detection threshold of three tracks. A discrepancy of at most 2% between the LLP efficiency in data and simulation is found, which is adopted as a contribution to the systematic detection uncertainty.

The uncertainty on the position of the beam line in the transverse plane is less than 20\(\,\upmu \text {m}\)  [16]. It can affect the secondary-vertex selection, mainly via the requirement on \(R_\mathrm{xy}\). By altering the PV position in simulated signal events, the effect is estimated to be below 1%.

The effect of the imperfect modelling on the observables used in the MVA training is estimated with pseudoexperiments. As previously stated, the bias on each input variable is determined by comparing simulated and experimental distributions of muons and LLP candidates from \(Z \) and \(W \) events, as well as from \({b} {\overline{{b}}} \) events. At the MVA test stage, each input variable is modified by a scale factor randomly selected from a Gaussian distribution of width equal to the corresponding bias. The standard deviation of the signal efficiency distribution is taken as a systematic uncertainty.

The signal and background samples are obtained through a selection on the muon isolation parameter. By a comparison of data and muons from simulated \({b} {\overline{{b}}} \) events, the maximum uncertainty on this variable is estimated to be \(\pm 0.015\) in the proximity of the thresholds, with a maximal effect on the efficiency of 1.7%.

Comparing the mass distributions of \({b} {\overline{{b}}} \) and \(Z\rightarrow {{b} {\overline{{b}}}} \) events, a maximum mass-scale discrepancy between data and simulated events of 10% is estimated in the proximity of the threshold, which translates into a 1.4% contribution to the detection efficiency uncertainty.

Finally, the total systematic uncertainty is obtained as the sum in quadrature of all contributions, where the different components of the detection efficiency are assumed to be fully correlated.

The choice of the signal and background invariant-mass templates can affect the results of the LLP mass fits. The uncertainty due to the signal model accounts for the mass scale and the mass resolution. The mass scale and resolution discrepancies between data and simulation are below 1% and 1.5% respectively, as obtained from \({b} {\overline{{b}}} \) and \(Z \rightarrow {{b} {\overline{{b}}}} \) events. Pseudoexperiments are used to estimate the effect on the cross-section calculation. For each theoretical model, ten simulated signal events are added to the selected data after a Gaussian smearing or after changing the mass scale. The average deviation of the observed upper limits with respect to the one obtained from the default signal and background distributions is below 2%.

The background shape is deduced from data selected in the poorly isolated region after reweighting, with weights inferred from the data distributions obtained with relaxed selection criteria. The overall uncertainty is estimated by reducing by half the weights and running pseudoexperiments as before. The average deviation of the observed upper limits is below 14%.

6 Results

The 95% confidence level (CL) upper limits, expected and observed, on the production cross-sections times branching fraction are computed for each model using the CLs approach [42]. Statistical and systematic uncertainties on the signal efficiencies are included as nuisance parameters of the likelihood function, assuming Gaussian distributions. Finally, the upper limit values are corrected by the factors which account for the imperfect modelling of signal and background templates.

The numerical results for all the models are given in Tables 4 and 5. Figures 4, 5, 6 and 7 show the measured cross-section times branching ratio upper limits, for different theoretical models. The decrease of sensitivity for relatively low LLP mass value is explained by the above-mentioned effects on the detection efficiency. The upper limits for the processes with \(m_{h^0} =125\text {\,GeV\!/}c^2 \) can be compared to the prediction of the Standard Model Higgs production cross-section from gluon fusion of about 46\(\text {\,pb}\) at \(\sqrt{s} =13\text {\,TeV} \) [43].

7 Conclusion

Long-lived massive particles decaying into a muon and two quarks have been searched for using proton-proton collision data collected by the LHCb experiment at \(\sqrt{s} =13\text {\,TeV} \), corresponding to an integrated luminosity of 5.4 \(\text {\,fb} ^{-1}\). The LLP lifetime range considered is from 5 to 200 ps. The background is dominated by \({b} {\overline{{b}}} \) events and is reduced by tight selection requirements, including a dedicated multivariate classifier. The signal yield is determined by a fit to the LLP reconstructed mass with a signal shape inferred from the theoretical models.

The forward acceptance of the LHCb experiment makes it complementary to other LHC experiments, while its low trigger \(p_{\mathrm {T}}\) threshold allows exploring relatively small LLP masses. Two types of LLP productions have been assumed. In the first a Higgs-like particle is produced by gluon fusion and decays into two LLPs. The analysis covers Higgs-like boson masses from 30 to 200\(\text {\,GeV\!/}c^2\), and LLP mass range from 10 \(\text {\,GeV\!/}c^2\) up to about one half of the mass of the parent boson. The second mode is a direct LLP production from quark interactions, covering the LLP mass range from 10 up to 90 \(\text {\,GeV\!/}c^2\).

The results for all theoretical models considered are compatible with the background-only hypothesis. The upper limits at 95% CL set on the cross-section times branching fractions are mostly of O(0.1\(\text {\,pb}\)), but the sensitivity is limited to O(10\(\text {\,pb}\)) for the lowest LLP mass value considered of 10\(\text {\,GeV\!/}c^2\).