1 Introduction

Supersymmetry (SUSY) is one of the most popular extensions of the Standard Model, which solves the hierarchy problem, can unify the gauge couplings and could provide dark matter candidates. The minimal supersymmetric standard model (MSSM) is the simplest phenomenologically viable realization of SUSY [1, 2]. The present study focuses on a subset of models featuring massive long-lived particles (LLP) with a measurable flight distance [3, 4]. LLP searches have been performed by Tevatron and LHC experiments [5,6,7,8,9,10,11,12], often using the Hidden Valley framework [4] as a benchmark model (see also the study of Ref. [13]). The LHCb detector probes the forward rapidity region which is only partially covered by the other LHC experiments, and triggers on particles with low transverse momenta, which allows the experiment to explore relatively small LLP masses.

In this paper a search for massive long-lived particles is presented, using proton-proton collision data collected by the LHCb detector at \(\sqrt{s} =7\) and 8 TeV, corresponding to integrated luminosities of 1 and 2 \(\,\hbox {fb}^{-1}\), respectively. The event topology considered in this study is a displaced vertex with several tracks including a high \(p_{\mathrm { T}}\) muon. This topology is found in the context of the minimal super-gravity (mSUGRA) realisation of the MSSM, with R-parity violation [14], in which the neutralino can decay into a muon and two jets. Neutralinos can be produced by a variety of processes. In this paper four simple production mechanisms with representative topologies and kinematics are considered, with the assumed LLP mass in the range 20–80 \(\mathrm{{GeV}}/c^2\). The LLP lifetime range considered is 5–100\({\,\mathrm{{ps}}}\), i.e. larger than the typical b-hadron lifetime. It corresponds to an average flight distance of up to 30\(\mathrm { \,cm}\), well inside the LHCb vertex detector. One of the production mechanisms considered in detail is the decay into two LLPs of a Higgs-like particle with an assumed mass between 50 and 130 \(\mathrm{{GeV}}/c^2\), i.e. in a range which includes the mass of the scalar boson discovered by the ATLAS and CMS experiments [15, 16]. In addition, inclusive analyses are performed assuming the full set of neutralino production mechanisms available in Pythia  6 [17]. In this case the LLP mass explored is in the range 23–198 \(\mathrm{{GeV}}/c^2\), inspired by Ref. [13], and different combinations of gluino and squark masses are studied.

2 Detector description

The LHCb detector [18, 19] is a single-arm forward spectrometer covering the pseudorapidity range \(2<\eta <5\), designed for the study of particles containing \(b \) or \(\mathrm {c} \) quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region (VELO), a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,Tm}}\), and three stations of silicon-strip detectors and straw drift tubes, placed downstream of the magnet. The tracking system provides a measurement of momentum, \({p}\), of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200\({\mathrm {\,GeV}/c}\). The minimum distance of a track to a primary vertex (PV), the impact parameter \(d_\mathrm{{IP}}\), is measured with a resolution of \((15+29/p_{\mathrm { T}}){\,\upmu \mathrm {m}} \), where \(p_{\mathrm { T}}\) is the component of the momentum transverse to the beam axis, in \({\mathrm {\,GeV}/c}\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger [20], which consists of a hardware stage based on information from the calorimeter and muon systems, followed by a software stage which runs a simplified version of the offline event reconstruction.

3 Event generation and detector simulation

Several sets of simulated events are used to design and optimize the signal selection and to estimate the detection efficiency. Proton-proton collisions are generated in Pythia  6 with a specific LHCb configuration [21], and with parton density functions taken from CTEQ6L [22]. The LLP signal in this framework is represented by the lightest neutralino \(\tilde{\chi }^{0}_{1}\), with mass \(m_\mathrm{{LLP}}\) and lifetime \(\tau _\mathrm{{LLP}}\). It is allowed to decay into two quarks and a muon. Decays to all quark pairs are assumed to have identical branching fractions except for those involving a top quark, which are neglected.

Two separate detector simulations are used to produce signal models: a full simulation, where the interaction of the generated particles with the detector is based on Geant4 [23,24,25], and a fast simulation. In Geant4, the detector and its response are implemented as described in Ref. [26]. In the fast simulation, which is used to cover a broader parameter space of the theoretical models, the charged particles falling into the geometrical acceptance of the detector are processed by the vertex reconstruction algorithm. The simulation accounts for the effects of the material veto described in the next section. The program also provides parameterised particle momenta resolutions, but it is found that these resolutions have no significant impact on the LLP mass reconstruction, nor on the signal detection efficiency. The fast simulation is validated by comparison with the full simulation. The distributions for mass, momentum and transverse momentum of the reconstructed LLP and for the reconstructed decay vertex position are in excellent agreement, as well as the muon momentum and its impact parameter to the PV. The detection efficiencies predicted by the full and the fast simulation differ by less than 5%.

Fig. 1
figure 1

Four topologies considered as representative LLP production mechanisms: \(P\!A\) non-resonant direct double LLP production, \(P\!B\) single LLP production, \(P\!\,C\) double LLP production from the decay of a Higgs-like boson, \(P\!D\) double LLP indirect production via squarks

Two LLP production scenarios are considered. In the first, the signal samples are generated assuming the full set of neutralino production processes available in Pythia. In particular, nine models are fully simulated with the parameters given in the Appendix, Table 4. Other points in the parameter space of the theoretical models are studied with the fast simulation, covering the \(m_\mathrm{{LLP}}\) range 23–198 \(\mathrm{{GeV}}/c^2\). These models are referred to as “LV” (for lepton number violation) followed by the LLP mass in  \(\mathrm{{GeV}}/c^2\) and lifetime (e.g. LV98 10\({\,\mathrm{{ps}}}\)). For the second scenario, the four production mechanisms depicted in Fig. 1, labelled \(P\!A\), \(P\!B\), \(P\!\,C\), and \(P\!D\), are selected and studied independently with the fast simulation. The LLP, represented by the neutralino, subsequently decays into two quarks and a muon. The processes \(P\!A\), \(P\!\,C\), and \(P\!D\) have two LLPs in the final state. In processes \(P\!\,C\) and \(P\!D\) two LLPs are produced by the decay of a Higgs-like particle of mass \(m_\mathrm{h^0}\), and by the decay of squarks of mass \(m_{\tilde{\mathrm{q}}}\), respectively. In process \(P\!B\) a single LLP is produced recoiling against an object labelled as a “gluino”, of mass \(m_{\text {``}\tilde{\mathrm{g}}\text {''}}\). In order to control the kinematic conditions, the particles generated in these processes are constrained to be on-shell and the “gluino” of option \(P\!B\) is stable. Since LHCb is most sensitive to relatively low LLP masses, only \(m_\mathrm{{LLP}}\) values below 80 \(\mathrm{{GeV}}/c^2\) are considered.

The background from direct production of heavy quarks, as well as from \({\mathrm {W}} \) and \({\mathrm {Z}} \) boson decays, is studied using the full simulation. A sample of \(9 \times 10^6\) inclusive \({\mathrm {c}} {\overline{{\mathrm {c}}}} \) events with at least two \(\mathrm {c} \) hadrons in \(1.5<\eta <5.0\), and another sample of about \(5 \times 10^5\) \({\mathrm {t}} {\overline{{\mathrm {t}}}} \) events with at least one muon in \(1.5<\eta <5.0\) and \(p_{\mathrm { T}} >10{\mathrm {\,GeV}/c} \) were produced. Several million simulated events are available with production of \({\mathrm {W}} \) and \({\mathrm {Z}} \) bosons. The most relevant background in this analysis is from \({b} {\overline{{b}}} \) events. The available simulated inclusive \({b} {\overline{{b}}} \) events are not numerous enough to cover the high-\(p_{\mathrm { T}}\) muon kinematic region required in this analysis. To enhance the \({b} {\overline{{b}}} \) background statistics, a dedicated sample of \(2.14 \times 10^5\) simulated events has been produced with a minimum parton \(\hat{p}_\mathrm{T}\) of 20\({\mathrm {\,GeV}/c}\) and requiring a muon with \(p_{\mathrm { T}} >12{\mathrm {\,GeV}/c} \) in \(1.5<\eta <5.0\). As a consequence of limitations in the available computing power, only \({b} {\overline{{b}}} \) events with \(\sqrt{s} =7\,\mathrm{{TeV}}\) have been fully simulated. Despite the considerable increase of generation efficiency, all the simulated \({b} {\overline{{b}}} \) events are rejected by the multivariate analysis presented in the next section. Therefore a data-driven approach is employed for the final background estimation.

4 Event selection

Signal events are selected by requiring a displaced high-multiplicity vertex with one associated isolated high-\(p_{\mathrm { T}}\) muon, since, due to the larger particle mass, muons from LLP decays are expected to have larger transverse momenta and to be more isolated than muons from hadron decays.

The events from pp collisions are selected online by a trigger requiring muons with \(p_{\mathrm { T}} >10{\mathrm {\,GeV}/c} \). Primary vertices and displaced vertices are reconstructed offline from charged particle tracks [27] with a minimum reconstructed \(p_{\mathrm { T}}\) of 100\({\mathrm {\,MeV}/c}\). Genuine PVs are identified by a small radial distance from the beam axis, \(R_\mathrm{{xy}} <0.3\) mm. The offline analysis requires that the triggering muon has an impact parameter to all PVs of \(d_\mathrm{{IP}} >0.25\,{\mathrm {mm}} \) and \(p_{\mathrm { T}} >12\) \({\mathrm {\,GeV}/c}\). To suppress the background due to kaons or pions punching through the calorimeters and being misidentified as muons, the corresponding energy deposit in the calorimeters must be less than 4% of the muon energy. To preserve enough background events in the signal-free region for the signal determination algorithm described in Sect. 5, no isolation requirement is applied at this stage. Secondary vertices are selected by requiring \(R_\mathrm{{xy}} >0.55\) mm, at least four tracks in the forward direction (i.e. in the direction of the spectrometer) including the muon and no tracks in the backward direction. The total invariant mass of the tracks coming from a selected vertex must be larger than 4.5 \(\mathrm{{GeV}}/c^2\). Particles interacting with the detector material are an important source of background. A geometric veto is used to reject events with vertices in regions occupied by detector material [28].

The number of data events selected is 18 925 (53 331) in the \(7\,\mathrm{{TeV}}\) (\(\mathrm 8\,\mathrm{{TeV}}\)) datasets. Less than 1% of the events have more than one candidate vertex, in which case the candidate with the highest-\(p_{\mathrm { T}}\) muon is chosen. According to the simulation, the background is largely dominated by \({b} {\overline{{b}}} \) events, while the contribution from the decays of \({\mathrm {W}} \) and \({\mathrm {Z}} \) bosons is of the order of 10 events. All simulated \({\mathrm {c}} {\overline{{\mathrm {c}}}} \) and \({\mathrm {t}} {\overline{{\mathrm {t}}}} \) events are rejected. The \({b} {\overline{{b}}} \) cross-section value measured by LHCb, \( 288 \pm 4 \pm 48\) \(\mu \)b [29,30,31,32], predicts \((15 \pm 3)\times 10^3\) events for the \(7\,\mathrm{{TeV}}\) dataset, after selection. The value for the \(\mathrm 8\,\mathrm{{TeV}}\) dataset is \((52 \pm 10)\times 10^3\). The extrapolation of the cross-section from 7 to \(\mathrm 8\,\mathrm{{TeV}}\) is obtained from POWHEG [33,34,35], while Pythia is used to obtain the detection efficiency. The candidate yields for the two datasets are consistent with a dominant \({b} {\overline{{b}}} \) composition of the background. This is confirmed by the study of the shapes of the distributions of the relevant observables. Figure 2 compares the distributions for the \(7\,\mathrm{{TeV}}\) dataset and for the 135 simulated \({b} {\overline{{b}}} \) events surviving the selection. For illustration, the shapes of simulated LV38 10 ps signal events are superimposed on all the distributions, as well as the expected shape for LV38 50 ps on the \(R_{\rm xy}\) distribution. The muon isolation variable is defined as the sum of the energy of tracks surrounding the muon direction, including the muon itself, in a cone of radius \(R_{\eta \phi }= 0.3\) in the pseudorapidity-azimuthal angle \((\eta , \phi )\) space, divided by the energy of the muon track. The corresponding distribution is shown in Fig. 2b. A muon isolation value of unity denotes a fully isolated muon. As expected, the muon from the signal is found to be more isolated than the hadronic background. Figure 2e presents the radial distribution of the displaced vertices; the drop in the number of candidates with a vertex above \(R_\mathrm{{xy}} \sim 5\,{\mathrm {mm}} \) is due to the material veto. From simulation, the veto introduces a loss of efficiency of 13% (42%) for the detection of LLPs with a \(30\,\mathrm{{GeV}}/c^2 \) mass and a 10\({\,\mathrm{{ps}}}\) (100\({\,\mathrm{{ps}}}\)) lifetime. The radial (\(\sigma _\mathrm{R}\)) and longitudinal (\(\sigma _\mathrm{z}\), parallel to the beam) uncertainties provided by the LLP vertex fit are shown in Fig. 2f, g. Larger uncertainties are expected from the vertex fits of candidates from \({b} {\overline{{b}}} \) events compared to signal LLPs. The former are more boosted and produce more narrowly collimated tracks, while the relatively heavier signal LLPs decay into more divergent tracks. This effect decreases when \(m_\mathrm{{LLP}}\) approaches the mass of \(b \)-quark hadrons.

Fig. 2
figure 2

Distributions for the 7 TeV dataset (black histogram) compared to simulated \({b} {\overline{{b}}} \) events (blue squares with error bars), showing a transverse momentum and b isolation of the muon, c number of tracks of the displaced vertex, d reconstructed mass, e radial position of the vertex, f, g vertex fit uncertainties in the radial and z direction. The fully simulated signal distributions for LV38 10 ps are shown (red dashed histograms), as well as LV38 50 ps (green dotted histogram) in (e). The distributions from simulation are normalised to the number of data entries

A multivariate analysis based on a multi-layer perceptron (MLP) [36, 37] is used to further purify the data sample. The MLP input variables are the muon \(p_{\mathrm { T}}\) and impact parameter, the number of charged particle tracks used to reconstruct the LLP, the vertex radial distance \(R_\mathrm{{xy}}\) from the beam line, and the uncertainties \(\sigma _\mathrm{R}\) and \(\sigma _\mathrm{z}\) provided by the LLP vertex fit. The muon isolation value and the reconstructed mass of the long-lived particles are not used in the MLP classifier; the discrimination power of these two variables is subsequently exploited for the signal determination. The signal training and test samples are obtained from simulated signal events selected under the same conditions as data. A data-driven approach is used to provide the background training samples, based on the hypothesis that the amount of signal in the data is small. For this, a number of candidates equal to the number of candidates of the signal training set, which is of the order of 1000, is randomly chosen in the data. The same procedure provides the background test samples. The MLP training is performed independently for each fully simulated model and dataset. The optimal MLP requirement is subsequently determined by maximizing a figure of merit defined by \(\epsilon / \sqrt{N_d + 1}\), where \(\epsilon \) is the signal efficiency from simulation for a given selection, and \(N_d\) the corresponding number of candidates found in the data.

The generalisation power of the MLP is assessed by verifying that the distributions of the classifier output for the training sample and the test sample agree. The uniformity over the dataset is controlled by the comparison of the MLP responses for several subsets of the data.

The MLP classifier can be biased by the presence of signal in the data events used as background training set. To quantify the potential bias, the MLP training is performed adding a fraction of simulated signal events (up to 5%) to the background set. This test, performed independently for all signal models, demonstrates a negligible variation of the performances quantified by the above figure of merit.

Fig. 3
figure 3

Reconstructed mass of the LLP candidate from the 8 TeV dataset. The top plots correspond to events with candidates selected from the background region of the muon isolation variable. They are fitted with the sum of two exponential functions. In the bottom row the candidates from the signal region are fitted including a specific signal shape, added to the background component. Subfigures a and c correspond to the analysis which assumes the LV38 5\({\,\mathrm{{ps}}}\) signal model, b and d are for LV98 10\({\,\mathrm{{ps}}}\)

5 Determination of the signal yield

The signal yield is determined with an extended unbinned maximum likelihood fit to the distribution of the reconstructed LLP mass, with the shape of the signal component taken from the simulated models, plus the background component. After the MLP filter, no simulated background survives; therefore a data-driven method is adopted to determine the background template. The data candidates are separated into a signal region with muon isolation below 1.4 and a background region with isolation value from 1.4 to 2. The signal region contains more than 90% of the signal for all the models considered (see e.g. Fig. 2b). The reconstructed mass obtained from the background candidates is used to constrain an empirical probability density function (PDF) consisting of the sum of two negative slope exponential functions, for which the slope values and amplitudes are free parameters in the fit. The signal PDF is taken from the histogram of the mass distribution obtained from simulation. The fit is performed simultaneously on events from the background region and from the signal region. In the latter the numbers of signal and background events are left free in the fit, while the slope values and the relative strength of the two exponential functions are in common with the background region fit. Examples of fit results are given in Fig. 3, obtained from the \(\mathrm 8\,\mathrm{{TeV}}\) dataset for two signal hypotheses, LV38 5\({\,\mathrm{{ps}}}\) and LV98 10\({\,\mathrm{{ps}}}\). The fitted signal yields, given in Table 1, for both datasets are compatible with the background-only hypothesis.

Table 1 Total signal detection efficiency \(\epsilon \), including the geometrical acceptance, and numbers of fitted signal and background events, \(N_\mathrm{s}\) and \(N_\mathrm{b}\), for the different signal hypotheses. The last column gives the value of \(\chi ^2/\mathrm {ndf}\) from the fit. The signal models are from the full simulation. Uncertainties are explained in Sect. 6

The validity of using events with isolation above 1.4 to model the background has been checked by comparing the relevant distributions from events in the background and in the signal regions, including the muon \(p_{\mathrm { T}}\) and impact parameter distributions, as well as the number of tracks, invariant mass, vertex \(R_\mathrm{{xy}}\) and vertex uncertainties of the LLP candidate. This test is performed with the nominal MLP selection, and also with loosened requirements that result in a threefold increase in the number of background candidates. In both cases all distributions agree within statistical uncertainties, with the \(\chi ^2/\mathrm {ndf}\) of the comparison in the range 0.6–1.5.

The sensitivity of the procedure is studied by adding a small number of signal events to the data according to a given signal model. The fitted yields are consistent with the numbers of added events on average, and the pull distributions are close to Gaussian functions with mean values between \(-0.1\) and 0.1 and standard deviations on the range from 0.9 to 1.2.

As a final check a two-dimensional sideband subtraction method (“ABCD method” [38]) has been considered. The LLP reconstructed mass and the muon isolation are used to separate the candidates in four regions. The results of this check are also consistent with zero signal for the two datasets.

Both the LLP mass fit and the ABCD methods are tested with W and \({\mathrm {Z}}/\gamma \) leptonic decays. Isolated high-\(p_{\mathrm { T}}\) muons are produced in such processes with kinematic properties similar to the signal. By removing the minimum \(R_\mathrm{{xy}}\) requirement the candidates can be formed by collecting tracks from the primary vertex. As before, the background is taken from a region of muon isolation above 1.4, which contains a negligible amount of signal. For both datasets the number of events obtained from this study is compatible with the cross-sections measured by LHCb [39,40,41].

6 Detection efficiency and systematic uncertainties

The total signal detection efficiency, estimated from fully simulated events, is shown in Table 1. It includes the geometrical acceptance, which for the detection of one \(\tilde{\chi }^{0}_{1}\) in LHCb is about 11% (12%) at \(\sqrt{s} =7\,\mathrm{{TeV}}\) (8 TeV). The efficiencies for the models where the fast simulation is used, including processes \(P\!A\), \(P\!B\), \(P\!\,C\), and \(P\!D\), vary from about 0.1% to about 2%. The efficiency increases with \(m_\mathrm{{LLP}}\) because more particles are produced in the decay of heavier LLPs. This effect is only partially counteracted by the loss of particles outside the spectrometer acceptance, which is especially likely when the LLP are produced from the decay of heavier states, such as the Higgs-like particles of process \(P\!\,C\). Another competing phenomenon is that the lower boost of heavier LLPs results in a shorter average flight length, i.e. the requirement of a minimum \(R_\mathrm{{xy}}\) disfavours heavy LLPs. The cut on \(R_\mathrm{{xy}}\) is more efficient at selecting LLPs with large lifetimes, but for lifetimes larger than \({\sim }50{\,\mathrm{{ps}}} \) a considerable portion of the decays falls into the material region and is vetoed. Finally, a drop of sensitivity is expected for LLPs with a lifetime close to the \(b \) hadron lifetimes, where the contamination from \({b} {\overline{{b}}} \) events becomes important, especially for low mass LLPs.

A breakdown of the relative systematic uncertainties for the analysis of the \(\mathrm 8\,\mathrm{{TeV}}\) dataset is shown in Table 2. The table does not account for the uncertainties associated with the fit procedure, which, as described below, require a specific treatment. The uncertainties on the integrated luminosity are \(1.7\%\) for \(7\,\mathrm{{TeV}}\) dataset and \(1.2\%\) for \(\mathrm 8\,\mathrm{{TeV}}\) data [42]. Several sources of systematic uncertainty coming from discrepancies between data and simulation have been considered.

The muon detection efficiency, including trigger, tracking, and muon identification efficiencies, is studied by a tag-and-probe technique applied to muons from \(J/\psi \rightarrow {\upmu ^+\upmu ^-} \) [43] and from \({\mathrm {Z}} \rightarrow {\upmu ^+\upmu ^-} \) decays [39,40,41, 44]. The corresponding systematic effects due to differences between data and simulation are estimated to be between 2.1 and 4.5%, depending on the theoretical model considered.

A comparison of the simulated and observed \(p_{\mathrm { T}}\) distributions of muons from \({\mathrm {Z}} \rightarrow {\upmu ^+\upmu ^-} \) decays shows a maximum difference of 3% in the momentum scale; this difference is propagated to the LLP analysis by moving the muon \(p_{\mathrm { T}}\) threshold by the same amount. A corresponding systematic uncertainty of 1.5% is estimated for all models under consideration.

The \(d_\mathrm{{IP}}\) distribution shows a discrepancy between data and simulation of about 5\({\,\upmu \mathrm {m}}\) in the mean value for muons from \({\mathrm {Z}} \) decays, with a maximum deviation of about 20\({\,\upmu \mathrm {m}}\) close to the muon \(p_{\mathrm { T}}\) threshold. By changing the minimum \(d_\mathrm{{IP}}\) requirement by this amount, the change in the detection efficiency is in the range 0.4–1.2%, depending on the model.

Table 2 Summary of the contributions to the relative systematic uncertainties, corresponding to the 8 TeV dataset, (the sub-total for the 7 TeV dataset is also given). The indicated ranges cover the fully simulated LV models. The detection efficiency is affected by the parton luminosity model and depends upon the production process, with a maximum uncertainty of 7% for the gluon-gluon fusion process \(P\!\,C\). For the fast simulation based analysis there is an additional contribution of 5%. The systematic effects associated with the signal and background models used in the LLP mass fit are not shown in the table

The vertex reconstruction efficiency is affected by the tracking efficiency and has a complicated spatial structure due to the geometry of the VELO and the material veto. In the material-free region, \(R_\mathrm{{xy}} <4.5\,{\mathrm {mm}} \), the efficiency to detect secondary vertices as a function of the flight distance has been studied in detail, in particular in the context of the \(b \) hadrons lifetime measurement [45]. The deviation of the efficiency in simulation with respect to the data is below 1%. For \(R_\mathrm{{xy}}\) from 4.5\(\,{\mathrm {mm}}\) to about 12\(\,{\mathrm {mm}}\) a study performed with inclusive \({b} {\overline{{b}}} \) events finds differences between data and simulation of less than 5%. The corresponding systematic uncertainties are determined by altering the efficiency in the simulation program as a function of the true vertex position. A maximum of 1% uncertainty is obtained for all the signal models. An alternative procedure to asses this uncertainty considers vertices from \(B^0 \rightarrow {{J/\psi }} {{K} ^{*0}} \) decays with \({{J/\psi }} \!\rightarrow {\upmu ^+\upmu ^-} \) and \({{K} ^{*0}} \rightarrow K^+ \pi ^-\). The detection efficiency in data and simulation is found to agree within 10%. This result, obtained from a four-particle final state, when propagated to LLP decays with on average more than 10 charged final-state particles for all modes, results in a discrepancy of at most 2% between the LLP efficiencies in data and simulation, which is the adopted value for the respective systematic uncertainty.

The uncertainty on the position of the beam line is less than 20\({\,\upmu \mathrm {m}}\) [46]. It can affect the secondary vertex selection, mainly via the requirement on \(R_\mathrm{{xy}}\). By altering the PV position in simulated signal events, the maximum effect on the LLP selection efficiency is in the range 0.2–1%.

The imprecision of the models used for training the MLP propagates into a systematic difference of the detection efficiency between data and simulation. The bias on each input variable is determined by comparing simulated and experimental distributions for muons and LLP candidates from \(\mathrm {Z} \) and \(\mathrm {W} \) events, and from \({b} {\overline{{b}}} \) events. The effect of the biases is subsequently estimated by testing the trained classifier on altered simulated signal events: each input variable is modified by a scale factor randomly drawn from a Gaussian distribution of width equal to the corresponding bias. The RMS variation of the signal efficiency distributions after the MLP range from 1.5 to 3.6% depending on the signal model. These values are taken as contributions to the systematic uncertainties.

The signal region is selected by the requirement of a muon isolation value lower than 1.4. By a comparison of data and simulated muons from \(\mathrm {Z} \) decays, the uncertainty on this variable is estimated to be \(\pm 0.05\). This uncertainty is propagated to a maximum 2.2% effect on the detection efficiency.

Comparing the mass distributions of \({b} {\overline{{b}}} \) events selected with relaxed cuts, a maximum mass scale discrepancy between data and simulated events of 10% is estimated. The corresponding shift of the simulated signal mass distribution results in a variation of the detection efficiency between 0.8 and 1.5%.

The statistical precision of the efficiency value determined from the simulated events is in the range 1.7–2.5% for the different models.

The theoretical uncertainties are dominated by the uncertainty of the partonic luminosity. Their contribution to the detection efficiency uncertainty is estimated following the procedure explained in Ref. [44] and vary from 3% up to a maximum of 7%, which is found for the gluon-gluon fusion process \(P\!\,C\).

For the analysis based on the fast simulation, a 5% uncertainty is added to account for the difference between the fast and the full simulation, as explained in Sect. 3.

The choice of the background and signal templates can affect the results of the LLP mass fit. The uncertainty due to the signal model accounts for the mass scale, the mass resolution and the finite number of events available to construct the model. Pseudoexperiments in which 10 signal events are added to the data are analysed with a modified signal template, and the resulting number of fitted candidates is compared to the result from the nominal fit model. Assuming as before a 10% uncertainty on the signal mass scale, a maximum absolute variation of 0.6 fitted signal candidates is obtained. No significant effects are obtained by modifying the signal mass resolution with an additional smearing. Changing the statistical precision by reducing the initial number N of signal events used to build the histogram PDF by \(2 \sqrt{N}\) has no significant effect either.

The uncertainty induced by the choice of the background model is obtained by reweighting the candidates from the background region in such a way that the distribution of the number of tracks included in the LLP vertex fit exactly matches the distribution in the signal region. This test is motivated by the fact that the number of tracks has a significant correlation with the measured mass. The fits of the mass distribution of pseudoexperiments give absolute variations in the numbers of fitted signal events in the range 0.1–1.6, the largest value at low LLP mass. Reweighting the candidates in such a way as to match the \(p_{\mathrm { T}}\) distributions gives variations which are less than 0.5 events for all models. Moving the isolation threshold by \(\pm 0.1\) leads to variations of the order of 0.01 events. In conclusion, the variation on the number of fitted candidates associated to the choice of the PDF models is in the range of 1–2 events. The calculation of the cross-section upper limits takes into account this uncertainty as an additional nuisance parameter on the fit procedure.

7 Results

The LLP candidates collected at \(\sqrt{s} =7\) and 8 TeVare analysed independently. The fast simulation is used to extend the MSSM/mSUGRA theoretical parameter space of the LV models, and for the analysis of processes \(P\!A\), \(P\!B\), \(P\!\,C\), and \(P\!D\). The results obtained are found to be compatible with the absence of signal for all signal model hypotheses considered. The 95% confidence level (CL) upper limit on the production cross-sections times branching fraction is computed for each model using the CLs approach [46]. The numerical results for the fully simulated LV models are given in Table 3.Footnote 1 A graphical representation of selected results is given in Figs. 4, 5 and 6.

The MSSM/mSUGRA LV models are explored by changing the common squark mass and the gluino mass. Figure 4 gives examples of the cross-section times branching fraction upper limits as a function of \(m_\mathrm{{LLP}}\) for such models for two values of \(\tau _\mathrm{{LLP}}\), and two values of the squark mass. The gluino mass is set to 2000 \(\mathrm{{GeV}}/c^2\). Varying the gluino mass from 1500 to 2500 \(\mathrm{{GeV}}/c^2\) has almost no effect on the results. The decrease of sensitivity for decreasing \(m_\mathrm{{LLP}}\) is explained by the above-mentioned effects on the detection efficiency.

A representation of selected results from the processes \(P\!A\), \(P\!B\), \(P\!\,C\), and \(P\!D\) is given in Fig. 5. The single LLP production of \(P\!B\) has a lower detection probability compared to the double LLP production case, \(P\!A\), which explains the reduced sensitivity. The \(P\!B\) plots correspond to \(m_{\text {``}\tilde{\mathrm{g}}\text {''}} =100\,\mathrm{{GeV}}/c^2 \). Varying \(m_{\text {``}\tilde{\mathrm{g}}\text {''}}\) from 100 to 1000 \(\mathrm{{GeV}}/c^2\) decreases the detection efficiency by a factor of two, while an increase by a factor of two is obtained reducing \(m_{\text {``}\tilde{\mathrm{g}}\text {''}} \) to 20 \(\mathrm{{GeV}}/c^2\). The results for process \(P\!\,C\) are given as a function of the Higgs-like boson mass, for three values of \(m_\mathrm{{LLP}}\). Again the sensitivity of the analysis drops with decreasing \(m_\mathrm{{LLP}}\). The results shown for \(P\!D\) are for \(m_{\tilde{\mathrm{q}}} =60\,\mathrm{{GeV}}/c^2 \), which limits the maximum \(m_\mathrm{{LLP}}\) value. In process \(P\!D\) some of scattering energy is absorbed by an additional jet during the LLP production, reducing the detection efficiency by a factor of two with respect to \(P\!A\). Finally, Fig. 6 gives the cross-section upper limits times branching fraction as a function of \(m_\mathrm{{LLP}}\), for the process \(P\!\,C\) with a mass of 125 \(\mathrm{{GeV}}/c^2\) for the Higgs-like boson and LLP lifetime from 5 to 100\({\,\mathrm{{ps}}}\). These results can be compared to the prediction of the Standard Model Higgs production cross-section of about 21\(\,{\mathrm {pb}}\) at \(\sqrt{s} =8\,\mathrm{{TeV}}\) [46].

Table 3 Upper limits (95% CL) on the production cross-section times branching fraction (pb) for the \(7\,\mathrm{{TeV}}\) and \(\mathrm 8\,\mathrm{{TeV}}\) datasets, based on the fully simulated LV signal samples
Fig. 4
figure 4

Expected (open dots with 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits at 95% confidence level, as a function of the LLP mass from the 8 TeV dataset. The theoretical models assume the full set of SUSY production processes available in Pythia  6 with default parameter settings, unless otherwise specified. The gluino mass is 2000 \(\mathrm{{GeV}}/c^2\)

Fig. 5
figure 5

Expected (open dots and 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL) for the processes indicated in the bottom left corner of each plot, \(\tau _\mathrm{{LLP}}\) is always 10\({\,\mathrm{{ps}}}\). The results correspond to the 8 TeV dataset. a Upper limits as a function of the LLP mass for process \(P\!A\); b as a function of the LLP mass for process \(P\!B\), with \(m_{\text {``}\tilde{\mathrm{g}}\text {''}} =100\) \(\mathrm{{GeV}}/c^2\); c as a function of \(m_\mathrm{h^0}\) for process \(P\!\,C\) for \(m_\mathrm{{LLP}}\) of 20, 40, and 60 \(\mathrm{{GeV}}/c^2\), from top to bottom (the single point at 130 \(\mathrm{{GeV}}/c^2\) with \(m_\mathrm{{LLP}} =60\,\mathrm{{GeV}}/c^2 \) has been shifted to the right for visualisation); d upper limits as a function of the LLP mass for process \(P\!D\) with \(m_{\tilde{\mathrm{q}}} =60\) \(\mathrm{{GeV}}/c^2\)

Fig. 6
figure 6

Expected (open dots with 1\(\sigma \) and 2\(\sigma \) bands) and observed (full dots) cross-section times branching fraction upper limits (95% CL) for the processes \(P\!\,C\) as a function of the LLP mass; the LLP lifetime \(\tau _\mathrm{{LLP}}\) is indicated in each plot, \(m_\mathrm{h^0} =125\,\mathrm{{GeV}}/c^2.\) The results correspond to the 8 TeV dataset

8 Conclusion

Long-lived massive particles decaying into a muon and two quarks have been searched for using proton-proton collision data collected by LHCb at \(\sqrt{s} =7\) and 8 TeV, corresponding to integrated luminosities of 1 and 2 \(\,\hbox {fb}^{-1}\), respectively. The background is dominated by \({b} {\overline{{b}}} \) events and is reduced by tight selection requirements, including a dedicated multivariate classifier. The number of candidates is determined by a fit to the LLP reconstructed mass with a signal shape inferred from the theoretical models.

LHCb can study the forward region \(2<\eta <5\), and its low trigger \(p_{\mathrm { T}}\) threshold allows the experiment to explore relatively small LLP masses. The analysis has been performed assuming four LLP production mechanisms with the topologies shown in Fig. 1, covering LLP lifetimes from 5 ps up to 100\({\,\mathrm{{ps}}}\) and masses in the range 20–80  \(\mathrm{{GeV}}/c^2\). One of the processes proceeds via the decay of a Higgs-like particle into two LLPs: the mass of the Higgs-like particle is varied between 50 and 130 \(\mathrm{{GeV}}/c^2\), comprising the mass of the scalar boson discovered by the ATLAS and CMS experiments. In addition, the full set of neutralino production mechanisms available in Pythia in the context of MSSM/mSUGRA has been considered, with an LLP mass range 23–198  \(\mathrm{{GeV}}/c^2\). The results for all theoretical models considered are compatible with the background-only hypothesis. Upper limits at 95% CL are set on the cross-section times branching fractions.