Observation of the suppressed decay $\Lambda^{0}_{b}\rightarrow p\pi^{-}\mu^{+}\mu^{-}$

The suppressed decay $\Lambda^{0}_{b}\rightarrow p\pi^{-}\mu^{+}\mu^{-}$, excluding the $J/\psi$ and $\psi(2S)\rightarrow \mu^{+}\mu^{-}$ resonances, is observed for the first time with a significance of 5.5 standard deviations. The analysis is performed with proton-proton collision data corresponding to an integrated luminosity of $3\mathrm{fb}^{-1}$ collected with the LHCb experiment. The $\Lambda^{0}_{b}\rightarrow p\pi^{-}\mu^{+}\mu^{-}$ branching fraction is measured relative to the $\Lambda^{0}_{b}\rightarrow J/\psi(\rightarrow \mu^{+}\mu^{-})p\pi^{-}$ branching fraction giving \begin{align} \nonumber \frac{\mathcal{B}(\Lambda^{0}_{b}\rightarrow p\pi^{-}\mu^{+}\mu^{-})}{\mathcal{B}({\Lambda^{0}_{b}\rightarrow J/\psi(\rightarrow \mu^{+}\mu^{-})p\pi^{-}})}&= 0.044\pm0.012\pm0.007, \end{align} where the first uncertainty is statistical and the second is systematic. This is the first observation of a $b\rightarrow d$ transition in a baryonic decay.


Introduction
The decay of the Λ 0 b baryon into the pπ − µ + µ − final state, where the muons do not originate from a hadronic resonance, is mediated by a b → d transition. Such decays are highly suppressed in the Standard Model (SM), as the leading order amplitudes are described by loop diagrams and are also suppressed by the relevant Cabibbo-Kobayshi-Maskawa (CKM) factors. This suppression is not necessarily present in extensions to the SM, and such decays are therefore sensitive to contributions from new particles. One of the lowest-order diagrams for the decay Λ 0 b → pπ − µ + µ − is shown in Fig. 1. The branching fraction of the decay ‡ Λ 0 b → pπ − µ + µ − is expected to be of O(10 −8 ). Together with the relevant form factors, a measurement of this branching fraction with respect to that of the analogous b → s transition, Λ 0 b → pK − µ + µ − , would allow the ratio of CKM elements |V td |/|V ts | to be determined. Comparing the value of |V td |/|V ts | from these processes with that measured via mixing processes would test the Minimal Flavour Violation hypothesis [1][2][3].
At present, no form-factor calculations have been made for the Λ 0 b → pπ − µ + µ − and Λ 0 b → pK − µ + µ − channels due to the complicated hadronic structure in the proton-meson systems. However, recent advances in lattice calculations [4] could make this possible in the future.
This paper describes a search for the decay Λ 0 b → pπ − µ + µ − , using proton-proton collision data corresponding to an integrated luminosity of 3 fb −1 . The data were collected with the LHCb experiment at centre-of-mass energies of 7 and 8 TeV. The branching fraction is determined relative to that of the tree-level decay, Λ 0 b →J/ψ (→µ + µ − )pπ − , denoted as Λ 0 b → J/ψ pπ − hereafter, which has been measured with a precision of 15% [5,6]. Figure 1: One of the lowest-order diagrams for the decay Λ 0 b → pπ − µ + µ − .

Detector and simulation
The LHCb detector [7,8] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two Ring-Imaging Cherenkov (RICH) detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger [9], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. Simulated events are used to optimise selection criteria and calculate the relative efficiency between the signal and normalisation channels. In the simulation, pp collisions are generated using Pythia [10] with a specific LHCb configuration [11]. Decays of hadronic particles are described by EvtGen [12], in which final-state radiation is generated using Photos [13]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [14], as described in Ref. [15].

Selection
The Λ 0 b → pπ − µ + µ − signal candidates are first required to pass the hardware trigger, which selects events containing at least one muon with p T greater than 1.48 GeV/c in the 7 TeV data or p T > 1.76 GeV/c in the 8 TeV data. In the subsequent software trigger, at least one of the final-state particles is required to have p T > 1.7 GeV/c in the 7 TeV data or p T > 1.6 GeV/c in the 8 TeV data. For muon candidates, a softer requirement of p T > 1.0 GeV/c is applied. The final-state particles that satisfy these transverse momentum criteria are also required to have an impact parameter larger than 100 µm with respect to all PVs in the event. Finally, the tracks of two or more of the final-state particles are required to form a vertex that is significantly displaced from all PVs.
Signal candidates are reconstructed by combining two oppositely-charged muons with two additional tracks that are identified as a proton and a pion using particle identification (PID) information that comes primarily from the RICH detectors. All final-state particles are required to have a good-quality track fit and to be inconsistent with originating from a PV. The pion (proton) candidates are required to have p T > 0.4 GeV/c and momentum greater than 2.0 (7.5) GeV/c. The four final-state particles are required to form a goodquality vertex, where the resulting Λ 0 b candidate is consistent with originating from a PV. The vertex is also required to be significantly displaced from this PV. In order to reject the background from Λ 0 b → J/ψ pπ − and Λ 0 b → ψ(2S)pπ − decays, the regions 8.0 < q 2 < 11.0 GeV 2 /c 4 and 12.5 < q 2 < 15.0 GeV 2 /c 4 are excluded from the signal search, where q 2 refers to the invariant mass squared of the two muons. In addition, contributions from Λ 0 b → Λ 0 (→ pπ − )µ + µ − decays are removed by requiring m pπ − > 1.12 GeV/c 2 . Several fully reconstructed decays with at least one misidentified particle can form backgrounds that peak in the distribution of the pπ − µ + µ − mass, m pπ − µ + µ − . Specific vetoes are used to reject such backgrounds. The vetoes require that if the invariant mass of the candidate is consistent with a particular hypothesis, then a more restrictive PID requirement is applied. For example, if the proton candidate is assigned the kaon mass and falls within the mass range 5246 < m K + π − µ + µ − < 5330 MeV/c 2 , the PID cut is significantly tightened to reduce K → p misidentification from B 0 → K + π − µ + µ − decays. Other possible sources of specific backgrounds are the decays After the vetoes have been applied, the only significant residual background contribution for the signal (normalisation) channel comes from the decay . This contamination is treated as a systematic uncertainty in the signal channel and is considered explicitly when extracting the yield of the normalisation channel.
b mass, are also explicitly considered when determining the signal yield.
A boosted decision tree (BDT) [16], with the AdaBoost algorithm [17] and a five-fold cross-validation method [18], is used to reduce combinatorial background. The BDT is trained and optimised on data. Candidates with m pπ − µ + µ − > 6000 MeV/c 2 are used as a sample representative of the background, and Λ 0 b → J/ψ (→ µ + µ − )pK − candidates selected from the data are used as a proxy for the signal sample. The BDT uses kinematic, geometric and PID variables associated with the proton to discriminate between the signal and background candidates. The two most discriminating input variables are the vertex quality of the Λ 0 b candidate and its consistency with originating from a PV. In order to reject background containing additional tracks in close proximity to the Λ 0 b vertex, an isolation parameter [19] is also used as an input variable. As the presence of a proton from a displaced vertex is a distinctive signature, PID information on the proton candidate is used in the BDT in order to improve the rejection of background. Other, less discriminating variables used in the BDT include the minimum impact parameter with respect to any PV and the momenta of the final-state particles. The requirement on the BDT response is optimised by maximising the figure of merit [20] defined as where ε sel is the selection efficiency for the signal and B is the background expected within 40 MeV/c 2 of the Λ 0 b mass. After candidates have been reconstructed and the above selection criteria have been applied, the requirement on the BDT output retains 65% of signal events and rejects 99% of the background.

Normalisation
The branching fraction of Λ 0 b → pπ − µ + µ − can be determined from where N (X) is the yield of the final state X and ε(X) is the efficiency to select that final state. The efficiencies are obtained from simulated events and specific control samples in the data. Since the normalisation channel Λ 0 b → J/ψ pπ − has the same final state and similar kinematics as the signal decay, many systematic uncertainties cancel in the efficiency ratio.
Control channels selected from the data are used to account for several effects that are mismodelled in the simulation. For example, the PID efficiencies are obtained from data samples with decays where the final-state particles can be identified by kinematic constraints alone [21]. Further corrections are derived by comparing the data and simulation distributions of the Λ 0 b momentum, transverse momentum, decay time and the track multiplicity for the normalisation channel. The relative efficiency of the BDT is calculated using both Λ 0 b → J/ψ pK − and Λ 0 b → pK − µ + µ − candidates selected from the data; the resulting efficiencies are consistent with each other. The most important difference in the efficiency between the signal and normalisation modes is due to the q 2 selection for the signal decay, which removes 30% of the signal candidates. For the full selection, including the dimuon mass vetoes, the total relative efficiency is found to be 022. For the normalisation channel, candidates are required to have a dimuon mass within 60 MeV/c 2 of the known J/ψ mass. The yield of the normalisation channel is obtained by performing an extended unbinned maximum likelihood fit to the Λ 0 b → J/ψ pπ − mass distribution, as shown in Fig. 2. The shape of the Λ 0 b → J/ψ pπ − mass distribution is described by the sum of two Gaussian functions with power law tails and a shared mean, where the Gaussian parameters are allowed to vary in the fit and the tail parameters are obtained from the simulation. Combinatorial background is parameterised with an exponential function with a decay constant that is allowed to vary in the fit. Finally, there is a small contribution from the decay Λ 0 b → J/ψ pK − , the shape of which is determined from the simulation and included in the fit to the data. In total, 1017 ± 41 Λ 0 b → J/ψ pπ − candidates are observed. This yield is significantly lower than in Refs [6,22], owing to the tighter selection employed to search for the Λ 0 b → pπ − µ + µ − decay.

Results
The fit to the invariant mass distribution of Λ 0 b → pπ − µ + µ − candidates, excluding the J/ψ and ψ(2S) regions, is shown in Fig. 3. The signal shape is determined from the fit to the normalisation decay in data, with corrections for the differences between the signal and normalisation modes obtained from the simulation. The combinatorial background is parameterised as in the fit for the normalisation mode. The shape of the partially reconstructed background is obtained from a fit to the Λ 0 b → pK − µ + µ − mass spectrum and the yield is allowed to vary in the fit to the Λ 0 b → pπ − µ + µ − mass distribution.   A signal contribution is clearly visible and Wilks' theorem [23] gives a significance of 5.5 standard deviations. The systematic uncertainties described in Sec. 6 are mainly associated with the normalisation. Only the systematic uncertainty arising from the shape assumed for the partially reconstructed background has any appreciable impact on the significance. Releasing the constraints on the relevant parameters, the significance increases to 5.7 standard deviations. Pseudoexperiments indicate that, on-average, the significance would be expected to decrease by 0.3 standard deviations. Given the statistical variation, the observed increase is perfectly compatible with the expectation. This analysis therefore constitutes the first observation of the decay Λ 0 b → pπ − µ + µ − . The number of signal candidates is found to be 22 ± 6, which is converted to relative and absolute branching fractions of and B(Λ 0 b → pπ − µ + µ − ) = (6.9 ± 1.9 ± 1.1 +1.3 −1.0 ) × 10 −8 using Eq. 1. In both cases, the first uncertainty given is statistical and the second is the systematic uncertainty, which is discussed in the next section. The third uncertainty on B(Λ 0 b → pπ − µ + µ − ) arises from the limited knowledge of the Λ 0 b → J/ψ pπ − [5, 6] and J/ψ →µ + µ − [24] branching fractions.

Systematic uncertainties
The systematic uncertainties are summarised in Table 1. The total systematic uncertainty is 16.1%, which is comparable to but smaller than the statistical uncertainty. The largest systematic uncertainty originates from the decay model used to simulate the signal. There are two components to this uncertainty. The first originates from the unknown q 2 distribution for the signal decay. As no model for the Λ 0 b → pπ − µ + µ − decay currently exists, the model for the decay Λ 0 b → Λ 0 (→ pπ − )µ + µ − from Ref. [25] is used to derive the q 2 distribution. To assess the systematic uncertainty from this assumption, the decay Λ 0 b → pK − µ + µ − is instead assumed to describe the signal q 2 distribution and the difference in relative efficiency is assigned as a systematic uncertainty. The q 2 distribution for the Λ 0 b → pK − µ + µ − decay is obtained from data weighted using the sPlot technique [26]. An uncertainty of 7.9% is found. The second component of the systematic uncertainty due to the decay model is the distribution of the pπ − invariant mass. In this case, the distribution in the simulation is weighted to match the data for the Λ 0 b → J/ψ pπ − decay and the efficiency is reevaluated. The difference of 7.7% in relative efficiency between these two cases is taken as a systematic uncertainty.
Another important source of systematic uncertainty is related to the assumption that the partially reconstructed background for the signal has the same shape as the partially reconstructed background in Λ 0 b → pK − µ + µ − decays. The effect of this assumption is estimated by allowing the shape parameters for the partially reconstructed background component to vary in the fit, and then calculating the resulting bias in the background estimation using pseudoexperiments. This results in a 6.9% uncertainty on the signal yield. As noted above, this is the only systematic uncertainty that has an appreciable effect on the significance for the observation of the decay Λ 0 b → pπ − µ + µ − .
Other, smaller uncertainties are assigned to the calculation of the efficiency: the calibration of the BDT efficiency using data (5.6%); the finite size of the simulation samples used (4.4%) and possible mismodelling of the trigger (3.4%). The statistical uncertainty on the Λ 0 b → J/ψ pπ − yield gives rise to a systematic uncertainty of 4.0%. Due to the low number of signal candidates, a small bias in the signal yield is observed. The size of this bias is calculated using pseudoexperiments and results in a 2.2% systematic uncertainty. No Λ 0 b → pK − µ + µ − contribution is considered for the Λ 0 b → pπ − µ + µ − fit, due to the low expected yield. The Λ 0 b → J/ψ pK − decay is used to assess the resulting systematic uncertainty, which is 1.6%. The corrections applied to the simulation give rise to a small systematic uncertainty (1.3%), as does the calibration of the PID efficiency using data (1.0%).

Conclusions
A search for the rare decay Λ 0 b → pπ − µ + µ − has been performed with proton-proton collision data collected with the LHCb experiment corresponding to 3 fb −1 of integrated luminosity. The search is made excluding the J/ψ and ψ(2S) →µ + µ − resonances. A signal is observed with a significance of 5.5 standard deviations, which constitutes the first observation of a b → d transition in a baryonic decay. The relative and absolute branching fractions are measured to be where the first uncertainties are statistical and the second are systematic. The third uncertainty on B(Λ 0 b → pπ − µ + µ − ) arises from the limited knowledge of the Λ 0 b → J/ψ pπ − [6] and J/ψ →µ + µ − [24] branching fractions. With further advances in lattice QCD combined with a Λ 0 b → pK − µ + µ − branching fraction measurement, this result will allow |V td |/|V ts | to be measured, enabling a test of the Minimal Flavour Violation hypothesis.