Test of lepton universality with $\Lambda^{0}_{b} \to p K^- \ell^+ \ell^-$ decays

The ratio of branching fractions of the decays $\Lambda^{0}_{b}\to pK^{-}e^{+}e^{-}$ and $\Lambda^{0}_{b}\to pK^{-}\mu^{+}\mu^{-}$, $R^{-1}_{pK}$, is measured for the first time using proton-proton collision data corresponding to an integrated luminosity of 4.7 $fb^{-1}$ recorded with the LHCb experiment at center-of-mass energies of 7, 8 and 13 TeV. In the dilepton mass-squared range $0.1<q^{2}<6.0$ $GeV^{2}/c^{4}$ and the $pK^{-}$ mass range $m(pK^{-})<2600$ $MeV/c^{2}$, the ratio of branching fractions is measured to be $R^{-1}_{pK} = 1.17 ^{+0.18}_{-0.16} \pm 0.07$, where the first uncertainty is statistical and the second systematic. This is the first test of lepton universality with b baryons and the first observation of the decay $\Lambda^{0}_{b}\to pK^{-}e^{+}e^{-}$.


Introduction
Decays involving b → s + − transitions, where ± represents a lepton, are mediated by flavour-changing neutral currents (FCNC). Since FCNCs are forbidden at tree level in the Standard Model (SM) and can only proceed through amplitudes involving electroweak loop (penguin and box) Feynman diagrams, these transitions are an ideal place to search for effects beyond the SM. The potential contributions of new particles to these processes can be manifested as modifications in the rate of particular decay modes, or changes in the angular distribution of the final-state particles. Hints for possible disagreement with the SM have been reported, for example in several measurements of angular observables [1][2][3][4] of rare b → s + − decays. The SM predictions of these quantities are affected by hadronic uncertainties and more precisely predicted observables are desirable.
In the SM, the electroweak couplings of the charged leptons are independent of their flavour. The properties of decays to leptons of different flavours are expected to be the same up to corrections related to the lepton mass. This property, referred to as Lepton Universality (LU), has already been tested in B-meson decays by measuring the ratio where H represents a hadron containing an s quark, such as a K or a K * meson. The decay rate, Γ, is integrated over a range of the squared dilepton invariant masses, q 2 . The R H ratios allow for very precise tests of LU, as hadronic uncertainties cancel in their theoretical predictions. In the SM, they are expected to be close to unity with O(1%) precision [5]. At e + e − machines operating at the Υ (4S) resonance, the ratios R K ( * ) have been measured to be consistent with unity with a precision between 20 and 50% [6][7][8][9]. The most precise measurements of R K in the q 2 range between 1.1 and 6.0 GeV 2 /c 4 and R K * 0 in the regions 0.045 < q 2 < 1.1 GeV 2 /c 4 and 1.1 < q 2 < 6.0 GeV 2 /c 4 have been performed by the LHCb collaboration and, depending on the theoretical prediction used, are respectively 2.5 [10], 2.1-2.3 and 2.4-2.5 [11] standard deviations below their SM expectations [5,[12][13][14][15][16][17][18][19][20][21]. Further tests of LU in other b → s + − transitions are therefore critical to improve the statistical significance of the measurement and to understand the origin of any discrepancies. At the LHC, Λ 0 b baryons are produced abundantly and b → s + − transitions can also be studied in their decays. The full set of angular observables in Λ 0 b → Λµ + µ − decays has been measured in Ref. [22] and CP asymmetries have been determined using Λ 0 b → pK − µ + µ − decays [23]. This paper presents the first test of LU in the baryon sector, through the measurement of the ratio of branching fractions for Λ 0 b → pK − µ + µ − and Λ 0 b → pK − e + e − decays, 1 R pK . Both the experimental signature of the decays and the large data sample available motivate the choice of Λ 0 b → pK − + − decays for this study. Similarly to other R H ratios, R pK is expected to be close to unity in the SM [24].
The complementarity between R K and R K * 0 measurements in constraining different types of new physics scenarios is widely discussed in the literature, see for example Ref. [25]. The spin one-half of the Λ 0 b baryon and the rich resonant structure of the pK − hadronic system [23,26] indicate a similar situation in Λ 0 b → pK − + − decays, where complementary constraints could be derived once the pK − resonant structures are analysed. Following the observations of Ref. [23] on the hadronic system, this analysis is restricted to invariant masses m(pK − ) < 2.6 GeV/c 2 , where most of the signal occurs. The analysis is performed in a wide q 2 region between 0.1 GeV 2 /c 4 and 6.0 GeV 2 /c 4 . The lower boundary is chosen to be far enough from the dimuon kinematic threshold so that the effect of radiative corrections is negligible on the R pK ratio, using similar arguments to those discussed in Ref. [5]. The upper boundary is set to reduce contamination from the radiative tail of the J/ψ resonance. Contamination from Λ 0 b → pK − φ(→ + − ) decays is estimated to be negligible, therefore no veto around the φ mass is applied to the dilepton spectrum.
Relying on the well-tested LU in J/ψ → + − decays [27], the measurement is performed as a double ratio of the branching fractions of the Λ 0 b → pK − + − and Λ 0 b → pK − J/ψ(→ + − ) decays: where the two decay channels are also referred to as the "nonresonant" and the "resonant" modes, respectively. Due to the similarity between the experimental effects on the nonresonant and resonant decay modes, many sources of systematic uncertainty are substantially reduced in the double ratio. This approach helps to mitigate the significant differences in reconstruction between decays with muons or electrons in the final state, which are mostly due to bremsstrahlung emission and the trigger response. The experimental quantities relevant for the LU measurement are the yields and the reconstruction and selection efficiencies of the four decays entering the double ratio. The definition of R −1 pK ensures that the smaller electron yields are placed in the numerator, granting a likelihood function with a more symmetrical distribution. In order to avoid experimental biases, a blind analysis is performed. In addition to the determination of the R −1 pK ratio, this analysis provides the first measurement of the Λ 0 b → pK − µ + µ − branching fraction and the first observation of the Λ 0 b → pK − e + e − decay. Due to the lack of information on the exact resonant content in the pK − spectrum, it is challenging to compute the expected branching fraction of these decays in the SM, for which no prediction has been found in the literature. Predictions for specific excited Λ resonances, Λ * , in the decays Λ 0 b → Λ * + − with Λ * → pK − , have been computed [28,29] but cannot be directly compared to this result. This paper is organised as follows: Sec. 2 describes the LHCb detector, as well as the data and the simulation samples used in this analysis; the sources of background and selection procedure of the signal candidates are discussed in Sec. 3; Sec. 4 details how the simulation is corrected in order to improve the modelling of the signal and background distributions in data and the efficiency determination; the resonant mass fits and related cross-checks are outlined in Sec. 5; Sec. 6 summarises the fit procedure and the systematic uncertainties associated with the measurements are described in Sec. 7; the results are presented in Sec. 8; and Sec. 9 presents the conclusions of this paper.

Detector and data sets
The LHCb detector [30,31] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic (ECAL) and a hadronic (HCAL) calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The trigger system consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. The hardware muon trigger selects events containing at least one muon with significant p T (with thresholds ranging from ∼ 1.5 to ∼ 1.8 GeV/c, depending on the data-taking period). The hardware electron trigger requires the presence of a cluster in the ECAL with significant transverse energy, E T , (from ∼ 2.5 to ∼ 3.0 GeV, depending on the data-taking period). The software trigger requires a two-, three-or four-track secondary vertex, with a significant displacement from any primary pp interaction vertex. At least one charged particle must have significant p T and be inconsistent with originating from any PV. A multivariate algorithm [32] is used for the identification of secondary vertices consistent with the decay of a b hadron.
The analysis is performed using a data sample corresponding to 3 fb −1 of pp collision data collected with the LHCb detector at a centre-of-mass energy of 7 and 8 TeV (Run 1) and 1.7 fb −1 at a centre-of-mass energy of 13 TeV collected during 2016 (Run 2).
Samples of simulated Λ 0 b → pK − µ + µ − , Λ 0 b → pK − e + e − , Λ 0 b → pK − J/ψ(→ µ + µ − ) and Λ 0 b → pK − J/ψ(→ e + e − ) decays, generated according to the available phase space in the decays, are used to optimise the selection, determine the efficiency of triggers, reconstruction and signal event selection, as well as to model the shapes used in the fits to extract the signal yields. The simulation is corrected to match the distributions observed in data using the Λ 0 b → pK − J/ψ control modes, as detailed in Sec. 4. In addition, specific simulated samples are exploited to estimate the contribution from various background sources. The pp collisions are generated using Pythia [33] with a specific LHCb configuration [34]. Decays of hadronic particles are described by EvtGen [35], in which final-state radiation (FSR) is generated using Photos [36], which is observed to agree with a full QED calculation at the level of ∼ 1% for the R K and R K * 0 observables [5]. The interactions of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [37] as described in Ref. [38].

Selection and backgrounds
The Λ 0 b candidates are formed from a pair of well reconstructed oppositely charged particles identified as muons or electrons, combined with a pair of oppositely charged particles, which are identified as a proton and a kaon. The pK − invariant mass is required to be smaller than 2600 MeV/c 2 . Each particle is required to have a large momentum and p T , and to not originate from any PV. In particular, for muon and electron candidates the p T is required to be greater than 800 MeV/c and 500 MeV/c, respectively. Kaon candidates must have a p T larger than 250 MeV/c 2 and the proton p T is required to be larger than 400 MeV/c in Run 1, and 1000 MeV/c in Run 2. All the particles must originate from a good-quality common vertex, which is displaced significantly from all reconstructed PVs in the event. When more than one PV is reconstructed, that with the smallest χ 2 IP is selected (and referred to as the associated PV), where χ 2 IP is the difference in χ 2 of a given PV reconstructed with and without tracks associated to the considered Λ 0 b candidate. The momentum direction of the Λ 0 b is required to be consistent with its direction of flight. When interacting with the material of the detector, electrons radiate bremsstrahlung photons. If the photons are emitted upstream of the magnet, the photon and the electron deposit their energy in different ECAL cells, and the electron momentum measured by the tracking system is underestimated. A dedicated procedure, consisting in a search for neutral energy deposits in the ECAL compatible with being emitted by the electron, is applied to correct for this effect. The limitations of the recovery technique degrade the resolution of the reconstructed invariant masses of both the dielectron pair and the Λ 0 b candidate [11].
The distribution of q 2 as a function of the four-body invariant mass for Λ 0 b candidates is shown in Fig. 1 for both muon and electron final states. In each plot, the contributions due to the J/ψ and ψ(2S) resonances are visible. Despite the recovery of bremsstrahlung photons, the e + e − invariant-mass distribution has a long radiative tail towards low values. Due to the correlation in the measurement of the q 2 and the pK − + − invariant mass, the Λ 0 b → pK − J/ψ and Λ 0 b → pK − ψ(2S) contributions are visible as diagonal bands. Signal Λ 0 b → pK − + − candidates form a vertical band, which is less prominent for the electron mode due to worse mass resolution and lower yield. The effect of the resolution motivates the choice of invariant-mass ranges considered for the analysis, which is presented in Table 1. The Λ 0 b invariant-mass resolution and the signal and background contributions depend on the way in which the event was selected by the hardware trigger. The data sample of decay modes involving e + e − pairs is therefore divided into two mutually exclusive categories: candidates triggered by activity in the event which is not associated with any Table 1: Resonant and nonresonant mode q 2 and m(pK − + − ) ranges. The variables m(pK − + − ) and m J/ψ (pK − + − ), corresponding to the four-body invariant mass and the four-body invariant mass computed with the J/ψ mass constraint on the dilepton system, are used for nonresonant and resonant Λ 0 b decays, respectively.
0.1 -6.0 4.80 -6.32 resonant e + e − 6.0 -11.0 5.30 -6.20 nonresonant µ + µ − 0.1 -6.0 5.10 -6.10 resonant µ + µ − 8.41 -10. 24 5. 30 -5.95 of the signal decay particles (L0I), and candidates for which at least one of the electrons from the Λ 0 b decay satisfies the hardware electron trigger and that are not selected by the previous requirement (L0E). For the decay modes involving a pair of muons, at least one of the two leptons must satisfy the requirements of the hardware muon trigger.
An important source of background arises from the misidentification of one or both of the final-state hadrons, denoted as hadron misidentification, which is common to both the resonant and nonresonant decays. All eight possible combinations of hadrons that can be misidentified as signal, namely K + K − , π + K − , pπ − , pp, K + p, K + π − , π + p and π + π − , are investigated using Λ 0 b → pK − J/ψ(→ µ + µ − ) candidates in data. Contributions from misidentification of a single hadron are found to be dominant, namely where a pion or a kaon is misidentified as a proton. A veto is applied to candidates with m(K + K − ) in a ±12 MeV/c 2 mass window around the known φ mass in order to suppress the narrow φ contribution in misidentified B 0 s → K + K − J/ψ(→ + − ) and B 0 s → K + K − + − decays. Finally, a double misidentification of the K and p hadrons, referred to as pK-swap, can occur. The particle identification (PID) requirements are optimised to suppress these backgrounds. Residual background contributions passing the candidate selection, namely and pK-swap, are included in the invariant-mass fits to the data described in Sec. 5.
For both the electron and muon resonant modes, a kinematic fit that constrains the dilepton invariant mass to the known mass of the J/ψ meson is used to compute the four-body invariant mass, m J/ψ (pK − + − ). A requirement on the four-body invariant mass m J/ψ (pK − + − ) for the resonant and m(pK − µ + µ − ) for the nonresonant mode to be larger than 5100 MeV/c 2 excludes backgrounds due to partially reconstructed decays, of the type Λ 0 b → pK − + − X, where one or more of the products of the Λ 0 b decay, denoted X, are not reconstructed. These components can not be fully suppressed in the nonresonant electron mode and are taken into account in the fit. For the decay modes involving electrons, where a wider invariant-mass range is used, cascade backgrounds arising mainly where potential additional particles X, Y are not reconstructed, are suppressed by a dedicated veto requiring m(pK − + ) > 2320 MeV/c 2 . This requirement also allows the contamination from the hadronic decay Λ + c → pK − π + to be removed. Additional vetoes are applied to suppress backgrounds from D 0 mesons and Λ 0 b → pK − J/ψ(→ µ + µ − ) decays, where the identification of a muon and a kaon are swapped. Events in which the decay products of a B − → K − + − decay are combined with a random proton are suppressed by requiring m(K − + − ) < 5200 MeV/c 2 . A two-dimensional requirement based on the invariant mass of signal candidates calculated using the corrected dielectron momentum (m corr ) and the significance of the measured distance between the PV and the decay vertex is applied to reduce the partially reconstructed backgrounds. Following the procedure of Ref. [11], m corr is computed by correcting the momentum of the dielectron pair by the ratio of the pK − and the dielectron momentum components transverse to the Λ 0 b direction of flight. After all the selection procedures described above, the dominant remaining background is that originating from the combination of random tracks in the detector. This source is referred to as combinatorial background, and its properties vary between different q 2 regions. The separation between the signal and the combinatorial background is achieved using a Boosted Decision Tree (BDT) algorithm [39], which exploits the gradient boosting technique [40]. The classifier is constructed using variables such as transverse momenta, the quality of the vertex fit, the impact parameter χ 2 of the final-state particles, the angle between the direction of flight and the momentum of the Λ 0 b candidate, and the minimum p T of the hadron pair and of the lepton pair. For each run period, a single BDT classifier is trained for the resonant and nonresonant decays, where final states involving muons and electrons are treated separately. The classifiers are trained using simulated Λ 0 b → pK − + − decays, which are corrected for known differences between data and simulation (see Sec. 4), to represent the signal, and candidates in data with pK − + − invariant mass larger than 5825 MeV/c 2 are used to represent the background samples. To avoid potential biases and to fully exploit the size of the data sample for the training procedure, a k-folding technique [41] is adopted, with k = 10. For each decay mode and run period, the cut applied on the classifier is optimised using a figure of merit defined as N S / √ N S + N B , where N S is the expected signal yield and N B is the expected background yield, which is estimated by fitting the invariant mass sidebands in data. The BDT selection suppresses the combinatorial background by approximately 97% and retains 85% of the signal. The efficiency of each classifier is independent of m(pK − + − ) in the regions used to measure the signal yields. Once all the selection requirements are applied, less than 2.5% of the events contain multiple candidates. In these cases, one candidate per event is selected randomly and retained for further analysis. The effect of the multiple candidate removal cancels in the ratios measured in this analysis.

Corrections to the simulation and efficiencies
In order to optimise the selection criteria, model the invariant-mass shapes and accurately evaluate the efficiencies, a set of corrections to the simulation is determined from unbiased control samples selected from the data. These corrections are applied to the simulated samples of the nonresonant and resonant modes. The first correction accounts for the incorrect description of the hadronic structure of Λ 0 b → pK − + − and Λ 0 b → pK − J/ψ(→ + − ) decays. The simulation of these decays for both the resonant and nonresonant modes relies on a simple phase-space model, while it is known from Ref. [26] that several resonances populate the pK − invariant mass distribution of Λ 0 b → pK − J/ψ(→ µ + µ − ) decays. Corrections based on an amplitude analysis performed in Ref. [26] are applied to simulated Λ 0 b → pK − J/ψ(→ + − ) and Λ 0 b → pK − + − decays. Differences between data and simulation in the kinematics of Λ 0 b decays are accounted for using two-dimensional corrections derived from data as a function of the p T and pseudorapidity, η, of the Λ 0 b Efficiency ratios between the nonresonant and resonant modes, , for the muon final state and electron final state in the two trigger categories and data-taking periods. The uncertainties are statistical only.

Channel
Run 1 Run 2 µ + µ − 0.756 ± 0.010 0.796 ± 0.013 e + e − (L0I) 0.862 ± 0.017 0.859 ± 0.018 e + e − (L0E) 0.630 ± 0.013 0.631 ± 0.013 candidate. The simulation samples used in this analysis were generated with a value of the Λ 0 b lifetime that did not account for newer and more accurate measurements [27]; a correction is applied to account for this small discrepancy.
A correction is also applied to account for differences between the PID response in data and simulation [42]. Several high-purity control samples are employed to evaluate the PID efficiencies in data using a tag-and-probe technique. For kaons and protons, samples of D * + → D 0 (→ K − π + )π + and Λ 0 b → Λ + c (→ pK − π + )π − are used, respectively. Finally, the electron and muon identification efficiencies are obtained from B + → K + J/ψ(→ + − ) decays. For each type of particle, the corrections are evaluated as a function of track momentum and pseudorapidity. Corrections obtained from the distributions of the number of reconstructed tracks per event, compared between data and simulation, are used to account for the mismodelling in the average event multiplicity. The simulated response of both the hardware and software triggers is corrected for using a tag-and-probe technique on Λ 0 b → pK − J/ψ(→ + − ) candidates. The corrections for the response of the leptonic hardware triggers are parametrised as a function of the cluster E T or track p T . For the software trigger, the corrections are determined as a function of the minimum p T of the Λ 0 b decay products. Once all the corrections are applied to the simulation, very good agreement between data and simulation is found.
The efficiency for selecting each decay mode, which enters the computation of R −1 pK , is defined as the product of the geometrical acceptance of the detector, and the efficiency of the complete reconstruction of all tracks, the trigger requirements and the full set of kinematic, PID and background rejection requirements. It takes into account migration between bins of q 2 due to resolution, FSR and bremsstrahlung emission. The efficiency ratios between the nonresonant and the resonant modes, which directly enter the R −1 pK computation, are reported in Table 2.

Mass fit to the resonant modes
The resonant yields are determined from unbinned extended maximum-likelihood fits to the m J/ψ (pK − + − ) distributions separately for various data-taking periods. For the Λ 0 b → pK − J/ψ(→ µ + µ − ) decay, the probability density function (PDF) for the signal is modelled by a bifurcated Crystal Ball (CB) function [43], which consists of a Gaussian core with asymmetric power-law tails. The parameters describing the tails are fixed from a fit to simulated signal decays. However, in order to account for possible remaining discrepancies with data, the mean and the width of the function are allowed to vary freely in the fit. The invariant-mass distribution of Λ 0 b → pK − J/ψ(→ e + e − ) decays is fitted independently for the two trigger categories, since different relative amounts of Invariant-mass distribution, with the J/ψ mass constraint applied, of Λ 0 b → pK − J/ψ(→ µ + µ − ) (left) and Λ 0 b → pK − J/ψ(→ e + e − ) (right) candidates, summed over trigger and data-taking categories. The black points represent the data, while the solid blue curve shows the result of the fit. The signal component is represented by the red curve and the shaded shapes are the background components, as detailed in the legend.
background and signal are expected. In each category, a sum of two bifurcated CB functions is used to model the signal shape. Similarly to the approach adopted for the muon mode, the parameters describing the tails of the signal distributions are fixed from the fits to simulated signal. In addition, the difference of the means of the two functions, and the ratio of their widths are also fixed according to the simulation. The mean and the width of one CB function are allowed to vary. For both electron and muon modes, the combinatorial background is parametrised using an exponential function with a free slope. Contributions from misidentified B 0 → K * 0 J/ψ(→ + − ) and B 0 s → K + K − J/ψ(→ + − ) decays and from pK-swap are included in the fits. They are described separately for the electron and muon modes, using kernel estimation techniques [44] applied to simulated events. The signal yield, as well as the yields of the combinatorial background and B 0 components are free parameters of the fit. The yields of the pK-swap component are related to the signal yields by a factor estimated from the Λ 0 b → pK − J/ψ(→ µ + µ − ) fit and propagated to the electron mode. The ratios between the B 0 s and B 0 background components are fixed from dedicated fits to the data. The results of the invariant-mass fits, including data from all the trigger categories and data-taking periods, are shown in Fig. 2. A total of 40 980 ± 220 and 10 180 ± 140 decays are found for the muon and electron resonant modes, respectively.
An important cross-check of the efficiencies is done using the ratio of branching fractions of the muon and electron resonant channels which is expected to be equal to unity [27]. The measurement of r −1 J/ψ is a very stringent test since, contrary to the double ratio R −1 pK , it does not benefit from the cancellation of the experimental systematic uncertainties related to the differences in the treatment of muons and electrons. This quantity is found to be r −1 J/ψ = 0.96 ± 0.05, where the uncertainty combines both statistical and systematic effects. Similar sources of systematic uncertainties to the R −1 pK measurement are considered (see Sec. 7). The value of r −1 J/ψ is compatible with unity within one standard deviation. The r −1 J/ψ ratio is examined as a function of a number of kinematic variables such as p T and η of the Λ 0 b baryon, m(pK − ), the final-state particle p T and the BDT classifier response. In all of the cases the result is compatible with a flat distribution. The validity of the analysis is tested by measuring the double ratio ratio is found to be compatible with unity within statistical uncertainties. However its statistical power is limited by the reduced phase-space available in this high-q 2 region.

Mass fit to the nonresonant modes
An unbinned maximum-likelihood fit to the invariant-mass distribution of nonresonant pK − + − candidates is performed simultaneously to the muon and electron modes in all the trigger and data-taking categories to extract the observables of interest. For each category i, the nonresonant yields are expressed in terms of the parameters of interest , (5) where N i is the event yield for the given decay in category i, i the reconstruction and selection efficiency in that category, and r B ≡ B(Λ 0 b → pK − µ + µ − )/B(Λ 0 b → pK − J/ψ) and R −1 pK the observables. The yields of the resonant modes are obtained from the fits described in Sec. 5, and the ratios of efficiencies are extracted from calibrated simulated samples and reported in Table 2. The branching fraction of the leptonic decay of the J/ψ meson is assumed to be flavour universal [27]. For the nonresonant decays, no constraint can be imposed on the dilepton mass, and the pK − + − invariant-mass resolution is therefore worse than in the resonant case. For the electron final state, it is significantly degraded compared to the resolution in the muon case. The fit range is extended accordingly as summarised in Table 1. As a consequence, more sources of background have to be taken into account in the electron mode. Both models are described separately in the following.
The Λ 0 b → pK − µ + µ − signal contribution is modelled by a bifurcated CB function, with the tail parameters determined on simulated data. The mean and the width of the distribution are allowed to vary freely in the fit to data. The combinatorial background is described with an exponential PDF with free slope and yield. The contamination from misreconstructed B 0 → K * 0 µ + µ − and B 0 s → K + K − µ + µ − decays is modelled by kernel estimation techniques applied to simulation. The B 0 → K * 0 µ + µ − yield is constrained to the value expected from simulation and the measured branching fraction [27] and the relative contributions of B 0 s → K + K − µ + µ − and B 0 → K * 0 µ + µ − decays are constrained to the ratio observed in the corresponding J/ψ modes. An associated systematic uncertainty is added for this choice. The contamination from pK-swap candidates is found to be negligible for the nonresonant modes, so no component is added to the fit to account for it.
The Λ 0 b → pK − e + e − signal component is modelled by the sum of three distributions, describing candidates where the electron candidates have no associated bremsstrahlung photon, have only one, or more than one. In the first case, the distribution presents a tail at low mass, due to unrecovered losses, but no tail at high mass and is thus modelled by a single CB function. The other two present a smaller tail at low mass, since energy losses are partially recovered, but also a tail at high mass, due to wrongly associated photons, and are modelled by the sum of two bifurcated CB functions. The tail parameters of these functions are fixed from fits to simulated signal. The proportions between the three cases are also obtained from simulation. Combinatorial and misidentified backgrounds are modelled in an analogous way to the muon mode. However, partially reconstructed backgrounds of the type Λ 0 b → pK − e + e − π 0 , where the π 0 is not reconstructed, cannot be efficiently excluded in this case, due to the worse resolution and the wider invariant-mass range used in the electron mode fit. This background is modelled using kernel estimation techniques applied to simulated Λ 0 b → pK * − e + e − events, with K * − → K − π 0 , since this is the most realistic physical background contributing to this type of decay. The yield of this component is free to vary in the fit to data. Finally, Λ 0 b → pK − J/ψ(→ e + e − ) decays that lose energy by bremsstrahlung can also pollute the nonresonant Λ 0 b → pK − e + e − candidates in the low invariant-mass region. This contribution is modelled using simulated events. Its yield is constrained in the fit, based on the measured Λ 0 b → pK − J/ψ(→ e + e − ) yield and the probability of such q 2 migration determined using simulated samples. The stability of the fit is evaluated with a large number of pseudoexperiments before proceeding to the final fit to data. The moments of the pull distributions of the R −1 pK and r B parameters are examined and the estimators are observed to be unbiased.
The results of the fit to data, where candidates are accumulated over all the trigger and data-taking categories, are shown in Fig. 3. In total, 444 ± 23 Λ 0 b → pK − µ + µ − and 122 ± 17 Λ 0 b → pK − e + e − decays are observed.

Systematic uncertainties
Systematic uncertainties arise from the computation of efficiencies, the limited precision on the measurement of the resonant mode yields and the fit model. Uncertainties that are uncorrelated between different trigger and data-taking categories are taken into account as Gaussian constraints on the input parameters to the fit, so that they are accounted for by the uncertainty returned by the fit. Correlated uncertainties are accounted for by smearing the likelihood profile for the given parameter of interest.
The main systematic uncertainties on the ratio of branching fractions, r B , come from the procedure used to correct the simulation for the imperfect description of the Λ 0 b → pK − µ + µ − decay model and the detector response. The first one is evaluated  Figure 3: Invariant-mass distribution of (left) Λ 0 b → pK − µ + µ − and (right) Λ 0 b → pK − e + e − candidates summed over trigger and data-taking categories. The black points represent the data, while the solid blue curve shows the total PDF. The signal component is represented by the red curve and the combinatorial, B 0 → K * 0 + − and B 0 s → K + K − + − components by yellow, brown and green filled histograms. In the electron model, the grey and blue filled histograms represent the partially reconstructed and Λ 0 b → pK − J/ψ(→ e + e − ) backgrounds.
by reweighting the distributions of m(pK − ), q 2 and the helicity angles, cos θ K and cos θ , in the Λ 0 b → pK − µ + µ − simulation to match those observed in data, instead of the amplitude model of the Λ 0 b → pK − J/ψ(→ µ + µ − ) decay explained in Sec. 4. The distributions of m(pK − ), q 2 and the helicity angles are corrected separately and the systematic uncertainties are added in quadrature. Since this is a decay-model effect, it is correlated between different data-taking periods and trigger categories. For the other corrections applied to simulation, the systematic uncertainty is evaluated using an alternative parameterisation of the correction, as well as different control samples to determine the corrections. After all the corrections are applied, a small disagreement between data and simulation is seen in the proton momentum and impact parameter distributions. An associated systematic effect is estimated by correcting these distributions to match those observed in data.
A bootstrapping technique is used to evaluate the effect of the limited size of the simulated samples used to calculate the corrections. The systematic uncertainties accounting for data and simulation differences are computed separately for each data-taking period and trigger category and are thus uncorrelated. Systematic uncertainties associated with the fit model are estimated using pseudoexperiments and are fully correlated between data-taking periods. Different sets are generated with alternative B 0 → K * 0 µ + µ − and B 0 s → K + K − µ + µ − yields and different smearing parameters for the nonparametric shapes. Alternatively, possible contributions of partially reconstructed backgrounds with a missing π 0 meson or from cascade decays of the type H b → H c (→ K − µ + ν µ X)µ − ν µ Y , where H denotes hadrons and the potential additional particles X and Y are not always reconstructed, are also included in the generated sets. These generated samples are fit with the default model and the difference obtained on r B is assigned as a systematic uncertainty. Also, the uncertainties on the Λ 0 b → pK − J/ψ(→ µ + µ − ) yields are propagated to the systematic uncertainties of r B . The systematic uncertainties associated to the measurement of the ratio of branching fractions are summarised in Table 3.
The sources of systematic uncertainties described for r B also affect the double ratio R −1 pK , but their sizes are expected to be smaller due to cancellations in the ratios. However, some additional sources have to be considered, which are specific to the electron mode and are related to the worse resolution of the nonresonant decay compared to the resonant one. Signal decays that migrate in and out of the 0.1 < q 2 < 6 GeV 2 /c 4 window due to resolution effects are taken into account in the efficiency determination. However, potential mismodelling of the q 2 resolution or its distribution in the simulation can introduce a systematic bias. The first effect is estimated by smearing the q 2 distribution of Λ 0 b → pK − e + e − decays in simulation according to the differences observed between Λ 0 b → pK − J/ψ(→ e + e − ) data and simulated candidates. Similarly, the effect of an alternative q 2 model is estimated by weighting simulated Λ 0 b → pK − e + e − events to match the q 2 distribution of B 0 → K * 0 e + e − decays generated with the model described in Ref. [45]. This uncertainty is taken to be fully correlated between trigger categories and data-taking periods. Potential disagreement between the resolution in simulation and data for the m corr variable, which is only used in the selection of Λ 0 b → pK − e + e − candidates, is studied with Λ 0 b → pK − J/ψ(→ e + e − ) candidates. A correction is obtained by comparing the distribution of this quantity for Λ 0 b → pK − J/ψ(→ e + e − ) candidates in data and simulation and is applied to the Λ 0 b → pK − e + e − simulation. No significant variation on the efficiency is found but a systematic contribution corresponding to one half of its uncertainty is conservatively assigned and considered to be fully correlated between trigger categories and data-taking periods. Systematic uncertainties affecting the Λ 0 b → pK − e + e − fit model are evaluated using pseudoexperiments. The scale factor of the signal width is varied by ±5%, the kernel of the nonparametric models describing the B 0 → K * 0 e + e − , B 0 s → K + K − e + e − , Λ 0 b → pK − e + e − π 0 and Λ 0 b → pK − J/ψ(→ e + e − ) backgrounds is varied and a component describing cascade H b → H c (→ K − + ν e X) − ν e Y decays is added to the model. The largest effect comes from the limited knowledge of the Λ 0 b → pK − e + e − π 0 invariant-mass shape. It is alternatively obtained from simulated decays with an intermediate ∆ resonance decaying to pπ 0 , decays with an intermediate Λ(1810) resonance decaying to pK * − , followed by K * − → K − π 0 , and from decays with no resonant structure. The latter approach gives the largest variation in the signal yield with respect to the default fit model, which is assigned as systematic uncertainty. Ignoring this background in the fit model is also considered, but provides a smaller difference in the signal yield. These uncertainties are treated as fully correlated between trigger categories and data-taking periods. The systematic uncertainties associated to the measurement of R −1 pK are summarised in Table 4. As a cross-check, the effect of all the corrections applied to the simulation is evaluated by removing them and estimating the change in the R −1 pK value. A 8.5% effect is observed on the double ratio.

Results
The ratio of branching fractions r B = B(Λ 0 b → pK − µ + µ − )/B(Λ 0 b → pK − J/ψ(→ µ + µ − )) and the R −1 pK observable in the range 0.1 < q 2 < 6 GeV 2 /c 4 and m(pK − ) < 2600 MeV/c 2 are obtained directly from the fit to data candidates. The result for the ratio of branching fractions is where the first uncertainty is statistical and the second systematic. The absolute branching fraction for the decay Λ 0 b → pK − µ + µ − in the range 0.1 < q 2 < 6 GeV 2 /c 4 and m(pK − ) < 2600 MeV/c 2 is computed using the value of B(Λ 0 b → pK − J/ψ) measured by LHCb [46] where the first uncertainty is statistical, the second is systematic and the third and fourth are due to the precision of the normalisation mode Λ 0 b → pK − J/ψ, namely the knowledge of the B 0 → J/ψK * 0 branching fraction and the Λ 0 b hadronisation fraction. The result of the test of LU in Λ 0 b → pK − + − decays, R −1 pK , in the range 0.1 < q 2 < 6 GeV 2 /c 4 and m(pK − ) < 2600 MeV/c 2 is where the first uncertainty is statistical and the second systematic. The profile likelihood of the R −1 pK parameter, including the smearing accounting for correlated systematic uncertainties, is shown in Fig. 4. The result is compatible with unity at the level of one standard deviation. For comparison with other LU tests, R pK is computed from the R −1 pK results by inverting the minimum and one standard deviation lower and upper bounds of the likelihood profile R pK | 0.1<q 2 <6 GeV 2 /c 4 = 0.86 + 0.14 − 0.11 ± 0.05, with a more asymmetric likelihood distribution in this case. The first observation of the rare decay Λ 0 b → pK − e + e − is also reported, with a significance greater than 7σ, accounting for systematic uncertainties. Combining the results obtained for r B and R −1 pK , and taking into account the correlations, the ratio of branching fractions for the dielectron final states is obtained where the first uncertainty is statistical and the second systematic. Taking into account the measured value of B(Λ 0 b → pK − J/ψ) [46], the branching fraction of the nonresonant electron mode is found to be where the first uncertainty is statistical, the second systematic and the third and fourth are due to the uncertainties on B(Λ 0 b → pK − J/ψ).

Conclusions
A test of lepton universality is performed for the first time using rare b-baryon decays, namely Λ 0 b → pK − + − with = e, µ. The measurement is performed in the range 0.1 < q 2 < 6 GeV 2 /c 4 and m(pK − ) < 2600 MeV/c 2 and the result is found to be R −1 pK = 1.17 + 0.18 − 0.16 ± 0.07, compatible with unity within one standard deviation. This result is also in agreement with the deviations observed in lepton-universality tests with B mesons [10,11], denoted R K and R K * 0 . More data is needed to confirm or exclude the presence of New Physics contributions in these decays. It should be noted that the current analysis is affected by different experimental uncertainties than those of lepton-universality tests performed with B mesons, such as the backgrounds that affect the extraction of the signal yields from data, or the control modes which are used to calibrate the simulation and measure the double ratio. Consequently, it provides an independent test of the SM.
The first measurement of the branching fraction of the rare muonic decay mode Λ 0 b → pK − µ + µ − is also performed and its value is found to be B(Λ 0 b → pK − µ + µ − )| 0.1<q 2 <6 GeV 2 /c 4 = 2.65 ± 0.14 ± 0.12 ± 0.29 + 0.38 − 0.23 × 10 −7 , where the uncertainty is dominated by the limited knowledge of the Λ 0 b → pK − J/ψ normalisation mode. This result is obtained in the range m(pK − ) < 2600 MeV/c 2 , which includes several resonant structures, and thus cannot be directly compared to the recent predictions computed for the exclusive decay Λ 0 b → Λ(1520) + − [29]. Finally, the electron mode Λ 0 b → pK − e + e − is observed for the first time with a significance larger than 7σ including systematic uncertainties, and its branching fraction is determined by combining the results of R −1 pK and B(Λ 0 b → pK − µ + µ − )/B(Λ 0 b → pK − J/ψ), B(Λ 0 b → pK − e + e − )| 0.1<q 2 <6 GeV 2 /c 4 = 3.1 ± 0.4 ± 0.2 ± 0.3 + 0.4 − 0.3 × 10 −7 . This is the first observation of a rare b-baryon decay with electrons in the final state and it opens the door to further tests of lepton universality in baryon decays.