Angular moments of the decay $\Lambda_b^0 \rightarrow \Lambda \mu^{+} \mu^{-}$ at low hadronic recoil

An analysis of the angular distribution of the decay $\Lambda_b^0 \rightarrow \Lambda \mu^{+} \mu^{-}$ is presented, using data collected with the LHCb detector between 2011 and 2016 and corresponding to an integrated luminosity of approximately $5\,fb^{-1}$. Angular observables are determined using a moment analysis of the angular distribution at low hadronic recoil, corresponding to the dimuon invariant mass squared range $15<q^{2}<20\, GeV^2/c^4$. The full basis of observables is measured for the first time. The lepton-side, hadron-side and combined forward-backward asymmetries of the decay are determined to be \begin{align} A_{FB}^{l}&= -0.39 \pm 0.04\,\rm{stat} \pm 0.01\, \rm{syst}, \nonumber\\ A_{FB}^{h}&= -0.30 \pm 0.05\,\rm{stat} \pm 0.02\, \rm{syst}, \nonumber\\ A_{FB}^{lh}&= +0.25 \pm 0.04\,\rm{stat} \pm 0.01\, \rm{syst}. \nonumber \end{align} The measurements are consistent with Standard Model predictions.


Introduction
In the Standard Model of particle physics (SM), the decay Λ 0 b → Λµ + µ − proceeds via a b to s quark flavour-changing neutral-current transition. The decay is consequently rare in the SM, with a branching fraction of order 10 −6 [1]. In extensions of the SM the branching fraction and angular distribution of the decay can be modified significantly, with the latter providing a large number of particularly sensitive observables (see e.g. Ref. [2]). The rate and angular distribution of corresponding B meson decays have been studied by the B-factory experiments, CDF at the TeVatron and the ATLAS, CMS and LHCb experiments at the LHC. A global analyses of the measurements favours a modification of the coupling strengths of the b to s transition from their SM values at the level of 4 to 5 standard deviations [3][4][5][6][7]. The decay Λ 0 b → Λµ + µ − has several important phenomenological differences to the B meson decays: the Λ 0 b baryon is a spin-half particle and could be produced polarised; the transition involves a diquark system as a spectator, rather than a single quark; and the Λ baryon decays weakly resulting in observables related to the hadronic part of the decay that are not present in the meson decays. The decay Λ 0 b → Λµ + µ − therefore provides an important additional test of the SM predictions, which can be used to improve our understanding of the nature of the anomalies seen in the B meson decays.
The decay Λ 0 b → Λµ + µ − was first observed by the CDF collaboration [8]. The LHCb collaboration has subsequently studied the rate of the decay as a function of the dimuon invariant mass squared, q 2 , in Refs. [9,10]. In the LHCb analysis, evidence for a signal was only found at low hadronic recoil (corresponding to the range 15 < q 2 < 20 GeV 2 /c 4 ). This is consistent with recent SM predictions based on Lattice QCD calculations of the form factors of the decay [1]. The angular distribution of the decay was studied for the first time in Ref. [10], using two projections of the five-dimensional angular distribution of the decay and a data set corresponding to an integrated luminosity of 3 fb −1 . The analysis measured two angular asymmetries using the hadronic and leptonic parts of the decay in the range 15 < q 2 < 20 GeV 2 /c 4 . This paper presents the first measurement of the full basis of angular observables for the decay Λ 0 b → Λµ + µ − in the range 15 < q 2 < 20 GeV 2 /c 4 . 1 The measurement uses pp collision data, corresponding to an integrated luminosity of approximately 5 fb −1 , collected between 2011 and 2016 at centre-of-mass energies of 7, 8 and 13 TeV. The paper is organised as follows: Sec. 2 introduces the moment analysis used to characterise the angular observables; Sec. 3 describes the LHCb detector; Sec. 4 outlines the selection of Λ 0 b → Λµ + µ − candidates, where the Λ is reconstructed in the pπ − final state; Sec. 5 presents the fit to the invariant-mass distribution of pπ − µ + µ − candidates, from which the yield of the Λ 0 b → Λµ + µ − signal is obtained; results are given in Sec. 7; Section 8 summarises potential sources of systematic uncertainty; and conclusions are presented in Sec. 9.

Moments of the angular distribution
The angular distribution of the Λ 0 b → Λµ + µ − decay can be described using a normal unit-vector,n, defined by the vector product of the beam direction and the Λ 0 b momentum vector, and five angles [11]: the angle, θ, betweenn and the Λ baryon direction in the rest frame of the Λ 0 b baryon; polar and azimuthal angles θ and φ describing the decay of the dimuon system; and polar and azimuthal angles θ b and φ b describing the decay of the Λ baryon. An explicit definition of the angular basis is provided in Appendix A. The beam direction is assumed to be aligned with the positive z direction in the LHCb coordinate system [12]. 2 The small crossing angle of the colliding beams is neglected in the analysis but is considered as a source of systematic uncertainty. If the Λ 0 b baryon is produced without any preferred polarisation, the angular distribution only depends on the angles θ and θ b and on the angle between the decay planes of the Λ baryon and the dimuon system (φ + φ b ). An illustration of this angular basis can be found in Ref. [11].
The full angular distribution, averaged over the range 15 < q 2 < 20 GeV 2 /c 4 , can be described by the sum of 34 q 2 -dependent angular terms [11], where Ω ≡ (cos θ, cos θ , φ , cos θ b , φ b ) and the f i ( Ω) functions have different dependencies on the angles. The K i parameters depend on the underlying short-distance physics and on the form factors governing the Λ 0 b → Λ transition. The full form of the distribution is given in Appendix B. Equation 1 is normalised such that 2K 1 + K 2 = 1. Twenty-four of the observables, K 11 to K 34 , are proportional to the Λ 0 b production polarisation and are zero if the Λ 0 b baryons are produced unpolarised. The reduced form of the angular distribution in the case of zero production polarisation can be found in Refs. [2,13].
The K i parameters can be determined from data by means of a maximum-likelihood fit or via a moment analysis [14,15]. The latter is preferred in this analysis due to the small size of the available data sample and the large number of unknown parameters.
To determine the values of the K i parameters, weighting functions g i ( Ω) are chosen to project out individual angular observables. The g i ( Ω) functions, which are orthogonal to the f j ( Ω) functions, are normalised such that The set of weighting functions used in this analysis can be found in Refs. [11,15] and listed in Appendix B. For the case of ideal detector response and in the absence of background, the K i parameters can be estimated from data by summing over the observed candidates.
In realistic scenarios, per-candidate weights are necessary to compensate for nonuniform selection efficiency and background contamination. The K i parameters are then estimated as where w n is the product of the two weights associated with candidate n. The background is subtracted using weights based on the sPlot technique [16,17]. The efficiency to reconstruct and select the candidates is determined using samples of simulated events. The small effects of finite angular resolution are neglected in the analysis but are considered as a source of systematic uncertainty.

Detector and simulation
The LHCb detector [12,18] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector (VELO) surrounding the pp interaction region [19], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [20] placed downstream of the magnet. The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance between a track and a primary pp interaction vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors (RICH1 and RICH2) [21]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified with a system composed of alternating layers of iron and multiwire proportional chambers [22]. The online event selection is performed by a trigger [23], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. The signal candidates are required to pass through a hardware trigger that selects events containing at least one muon with large p T or a pair of muons with a large product of their transverse momenta. The p T threshold of the single muon trigger varied in the range between 1 and 2 GeV/c, depending on the data-taking conditions. The subsequent software trigger requires a two-, threeor four-track secondary vertex with a significant displacement from any PV. At least one of the tracks must have a transverse momentum p T > 1 GeV/c and be inconsistent with originating from a PV. A multivariate algorithm [24] is used to identify whether the secondary vertex is consistent with the decay of a b hadron.
Samples of simulated Λ 0 b → Λµ + µ − events are used to develop an offline event selection and to quantify the effects of detector response, candidate reconstruction and selection on the measured angular distribution. In the simulation, pp collisions are generated using Pythia [25,26] with a specific LHCb configuration [27]. Decays of hadrons are described by EvtGen [28], in which final-state radiation is generated using Photos [29]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [30] as described in Ref. [31]. The samples of simulated data are corrected to account for observed differences relative to data in detector occupancy, vertex quality and the production kinematics of the Λ 0 b baryon. The particle identification performance of the detector is measured using calibration samples of data.

Candidate selection
Signal candidates are formed by combining a Λ baryon candidate with two oppositely charged particles that are identified as muons by the muon system and have track segments in the VELO. Only muon pairs with q 2 in the range 15 < q 2 < 20 GeV 2 /c 4 , where the majority of the Λ 0 b → Λµ + µ − signal is expected to be observed, are considered.
Candidates in the range 8 < q 2 < 11 GeV 2 /c 4 , which predominantly consist of decays via an intermediate J/ψ meson that subsequently decays to µ + µ − , are also retained and used to cross-check various aspects of the analysis. Candidate Λ decays are reconstructed in the Λ → pπ − decay mode from two oppositely charged tracks. The tracks are reconstructed in one of two categories, depending on where the Λ decayed in the detector. The two tracks either both include information from the VELO (long candidates) or both do not include information from the VELO (downstream candidates). 3 The Λ candidates must also have: a vertex fit with a good χ 2 ; a decay time of at least 2 ps; an invariant mass within 30 MeV/c 2 of the known Λ mass [32]; and a decay vertex at z < 2350 mm. The requirement on the decay position removes background from hadronic interactions in the material at the exit of the RICH1 detector. The Λ baryon and the dimuon pair are required to form a vertex with a good fit quality. The resulting Λ 0 b candidate is required to be consistent with originating from one of the PVs in the event and to have a vertex position that is significantly displaced from that PV.
An artificial neural network is trained to further suppress combinatorial background, in which tracks from an event are mistakenly combined to form a candidate. The neural network uses simulated Λ 0 b → Λµ + µ − decays as a proxy for the signal and candidates from the upper mass sideband of the data, with a Λµ + µ − invariant mass greater than 5670 MeV/c 2 , for the background. The inputs to the neural network are: the χ 2 of the vertex fit to the Λ 0 b candidate; the Λ 0 b decay-time and the angle between the Λ 0 b momentum vector and the vector between the PV and the Λ 0 b decay vertex; the Λ flight distance from the PV, its p T and reconstructed mass; the IP of the muon with the highest p T ; the IP of either the pion or proton from the Λ, depending on which has the highest p T ; and a measure of the isolation of the Λ 0 b baryon in the detector. The working point of the neural network is chosen to maximise the expected significance of the Λ 0 b → Λµ + µ − signal in the 15 < q 2 < 20 GeV 2 /c 4 region, assuming the branching fraction measured in Ref. [10]. It is checked that selecting events based on their neural network response does not introduce any significant bias in the reconstructed pπ − µ + µ − mass distribution, m(pπ − µ + µ − ). Figure 1 shows the pπ − µ + µ − mass distribution of the selected candidates in the Run 1 and Run 2 data sets, separated into the long-track and downstream-track pπ − categories. The candidates comprise a mixture of Λ 0 b → Λµ + µ − decays, combinatorial background and a negligible contribution from other b-hadron decays. The largest single component of the latter arises from the decay B 0 → K 0 S µ + µ − , where the K 0 S meson decays to π + π − and is mis-reconstructed as a Λ baryon.

Candidate yields
The yield of Λ 0 b → Λµ + µ − decays is determined by performing an unbinned extended maximum-likelihood fit to m(pπ − µ + µ − ). In the fit, the signal is described by the sum of two modified Gaussian functions, one with a power-law tail on the low-mass side and the other with a power-law tail on the high-mass side of the distribution. The two Gaussian functions have a common peak position and width parameter but different tail parameters and relative fractions. The tail parameters and the relative fraction of the two functions is fixed from fits performed to simulated Λ 0 b → Λµ + µ − decays. The mean and width are determined from fits to Λ 0 b → J/ψ Λ candidates in the data. A small correction is applied to the width parameter to account for a q 2 dependence of the resolution seen in the simulation. Combinatorial background is described by an exponential function, with a slope parameter that is determined from data. The parameters describing the signal and the background are determined separately for each data-taking period and for the long-and the downstream-track pπ − categories.
The fits result in yields of 120 ± 13 (175 ± 15) and 126 ± 13 (189 ± 16 ) decays in the long (downstream) pπ − category of the Run 1 and Run 2 data, respectively. These fits are used to the determine the weights needed to subtract the background in the moment analysis. The yields are consistent with those expected based on the estimated signal efficiency, the recorded integrated luminosity and the scaling of the Λ 0 b production cross-section with centre-of-mass energy.

Angular efficiency
The trigger, reconstruction and the selection process distort the measured angular distribution of the Λ 0 b → Λµ + µ − decays. The largest distortions are found to be the result of kinematic requirements in the reconstruction, most notably due to an implicit momentum threshold applied by requiring that the muons traverse the detector and reach the muon system. The angular efficiency is parameterised in six dimensions taking into account the correlations between the different angles and the q 2 -dependence of the angular efficiency.
where the L t (x) denote a Legendre polynomial of order t in variable x, and the q 2 range considered has been rescaled linearly between −1 and +1. The coefficients c ijmnrs are determined by performing a moment analysis of Λ 0 b → Λµ + µ − decays simulated according to a phase-space model. The simulated decays are weighted such that they are uniformly distributed in q 2 and in the five angles, after which the angular distribution of the selected decays is proportional to the efficiency.
To achieve a good parameterisation of the efficiency, a large number of terms is required. The number of terms is reduced using an iterative approach. As a first step, the efficiency projection of each variable is parameterised independently using the sum of Legendre polynomials of up to eighth order. As a second step, correlations between pairs of angles and between individual angles and q 2 are accounted for in turn. These corrections are parameterised by sums involving pairs of polynomials that run up to sixth order in each variable. As a final step, a six-dimensional correction is applied allowing for polynomials of up to first order in the angles and q 2 . Before each step, the simulated decays are corrected to remove the effects parameterised in the previous step. Small differences in the efficiency to reconstruct p/p and π + /π − are neglected. The angular efficiency model is cross-checked in data using Λ 0 b → J/ψ Λ and B 0 → J/ψ K 0 S decays, with J/ψ → µ + µ − . These decays have a similar topology to the Λ 0 b → Λµ + µ − decay and well known angular distributions. For the B 0 → J/ψ K 0 S decay, where the K 0 S decays to π + π − , the parameter K 1 is one-half and the remaining observables are equal to zero. The angular distribution of the Λ 0 b → J/ψ Λ decay is compatible with the measurements in Refs. [33-35].

Results
The angular observables are obtained using a moment analysis of the angular distribution, weighting candidates as described in Sec. 6 to account for their detection efficiency. Background is subtracted using weights obtained from the sPlot technique from the fits described in Sec. 5. The weights used to correct for the efficiency and subtract the background are determined separately for each data-taking period and for the long-track and downstream-track pπ − categories. The K i parameters are then determined from a data set that combines the two reconstruction categories. As the polarisation of the Λ 0 b baryons at production may vary with centre-of-mass energy between the Run 1 data, collected at √ s = 7 and 8 TeV, and the Run 2 data, collected at √ s = 13 TeV, these two data sets are initially treated independently. The results for the two data-taking periods are given in Appendix C. The statistical uncertainties on the various K i parameters are determined using a bootstrapping technique [36]. In each step of the bootstrap, the process of subtracting the background and the weighting of the candidates is repeated.
A χ 2 comparison of the results from the two data-taking periods, taking into account the correlations between the observables, yields a χ 2 of 35.0 with 33 degrees of freedom. This indicates an excellent agreement between the two data sets and suggests that the production polarisation is consistent for the centre-of-mass energies studied. The Run 1 and Run 2 data samples are therefore combined and the observables are determined on the combined sample. The results are given in Table 1. The correlation between the angular observables is presented in Appendix D. Figure 2 shows the one-dimensional angular projections of cos θ , cos θ b , cos θ, φ and φ b for the background-subtracted candidates. The data are described well by the product of the angular distributions obtained from the moment analysis and the efficiency model. Figure 3 compares the measured observables with their corresponding SM predictions, obtained from the EOS software [37] using the values of the Λ 0 b production polarisation measured in Ref. [33]. The values of the observables K 11 to K 34 are consistent with zero. This is expected from measurements of the angular distribution of the decay Λ 0 b → J/ψ Λ by CMS [35] and LHCb [33], which indicate that the production polarisation of Λ 0 b baryons is small in pp collisions at 7 and 8 TeV. The measurements are consistent with the SM predictions for K 1 to K 10 . The largest discrepancy is seen in K 6 , which is 2.6 standard deviations from the SM prediction. The angular observables result in an angular distribution that is not positive for all values of the angles. To obtain a physical angular distribution, K 6 has to move closer to its SM value. The measured K i values are also consistent with the values predicted by new physics scenarios favoured by global fits to data from b to s quark transitions [3][4][5][6][7]. These new physics scenarios result in only a small change of K 1 to K 10 in the low-recoil region.
The K i observables can be combined to determine the angular asymmetries where the first uncertainties are statistical and the second are the systematic uncertainties that are discussed in the following section. The forward-backward asymmetries A FB and A h FB are in good agreement with the SM predictions. The asymmetry A h FB , which is proportional to K 6 , is 2.6 standard deviations from its SM prediction. The value of A h FB is consistent with that measured in Ref. [10]. The value of A FB is not comparable due to an inconsistency in the definition of θ in that reference. 4

Systematic uncertainties
The angular observables may be sensitive to systematic effects arising from imperfect modelling of either the angular efficiency or the m(pπ − µ + µ − ) distribution. Where possible, systematic uncertainties are estimated using pseudoexperiments. These are generated from a systematically varied model and the observables are then estimated using the nominal analysis, neglecting the variation in the generation. The sources of systematic uncertainty considered are listed in Table 2. In general, systematic uncertainties are found to be small compared to the statistical uncertainties on the measurements.
The largest systematic uncertainties in modelling the angular efficiency are from the size of the simulated data samples and the order of the Legendre polynomials used to 4 Under the definition of θ used in Ref. [10], A FB measured the asymmetry difference between Λ 0 b and Λ 0 b decays rather than the average of the asymmetries. 0.050 ± 0.084 ± 0.023 K 33 0.022 ± 0.060 ± 0.009 K 17 −0.000 ± 0.120 ± 0.022 K 34 0.060 ± 0.058 ± 0.009 parameterise the efficiency. The former is determined by bootstrapping the simulated sample and re-evaluating the model. The latter is estimated by increasing the order of polynomials used in the efficiency parameterisation by up to two orders. By default, the efficiency model is chosen to have the minimum number of terms needed to get a good description of both the simulated and the data control samples. Increasing further the number of terms results in an overfitting of statistical fluctuations in the simulated data used to determine the efficiency model (due to the limited size of the simulated data set). A systematic uncertainty due to the modelling of the data by the simulation is estimated by varying the tracking and muon identification efficiencies, and by applying an additional correction to the p T and η spectra of the Λ 0 b baryons. The impact of neglecting angular resolution when determining the angular observables is estimated by smearing pseudoexperiments according to the resolution determined using simulated data. The angular resolution is poorest for θ, θ b and φ b in the downstream pπ − category, at around 90 mrad for θ and θ b and 150 mrad for φ b .
In the calculation of the angular basis, the crossing angle of the LHC beams is neglected. The impact of this is estimated by generating pseudoexperiments with the correct crossing angle and neglecting this when the angular observables are determined.
The systematic uncertainty due to modelling the shape of the signal mass distribution is small. The main contribution to this uncertainty comes from the modelling of the tails of the signal mass distribution. The factorisation of the mass model and the angular distribution, which is a requirement of the sPlot technique, is also tested and results in a negligible systematic uncertainty. The background is subtracted from the data but no efficiency correction is applied. The projection of each angular distribution obtained from the moment analysis multiplied by the efficiency distribution is superimposed. The large variation in φ is primarily due to the angular acceptance.

Summary
An analysis of the angular distribution of the decay Λ 0 b → Λµ + µ − in the dimuon invariant mass squared range 15 < q 2 < 20 GeV 2 /c 4 is reported. Using data collected with the LHCb detector between 2011 and 2016, the full basis of angular observables is measured for the first time. From the measured observables, the lepton-side, hadron-side and combined    The results presented here supersede the results for angular observables in Ref. [10] (see discussion in Sec. 7). The measured angular observables are compatible with the SM predictions obtained using the EOS software [37], where the Λ 0 b production polarisation is set to the value obtained by the LHCb collaboration in pp collisions at a centre-of-mass energy of 7 TeV [33].

Appendices A Angular basis
The angular distribution of the decay Λ 0 b → Λµ + µ − is described by five angles, θ, θ , φ , θ b and φ b defined with respect to the normal-vector wherep = p/| p| and the parentheses refer to the rest frame the momentum is measured in. The angle θ is defined by the angle betweenn and the Λ baryon momentum in the Λ 0 b baryon rest frame, i.e.
cos θ =n ·p The decay of the Λ baryon and the dimuon system can be described by coordinate systems The angles θ b and φ b (θ and φ ) are the polar and azimuthal angle of the proton (µ + ) in the Λ baryon (dimuon) rest frame. The angles are defined by wherep b (p ) is the direction of the proton (µ + ) andp ⊥b (p ⊥ ) is a unit vector corresponding to the component perpendicular to theẑ b (ẑ ) axis. For the Λ 0 b decay, the angular variables are transformed such that θ → π − θ , φ → π − φ and φ b → −φ b . This ensures that, in the absence of CP violating effects, the K i observables are the same for Λ 0 b and Λ 0 b decays.
C Results separated by data-taking period Tables 3 and 4 show the values of the observables for each of the two data-taking periods. Table 3 shows the values of the observables combining the 2011 data, collected at √ s = 7 TeV, and the 2012 data, collected at √ s = 8 TeV. Table 4 shows the values of the observables in the Run 2 data, collected at √ s = 13 TeV. 0.140 ± 0.088 ± 0.012 Table 4: Measured values for the angular observables from the Run 2 data combining the results of the moments obtained from the candidates reconstructed in the long-and downstream-track pπ − categories. The first and second uncertainties are statistical and systematic, respectively.

D Correlation matrices
Figures 4 and 5 shows the statistical correlation between the angular observables determined using bootstrapped samples. The correlation coefficients are typically small but can be as large as 30-40% between pairs of observables. The observables K 1 and K 2 are fully anticorrelated due to the normalisation of the observables, which requires 2K 1 + K 2 = 1. The correlation matrices in numerical form are attached as supplementary material to this article.