Angular analysis of the rare decay $B_s^0\to\phi\mu^+\mu^-$

An angular analysis of the rare decay $B^0_s\rightarrow\phi\mu^+\mu^-$ is presented, using proton-proton collision data collected by the LHCb experiment at centre-of-mass energies of $7$, $8$ and $13~\rm{TeV}$, corresponding to an integrated luminosity of $8.4~\rm{fb}^{-1}$. The observables describing the angular distributions of the decay $B^0_s\rightarrow\phi\mu^+\mu^-$ are determined in regions of $q^2$, the square of the dimuon invariant mass. The results are consistent with Standard Model predictions.


Introduction
Transitions of a b quark to an s quark and a pair of oppositely charged leptons are forbidden at tree level in the Standard Model (SM) and only proceed via higher-order electroweak (loop) diagrams. These transitions constitute powerful probes for New Physics (NP) contributions beyond the SM that can appear in competing diagrams and significantly affect branching fractions and angular distributions of b → s + − decays. Recent studies of b → s + − decays have observed tensions with SM predictions in measurements of branching fractions [1][2][3][4][5][6], angular observables [7-13] and tests of lepton universality [13][14][15][16][17][18][19][20][21]. One of the most significant deviations from SM expectations is observed in the determination of the branching fraction of B 0 s → φµ + µ − decays [6]. 1 The measured branching fraction is found to be 3.6 standard deviations (σ) below a precise SM prediction [22][23][24][25][26] in the squared dimuon mass (q 2 ) region 1.1 < q 2 < 6.0 GeV 2 /c 4 . Angular analyses of b → s + − decays provide information complementary to branching fraction measurements, allowing to probe the operator structure of potential NP contributions. An angular analysis of the decay B 0 s → φµ + µ − [4], using proton-proton (pp) collision data corresponding to 3 fb −1 recorded by the LHCb experiment during 2011-2012, found the angular distributions to be compatible with SM predictions.
This paper presents an updated angular analysis of B 0 s → φµ + µ − decays, where the φ meson is reconstructed in the K + K − final state, using pp collisions recorded by the LHCb experiment corresponding to a total integrated luminosity of 8.4 fb −1 . The data were collected at centre-of-mass energies of 7 TeV (2011), 8 TeV (2012) and 13 TeV (2016-2018) during the LHC Run 1 and Run 2, respectively. For the purposes of this analysis, the data are split according to the 2011-2012 (3 fb −1 ), 2016 (1.7 fb −1 ) and 2017-2018 (3.7 fb −1 ) data-taking periods. The higher bb production cross-section in Run 2 [27,28] yields an approximate four-fold increase in the total number of produced B 0 s mesons compared to the Run 1 data. The criteria used to select candidates in this analysis are identical to those of Ref. [6], with an adapted q 2 binning scheme.
Neglecting the natural width of the φ meson, the B 0 s → φ(→ K + K − )µ + µ − decay rate depends on q 2 , three decay angles, θ l , θ K , and φ, and the decay time of the B 0 s meson [29]. The angle θ l (θ K ) is defined as the angle of the µ − (K − ) with respect to the direction of flight of the B 0 s meson in the µ + µ − (K + K − ) centre-of-mass frame, and φ as the angle between the µ + µ − and the K + K − planes in the B 0 s meson centre-of-mass frame. As the decay flavour of the B 0 s meson cannot be determined from the flavour-symmetric final state, the same angular definition is used for both B 0 s and B 0 s decays. The untagged CP -averaged angular decay rate, Γ + Γ, is measured integrated over the B 0 s decay time and is given for a specific q 2 region by 1 d(Γ + Γ)/dq 2 d 3 (Γ + Γ) d cos θ l d cos θ K dφ = 9 32π 3 4 (1 − F L ) sin 2 θ K (1 + 1 3 cos 2θ l ) + F L cos 2 θ K (1 − cos 2θ l ) + S 3 sin 2 θ K sin 2 θ l cos 2φ + S 4 sin 2θ K sin 2θ l cos φ + A 5 sin 2θ K sin θ l cos φ + 4 3 A CP FB sin 2 θ K cos θ l + S 7 sin 2θ K sin θ l sin φ + A 8 sin 2θ K sin 2θ l sin φ + A 9 sin 2 θ K sin 2 θ l sin 2φ , (1) where the angular observables F L and S 3,4,7 are CP averages, and A CP FB and A 5,8,9 are CP asymmetries [30,31]. The presence of CP asymmetries in Eq. 1 is due to the need to use identical angular definitions for the B 0 s and B 0 s modes [29,31]. Of particular interest are the T -odd CP asymmetries A 8 and A 9 , which are predicted to be close to zero in the SM, but can be large in the presence of NP contributions [31]. As the decay flavour of the B 0 s meson is unknown, the CP -averaged observable S 5 (P 5 ), which has received a lot of attention in the study of B 0 → K * 0 µ + µ − decays [8], cannot be accessed by this analysis.

Detector and simulation
The LHCb detector [32,33] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region [34], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [35,36] placed downstream of the magnet. The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary pp collision vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [37]. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [38].
The online event selection is performed by a trigger system [39]. In this analysis, an initial hardware stage uses information from the muon system to require at least one muon with significant p T in the event. Events passing the hardware trigger enter the software trigger, where a full event reconstruction is applied. At this stage, further requirements are placed on the kinematics of the muon candidates and on the topology of the signal candidate.
Simulated samples are used to determine the effect of reconstruction and selection on the angular distributions of the signal candidates, as well as to estimate expected signal yields and contamination from specific background processes. The pp collisions are simulated using Pythia [40] with a specific LHCb configuration [41]. Decays of unstable particles are described by EvtGen [42], in which final-state radiation is generated using Photos [43]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [44] as described in Ref. [45]. Residual mismodelling of the particle identification performance, the p T spectrum of the B 0 s mesons, the track multiplicity and the efficiency of the hardware trigger is corrected using high-yield control samples from data.

Selection of signal candidates
All tracks in the K + K − µ + µ − final-state are required to have significant χ 2 IP with respect to any PV, where χ 2 IP is defined as the difference in the vertex-fit χ 2 of a given PV reconstructed with and without the track being considered. The final-state particles are further required to be well identified using particle identification information. The B 0 s decay vertex, determined by fitting the four final-state tracks, is required to be of good quality and to be significantly displaced from any PV in the event. The angle between the vector connecting the associated PV with the B 0 s decay vertex and the momentum of the B 0 s candidate (θ DIRA ) is required to be small. The associated PV is defined as that which fits best to the flight direction of the B 0 s candidate. Candidates are accepted if the invariant reconstructed K + K − µ + µ − mass is in the range 5270 < m(K + K − µ + µ − ) < 5700 MeV/c 2 and the invariant mass of the K + K − system is within 12 MeV/c 2 of the known φ mass [46]. Candidates are further required to have a q 2 value in the range 0.1 < q 2 < 18.9 GeV 2 /c 4 . The φ decays dominate the experimental q 2 spectrum in the q 2 regions of 0.98 < q 2 < 1.1 GeV 2 /c 4 , 8 < q 2 < 11 GeV 2 /c 4 and 12.5 < q 2 < 15 GeV 2 /c 4 , respectively. These q 2 regions are therefore excluded from the signal selection but B 0 s → J/ψφ candidates in the 8 < q 2 < 11 GeV 2 /c 4 region are retained as a control mode to develop selection criteria, validate fit behaviour and derive corrections to the simulation.
Background originating from a random combination of tracks (combinatorial background) is reduced using a boosted decision tree (BDT) [47] classifier trained with the AdaBoost algorithm [48] as implemented in the TMVA package [49]. The BDT classifier is trained on data and its performance verified using standard cross-validation techniques [50]. The upper mass sideband, defined as m(K + K − µ + µ − ) > 5567 MeV/c 2 , is used as a proxy for the combinatorial background and enriched in background by relaxing the requirement on the invariant mass of the K + K − system from 12 MeV/c 2 to 50 MeV/c 2 around the known φ mass. As a signal proxy, a sample of B 0 s → J/ψφ candidates from data in a 50 MeV/c 2 mass range around the known B 0 s mass [46] is used, for which background contributions have been statistically subtracted [51] using the invariant reconstructed K + K − µ + µ − mass as the separating variable.
The classifier combines the transverse momentum of the B 0 s , the angle θ DIRA , the fit quality of the B 0 s vertex (vertex-fit χ 2 ), its displacement from the associated PV and the χ 2 IP and particle identification information of all final-state tracks. The selection criterion on the BDT output is chosen according to the figure of merit N sig / N sig + N bkg , where N sig (N bkg ) is the expected number of signal (background) events in the signal region. With respect to the previously described selection criteria, this requirement results in a signal efficiency of 96% and a background rejection of 96%, where the latter considers only contributions from combinatorial background.
Decays of b hadrons where one or more of the final-state particles have been misidentified constitute another important background source, referred to as peaking backgrounds. Contributions include decays of the form B 0 s → J/ψφ, B 0 s → ψ(2S)φ, B 0 → J/ψK * 0 and B 0 → ψ(2S)K * 0 , where a hadron is misidentified as a muon and vice versa. Once misidentified, these decays can contaminate the signal q 2 regions. To further suppress these contributions, more stringent particle identification requirements are placed on candidates where the invariant mass of the µ ± K ∓ system under the dimuon mass hypothesis is close to the known J/ψ or ψ(2S) mass [46].
Other sources of peaking background include Λ 0 b → pK − µ + µ − decays, where the proton is misidentified as a kaon, and B 0 → K * 0 (→ K + π − )µ + µ − decays, where the pion is misidentified as a kaon. The Λ 0 b → pK − µ + µ − decay is additionally suppressed by applying s → φµ + µ − candidates integrated over the 0.1 < q 2 < 0.98 GeV 2 /c 4 , 1.1 < q 2 < 8 GeV 2 /c 4 , 11.0 < q 2 < 12.5 GeV 2 /c 4 and 15.0 < q 2 < 18.9 GeV 2 /c 4 regions for the data-taking periods 2011-2012 (top left), 2016 (top right), and 2017-2018 (bottom). The data are overlaid with the PDF used to describe the m(K + K − µ + µ − ) spectrum, fitted separately for each data set. more stringent particle identification criteria if the invariant mass of a candidate under the relevant misidentification hypothesis is close to the known Λ 0 b mass [46]. No single source of peaking background is found to contribute more than 0.5% of the total signal yield after all selection criteria are applied. Peaking background contributions are therefore neglected in the fit model and a systematic uncertainty is assigned to account for potential residual background pollution. Figure 1 shows the m(K + K − µ + µ − ) distribution for all candidates passing the selection, integrated over the 0.1 < q 2 < 18.9 GeV 2 /c 4 region for the separate data sets, excluding the q 2 regions contaminated by the resonant B 0 The data are overlaid with the fitted probability density function (PDF) described in Sec. 4. Signal yields of 408 ± 23, 402 ± 23 and 1120 ± 40 are found for the 2011-2012, 2016 and 2017-2018 data sets, where the uncertainties are statistical only.

Angular analysis
The angular observables are determined using an unbinned maximum likelihood fit to the invariant K + K − µ + µ − mass distribution and the three decay angles, θ l , θ K , and φ. In the q 2 region below 12.5 GeV 2 /c 4 , the fit is performed separately in narrow q 2 regions of around 2 GeV 2 /c 4 width and in an additional wide q 2 region defined as [1.1, 6.0] GeV 2 /c 4 . Above 15 GeV 2 /c 4 , a single wide region is used, defined as [15.0, 18.9] GeV 2 /c 4 . A finer binning scheme compared to Ref.
[4] is chosen to maximise sensitivity to potential short-distance NP contributions whilst ensuring stable fit behaviour.
The m(K + K − µ + µ − ) distribution is included in the fit to improve the separation power between signal and background. The signal component is modelled in m(K + K − µ + µ − ) by a sum of two Gaussian functions with a common mean and power-law tails towards the upper or lower mass side [52]. The parameters describing the power-law tails are determined using simulated B 0 s → J/ψφ events. The parameters describing the widths and the mean of the Gaussian functions are fixed in the signal mode to the values from a fit to B 0 s → J/ψφ candidates in data. An additional q 2 -dependent scaling factor is determined from simulation and applied to the widths of the Gaussian distributions to account for the q 2 dependence of the m(K + K − µ + µ − ) invariant-mass resolution. The angular distribution for the signal candidates is parameterised using Eq. 1. The combinatorial background in the m(K + K − µ + µ − ) distribution is described using an exponential function and in the angular distributions using a product of first-order Chebyshev polynomials. The factorisation of the background angular distributions is validated using data candidates selected in the upper mass sideband. The fraction of B 0 s → K + K − µ + µ − decays with the K + K − system in an S-wave configuration, F S , is expected to be at the level of 1-2% [53-55]. These contributions are therefore not modelled in the fit and a systematic uncertainty is assigned to account for this choice.
The selection and reconstruction can distort the angular and q 2 distributions observed in data. These effects are described by an angular acceptance, (cos θ , cos θ K , φ, q 2 ). The acceptance is parameterised using a product of Legendre polynomials P i of order i according to (cos θ , cos θ K , φ, q 2 ) = k,l,m,n c klmn P k (cos θ )P l (cos θ K )P m (φ)P n (q 2 ) , where the coefficients c klmn are determined on a large sample of simulated B 0 s → φµ + µ − events by exploiting the orthogonality of the Legendre polynomials. Given that the acceptance is parameterised in terms of the key degrees of freedom used in the decay description (i.e. the three angles and q 2 ) there is minimal dependency on the model used to simulate the events. The orders used to model the efficiency are k ≤ 4, l ≤ 2 (l ≤ 4 in Run 2), m ≤ 6 and n ≤ 7 in cos θ , cos θ K , φ and q 2 , respectively. Where different sets of acceptance orders give a similar description of the acceptance function, the set of lowest orders is chosen. Given the flavour-symmetric final state, only even orders are considered in the description of the three decay angles. The choice of orders used to describe the angular acceptance is assessed as a source of systematic uncertainty.
In the narrow q 2 regions, the PDF describing the signal candidates is constructed from the product of the acceptance function evaluated at the median of the q 2 region and the signal fit model. For the wide q 2 regions, the acceptance is taken into account by weighting each event by the inverse efficiency. The shape of the angular acceptance is found to vary according to the data-taking conditions. The different trigger thresholds during the 2016 and 2017-2018 data-taking periods require the Run 2 data to be further separated. The angular acceptance is therefore derived separately for the 2011-2012, 2016 and 2017-2018 data sets. The data are split according to these periods and a simultaneous fit is performed. In the fit, the angular observables and angular background parameters are shared across the three data sets. The sharing of the angular background parameters improves the fit behaviour and the resulting small bias due to this choice is added as a systematic uncertainty. All other nuisance parameters are determined independently for each data set. In order to avoid experimenter's bias, all decisions regarding the fit strategy and candidate selection were made before the results were examined.
Pseudoexperiments, generated using the results of the best fit to data, are used to assess the bias and coverage of the simultaneous fit. The majority of observables have a bias of less than 10% of the statistical uncertainty. The observables S 3 and S 4 in the q 2 region [4.0, 6.0] GeV 2 /c 4 and A CP FB in the q 2 region [0.1, 0.98] GeV 2 /c 4 exhibit a fit bias at the level of 15% of the statistical uncertainty. An additional systematic uncertainty equal to the size of the fit bias is assigned for all observables and the statistical uncertainty is corrected to account for any under-or over-coverage, which is at the level of 14% or less.
The angular acceptance corrections for each data set are validated using both fits to B 0 s → J/ψφ candidates and fits to simulated B 0 s → φµ + µ − candidates, where the latter are generated according to a physics model using inputs from Ref. [56]. The angular observables extracted from fits to B 0 s → J/ψφ candidates are in good agreement with previous measurements [54,55]. In addition, the angular observables extracted from fits to simulated events are in good agreement with the values used in their generation.

Results
The angular distributions for the combined 2011-2012, 2016 and 2017-2018 data set are shown in Fig. 2 for all candidates in the [1.1, 6] GeV 2 /c 4 q 2 region and for candidates within ±50 MeV/c 2 of the known B 0 s mass. The data are overlaid with the projection of the fitted PDF, combined across the data sets. The fit projections for all q 2 regions and individual data sets are provided in Appendix A.
The numerical results for the angular observables are given in Table 1, including systematic uncertainties as discussed in Sec. 6. The linear-correlation matrices for the angular observables are provided in Tables 3, 4 and 5 in Appendix B. A graphical comparison of the results with the SM predictions [23-26] is shown in Fig. 3. Overall, the results are in good agreement with the SM predictions, with the CP asymmetries compatible with zero as expected in the SM. For the CP averages, a mild tension in F L is observed at low q 2 , where the data are found to lie below the SM prediction.
To determine the compatibility of the angular observables with the SM, the flavio software package [24] is used. The Wilson coefficient representing the real part of the bsµµ vector coupling, Re(C 9 ), is varied in a fit of the CP -averaged angular observables F L , . The asymmetries are excluded as they offer little sensitivity to Re(C 9 ). The best fit value is given by ∆Re(C 9 ) = −1.3 +0.7 −0.6 and is preferred over the SM hypothesis (∆Re(C 9 ) = 0) at the level of 1.9 σ.
Total PDF Background    obtained from the maximum likelihood fit. The first uncertainty is statistical and the second is the total systematic uncertainty, as described in Sec. 6.

Systematic uncertainties
Systematic effects may change the measured angular observables. The size of these effects is determined using high-yield pseudoexperiments, generated using an alternative PDF which encodes the systematic effect under study. The pseudoexperiments are fitted with both the default and alternative PDFs, and the resulting difference in angular observables is assigned as a systematic uncertainty. Each systematic effect is studied using approximately one hundred million generated events. The simulated samples used to derive the default acceptance are produced according to the phase space of the three-body B 0 s → φµ + µ − decay with the lifetime difference between the B 0 s and B 0 s system, ∆Γ s , set to zero. A systematic uncertainty is determined by weighting the angular, q 2 and B 0 s lifetime distributions of simulated events to an alternative model description, in which the value for ∆Γ s /Γ s is taken as 0.17 [57] and the B 0 s → φµ + µ − decay is described using a more realistic physics model with form factor calculations taken from Refs. [23-26].
As the angular observables are measured integrated over the B 0 s decay time, neglecting the B 0 s decay time dependence of the acceptance induces an additional source of bias. The size of this effect remains small compared to the statistical uncertainty and is accounted for with a systematic uncertainty.
To assess the systematic uncertainty associated with the description of the angular background distribution, pseudoexperiments are generated using second-order Chebyshev polynomials, the coefficients for which are obtained from a fit to candidates in the upper mass sideband. Similarly, the impact of sharing the angular background parameters across data sets is determined by generating pseudoexperiments using first order Chebyshev polynomials with coefficients derived separately for each data set. Table 2: Sources of the systematic uncertainties associated with the angular observables. As the size of a systematic effect can vary strongly depending on the observable and q 2 region in question, only the magnitude of the largest systematic uncertainty across all regions, rounded up to the next multiple of 0.005, is indicated.
To evaluate the impact of the corrections to the track multiplicity, B 0 s p T spectrum and hardware trigger response in simulated events on the angular observables, the angular acceptance is rederived, each time removing a correction. The largest resulting deviation in a given angular observable is assigned as a systematic uncertainty. For the particle identification response, the corrections are determined using an alternative model and a new angular acceptance is rederived.
To account for neglected B 0 s → K + K − µ + µ − decays, where the K + K − system is in an S-wave configuration, pseudoexperiments are generated according to the combined P-and S-wave decay rate, where F S is conservatively taken to be 2%. The pseudoexperiments are fitted with the default model and the resulting shift in the angular observables is assigned as a systematic uncertainty. The impact of peaking background contributions is assessed in a similar fashion by injecting additional events drawn from the reconstructed B 0 s mass and angular distributions of the background in question.
The influence of the choice for the maximum order of the Legendre polynomials used in the acceptance parameterisation is evaluated by rederiving the acceptance using a higher order. Further sources of systematic uncertainty include the size of the simulated samples used to derive the acceptance, the evaluation of the acceptance at a single point in q 2 for the narrow q 2 regions and the signal mass model, all of which yield negligible contributions to the overall systematic uncertainty.
The systematic uncertainties are summarised in Table 2. As the size of a systematic effect can vary strongly depending on the observable and q 2 region in question, only the magnitude of the largest systematic uncertainty across all regions, rounded up to the next multiple of 0.005, is indicated.

Conclusions
This paper presents an angular analysis of the B 0 s → φµ + µ − decay using pp collisions corresponding to 8.4 fb −1 of data recorded by the LHCb experiment during the Run 1 and Run 2 data-taking periods. The angular observables are extracted using an unbinned maximum likelihood fit to the angular distributions of untagged B 0 s → φµ + µ − decays in regions of the square of the dimuon mass, q 2 . The results in this paper constitute the most precise measurement of the B 0 s → φµ + µ − angular observables to date, with an approximate two-fold increase in sensitivity compared to the results of Ref.
[4], which are superseded by this paper. The results are found to be compatible with SM predictions.