Differential branching fractions and isospin asymmetries of $B \to K^{(*)}\mu^{+}\mu^{-}$ decays

The isospin asymmetries of $B \to K\mu^+\mu^-$ and $B \to K^{*}\mu^+\mu^-$ decays and the partial branching fractions of the $B^0 \to K^0\mu^+\mu^-$, $B^+ \to K^+\mu^+\mu^-$ and $B^+ \to K^{*+}\mu^+\mu^-$ decays are measured as functions of the dimuon mass squared, $q^2$. The data used correspond to an integrated luminosity of 3$~$fb$^{-1}$ from proton-proton collisions collected with the LHCb detector at centre-of-mass energies of 7$\,$TeV and 8$\,$TeV in 2011 and 2012, respectively. The isospin asymmetries are both consistent with the Standard Model expectations. The three measured branching fractions, while individually consistent, all favour lower values than their respective Standard Model predictions.


Introduction
The rare decay of a B meson into a strange meson and a µ + µ − pair is a b → s quark-level transition. In the Standard Model (SM), this can only proceed via loop diagrams. The loop-order suppression of the SM amplitudes increases the sensitivity to new virtual particles that can influence the decay amplitude at a similar level to the SM contribution. The branching fractions of B → K ( * ) µ + µ − decays are highly sensitive to contributions from vector or axial-vector like particles predicted in extensions of the SM. However, despite recent progress in lattice calculations [1,2], theoretical predictions of the decay rates suffer from relatively large uncertainties in the B → K ( * ) form factor calculations.
To maximise sensitivity, observables can be constructed from ratios or asymmetries where the leading form factor uncertainties cancel. The CP -averaged isospin asymmetry (A I ) is such an observable. It is defined as where Γ(f ) and B(f ) are the partial width and branching fraction of the B → f decay and τ 0 /τ + is the ratio of the lifetimes of the B 0 and B + mesons 1 . The decays in the isospin ratio differ only by the charge of the light (spectator) quark in the B meson. The SM prediction for A I is O(1%) in the dimuon mass squared, q 2 , region below the J/ψ resonance [3][4][5]. There is no precise prediction for A I for the q 2 region above the J/ψ resonance, but it is expected to be even smaller than at low q 2 [5]. As q 2 approaches zero, the isospin asymmetry of B → K * µ + µ − is expected to approach the same asymmetry as in B → K * γ decays, which is measured to be (5 ± 3)% [6]. Previously, A I has been measured by the BaBar [7], Belle [8] and LHCb [9] collaborations, where measurements for the B → Kµ + µ − decay have predominantly given negative values of A I . In particular, the B → Kµ + µ − isospin asymmetry measured by the LHCb experiment deviates from zero by over 4 standard deviations. For B → K * µ + µ − , measurements of A I are consistent with zero.
This paper describes a measurement of the isospin asymmetry in B → Kµ + µ − and B → K * µ + µ − decays based on data collected with the LHCb detector, corresponding to an integrated luminosity of 1 fb −1 recorded in 2011 at a centre-of-mass energy √ s = 7 TeV, and 2 fb −1 recorded in 2012 at √ s = 8 TeV. The previous analysis [9] was carried out on the 1 fb −1 of data recorded in 2011. The analysis presented here includes, in addition to the data from 2012, a re-analysis of the full 1 fb −1 data sample with improved detector alignment parameters, reconstruction algorithms and event selection. Thus it supersedes the measurements in Ref. [9]. Moreover, the assumption that there is no isospin asymmetry in the B → J/ψ K ( * ) decays is now used for all the measurements.
The isospin asymmetries are determined by measuring the differential branching fractions of The K 0 meson is reconstructed through the decay K 0 S → π + π − ; the K * + as K * + → K 0 S (→ π + π − )π + and the K * 0 as K * 0 → K + π − . Modes involving a K 0 L or π 0 in the final state are not considered. The individual branching fractions of B + → K + µ + µ − , B 0 → K 0 µ + µ − and B + → K * + µ + µ − decays are also reported. The branching fraction of the decay B 0 → K * 0 µ + µ − has been previously reported in Ref. [10] and is not updated here.
The B 0 → K * 0 µ + µ − and B + → K * + µ + µ − branching fractions are influenced by the presence of B 0 → K + π − µ + µ − and B + → K 0 S π + µ + µ − decays with the K + π − or K 0 S π + system in a S-wave configuration. It is not possible to separate these candidates from the dominant K * 0 and K * + resonant components without performing an analysis of the K + π − or K 0 S π + invariant mass and the angular distribution of the final state particles. The S-wave component is expected to be at the level of a few percent [11] and to cancel when evaluating the isospin asymmetry of the B → K * µ + µ − decays.

Detector and dataset
The LHCb detector [12] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [13] placed downstream of the magnet. The combined tracking system provides a momentum measurement with relative uncertainty that varies from 0.4% at 5 GeV/c to 0.6% at 100 GeV/c, and an impact parameter resolution of 20 µm for tracks with high transverse momentum. Charged hadrons are identified using two ring-imaging Cherenkov (RICH) detectors [14]. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [15]. Decays of K 0 S → π + π − are reconstructed in two different categories: the first involving K 0 S mesons that decay early enough for the daughter pions to be reconstructed in the vertex detector; and the second containing K 0 S mesons that decay later such that track segments of the pions cannot be formed in the vertex detector. These categories are referred to as long and downstream, respectively. Candidates in the long category have better mass, momentum and vertex resolution than those in the downstream category.
Simulated events are used to estimate the efficiencies of the trigger, reconstruction and subsequent event selection of the different signal decays and to estimate the contribution from specific background sources. These samples are produced using the software described in Refs. [16][17][18][19][20][21][22].

Selection
The B → K ( * ) µ + µ − candidate events are required to pass a two-stage trigger system [23]. In the initial hardware stage, these events are selected with at least one muon with transverse momentum, p T > 1.48 (1.76) GeV/c in 2011 (2012). In the subsequent software stage, at least one of the final-state particles is required to have p T > 1.0 GeV/c and an impact parameter (IP) larger than 100 µm with respect to all of the primary pp interaction vertices (PVs) in the event. Finally, a multivariate algorithm [24] is used for the identification of secondary vertices consistent with the decay of a b hadron with muons in the final state.
For the B 0 → K 0 S µ + µ − and B + → K * + µ + µ − modes, K 0 S candidates are required to have a mass within 30 MeV/c 2 of the known K 0 S mass [25]. For the B 0 → K * 0 µ + µ − and B + → K * + µ + µ − modes, K * candidates are formed by combining kaons and pions and are required to have a mass within 100 MeV/c 2 of the known K * masses [25]. For all decay modes, B candidates are formed by subsequently combining the K ( * ) meson with two muons of opposite charge and requiring the mass to be between 5170 and 5700 MeV/c 2 .
The event selection is common to that described in Refs. [10,26,27]: the µ ± and the K + candidates are required to have χ 2 IP > 9, where χ 2 IP is defined as the minimum change in χ 2 of the vertex fit to any of the PVs in the event when the particle is added to that PV; the dimuon pair vertex fit has χ 2 < 9; the B candidate is required to have a vertex fit χ 2 < 8 per degree of freedom; the B momentum vector is aligned with respect to one of the PVs in the event within 14 mrad, the B candidate has χ 2 IP < 9 with respect to that PV and the vertex fit χ 2 of that PV increases by more than 121 when including the B decay products. In addition, the K 0 S candidate is required to have a decay time larger than 2 ps. Using particle identification information from the RICH detectors, calorimeters and muon system, multivariate discriminants (PID variables) are employed to reject background candidates, where pions are misidentified as kaons and vice-versa, and where a pion or kaon is incorrectly identified as a muon.
The initial selection is followed by a tighter multivariate selection, based on a boosted decision tree (BDT) [28] with the AdaBoost algorithm [29], which is designed to reject background of combinatorial nature. Separate BDTs are employed for each signal decay. For decays involving a K 0 S meson, two independent BDTs are trained for the long and downstream categories. This gives a total of six BDTs which all use data from the upper mass sideband (m(K ( * ) µ + µ − ) > 5350 MeV/c 2 ) of their corresponding decay to represent the background sample in the training. Simulated events are used as the signal sample in the training of the corresponding BDTs. In contrast, to stay consistent with the selection in Ref. [26], the signal for the training of the B 0 → K * 0 µ + µ − BDT is taken from reconstructed B 0 → (J/ψ → µ + µ − )K * 0 candidates from data. Events used in the training of the BDTs are not used in the subsequent classification of the data.
All six BDTs use predominantly geometric variables, including the variables used in the pre-selection described above. The B 0 → K * 0 µ + µ − BDT also makes use of PID variables to further suppress background where a K + is misidentified as a π + and vice-versa in the The multivariate selections for B + → K + µ + µ − , B + → K * + µ + µ − and B 0 → K * 0 µ + µ − candidates have an efficiency of 90% for the signal channels and remove 95% of the background that remains after the pre-selection. The long lifetime of the K 0 S meson makes it difficult to determine whether it originates from the same vertex as the dimuon system in B 0 → K 0 S µ + µ − decays. As such, the multivariate selection for B 0 → K 0 S µ + µ − candidates has a signal efficiency of 66% and 48% for the long and downstream categories, respectively, while removing 99% of the background surviving the pre-selection.
Combinatorial background, where the final-state particles attributed to the B candidate do not all come from the same b-hadron decay, are reduced to a small level by the multivariate selection. In addition, there are several sources of background that peak in the K ( * ) µ + µ − invariant mass. The largest of these are B → J/ψ K ( * ) and B → ψ(2S)K ( * ) decays, which are rejected by removing the regions of dimuon invariant mass around the charmonium resonances (2828 < m(µ + µ − ) < 3317 MeV/c 2 and 3536 < m(µ + µ − ) < 3873 MeV/c 2 ). A combination of mass and PID requirements remove additional peaking backgrounds. These include Λ 0 b → Λ ( * ) µ + µ − decays, where the proton from the Λ → pπ − decay is misidentified as a K + or the proton misidentified as a π + in the Λ * → pK − decay, B 0 s → φµ + µ − decays where a kaon from φ → K + K − is misidentified as a pion, and B + → K + µ + µ − decays that combine with a random pion to fake a B 0 → K * 0 µ + µ − decay. After the application of all the selection criteria the exclusive backgrounds are reduced to less than 1% of the level of the signal.
To improve the resolution on the reconstructed mass of the B meson, a kinematic fit [30] is performed for candidates involving a K 0 S meson. In the fit, the mass of the π + π − system is constrained to the nominal K 0 S mass and the B candidate is required to originate from its associated PV.

Signal yield determination
Signal yields are determined using extended unbinned maximum likelihood fits to the K ( * ) µ + µ − mass in the range 5170-5700 MeV/c 2 . These fits are performed in nine bins of q 2 for B 0 → K 0 S µ + µ − , B + → K * + µ + µ − and B 0 → K * 0 µ + µ − decays, while for the B + → K + µ + µ − decay the larger number of signal events allows to define nineteen q 2 bins. The binning scheme is shown in Tables 4 to 6 of the appendix. It removes the region of q 2 around the charmonium resonances. For the B + → K + µ + µ − differential branching fraction, where the statistical uncertainty is the smallest, a narrow range in m(µ + µ − ) is also removed around the mass of the φ meson. The signal component in the fit is described by the sum of two Crystal Ball functions [31] with common peak values and tail parameters, but different widths. The signal shape parameters are taken from a fit to B → J/ψ K ( * ) channels in the data, with a correction that accounts for a small q 2 dependence on the peak value and width obtained from the simulation. The combinatorial background is parameterised by an exponential function, which is allowed to vary for each q 2 bin and K 0 S category independently. For decays involving K 0 S mesons, separate fits are Decay mode Signal yield made to the long and downstream categories. The mass fits for the four signal channels are shown in Fig. 1, where the long and downstream K 0 S categories are combined and the results of the fits, performed in separate q 2 bins, are merged for presentation purposes. The corresponding number of signal candidates for each channel is given in Table 1.

Branching fraction normalisation
Each signal mode is normalised with respect to its corresponding B → J/ψ K ( * ) channel, where the J/ψ resonance decays into two muons. These normalisation channels have branching fractions that are approximately two orders of magnitude higher than those of the signal channels. Each normalisation channel has similar kinematic properties and the same final-state particles as the signal modes. This results in an almost complete cancellation of systematic uncertainties when measuring the ratio of branching fractions of the signal mode with the corresponding normalisation channel. Separate normalisations for the long and downstream K 0 S reconstruction categories are used to further cancel potential sources of systematic uncertainty.
Corrections to the IP resolution, PID variables and B candidate kinematic properties are applied to the simulated events, such that the distributions of simulated candidates from the normalisation channels agree with the data. The simulation samples are subsequently used to calculate the relative efficiencies as functions of q 2 . The q 2 dependence arises mainly from trigger effects, where the muons have increased (decreased) p T at high (low) q 2 and consequently have a higher (lower) trigger efficiency. Furthermore, at high q 2 , the hadrons are almost at rest in the B meson rest frame and, like the B meson, points back to the PV in the laboratory frame. The IP requirements applied on the hadron have a lower efficiency for this region of q 2 . The K 0 S channels have an additional effect due to the different acceptance of the two reconstruction categories; K 0 S mesons are more likely to be reconstructed in the long category if they have low momentum, which favours the high q 2 region. The momentum distributions of the K 0 S mesons in B 0 → J/ψ K 0 S and B + → J/ψ K * + decays in data and simulation for both K 0 S categories are in good agreement, indicating that the acceptance is well described in the simulation.
The measured differential branching fraction averaged over a q 2 bin of width q 2 max − q 2 min is given by where N (B → K ( * ) µ + µ − ) is the number of signal candidates in the bin, N (B → J/ψ K ( * ) ) is the number of normalisation candidates, the product of B(B → J/ψ K ( * ) ) and B(J/ψ → µ + µ − ) is the visible branching fraction of the normalisation channel, and ε(B → K ( * ) µ + µ − )/ε(B → J/ψ K ( * ) ) is the relative efficiency between the signal and normalisation channels in the bin.

Systematic uncertainties
The branching fraction measurements of the normalisation modes from the B-factory experiments assume that the B + and B 0 mesons are produced with equal proportions at the Υ(4S) resonance [32][33][34]. In contrast, in this paper isospin symmetry is assumed for the B → J/ψ K ( * ) decays, implying that the B + → J/ψ K + (B + → J/ψ K * + ) and B 0 → J/ψ K 0 (B 0 → J/ψ K * 0 ) decays have the same partial width. The branching fractions used in the normalisation are obtained by: taking the most precise branching fraction results from Ref. [32] and translating them into partial widths; averaging the partial widths of the K + , K 0 and the K * + , K * 0 modes, respectively; and finally translating the widths back to branching fractions. The calculation only requires knowledge of the ratio of B 0 and B + lifetimes for which we use 0.93 ± 0.01 [25]. Statistical uncertainties are treated as uncorrelated while systematical uncertainties are conservatively treated as fully correlated.
The resulting branching fractions of the normalisation channels are where the first uncertainty is statistical and the second systematic.
A systematic uncertainty is assigned to account for the imperfect knowledge of the q 2 spectrum in the simulation within each q 2 bin. For example, the recent observation of a resonance in the high q 2 region of B + → K + µ + µ − decays [26] alters the q 2 distribution and hence the selection efficiencies in that region. By reweighting simulated events to account for this resonance, and for variations of the B → K ( * ) form factor model as described in Ref. [35], a systematic uncertainty is determined at the level of (1 − 2)% depending on channel and q 2 bin.
Data-driven corrections of the long and downstream tracking efficiencies in the simulation are determined using tag-and-probe techniques in J/ψ → µ + µ − and D 0 → φK 0 S decays, respectively. For the J/ψ → µ + µ − decay, the tag is a fully reconstructed muon track. It is combined with another muon, referred to as the probe, reconstructed using the muon stations and the large-area silicon detector upstream of the magnet. The tracking efficiency is determined by reconstructing the probe using the full tracking system. The D 0 → φK 0 S decay is tagged via a partial reconstruction using only one of the K 0 S daughters. The downstream tracking efficiency is then evaluated by fully reconstructing the K 0 S candidate. The resulting systematic uncertainty on the efficiency ratio, due to finite precision of the measurement, is found to be negligible. The systematic uncertainty that arises from the corrections to the IP resolution, PID variables and B candidate kinematic properties in the simulation varies between 1% and 3% depending on channel and q 2 bin.
A summary of the systematic uncertainties can be found in Table 2. The uncertainties on the branching fractions of the normalisation modes constitute the dominant source of systematic uncertainty on the branching fraction measurements while it cancels in the isospin measurements.

Source
Branching fraction Isospin asymmetry The values are given in Tables 4 to 6 in the appendix. In the low q 2 region, these predictions rely on the QCD factorisation approaches from Refs. [38,39] for B → K * µ + µ − and Ref. [40] for B → Kµ + µ − , and lose accuracy when approaching the J/ψ resonance. In the high q 2 region, an operator product expansion in the inverse b-quark mass, 1/m b , and in 1/ q 2 is used based on Ref. [41]. This expansion is only valid above the open charm threshold. A dimensional estimate of the uncertainty associated with this expansion is discussed in Ref. [42]. For light cone sum rule (LCSR) predictions, the B → K ( * ) form factor calculations are taken from Refs. [43] and [44]. Predictions based on form factors from lattice calculations are also overlaid [1,2,45,46]. Although all three differential branching fraction measurements are consistent with the SM, they all have values smaller than the theoretical prediction. The sample size for B + → K + µ + µ − is sufficient to show significant structures in the q 2 distribution. As an example, the peak at high q 2 is due to the ψ(4160) resonance, which is discussed in more detail in Ref. [26].
The presence of an S-wave contribution to the K + π − and K 0 S π + systems of B 0 → K * 0 µ + µ − and B + → K * + µ + µ − candidates, respectively, complicates the analysis of these channels. This effect is of the order of a few percent and can be neglected in B + → K * + µ + µ − decays with the current statistical precision. The larger signal yield of B 0 → K * 0 µ + µ − , however, merits a detailed analysis of the S-wave contribution and requires a dedicated study. For this reason the branching fraction of B 0 → K * 0 µ + µ − decays is not reported.

Isospin asymmetry results
The assumption of no isospin asymmetry in the B → J/ψ K ( * ) modes makes the isospin measurement equivalent to measuring the difference in isospin asymmetry between B → K ( * ) µ + µ − and B → J/ψ K ( * ) decays. Compared to using the values in Ref. [25] for the branching fractions of the B → J/ψ K ( * ) modes, this approach shifts A I in each bin by approximately 4%. The isospin asymmetries are shown in Fig. 3 for B → Kµ + µ − and B → K * µ + µ − and given in Tables 7 and 8 in the appendix. The asymmetric uncertainties are obtained from the profile likelihood.
Since there is no knowledge on the shape of A I in models that extend the SM, apart from large correlations expected between neighbouring bins, the A I = 0 hypothesis is tested against the simplest alternative, that is a constant value different from zero. The difference in χ 2 between the two hypotheses is used as a test statistic and is compared to the differences in an ensemble of pseudo-experiments which are generated with zero isospin asymmetry. Given the current statistical precision, the hypothesis of A I = 0 is a good approximation to the SM which predicts A I to be O(1%) [3][4][5]. The p-value for the B → Kµ + µ − isospin asymmetry under the A I = 0 hypothesis is 11%, corresponding to a significance of 1.5 σ. The B → K * µ + µ − isospin asymmetry has a p-value of 80% with respect to zero. Alternatively, a simple χ 2 test of the data with respect to a hypothesis of zero isospin asymmetry has a p-value of 54% (4%) for the B → Kµ + µ − (B → K * µ + µ − ) isospin asymmetry.
Although the isospin asymmetry for B → Kµ + µ − decays is negative in all but one q 2 bin, results are more consistent with the SM compared to the previous measurement in Ref. [9], which quoted a 4.4 σ significance to differ from zero, using a test statistic that explicitly tested for A I to be negative in all bins. The lower significance quoted here is due to four effects: the change of the test statistic in the calculation of the significance itself, which reduces the previous discrepancy to 3.5 σ; the assumption that the isospin asymmetry of B → J/ψ K ( * ) is zero which reduces the significance further to 3.2 σ; a re-analysis of the 2011 data with the updated reconstruction and event selection that reduces the significance to 2.5 σ; and finally the inclusion of the 2012 data set reduces the significance further to 1.5 σ.
The measurements of A I in the individual q 2 bins obtained from the re-analysis of the 2011 data set are compatible with those obtained in the previous analysis; a χ 2 test on the compatibility of the two results, taking the overlap of events into account, has a p-value of 93%. However results from the 2012 data are more compatible with an A I value of zero than the re-analysed 2011 data, as shown in Fig 4.

Conclusion
The most precise measurements of the differential branching fractions of B + → K + µ + µ − , B 0 → K 0 S µ + µ − and B + → K * + µ + µ − decays as well as the isospin asymmetries of B → Kµ + µ − and B → K * µ + µ − decays have been performed using a data set corresponding to 3 fb −1 of integrated luminosity collected by the LHCb detector.