Test of lepton flavor universality and search for lepton flavor violation in B → Kℓℓ decays

We present measurements of the branching fractions for the decays B → Kμ+μ− and B → Ke+e−, and their ratio (RK), using a data sample of 711 fb−1 that contains 772 × 106BB¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ B\overline{B} $$\end{document} events. The data were collected at the ϒ(4S) resonance with the Belle detector at the KEKB asymmetric-energy e+e− collider. The ratio RK is measured in five bins of dilepton invariant-mass-squared (q2): q2 ∈ (0.1, 4.0), (4.00, 8.12), (1.0, 6.0), (10.2, 12.8) and (> 14.18) GeV2/c4, along with the whole q2 region. The RK value for q2 ∈ (1.0, 6.0) GeV2/c4 is 1.03−0.24+0.28\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {1.03}_{-0.24}^{+0.28} $$\end{document} ± 0.01. The first and second uncertainties listed are statistical and systematic, respectively. All results for RK are consistent with Standard Model predictions. We also measure CP-averaged isospin asymmetries in the same q2 bins. The results are consistent with a null asymmetry, with the largest difference of 2.6 standard deviations occurring for the q2 ∈ (1.0, 6.0) GeV2/c4 bin in the mode with muon final states. The measured differential branching fractions, dℬ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ d\mathrm{\mathcal{B}} $$\end{document}/dq2, are consistent with theoretical predictions for charged B decays, while the corresponding values are below the expectations for neutral B decays. We have also searched for lepton-flavor-violating B → Kμ±e∓ decays and set 90% confidence-level upper limits on the branching fraction in the range of 10−8 for B+ → K+μ±e∓, and B0 → K0μ±e∓ modes.


Introduction
The decays B → K + − ( = e, µ), mediated by the b → s + − quark-level transition, constitute a flavor-changing neutral current process. Such processes are forbidden at tree level in the Standard Model (SM) but can proceed via suppressed loop-level diagrams, and they are therefore sensitive to particles predicted in a number of new physics models [1,2]. A robust observable [3] to test the SM prediction is the lepton-flavor-universality (LFU) ratio, where H is a K or K * meson and the decay rate Γ is integrated over a range of the dilepton invariant mass squared, q 2 ≡ M 2 ( + − ). For R K * , recently LHCb [4] reported hints of deviations from SM expectations, while Belle [5] results are consistent with the SM with relatively larger uncertainties. LHCb also measured R K [6], reporting a difference of about 2.5 standard deviations (σ) from the SM prediction in the q 2 ∈ (1.1, 6.0) GeV 2 /c 4 bin. A previous measurement of the same quantity was performed by Belle [7] in the whole q 2 range with a data sample of 657 × 10 6 BB events. The result presented here is obtained from a multidimensional fit performed on the full Belle data sample of 772×10 6 BB events, and supersedes our previous result [7].
Another theoretically robust observable [8], where the dominant form-factor-related uncertainties also cancel, is the CP -averaged isospin asymmetry, representing the difference in partial widths: where τ B + /τ B 0 = 1.078 ± 0.004 is the lifetime ratio of B + to B 0 [9]. The f +− /f 00 = 1.058 ± 0.024 [10] is the relative production fraction of charged (f +− ) and neutral B JHEP03(2021)105 mesons (f 00 ) at B factories. The A I value is expected to be close to zero in the SM [11]. Earlier, BaBar [12] and Belle [7] reported A I to be significantly below zero, especially in the q 2 region below the J/ψ resonance, while LHCb [13] reported results consistent with SM predictions. In many theoretical models, lepton flavor violation (LFV) accompanies LFU violation [14]. With neutrino mixing, LFV is only possible at rates far below the current experimental sensitivity. In case of signal, this will signify physics beyond SM [15]. The LFV in B decays can be studied via B → Kµ ± e ∓ . The most stringent upper limits on B + → K + µ + e − and B + → K + µ − e + set by LHCb [16] are 6.4 × 10 −9 and 7.0 × 10 −9 at 90% confidence level (CL). Prior to that, B 0 → K 0 µ ± e ∓ decays were searched for by BaBar [17], which set a 90% CL upper limit on the branching fraction of 2.7 × 10 −7 .
In this paper, we report a measurement of the decay branching fractions of B → K + − , R K and A I in the whole q 2 range as well as in five q 2 bins [(0.1, 4.0), (4.00, 8.12), (1.0, 6.0), (10.2, 12.8) and (> 14.18)] GeV 2 /c 4 . We also search for B → Kµ ± e ∓ decays using the full Belle data sample.

Data samples and Belle detector
This analysis uses a 711 fb −1 data sample containing (772 ± 11) × 10 6 BB events, collected at the Υ(4S) resonance by the Belle experiment at the KEKB e + e − collider [18,19]. An 89 fb −1 data sample recorded 60 MeV below the Υ(4S) peak (off-resonance) is used to estimate the background contribution from e + e − → qq (q = u, d, s, c) continuum events.
The Belle detector [20,21] is a large-solid-angle magnetic spectrometer composed of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter (ECL) comprising CsI(Tl) crystals. All these subdetectors are located inside a superconducting solenoid coil that provides a 1.5 T magnetic field. An iron flux-return yoke placed outside the coil is instrumented with resistive plate chambers (KLM) to detect K 0 L mesons and muons. Two inner detector configurations were used: a 2.0 cm radius beam pipe and a three-layer SVD for the first sample of 140 fb −1 ; and a 1.5 cm radius beam pipe, a four-layer SVD, and a small-cell inner CDC for the remaining 571 fb −1 [22].
To study properties of signal events and optimize selection criteria, we use samples of Monte Carlo (MC) simulated events. The B → K + − modes are generated with the EvtGen package [23] based on a model described in ref. [24], while LFV modes are generated with a phase-space model. The PHOTOS [25][26][27] package is used to incorporate final-state radiation. The detector response is simulated with GEANT3 [28].

Analysis overview
We reconstruct B → K + − (K = K + , K 0 S ) 1 decays by selecting charged particles that originate from the vicinity of the e + e − interaction point (IP), except for those originating JHEP03(2021)105 from K 0 S decays. We require impact parameters less than 1.0 cm in the transverse plane and less than 4.0 cm along the z axis (parallel to the e + beam). To reduce backgrounds from low-momentum particles, we require that tracks have a minimum transverse momentum of 100 MeV/c .
From the list of selected tracks, we identify K + candidates using a likelihood ratio R K/π = L K /(L K + L π ), where L K and L π are the likelihoods for charged kaons and pions, respectively, calculated based on the number of photoelectrons in the ACC, the specific ionization in the CDC, and the time of flight as determined from the TOF. We select kaons by requiring R K/π > 0.6, which has a kaon identification efficiency of 92% and a pion misidentification rate of 7%. For the neutral B decay, candidate K 0 S mesons are reconstructed by combining two oppositely charged tracks (assumed to be pions) with an invariant mass between 487 and 508 MeV/c 2 ; this corresponds to a 3σ window around the nominal K 0 S mass [9]. Such candidates are further identified with a neural network (NN). The variables used for this NN are: the K 0 S momentum; the distance along the z axis between the two track helices at their closest approach; the flight length in the transverse plane; the angle between the K 0 S momentum and the vector joining the IP with the K 0 S decay vertex; the angle between the pion momentum and the laboratory-frame direction in the K 0 S rest frame; the distances-of-closest-approach in the transverse plane between the IP and the two pion helices; and the number of hits in the CDC; and the presence or absence of hits in the SVD for each pion track.
Muon candidates are identified based on information from the KLM. We require that candidates have a momentum greater than 0.8 GeV/c (enabling them to reach the KLM subdetector), and a penetration depth and degree of transverse scattering consistent with those of a muon [29]. The latter information is used to calculate a normalized muon likelihood R µ , and we require R µ > 0.9. For this requirement, the average muon detection efficiency is 89%, with a pion misidentification rate of 1.5% [30].
Electron candidates are required to have a momentum greater than 0.5 GeV/c and are identified using the ratio of calorimetric cluster energy to the CDC track momentum; the shower shape in the ECL; the matching of the track with the ECL cluster; the specific ionization in the CDC; and the number of photoelectrons in the ACC. This information is used to calculate a normalized electron likelihood R e , and we require R e > 0.9. This requirement has an efficiency of 92% and a pion misidentification rate below 1% [31]. To recover energy loss due to possible bremsstrahlung, we search for photons inside a cone of radius 50 mrad centered around the electron direction. For each photon found within the cone, its four-momentum is added to that of the initial electron.
Charged (neutral) B candidates are reconstructed by combining K ± (K 0 S ) with suitable µ ± or e ± candidates. To distinguish signal from background events, two kinematic variables are used: the beam-energy-constrained mass M bc = (E beam /c 2 ) 2 − (p B /c) 2 , and the energy difference ∆E = E B − E beam , where E beam is the beam energy, and E B and p B are the energy and momentum, respectively, of the B candidate. All these quantities are calculated in the e + e − center-of-mass (CM) frame. For signal events, the ∆E distribution peaks at zero, and the M bc distribution peaks near the B mass. We retain candidates satisfying the requirements −0.10 < ∆E < 0.25 GeV and M bc > 5.2 GeV/c 2 .

JHEP03(2021)105
With the above selection criteria applied, about 2% of signal MC events are found to have more than one B candidate. For these events, we retain the candidate with smallest χ 2 value resulting from a vertex fit of the B daughter particles. From MC simulation, this criterion is found to select the correct signal candidate 78-85% of the time, depending on the decay mode. The decays B → J/ψ(→ + − )K and B → ψ(2S)(→ + − )K, used later as control samples, are suppressed in the signal selection via a set of vetoes 8.75 < q 2 < 10.2 GeV 2 /c 4 and 13.0 < q 2 < 14.0 GeV 2 /c 4 with the dimuon; 8.12 < q 2 < 10.2 GeV 2 /c 4 and 12.8 < q 2 < 14.0 GeV 2 /c 4 with the dielectron final states for B → J/ψK and B → ψ(2S)K, respectively. An additional veto of the low q 2 region (< 0.05 GeV 2 /c 4 ) is applied in the case of B → Ke + e − to suppress possible contaminations from γ * → e + e − and π 0 → γe + e − .
At this stage of the analysis, there is significant background from continuum processes and other B decays. As lighter quarks are produced with large kinetic energy, the former events tend to consist of two back-to-back jets of pions and kaons. In contrast, BB events are produced almost at rest in the CM frame, resulting in more spherically distributed daughter particles. We thus distinguish BB events from qq background based on event topology.
Background arising from B decays has typically two uncorrelated leptons in the final state. Such background falls into three categories: (a) both B andB decay semileptonically; (b) a B →D ( * ) X + ν decay is followed byD ( * ) → X −ν ; and (c) hadronic B decays where one or more daughter particles are misidentified as leptons. To suppress continuum as well as BB background, we use an NN trained with the following input variables: 1. A likelihood ratio constructed from a set of modified Fox-Wolfram moments [32,33]. 3. The angle θ T between the thrust axes calculated from final-state particles for the candidate B and for the rest of the event in the CM frame. (The thrust axis is the direction that maximizes the sum of the longitudinal momenta of the considered particles). For signal events, the cos θ T distribution is flat, whereas for continuum events it peaks near ±1.
4. Flavor-tagging information from the tag-side (recoiling) B decay. The flavor-tagging algorithm [34] outputs two variables: the flavor q of the tag-side B, and the tag quality r. The latter ranges from zero for no flavor information to one for an unambiguous flavor assignment.
5. The confidence level of the B vertex fitted from all daughter particles.
6. The separation in z between the signal B decay vertex and that of the other B in the event.
7. The separation between the two leptons along the z-axis divided by the quadratic sum of uncertainties in the z-intercepts of the tracks.

JHEP03(2021)105
8. The sum of the ECL energy of tracks and clusters not associated with the signal B decay. 9. A set of variables developed by CLEO [35] that characterize the momentum flow into concentric areas around the thrust axis of a reconstructed B candidate.
The NN outputs a single variable O, for which larger values correspond to more signallike events. To facilitate modeling of the distribution of O with an analytic function, we require O > −0.6 (= O min ) and transform O to a new variable: where O max is the upper boundary of O. The criterion on O min reduces the background events by more than 75%, with a signal loss of only 4-5%.
We study the remaining background events using MC simulation for individual modes, with special attention paid to those that can mimic signal decays. Candidates arising from B 0 → J/ψ(→ + − )K * 0 populate towards the negative side in ∆E and are suppressed with the requirement ∆E > −0.1 GeV. The decay B + →D 0 (→ K + π − )π + mimics B + → K + µ + µ − when both pions are mis-identified as muons; to suppress this background, we apply a veto on the invariant mass of the K + and µ − candidates: The contribution from other B → charm decays is negligible. Events originating from the decays B + → J/ψ(→ µ + µ − )K + , in which one of the muons is misidentified as a kaon and vice versa, contribute as a peaking background to B + → K + µ + µ − . Such events are suppressed by applying a veto on the invariant mass M [ For the LFV modes, the background coming from B + → J/ψ(→ e + e − )K + because of misidentification and swapping between particles is removed by invariant mass vetoes. For the B + → K + µ + e − mode, two vetoes are applied: (a) the electron is misidentified as kaon and kaon as muon, and thus the veto on the kaon-electron invariant mass is M [K + e − ] / ∈ (2.95, 3.11) GeV/c 2 ; and (b) the electron is misidentified as a muon, and thus the muon-electron mass veto is M [µ + e − ] / ∈ (3.02, 3.12) GeV/c 2 . For the B + → K + µ − e + channel, only the latter background is found and removed using M [µ + e − ] / ∈ (3.02, 3.12) GeV/c 2 . A small contribution from B + →D 0 (K + π − )π + for these LFV modes, due to misidentification of pions as leptons, is removed by requiring where an electron is misrecontructed as a muon, is suppressed by requiring M [µ + e − ] / ∈ (3.04, 3.12) GeV/c 2 . When calculating invariant masses for these vetoes, the mass hypothesis for the misidentified particle is used. There is a small background from B → Kπ + π − decays in the B resonances. To avoid biasing our results, all selection criteria are determined in a "blind" manner, i.e., they are finalized before looking at data events in the signal region.
We determine the signal yields by performing a three-dimensional unbinned extended maximum-likelihood fit to the M bc , ∆E, and O distributions in different q 2 bins. The fits are performed for each mode separately. The probability density functions (PDFs) used to model signal decays are as follows: for M bc we use a Gaussian, for ∆E the sum of a Gaussian and a Crystal Ball function [36], and for O the sum of a Gaussian and an asymmetric Gaussian with a common mean. All signal shape parameters are obtained from MC simulation. To account for small differences observed between data and MC simulations, we introduce an offset in the mean positions and scaling factors for the widths. The values of these parameters are obtained from fitting the control sample B → J/ψ(→ + − )K decays and kept fixed. The PDFs used for charmless B → Kπ + π − peaking background is the same as that of the signal PDFs, with the fixed number of peaking events. The shapes of the M bc , ∆E, and O distributions for background arising from B decays are parameterized with an ARGUS function [37], an exponential, and a Gaussian function, respectively. Similarly, the continuum background is modeled using an ARGUS, a firstorder polynomial, and a Gaussian function. The shapes of BB and continuum backgrounds are very similar in two of the fit variables, making it difficult to simultaneously float the yields of both backgrounds. Hence, the continuum yields are obtained for each mode from the off-resonance data sample and fixed in the fit. These yields are consistent with those of the high-statistics off-resonance MC sample. The BB yields are floated in the fit.

Results
The There are 137 ± 14 and 138 ± 15 signal events for the decays B + → K + µ + µ − and B + → K + e + e − , respectively, whereas the yields for the decays

JHEP03(2021)105
Mode B The signal yields for LFV decays are obtained by performing unbinned extended maximum-likelihood fits, similar to those for the B → K + − modes. The signal-enhanced projection plots with fit results for LFV decays are shown in figure 7. The fitted yields are 11.6 +6.1 −5.5 , 1.7 +3.6 −2.2 , and −3.
together, as we do not distinguish between B 0 andB 0 . The total branching fraction B(B 0 → K 0 µ ± e ∓ ) corresponds, via isospin invariance, to B(B + → K + µ + e − )+ B(B + → K + µ − e + ). The significance of the signal yield for B + → K + µ + e − channel is 2.8σ. To estimate the signal significance for this mode, we have generated a large sample of pseudoexperiments with N true sig = 0 and estimated the number of cases which have N fit sig > N sig (observed). The confidence level obtained is then translated to significance. We calculate the upper limit for these modes at 90% CL using a frequentist method. In this method, for different numbers of signal events N sig (gen), we generate 1000 Monte Carlo experiments with signal and background PDFs, with each set of events being statistically equivalent to our data sample of 711 fb −1 . We fit all these simulated data sets, and, for each value of N sig (gen), we calculate the fraction of MC experiments that have N sig ≤ N sig (data). The 90% CL upper limit is taken to be the value of N sig (gen) (called here N UL sig ) for which 10% of the experiments have N sig ≤ N sig (data). The upper limit on the branching fraction is then derived using the formula: where N BB is the number of BB pairs = (772 ± 11) × 10 6 , f +−(00) is the branching for charged (neutral) B decays, and ε is the signal reconstruction efficiency calculated from signal MC samples. The systematic uncertainty in B UL is included by smearing the N sig obtained from the MC fits with the fractional systematic uncertainty (discussed in section 5). The results are listed in table 3.

Systematic uncertainties
Systematic uncertainties arising due to lepton identification is 0.3% (0.4%) for each muon (electron) selection. This uncertainty is calculated using an inclusive J/ψ → + − , = e or µ sample. Uncertainty due to hadron identification is 0.8% for K + using D * + → D 0 (K − π + )π + sample and 1.6% for K 0 S [38]. The systematic uncertainty due to charged  Mode  track reconstruction is 0.35% per track estimated by using the partially reconstructed D * − →D 0 π − ,D 0 → π + π − K 0 S , and K 0 S → π + π − events. The uncertainty in efficiency due to limited MC statistics is about 0.2%, and the uncertainty in the number of BB events is 1.4%. The systematic uncertainty in the branching fraction [9]. We compare the efficiency of the O > O min criterion between data and MC samples with the control channel B → J/ψK, J/ψ → + − ; the differences between data and MC simulation (0.9-1.2%) are corrected and the corresponding uncertainty (0.2-0.3%) is assigned as a systematic uncertainty. The uncertainty due to PDF shapes is evaluated by varying the fixed shape parameters by ±1σ and repeating the fit; the change in the central value of N sig is taken as the systematic uncertainty, which ranges from 0.1 to 0.6%. The uncertainty due to the fixed yield of continuum events is JHEP03(2021)105  estimated by varying the yield by ±1σ in the fit; the resulting variation in N sig is less than 1%. The charmless B → Kπ + π − background fixed in the fit for the modes with muon final states is varied within ±1σ in the fit, and the change in N sig is assigned as systematic, which is 0.1-0.2%. The decay model systematic for B → K + − modes is evaluated by comparing reconstruction efficiencies calculated from MC samples generated with different models [39,40] and is 0.3 to 2.0% depending on the q 2 bin. For the B → J/ψK branching fraction, we have considered all the sources except for the contribution due to JHEP03(2021)105
The branching fractions for B + → J/ψK + , and B 0 → J/ψK 0 are (1.032 ± 0.007 ± 0.024) × 10 −3 , and (0.902 ± 0.010 ± 0.026) × 10 −3 , respectively. These are the single most precise measurements to date. The R K values for different q 2 bins are consistent with the SM predictions, and the value for the whole q 2 range is 1.10 +0.16 −0.15 ± 0.02. The results for five q 2 bins are

JHEP03(2021)105
Our result of R K + for the bin of interest, q 2 ∈ (1.0, 6.0) GeV 2 /c 4 , is higher than the LHCb result [6,43] by 1.6σ. The A I values for almost all the bins for different channels show a negative asymmetry. For the bin q 2 ∈ (1.0, 6.0) GeV 2 /c 4 , the obtained A I value deviates from zero by 2.6σ for the mode with muon final states. The A I value for the whole q 2 range is −0.19 +0.07 −0.06 ± 0.01. We see no deviation in differential branching fractions for the mode B + → K + µµ, where LHCb [44] observes lower values than the standard model predictions, though not inconsistent with our result. The values for this observable are lower than the theoretical prediction for neutral B decays, reflecting A I < 1. We have also searched for the lepton-flavor-violating B → Kµe decays and set upper limits on their branching fractions at 90% CL: We improve the existing limit on the neutral decay mode by an order of magnitude. More precisely, the limit of BaBar [17] is 2.7 × 10 −7 , i.e., the improvement is by a factor of 7.1.