Search for $B{}^0_s \rightarrow \ell^{\mp} \tau^{\pm}$ with the Semi-leptonic Tagging Method at Belle

We present a search for the lepton-flavor-violating decays $B{}^0_s \rightarrow \ell^{\mp}\tau^{\pm}$, where $\ell = e, \mu$, using the full data sample of $121~\mathrm{fb}^{-1}$ collected at the $\Upsilon(5S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. We use $B{}^0_s \overline{B}{}^0_s$ events in which one $B{}^0_s$ meson is reconstructed in a semileptonic decay mode and the other in the signal mode. We find no evidence for $B{}^0_s \rightarrow \ell^{\mp}\tau^{\pm}$ decays and set upper limits on their branching fractions at $90\%$ confidence level as $\mathcal{B}(B{}^0_s \rightarrow e^{\mp}\tau^{\pm})<14 \times 10^{-4}$ and $\mathcal{B}(B{}^0_s \rightarrow \mu^{\mp}\tau^{\pm})<7.3 \times 10^{-4}$. Our result represents the first upper limit on the $B{}^0_s \rightarrow e^{\mp}\tau^{\pm}$ decay rate.


Introduction
The lepton-flavor-violating (LFV) decays B 0 s → ℓ ∓ τ ± , where ℓ = e, µ, are forbidden in the standard model (SM). Such decays can occur via neutrino mixing by loop and box diagrams [1], but the predicted decay rates are far below current experimental capabilities. Thus, any observations at current experiments would constitute an unambiguous signature of new physics (NP). Recent results indicating possible lepton flavor universality violation in B meson decay have been discussed in Refs. [2,3], where many NP models are proposed to explain it. Such models allow significantly enhanced LFV decay rates that may be detectable with current facilities. For example, the models containing a heavy neutral gauge boson (Z ′ ) could lead to an enhanced B 0 s → µ − τ + branching fraction, up to 10 −8 when only left-or right-handed couplings to quarks are considered, or of order 10 −6 [4], if both are allowed. In models with either scalar or vector leptoquarks, the prediction for the branching fraction of B 0 s → ℓ − τ + can be as large as 10 −5 [5][6][7], depending on the assumed leptoquark mass. It is imperative to search for signals of physics beyond the SM in all possible avenues, and since the expected branching fraction of B 0 s → e − τ + may differ from B 0 s → µ − τ + depending on models, it is important to search for both decay modes to obtain additional information regarding the NP. To date, no experimental results for B 0 s → e ∓ τ ± have been reported while an upper limit B(B 0 s → µ ∓ τ ± ) < 3.4 × 10 −5 at 90% confidence level (CL) [8] has been reported by LHCb.
In this paper, we report a search for B 0 s → ℓ ∓ τ ± decays using 121 fb −1 of data collected by the Belle experiment at the KEKB asymmetric-energy e + e − collider [9,10]. The data were collected at an e + e − center-of-mass (c.m.) energy corresponding to the Υ(5S) resonance mass.

Data sample and Belle detector
The Belle detector is a large-solid-angle magnetic spectrometer comprising a silicon vertex detector, a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and a CsI(Tl) crystal electromagnetic calorimeter (ECL). All these components are located inside a superconducting solenoid providing a magnetic field of 1.5 T. An iron flux return located outside the solenoid coil is instrumented with resistive plate chambers to detect K 0 L mesons and muons (KLM). A more detailed description of the detector and its layout and performance can be found in Refs. [11,12].
We study the properties of signal events, identify sources of background, and optimize selection criteria using Monte Carlo (MC) simulated events. These samples are generated using EvtGen [13]. The detector response is simulated using the Geant3 framework [14]. We simulate 20 million B 0 s → ℓ ∓ τ ± MC events to study the detector response and to calculate signal reconstruction efficiencies. To estimate backgrounds, we use MC samples of B [15], and e + e − → qq (q = u, d, s, c) events. These samples, referred to as generic MC, are equivalent to six times the data luminosity. The Belle data are converted into the Belle II format [16], and the particle and event reconstruction are performed within the basf2 framework [17,18] of the Belle II experiment.
The B 0 s and B 0 s mesons are produced in the process s mixing, such that half of the events contain same flavor B s pairs. The Υ(5S) resonance production cross section is 340 ± 16 pb [19], and f s , its total branching fraction for decays to B ( * )0 s B ( * )0 s , is 0.201 ± 0.031 [15]. Therefore, the Belle data sample is estimated to contain (16.6 ± 2.7) × 10 6 B s mesons.

Event Selection and Analysis Overview
Hereafter, B s refers to either B 0 s or B 0 s , and the inclusion of charge-conjugated modes is implied. In this analysis, one B s is reconstructed in a semileptonic decay mode B 0 s → D + s ℓ − (X)ν ℓ and used as a tag, where X stands for any particles such as π or a combination of pions, and the signal B s → ℓ − τ + is searched for in the mode τ + → ℓ + ν τ ν ℓ . We label the primary and secondary leptons from the τ decay on the signal side B s as ℓ 1 and ℓ 2 , and the lepton on the tag side as ℓ 3 . In short, we Figure 1 shows a schematic diagram of the process, separated into signal and tag sides. To avoid biasing the results, all selection criteria are determined in a "blind" manner, i.e., they are optimized using MC samples only, before the experimental data in the signal region are revealed. For charged particles, aside for pions from K 0 S , the distance of nearest approach of the track perpendicular to and along the beam direction, with respect to the nominal interaction point, are required to be less than 0.5 cm and 2.0 cm, respectively. The K 0 S candidates are reconstructed by combining two oppositely charged particles (assumed to be pions) with an invariant mass between 487 and 508 MeV/c 2 ; this range corresponds to approximately three standard deviations (±3σ) in the invariant mass resolution around the nominal K 0 S mass [15]. Such candidates are further subjected to a neural network-based identification [20]. The π 0 candidates are reconstructed from pairs of photons detected as ECL clusters without any associated charged tracks in the CDC. The energy of each photon is required to be greater than 50 MeV if the photon is detected in the barrel region (32.2 • < θ < 128.7 • , where θ is its polar angle), greater than 100 MeV if the photon is in the forward endcap region (12.4 • < θ < 31.4 • ), and greater than 150 MeV if the photon is in the backward endcap region (130.7 • < θ < 155.1 • ) [11,12]. The invariant mass of each photon pair is required to be between 120 and 150 MeV/c 2 ; this range corresponds to a window of approximately ±3σ in the invariant mass resolution around the nominal π 0 mass, and the reconstructed π 0 momentum in the c.m. frame (p * π 0 ) must be greater than 0.2 GeV/c. A mass constrained fit to the nominal π 0 mass is performed to improve momentum resolution.
To identify charged hardons, we use information on the light yield from the ACC, crossing time from the TOF, and specific ionization from the CDC. This information is combined into likelihoods L K and L π for a given track to be a K + or π + , respectively. To identify K + or π + tracks, we require L K /(L K + L π ) > 0.6 or L π /(L K + L π ) > 0.6. This requirement is more than 93% efficient in identifying pions, with a K + mis-identification rate below 5%. Muon candidates are selected based on information from the KLM [21]. We calculate a normalized muon likelihood ratio R µ = L µ /(L µ + L π + L K ), where L µ is the likelihood for muons, and require R µ > 0.9. This requirement has an efficiency of 85 − 92% and a probability of misidentifying a hadron as a muon below 7%. Electron candidates are identified using the ratio of calorimetric cluster energy to particle momentum, the shower shape in the ECL, the matching of the track with the ECL cluster, the specific ionization in the CDC, and the number of photoelectrons in the ACC [22]. This information is used to calculate a normalized electron likelihood ratio R e = L e /(L e + L had ), where L e is the likelihood for electrons and L had is a product of hadron likelihoods. We require R e > 0.9. This requirement has an efficiency of 84 − 92% and a probability of misidentifying a hadron as an electron below 1%.
For the tag side B 0 s → D + s ℓ − 3 (X)ν ℓ 3 , the charge of ℓ 3 can be opposite to or the same as ℓ 1 , as B s mixing produces equal numbers of opposite and same charge combinations. However, we accept only combinations where the charges of ℓ 1 and ℓ 3 are the same; this significantly reduces combinatorial background. We reconstruct D s meson candidates with opposite charge to ℓ 3 from the following five decay modes: D + s → ϕπ + , K * 0 K + , ϕρ 0 π + , K 0 S K + and ϕρ + . Here, ρ 0 , ρ + , K * 0 and ϕ are reconstructed through ρ 0 → π + π − , ρ + → π + π 0 , K * 0 → K − π + and ϕ → K + K − , and candidates are required to have a reconstructed invariant mass 625 MeV/c 2 < M π + π − (π + π 0 ) < 925 MeV/c 2 for ρ 0 (ρ + ), 845 MeV/c 2 < M K + π − < 945 MeV/c 2 for K * 0 , and 1.01 GeV/c 2 < M K + K − < 1.03 GeV/c 2 for ϕ. The D s candidate is then combined with an e or µ to form a B s meson candidate. Figure 2 shows the mass distribution of D + s meson candidates. The mass of the D + s candidate is required to be between 1.96 and 1.98 GeV/c 2 . These mass windows correspond to ±3σ in the invariant mass resolution around the nominal masses [15]. Figure 3 shows the p * 1 distribution for B s → e − τ + and B s → µ − τ + after initial selections. → µ − τ + modes. These criteria reject 98% of the background events with 40% signal loss. After applying all selection criteria, 8-9% of events have multiple signal candidates. For these events, the candidate with the highest FastBDT output is retained. This criterion is found to select the correct signal candidate 91% of the time for both decay modes. The reconstruction efficiencies from the signal simulations are 0.032% and 0.031% for B s → e − τ + and B s → µ − τ + , respectively. Figure 5 shows the p * 1 distribution after applying the selection on O FastBDT . We observe three events for B s → e − τ + and one event for B s → µ − τ + in the signal region.

Systematic Uncertainties
A summary of systematic uncertainties is shown in Table 1. In order to estimate the systematic uncertainty of the O FastBDT selection, we use 711 fb −1 data sample taken at the Υ(4S) and reconstruct B − → D 0 π − decays, tagging the other side B + by B + → D 0 ℓ + ν. Here, the signal side D 0 is reconstructed in the mode K − π + , while the tag side D 0 is reconstructed in the three decay modes D 0 → K + π − , K 0 S π + π − , and K + π − π + π − . In this study, π − from B − is treated as ℓ 1 in the signal mode, K − from D 0 as ℓ 2 , and π + from D 0 is neglected. With these changes, the topology of these events becomes similar to our signal mode, and we use the same MVA as for the signal without retraining. For this control sample study, we ap- The semileptonic branching fraction of B s is poorly known, so we estimate the systematic uncertainty of tagging from the data, using a control sample of B 0 s → D + s (X)ℓ − ν ℓ , i.e. the B s → ℓ − τ + mode is replaced by B 0 s → D − s ℓ + ν ℓ . In this control sample study, the signal side B 0 s → D − s ℓ + ν ℓ is reconstructed using three D s decay modes D − s → ϕ(→ K + K − )π − , K 0 S K − , and K * 0 (→ K + π − )K − . The tagside B 0 s → D + s (X)ℓ − ν ℓ is reconstructed in the same way as for the B s → ℓ − τ + analysis. We require the mass of the tag-side D s meson candidate to be 1.96 < M Ds < 1.98 GeV/c 2 . We also require the momentum of the tag-side lepton to be greater than 1.0 GeV/c and O FastBDT to be greater than 0.2. If there are multiple combinations in one event, the one with the highest FastBDT output is retained. We extract the signal by performing a one-dimensional unbinned fit to M Ds on the signal side. We find the signal yields 34.3 ± 6.7 and 37.0 ± 6.8 for MC and data events, respectively, which are consistent within the uncertainty. These yields are approximately proportional to the square of the tagging efficiency including the branching fraction of semi-leptonic B s decay to D s , so we take half the uncertainty on the yields to be the systematic uncertainty from the tag side reconstruction. Taking into account additional contributions due to different D s reconstruction and FastBDT selection in this control sample study, we assign 15.0 % as the systematic uncertainty from the tag side reconstruction. This uncertainty includes the contribution of the uncertainty on the branching fraction of the semi-leptonic B s decay to D s as well as the effect of the reconstruction and selection on D s and ℓ. Other systematic uncertainties arise from the signal-side leptons ℓ 1 and ℓ 2 . The systematic uncertainty due to charged track reconstruction is estimated to be 0.35% per track by using the partially reconstructed D * − → D 0 π − , D 0 → π − π + K 0 S and K 0 S → π + π − events [27]. The systematic uncertainties due to lepton identification are 4.3% and 3.5% for B s → e − τ + and B s → µ − τ + decay modes, respectively. The systematic uncertainties due to the τ decay branching fractions are 0.2% [15]. In addition, the systematic uncertainty due to B s meson counting is estimated as 16.1%. The total systematic uncertainty is taken as the sum in quadrature of all individual contributions.

Results and Summary
In the signal region, we find three events for B s → e − τ + and one event for B s → µ − τ + , as shown in Figure 5. The expected number of background events in the signal region, N exp bkg , is estimated from the number of events in the sideband, scaled by the ratio of events in the signal region and sideband without the O FastBDT selection as determined from MC simulation. Here, the sidebands are defined as p * 1 ∈ [1.9, 2.1] GeV/c and p * 1 ∈ [2.7, 3.0] GeV/c. We find N exp bkg = 0.68 ± 0.69 for B s → e − τ + and N exp bkg = 0.77 ± 0.78 for B s → µ − τ + . The number of observed events in the electron mode is larger but not inconsistent with the expected number, and the probability of obtaining three or more events with N exp bkg = 0.68±0.69 is 7.3%. The p * 1 distribution of the three events in the electron mode is different from the expectation for the signal. Thus we calculate upper limits on the branching fractions.
To calculate this limit, we use the POLE program [28,29] with the relation B = (N obs − N exp bkg )/(N Bs × ϵ sig ), where N obs is the number of the observed events, N Bs is the number of B s mesons in the data (16.6 ± 2.7) × 10 6 , and ϵ sig is the signal efficiency including the branching fraction of τ . The uncertainties on ϵ sig and N Bs listed in Section 4, together with the uncertainty of N exp bkg , are taken into account in the upper limit estimation [28]. Since the uncertainty in f s is significant, we report the upper limit not only on the branching fraction but also on f s × B(B s → ℓ − τ + ). Table 2 summarizes the results, including the upper limit. To summarize, we have searched for the decays B 0 s → ℓ ∓ τ ± using the Belle data sample of 121 fb −1 collected at the Υ(5S) resonance. From the observed signal yields, we set upper limits at 90% confidence level. Our limit on the B 0 s → e ∓ τ ± decay rate is the first such limit reported. The sensitivity to these modes can be improved in the future with the Belle II experiment, which could collect a much larger data sample at the Υ(5S) resonance, and apply enhanced analysis techniques such as full reconstruction of the tag B 0 s [30].