Search for the lepton flavour violating decay B+ → K+μ−τ+ using Bs2∗0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {B}_{s2}^{\ast 0} $$\end{document} decays

A search is presented for the lepton flavour violating decay B+ → K+μ−τ+ using a sample of proton-proton collisions at centre-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of 9 fb−1. The τ leptons are selected inclusively, primarily via decays with a single charged particle. The four-momentum of the τ lepton is determined by using B+ mesons from Bs2∗0→B+K−\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {B}_{s2}^{\ast 0}\to {B}^{+}{K}^{-} $$\end{document} decays. No significant excess is observed, and an upper limit is set on the branching fraction ℬ(B+ → K+μ−τ+) < 3.9 × 10−5 at 90 %  confidence level. The obtained limit is comparable to the world-best limit.


JHEP06(2020)129
meson, it is possible to determine the momentum of the B + meson up to a quadratic ambiguity by imposing mass constraints on the B * 0 s2 and B + mesons [18]. This technique was first used to study relative branching fractions in B + → D 0 Xµ + ν decays [19]. We then search for a peak in the missing-mass squared distribution corresponding to the τ mass squared, m 2 τ . Even signal B + mesons not coming from a B * 0 s2 decay show a peak at m 2 τ . We account for the contribution of these non-B * 0 s2 candidates in the analysis. The τ leptons are selected inclusively, as we only require one additional charged track near the K + µ − pair to help discriminate against background. To normalise the branching fraction, we use the decay B + → J/ψK + , with J/ψ → µ + µ − . The normalisation channel is also used to quantify the contributions from B * 0 s2 decays, as well as non-B * 0 s2 candidates with nearby kaons. In addition to providing the missing-mass discriminating variable, this method allows us to study the control sample composed of same-sign B + K + decays, which does not include any B * 0 s2 component. We use this sample to optimise the signal selection, and motivate our description of the background missing-mass shape.

Detector, data samples, and simulation
The LHCb detector [20,21] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet.
The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV. 2 The minimum distance of a track to a primary pp interaction vertex (PV), the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. At the hardware trigger stage, events are required to have a muon with high p T or a hadron, photon or electron with high transverse energy deposited in the calorimeters. The software trigger requires a two-, threeor four-track secondary vertex with a significant displacement from any primary vertex.
We use data samples collected from 2011 to 2018, at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to an integrated luminosity of 9 fb −1 . We model signal and normalisation decays using simulation. In the simulation, pp collisions are generated using Pythia [22,23] with a specific LHCb configuration [24]. Decays of hadrons and τ leptons are described by EvtGen [25]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [26,27] as described in ref. [28].

JHEP06(2020)129
For the signal, we consider both a phase space model and variations of the decay kinematics with effective operators for the b → sµ + τ − interaction and their corresponding Wilson coefficients using the distributions from ref. [29] (see also ref. [30]) and the form factors from ref. [31]. The branching fraction limit is determined for various hypotheses: for the phase-space decay, for a decay via the vector or axial-vector operators O

Selection and missing mass calculation
The selection of B + candidates begins with a K + µ − pair with an invariant mass m K + µ − > 1800 MeV to reduce background from semileptonic charm decays. The K + and µ − candidates are formed from high-quality tracks consistent with kaon and muon hypotheses and inconsistent with being produced at any PV in the event. The K + µ − vertex must be of high quality and well separated from any PV. The K + µ − pair is associated with a single PV by choosing the vertex that minimizes the quantity χ 2 IP , defined as the change in the χ 2 of the vertex fit when including or excluding the K + µ − pair in the fit.
To better separate signal candidates with τ leptons from background, we require an additional track, labelled t + , with charge opposite to that of the muon. This track must also be of high quality and inconsistent with being produced at any PV in the event. By adding this third track, we also fully reconstruct the normalisation mode B + → J/ψK + , with J/ψ → µ + µ − . Many background candidates are expected to come from B-meson decays of the form B → D(→ K + Xµ − )K + Y , where X and Y refer to any number of additional particles. In these cases the kaon originating from the D meson is assigned as the additional track. Since only approximately 2% of τ decays contain a charged kaon, we apply particle identification requirements so that the track is unlikely to be a charged kaon. Events in which multiple candidates are found, or events in which a candidate τ + → π + π − π + ν τ decay is reconstructed, are not used in this search. This removes backgrounds and avoids overlap with ongoing searches at LHCb exclusively using this decay channel. The overall signal loss is less than 3%.
We split the data samples into signal and normalisation regions based on the invariant mass of the K + µ − t + triple, using the muon hypothesis for the third track. Candidates with m Kµµ < 4800 MeV fall into the signal region, while candidates with 5180 < m Kµµ < 5380 MeV and |m µµ − m J/ψ | < 40 MeV fall into the normalisation region.
The B + candidate direction is estimated using the associated PV and K + µ − vertex positions. We next consider prompt tracks, i.e. those that are consistent with being produced at that PV. Those tracks identified as kaons, with a charge opposite to that of the kaon in the K + µ − pair and a small perpendicular momentum relative to the B + candidate direction, are combined with the B + candidates to form B * 0 s2 candidates. We refer to this sample as the opposite-sign kaon (OSK) sample. Additionally, we select a control sample, referred to as same-sign kaon (SSK) sample, by adding prompt kaons of the same sign as the kaon in the K + µ − pair.
From ref. [19], the two B-meson energy solutions are where m BK = m B * 0 s2 is the assumed B + K − mass, p K and E K are the reconstructed prompt kaon momentum and energy, and θ is the laboratory frame angle between the prompt kaon and B-meson directions. The missing four-momentum of the τ lepton, P miss , is then reconstructed as P B − P K + µ − , where P B and P K + µ − are the four-momenta of the B meson and K + µ − pair. The missing mass squared is calculated using the lowest energy, real solution for which the resulting missing energy is greater than the reconstructed energy of the third track under a pion mass hypothesis. With this choice, we correctly reconstruct the energy of signal decays in simulation in more than 75% of cases. About 9% of all signal decays have no such solution and are lost. Both signal and normalisation candidates, as well as the SSK control-sample candidates, are required to pass this procedure. Candidates in the signal region are additionally required to have the residual missing mass squared, defined as the four-momentum difference of the B meson and K + µ − t + triple, (P B − P K + µ − − P t ) 2 , greater than −0.5 GeV 2 . This requirement removes background and only poorly reconstructed signal candidates which do not peak at the τ mass squared. The minimum mass difference, defined in ref. [19] as is required to be greater than 30 MeV. This removes contributions from B 0 s1 and B * 0 s2 → B * + K − decays, as well as background in which a kaon from the B decay is wrongly associated to the primary vertex.
Missing-mass distributions for the signal simulation and the full data sample after the above selection are shown in figure 1. All signal decays, whether they come from a B * 0 s2 meson or not, peak at the known m 2 τ , however the non-B * 0 s2 candidates have a much wider peak than the B * 0 s2 ones. The data distributions are shown for both the OSK and SSK samples. They have similar shapes with a broad hump centred near 5 GeV 2 . We note that the OSK sample has a higher yield than the SSK; this excess has been observed in both fully and partially reconstructed decays [19,32].

Normalisation
We determine the yield of the normalisation decay, as well as the relative efficiency of the signal modes with respect to the normalisation mode, separately for each data-taking year. For the normalisation mode, we determine the inclusive yield of B + → J/ψK + decays, whether or not they originate from a B * 0 s2 meson, by a binned maximum-likelihood fit to the K + µ − t + mass distribution, where we assign the muon mass hypothesis to the third track. The signal is described with a Gaussian distribution, and the background with a linear model.  We determine the fraction of the normalisation candidates coming from B * 0 s2 decays using a K + µ − t + mass fit for the combined-years data sample using the same model as the separated-years samples, along with a binned maximum-likelihood fit to the measured mass-difference distribution m B + K − − m B + − m K − around the B * 0 s2 peak. For the latter fit, we describe the signal peak with a Gaussian core that transitions to an exponential tail on each side, and we model the background with a third-degree polynomial. The results of these fits are shown in figure 2. The total data sample contains 4240 ± 70 B + → J/ψK + decays; the fraction originating from B * 0 s2 decays is f B * 0 s2 = (25.4 ± 1.8) %, where the uncertainty combines the statistical and systematic uncertainties from the choice of fit function. The year-to-year variation is not found to be statistically significant, so we use the value obtained from the combined dataset for all years.
The relative efficiency of the signal and normalisation modes is determined using simulation with corrections from data. For B * 0 s2 decays the relative efficiencies in different years average around 30%, with an absolute year-to-year variation of less than 3%. Different signal decay models change the relative efficiency by approximately 10%, with the decays via scalar and pseudoscalar operators having a lower overall efficiency. Signal events in -5 -

JHEP06(2020)129
which the B + meson does not originate from a B * 0 s2 decay have a lower selection efficiency, primarily because fewer of these candidates pass the residual missing-mass requirement and fall into the missing-mass fit range. Using simulation, we derive an additional factor for this signal component of r non-B * 0 s2 = 0.849 ± 0.007, which gives the relative efficiency with respect to the B * 0 s2 mode.

Multivariate signal selection
We further improve the signal selection using a Boosted Decision Tree (BDT) classification with the Adaboost algorithm [33]. The BDT inputs are primarily chosen to distinguish additional tracks coming from signal τ lepton decays from various sources of background. Some examples are semileptonic b-hadron decays to charm where the charm hadron produces a kaon with charge opposite that of the muon, or b-hadron decays where the muon is produced in the semileptonic decay of a child charm hadron. The background training sample is taken from the SSK sample in the m 2 miss region around m 2 τ . This focuses the training on the sources of background which fall near the signal peak. We describe the signal with simulation samples that include only B * 0 s2 decays; the effect of the BDT on non-B * 0 s2 signal simulation is then estimated separately. The training makes use of different topological reconstructions of the K + µ − t + triple: in addition to the signal selection, we also first combine either the kaon and the track or the muon and the track into a pair before adding the third particle. The pair masses and the flight distance of the pair in each topology help to distinguish the signal from background, for instance when the pair comes from a charm hadron decay. We also include the flight distance of the τ , which we reconstruct as the distance along the τ trajectory found in the missing-mass calculation from the K + µ − vertex to the point of closest approach of the third track. The result of a separate isolation discriminant is included to reduce background with additional charged tracks; this discriminant is trained to distinguish additional tracks belonging to the same b-hadron decay from other tracks in the event. It uses the kinematics of the additional tracks, their distances to and angles with the signal candidate tracks, and topological information from vertices formed by the additional tracks and signal candidate tracks.
A loose requirement on the signal optimisation BDT output is applied, keeping about 70% of all simulated B * 0 s2 signal candidates and about 40% of non-B * 0 s2 signal candidates. We perform the final fit to the m 2 miss distribution in four bins of the BDT output. The bins are chosen by optimising the expected upper limit using a number of background events derived from the yields in the OSK and SSK m 2 miss sidebands from 1 to 2 GeV 2 and 4 to 6 GeV 2 .

Background studies
The background in this analysis is composed of a large number of different partially reconstructed b-hadron decays. None of them, however, produce a narrow peak in m 2 miss . Only -6 -

JHEP06(2020)129
B + mesons produced from B * 0 s2 decays have a resolution comparable to the signal. Furthermore, if there is more than one missing particle then the true missing-mass distribution will be much wider than the expected signal peak. Charm hadrons have masses close to the τ mass, however there is no Standard Model decay B + → K + µ − D + . We are not sensitive to decays such as B + → K + π − D + , where the pion is misidentified as a muon, because of their low branching fraction and the rate at which hadrons are misidentified as muons. We expect that the missing-mass distribution, summed over many different background components, is smooth, and we model it as a polynomial.
These assumptions are tested using simulation and data. We produce fast simulation samples with RapidSim [34] of a number of potential exclusive background sources from B + , B 0 , B 0 s , and Λ 0 b hadrons; the true missing-mass distributions for these decays are smeared to estimate their shapes in data. No sign of any sharply peaking component is found. In data we consider a number of different control samples, namely all possible Kµt charge combinations in both OSK and SSK samples, excluding the signal selection of K + µ − t + in the OSK sample. There is no sign of any narrow peak in any of the distributions, even after applying a tight requirement on the BDT output.
Maximum-likelihood fits to the SSK sample using polynomials of different degrees in the restricted m 2 miss range from 1 to 6 GeV 2 are used to study the background shape in more detail. The optimal number of free polynomial parameters in the most signal-like BDT output bin, based on the best-fit value of −2 log L, penalised by one for each additional parameter, is four. We further study the effect of background modelling by performing a large number of pseudoexperiments, both background-only and with injected signal at branching fractions of 1 × 10 −5 and 2 × 10 −5 . In these studies, we first fit a background model of some polynomial degree to one of the control samples to determine a base background model. We generate many pseudodatasets from this background model, and then fit them both with polynomials of the same degree as well as different degrees. Based on these studies, we take into account the systematic uncertainty due to the background modelling by reporting the weakest limit using background descriptions of third, fourth, or fifth degree polynomials, all of which well describe the background shapes in the pseudoexperiments.

Fit description
We search for the K + µ − τ + missing-mass peak with an unbinned maximum-ikelihood fit simultaneously in four bins of BDT output in the OSK K + µ − t + signal channel. The fit is performed in the missing-mass range 1 < m 2 miss < 6 GeV 2 . The parameter of interest is the branching fraction B(B + → K + µ − τ + ). We describe the m 2 miss shape for the signal component with a generalized hyperbolic distribution [35] with shape parameters obtained from simulation. Two signal shapes are used: one for B * 0 s2 decays, and one for the wider non-B * 0 s2 contribution. We determine the shapes separately in each bin of BDT response. The signal decay model does not significantly affect the signal missing-mass shape. The background is described by polynomial functions which vary independently in each BDT output bin.

JHEP06(2020)129
We base the normalisation of the signal components on the yields of the B + → J/ψK + decays; the yield is determined in data year-by-year to account for different efficiencies between years. We combine this together with the relative efficiencies, ε rel ; the known B + → J/ψK + with J/ψ → µ + µ − combined branching fraction, abbreviated as B(J/ψK + ); and the parameter of interest to derive a total number of B + → K + µ − τ + signal decays. This total is divided between B * 0 s2 and non-B * 0 s2 decays based on the observed fraction in the normalization channel, and then distributed across the four BDT bins. This gives yields in each BDT bin j of where ε B * 0 s2 ,j and ε non-B * 0 s2 ,j are the separate efficiencies for each signal component to be found in BDT bin j. The main parameters of the fit are thus the B + → K + µ − τ + branching fraction, four parameters for the background normalisation in each BDT bin, and up to five parameters describing the polynomial background shapes in each BDT bin.
The largest systematic uncertainty comes from the choice of background model. The fifth degree background description obtains the weakest limit among the tested background models. We include the effects of other systematic uncertainties using Gaussian-constrained nuisance parameters. These nuisance parameters modify the normalisation yield, the relative efficiency of the signal and normalisation channels, the signal yield in each BDT bin, and the signal shapes. The largest effects come from the modelling of the kinematics of B * 0 s2 decays in simulation, which results in 5% changes in the relative efficiency and in the signal fractions in each bin of BDT response. The relative statistical uncertainty of the B * 0 s2 fraction taken from the normalisation channel is approximately 7%. Altogether, the total effect of these systematic uncertainties on the final limit is small, at the 10 −6 level.

Results and conclusion
The result at the best fit point is shown in figure 3. The obtained value for the signal branching fraction from the maximum-likelihood fit is (1.9 ± 1.5) × 10 −5 . No significant excess is observed, and we set upper limits on the branching fraction using the CLs method [36]. We perform a scan in the signal branching fraction, obtaining the signal and background p-values from the distributions of a one-sided profile-likelihood-ratio test statistic obtained with pseudoexperiments in which we vary the constraints on the systematic uncertainties. The scan used to determine the observed limits, compared to the expected one, is shown in figure 4. The expected upper limit at 90% CL is 2.3 × 10 −5 . The observed 90% and   This is the first result from the LHCb experiment for the lepton-flavour violating decay B + → K + µ − τ + . By studying B + mesons from B * 0 s2 decays, we are able to make the first analysis at LHCb of a B hadron decay using inclusive τ decays. This provides complementary information to searches for lepton-flavour violation at LHCb with threeprong τ decays, for example B 0 (s) → τ ± µ ∓ decays [37]. We observe no significant signal, and set an upper limit slightly above that obtained by the BaBar collaboration [17].  Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.