Search for the lepton ﬂavour violating decay

: A search for the lepton ﬂavour violating decay τ − → µ − µ + µ − is performed with the LHCb experiment. The data sample corresponds to an integrated luminosity of 1 . 0 fb − 1 of proton-proton collisions at a centre-of-mass energy of 7 TeV and 2 . 0 fb − 1 at 8 TeV. No evidence is found for a signal, and a limit is set at 90% conﬁdence level on the branching fraction, B ( τ − → µ − µ + µ − ) < 4 . 6 × 10 − 8 .


Introduction
Lepton flavour violating processes are allowed within the context of the Standard Model (SM) with massive neutrinos, but their branching fractions are of order 10 −40 [1,2] or smaller.Observation of charged lepton flavour violation (LFV) would therefore be an unambiguous signature of physics beyond the Standard Model (BSM), but no such process has been observed to date [3].
A number of BSM scenarios predict LFV at branching fractions approaching current experimental sensitivities [4], with LFV in τ − decays often enhanced with respect to µ − decays due to the large difference in mass between the two leptons (the inclusion of charge-conjugate processes is implied throughout).If charged LFV were to be discovered, measurements of the branching fractions for a number of channels would be required to determine the nature of the BSM physics.In the absence of such a discovery, improving the experimental constraints on the branching fractions for LFV decays would help to constrain the parameter spaces of BSM models.
The search for LFV in τ − decays at LHCb takes advantage of the large inclusive τ − production cross-section at the LHC, where τ − leptons are produced almost entirely from the decays of b and c hadrons.Using the bb and cc cross-sections measured by LHCb [9,10] and the inclusive b → τ and c → τ branching fractions [3], the inclusive τ − cross-section is estimated to be 85 µb at 7 TeV.
Selection criteria are implemented for the signal mode, τ − → µ − µ + µ − , and for the calibration and normalisation channel, which is D − s → φπ − with φ → µ + µ − , referred to in the following as D − s → φ (µ + µ − ) π − .To avoid potential bias, µ − µ + µ − candidates with mass within ±30 MeV/c 2 (approximately three times the expected mass resolution) of the known τ − mass are initially excluded from the analysis.Discrimination between a potential signal and the background is performed using a three-dimensional binned distribution in two multivariate classifiers and the mass of the τ − candidate.One classifier is based on the three-body decay topology and the other on muon identification.

Detector and triggers
The LHCb detector [5] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet.The tracking system provides a measurement of momentum, p, with a relative uncertainty that varies from 0.4% at low momentum to 0.6% at 100 GeV/c.The minimum distance of a track to a primary vertex, the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of p transverse to the beam, in GeV/c.Different types of charged hadrons are distinguished using information from two ringimaging Cherenkov detectors (RICH) [11].Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter.Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [12].
The trigger [13] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.Candidate events are first required to pass the hardware trigger, which selects muons with a transverse momentum p T > 1.48 GeV/c in the 7 TeV data or p T > 1.76 GeV/c in the 8 TeV data.In the software trigger, at least one of the final-state particles is required to have both p T > 0.8 GeV/c and IP > 100 µm with respect to all of the primary pp interaction vertices (PVs) in the event.Finally, the tracks of two or more of the final-state particles are required to form a vertex that is significantly displaced from the PVs.

Monte Carlo simulation
In the simulation, pp collisions are generated using Pythia [14] with a specific LHCb configuration [15].Decays of hadronic particles are described by EvtGen [16], in which final-state radiation is generated using Photos [17].For the τ − → µ − µ + µ − signal channel, the final-state particles are distributed according to three-body phase-space.The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [18] as described in Ref. [19].
As the τ − leptons produced in the LHCb acceptance originate almost exclusively from heavy quark decays, they can be classified in one of five categories according to the parent particle.The parent particle can be the following: a b hadron; a D − s or D − meson that is produced directly in a proton-proton collision or via the decay of an excited charm meson; or a D − s or D − meson resulting from the decay of a b hadron.Events from each category are generated separately and are combined in accordance with the measured cross-sections and branching fractions.Variations of the cross-sections and branching fractions within their uncertainties are considered as sources of systematic uncertainty.

Event selection
Candidate τ − → µ − µ + µ − decays are selected by requiring three tracks that combine to give a mass close to that of the τ − lepton, and that form a vertex that is displaced from the PV.The tracks are required to be well-reconstructed muon candidates with p T > 300 MeV/c that have a significant separation from the PV.There must be a good fit to the three-track vertex, and the decay time of the candidate forming the vertex has to satisfy ct > 100 µm.As the τ − leptons are produced predominantly in the decays of charm mesons, where the Q-values are relatively small (and so the charm meson and the τ − are almost collinear in the laboratory frame), a requirement on the pointing angle, θ, between the momentum vector of the three-track system and the vector joining the primary and secondary vertices is used to remove poorly reconstructed candidates (cos θ > 0.99).Contamination from pairs of tracks originating from the same particle is reduced by removing same-sign muon pairs with mass lower than 250 MeV/c 2 .
The decay D − s → η (µ + µ − γ) µ − νµ is a source of irreducible background near the signal region, and therefore candidates with a µ + µ − invariant mass below 450 MeV/c 2 are removed.Signal candidates containing muons that result from the decay of the φ(1020) meson are removed by excluding µ + µ − masses within ±20 MeV/c 2 of the known φ(1020) meson mass.
The signal region is defined by a ±20 MeV/c 2 window (approximately two times the expected mass resolution) around the known τ − mass.Candidates with µ − µ + µ − invariant mass between 1600 and 1950 MeV/c 2 are kept to allow evaluation of the background contributions in the signal region.In the following, the wide mass windows on either side of the signal region are referred to as the data sidebands.The signal region for the normalisation channel, D − s → φ (µ + µ − ) π − , which has a similar topology to that of the τ − → µ − µ + µ − decay, is defined by a ±20 MeV/c 2 window around the D − s mass, with the µ + µ − mass required to be within ±20 MeV/c 2 of the φ(1020) meson mass.Where appropriate, the rest of the selection criteria are identical to those for the signal channel, with one of the muon candidates replaced by a pion candidate.

Signal and background discrimination
Three classifiers are used to discriminate between signal and background: an invariant mass classifier that uses the reconstructed mass of the τ − candidate; a geometric classifier, M 3body ; and a particle identification classifier, M PID .
The multivariate classifier M 3body is based on the geometry and kinematic properties of the final-state tracks and the reconstructed τ − candidate.It aims to reject backgrounds from combinations of tracks that do not share a common vertex and those from multi-body decays with more than three final-state particles.The variables used in the classifier include the vertex fit quality, the displacement of the vertex from the PV, the pointing angle θ, and the IP and fit χ 2 of the tracks.An ensemble-selected (blended) [20], custom boosted decision tree (BDT) classifier is used [21,22], as described in the following.In the blending method the input variables are combined [23] into one BDT, two Fisher discriminants [24], four neural networks [25], one function-discriminant analysis [26] and one linear discriminant [27].Each classifier is trained using simulated signal and background samples, where the composition of the background is a mixture of b b → µµX and cc → µµX processes according to their relative abundances as measured in data.As each category of simulated signal events has different kinematic properties, a separate set of classifiers is trained for each.One third of the available signal sample is used at this stage, along with one half of the background sample.The classifier responses, along with the original input variables, are then used as input to the custom BDT classifier, which is trained on the remaining half of the background sample and a third of the signal sample, with the five categories combined, to give the final classifier response.As the responses of the individual classifiers are not fully correlated, blending the output of the classifiers improves the sensitivity of the analysis in our data sample by 6% with respect to that achievable by using the best single classifier.The M 3body classifier response is calibrated using the D − s → φ (µ + µ − ) π − control channel to correct for differences in response between data and simulation.
The multivariate classifier M PID uses information from the RICH detectors, the calorimeters and the muon detectors to obtain the likelihood that each of the three finalstate particles is compatible with the muon hypothesis.The value of the M PID response is taken as the smallest likelihood of the three muon candidates.The M PID classifier uses a neural network that is trained on simulated events to discriminate muons from other charged particles.The M PID classifier response is calibrated using muons from J/ψ → µ + µ − decays in data.
For the M 3body and M PID responses, a binning is chosen such that the separation between the background-only and signal-plus-background hypotheses is maximised, whilst minimising the number of bins.The binning optimisation is performed separately for the 7 TeV and 8 TeV data sets, because there are small differences in event topology with changes of centre-of-mass energy.The optimisation does not depend on the signal branching fraction.The bins at lowest values of M 3body and M PID response do not contribute to the sensitivity and are excluded from the analysis.The distributions of the responses of the two classifiers, along with their binning schemes, are shown in Fig. 1 for the 8 TeV data set.
The expected shapes of the invariant mass spectra for the τ − → µ − µ + µ − signal in the 7 TeV and 8 TeV data sets are taken from fits to the D − s → φ (µ + µ − ) π − control channel in data.Figure 2 shows the fit to the 8 TeV data.No particle identification requirements are applied to the pion.The signal distribution is modelled with the sum of two Gaussian functions with a common mean, where the narrower Gaussian contributes 70% of the total signal yield, while the combinatorial background is modelled with an exponential function.The expected width of the τ − signal in data is taken from simulation, scaled by the ratio of the widths of the D − s peaks in data and simulation.

Backgrounds
The background processes for the τ − → µ − µ + µ − decay consist mainly of heavy meson decays yielding three muons in the final state, or one or two muons in combination with two or one misidentified particles.There are also a large number of events with one or two muons from heavy meson decays combined with two or one muons from elsewhere in the event.Decays containing undetected final-state particles, such as K 0 L mesons, neutrinos or photons, can give large backgrounds, which vary smoothly in the signal region.The most important background channel of this type is found to be D − s → η (µ + µ − γ) µ − νµ , about 90% of which is removed by the requirement on the dimuon mass.The small remaining contribution from this process has a mass distribution similar to that of the other backgrounds in the mass range considered in the fit.The dominant contributions to the background from misidentified particles are from D − (s) → K + π − π − and D − (s) → π + π − π − decays.However, these events populate mainly the region of low M PID response and are reduced to a negligible level by the exclusion of the first bin.
The expected numbers of background events within the signal region, for each bin in M 3body and M PID , are evaluated by fitting an exponential function to the candidate mass spectra outside of the signal windows using an extended, unbinned maximum likelihood fit.The parameters of the exponential function are allowed to vary independently in each bin.The small differences obtained if the exponential curves are replaced by straight lines are included as systematic uncertainties.The µ − µ + µ − mass spectra are fitted over the mass range 1600-1950 MeV/c 2 , excluding windows of width ±30 MeV/c 2 around the expected signal mass.The resulting fits to the data sidebands for the highest sensitivity bins are shown in Fig. 3

Normalisation
To convert the observed number of τ − → µ − µ + µ − candidates into a branching fraction it is normalized using the where α is the overall normalisation factor, N sig is the number of observed signal events and all other terms are described below.Table 1 gives a summary of all contributions to the factor α; the uncertainties are taken to be uncorrelated.The branching fraction of the normalisation channel is determined from known branching fractions as where B (φ → K + K − ) and B (φ → µ + µ − ) are taken from Ref. [3] and ) is taken from Ref. [28].The branching fraction B (D − s → τ − ντ ) is taken from Refs.[3,29].LHCb measurement [30] and the cc cross-section measured at 7 TeV is scaled by a factor of 8 /7, consistent with Pythia simulations.The uncertainty on this scaling factor, which is negligible, is found by varying the parton distribution functions in Pythia.
The reconstruction and selection efficiencies, R , are products of the detector acceptances for the decay of interest, the muon identification efficiencies and the selection efficiencies.The combined muon identification and selection efficiencies are determined from the yield of simulated events after the full selections are applied.The ratio of efficiencies is corrected to account for the differences between data and simulation in track reconstruction, muon identification, the φ(1020) mass window requirement in the normalisation channel and the τ − mass range.The removal of candidates in the least sensitive bins in the M 3body and M PID classifier responses is also taken into account.
The trigger efficiencies, T , are evaluated from simulation and their systematic uncertainties are determined from the differences between the trigger efficiencies of B − → J/ψ(µ + µ − )K − decays measured in data and in simulation, using muons with momentum values typical of τ − → µ − µ + µ − signal decays.The trigger efficiency for the 8 TeV data set is corrected to account for differences in trigger conditions across the data taking period, resulting in a relatively large systematic error.
The yields of D − s → φ (µ + µ − ) π − candidates in data, N cal , are determined from the fits to reconstructed φ (µ + µ − ) π − mass distributions shown in Fig. 2. The variations in the yields when the relative contributions of the two Gaussian components are allowed to vary in the fits are considered as systematic uncertainties.

Results
Tables 2 and 3 give the expected and observed numbers of candidates in the signal region, for each bin of the classifier responses.No significant excess of events over the expected background is observed.Using the CL s method [31] and Eq. 1, the observed CL s value and the expected CL s distribution are calculated as functions of the assumed branching fraction, as shown in Fig. 4. The systematic uncertainties on the signal and background estimates, which have a very small effect on the final limits, are included following Ref.[31].The expected limit at 90% (95%) CL for the branching fraction is B (τ − → µ − µ + µ − ) < 5.0 (6.1) × 10 −8 , while the observed limit at 90% (95%) CL is Whilst the above limits are given for the phase-space model of τ − decays, the kinematic properties of the decay would depend on the physical processes that introduce LFV.Reference [32] gives a model-independent analysis of the decay distributions in an effective field-theory approach including BSM operators with different chirality structures.Depending on the choice of operator, the observed limit varies within the range (4.1 − 6.8) × 10 −8 at 90% CL.The weakest limit results from an operator that favours low µ + µ − mass, since the requirement to remove the D − s → η (µ + µ − γ) µ − νµ background excludes a large fraction of the relevant phase-space.
In summary, the LHCb search for the LFV decay τ − → µ − µ + µ − is updated using all data collected during the first run of the LHC, corresponding to an integrated luminosity of 3.0 fb −1 .No evidence for any signal is found.The measured limits supersede those of Ref. [6] and, in combination with results from the B factories, improve the constraints placed on the parameters of a broad class of BSM models.

Figure 1 :
Figure 1: Distribution of (a) M 3body and (b) M PID response for 8 TeV data.The binnings correspond to those used in the extraction of the final results.The short-dashed (red) lines show the response of the data sidebands, whilst the long-dashed (blue) and solid (black) lines show the response of simulated signal events before and after calibration.In both cases the first bin is excluded from the analysis.

for 7 and 8
TeV data separately.

Figure 2 :
Figure 2: Invariant mass distribution of φ(µ + µ − )π − candidates in 8 TeV data.The solid (blue) line shows the overall fit, the long-dashed (green) and short-dashed (red) lines show the two Gaussian components of the D − s signal and the dot-dashed (black) line shows the combinatorial background contribution.

Figure 3 :
Figure 3: Invariant mass distributions and fits to the mass sidebands in (a) 7 TeV and (b) 8 TeV data for µ + µ − µ − candidates in the bins of M 3body and M PID response that contain the highest signal probabilities.

Figure 4 :
Figure 4: Distribution of CL s values as a function of the assumed branching fraction forτ − → µ − µ + µ − ,under the hypothesis to observe background events only.The dashed line indicates the expected limit and the solid line the observed one.The light (yellow) and dark (green) bands cover the regions of 68% and 95% confidence for the expected limit.

Table 1 :
Terms entering into the normalisation factors, α, and their combined statistical and systematic uncertainties.

Table 2 :
Expected background candidate yields in the 7 TeV data set, with their uncertainties, and observed candidate yields within the τ − signal window in the different bins of classifier response.The classifier responses range from 0 (most background-like) to +1 (most signal-like).The first bin in each classifier response is excluded from the analysis.

Table 3 :
Expected background candidate yields in the 8 TeV data set, with their uncertainties, and observed candidate yields within the τ − signal window in the different bins of classifier response.The classifier responses range from 0 (most background-like) to +1 (most signal-like).The first bin in each classifier response is excluded from the analysis.Università di Bologna, Bologna, Italy e Università di Cagliari, Cagliari, Italy f Università di Ferrara, Ferrara, Italy g Università di Firenze, Firenze, Italy h Università di Urbino, Urbino, Italy i Università di Modena e Reggio Emilia, Modena, Italy j Università di Genova, Genova, Italy k Università di Milano Bicocca, Milano, Italy l Università di Roma Tor Vergata, Roma, Italy m Università di Roma La Sapienza, Roma, Italy n Università della Basilicata, Potenza, Italy o AGH -University of Science and Technology, Faculty of Computer Science, Electronics and Telecommunications, Kraków, Poland p LIFAELS, La Salle, Universitat Ramon Llull, Barcelona, Spain q Hanoi University of Science, Hanoi, Viet Nam r Università di Padova, Padova, Italy s Università di Pisa, Pisa, Italy t Scuola Normale Superiore, Pisa, Italy u Università degli Studi di Milano, Milano, Italy v Politecnico di Milano, Milano, Italy d