Search for lepton-flavour-violating decays of the Higgs and Z bosons with the ATLAS detector

Direct searches for lepton flavour violation in decays of the Higgs and Z bosons with the ATLAS detector at the LHC are presented. The following three decays are considered: H→eτ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H\rightarrow e\tau $$\end{document}, H→μτ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H\rightarrow \mu \tau $$\end{document}, and Z→μτ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z\rightarrow \mu \tau $$\end{document}. The searches are based on the data sample of proton–proton collisions collected by the ATLAS detector corresponding to an integrated luminosity of 20.3 fb-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{fb}^{-1}$$\end{document} at a centre-of-mass energy of s=8\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s}=8$$\end{document} TeV. No significant excess is observed, and upper limits on the lepton-flavour-violating branching ratios are set at the 95%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} confidence level: Br(H→eτ)<1.04%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(H\rightarrow e\tau )<1.04\%$$\end{document}, Br(H→μτ)<1.43%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(H\rightarrow \mu \tau )<1.43\%$$\end{document}, and Br(Z→μτ)<1.69×10-5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(Z\rightarrow \mu \tau )<1.69\times 10^{-5}$$\end{document}.


Introduction
One of the main goals of the Large Hadron Collider (LHC) physics programme at CERN is to discover physics beyond the Standard Model (SM). A possible sign would be the observation of lepton flavour violation (LFV) that could be realised in decays of the Higgs boson or of the Z boson to pairs of leptons with different flavours.
The most stringent bounds on the LFV decays of the Higgs and Z bosons other than H → μe are derived from direct searches [20]. The CMS Collaboration has performed the first direct search for LFV H → μτ decays [21] and reported a small excess (2.4 standard deviations) of data over the predicted background. Their results give a 1.51% upper limit on Br(H → μτ ) at the 95% confidence level (CL). The ATLAS Collaboration has also performed a search [22] for the LFV H → μτ decays in the final state with one muon e-mail: atlas.publications@cern.ch and one hadronically decaying τ -lepton, τ had , and reported a 1.85% upper limit on Br(H → μτ ) at the 95% CL. The most stringent indirect constraint on H → eμ decays is derived from the results of searches for μ → eγ decays [23], and a bound of Br(H → eμ) < O(10 −8 ) is obtained [24,25]. The bound on μ → eγ decays suggests that the presence of a H → μτ signal would exclude the presence of a H → eτ signal, and vice versa, at an experimentally observable level at the LHC [25]. It is also important to note that a relatively large Br(H → μτ ) can be achieved without any particular tuning of the effective couplings, while a large Br(H → eτ ) is possible only at the cost of some fine-tuning of the corresponding couplings [25]. Upper bounds on the LFV Z → eμ, Z → μτ and Z → eτ decays were set by the LEP experiments [26,27]: Br(Z → eμ) < 1.7 × 10 −6 , Br(Z → eτ ) < 9.8 × 10 −6 , and Br(Z → μτ ) < 1.2 × 10 −5 at the 95% CL. The ATLAS experiment set the most stringent upper bound on the LFV Z → eμ decays [28]: Br(Z → eμ) < 7.5×10 −7 at 95% CL. This paper describes three new searches for LFV decays of the Higgs and Z bosons. The first study is a search for H → eτ decays in the final state with one electron and one hadronically decaying τ -lepton, τ had . The second analysis is a simultaneous search for the LFV H → eτ and H → μτ decays in the final state with a leptonically decaying τ -lepton, τ lep . A combination of results of the earlier ATLAS search for the LFV H → μτ had decays [22] and the two searches described in this paper is also presented. The third study constitutes the first ATLAS search for LFV decays of the Z boson with hadronic τ -lepton decays in the channel Z → μτ had . The search for LFV decays in the τ lep analysis is based on the novel method introduced in Ref. [29]; the searches in the τ had analyses are based on the techniques developed for the SM H → τ lep τ had search. All three searches are based on the data sample of pp collisions collected at a centre-of-mass energy of √ s = 8 TeV and corresponding to an integrated luminosity of 20.3 fb −1 . Given the overlap between the analysis techniques used in the H → eτ had search and in the Z → μτ had search, from here on they are referred to as the τ had channels; the H → τ lep search is referred to as the τ lep channel, where = e, μ.
The signatures of LFV searches reported here are characterised by the presence of an energetic lepton originating directly from the boson decay and carrying roughly half of its energy, and the hadronic or leptonic decay products of a τ -lepton. The data in the τ had channels were collected with single-lepton triggers: a single-muon trigger with the threshold of p T = 24 GeV and a single-electron trigger with the threshold E T = 24 GeV. The data in the τ lep channel were collected using asymmetric electron-muon triggers with ( p μ T , E e T ) > (18,8) GeV and (E e T , p μ T ) > (14,8) GeV thresholds. The p T and E T requirements on the objects in the presented analyses are at least 2 GeV higher than the trigger requirements.
A brief description of the object definitions is provided below. The primary vertex is chosen as the proton-proton collision vertex candidate with the highest sum of the squared transverse momenta of all associated tracks [31].
Muon candidates are reconstructed using an algorithm that combines information from the ID and the MS [32]. Muon quality criteria such as inner-detector hit requirements are applied to achieve a precise measurement of the muon momentum and to reduce the misidentification rate. Muons are required to have p T > 10 GeV and to be within |η| < 2.5. The distance between the z-position of the point of closest approach of the muon inner-detector track to the beamline and the z-coordinate of the primary vertex is required to be less than 1 cm. In the τ lep channel, there is an additional cut on the transverse impact parameter significance, defined as the transverse impact parameter divided by its uncertainty: |d 0 |/σ d 0 < 3. These requirements reduce the contamination due to cosmic-ray muons and beam-induced 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upward. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). The transverse momentum and the transverse energy are defined as p T = p × sin(θ) and E T = E × sin(θ), respectively. The distance R in η-φ space is defined as R = ( η) 2 + ( φ) 2 .
backgrounds. Typical reconstruction and identification efficiencies for muons meeting these selection criteria are above 95% [32]. Electron candidates are reconstructed from energy clusters in the electromagnetic calorimeters matched to tracks in the ID. They are required to have transverse energy E T > 15(12) GeV in the τ had (τ lep ) channel, to be within the pseudorapidity range |η| < 2.47, and to satisfy the medium shower shape and track selection criteria defined in Ref. [33]. Candidates found in the transition region between the barrel and end-cap calorimeters (1.37 < |η| < 1.52) are not considered in the τ had channel. Typical reconstruction and identification efficiencies for electrons satisfying these selection criteria range between 80 and 90%, depending on E T and η.
Exactly one lepton (electron or muon) satisfying the above identification requirements is allowed in the τ had channels. In the τ lep channel, only events with exactly one identified muon and one identified electron are retained. All lepton (electron or muon) candidates must be matched to the corresponding trigger objects and satisfy additional isolation criteria, based on tracking and calorimeter information, in order to suppress the background from misidentified jets or from semileptonic decays of charm and bottom hadrons. The calorimeter isolation variable I (E T , R) is defined as the sum of the total transverse energy in the calorimeter in a cone of size R around the electron cluster or the muon track, divided by the E T of the electron cluster or the p T of the muon, respectively. The track-based isolation I ( p T , R) is defined as the scalar sum of the transverse momenta of tracks within a cone of size R around the electron or muon track, divided by the E T of the electron cluster or the muon p T , respectively. The contribution due to the lepton itself is not included in either sum. The isolation requirements used in the τ had and τ lep channels, optimised to reduce the contamination from non-prompt leptons, are listed in Table 1.
Hadronically decaying τ -leptons are identified by means of a multivariate analysis technique [34] based on boosted decision trees, which exploits information about ID tracks and clusters in the electromagnetic and hadronic calorimeters. The τ had candidates are required to have +1 or −1 net charge in units of electron charge, and must be 1-or 3-track (1-or 3-prong) candidates. Events with exactly one τ had can-didate satisfying the medium identification criteria [34] with p T > 20 GeV and |η| < 2.47 are considered in the τ had channels. In the τ lep channel, events with identified τ had candidates are rejected to avoid overlap between H → τ had and H → τ lep . The identification efficiency for τ had candidates satisfying these requirements is (55-60)%. Dedicated criteria [34] to separate τ had candidates from misidentified electrons are also applied, with a selection efficiency for true τ had decays (that pass the τ had identification requirements described above) of 95%. To reduce the contamination due to backgrounds where a muon mimics a τ had signature, events in which an identified muon with p T (μ) > 4 GeV overlaps with an identified τ had are rejected [35]. The probability to misidentify a jet with p T > 20 GeV as a τ had candidate is Jets are reconstructed using the anti-k t jet clustering algorithm [36] with a radius parameter R = 0.4, taking the deposited energy in clusters of calorimeter cells as inputs. Fully calibrated jets [37] are required to be reconstructed in the range |η| < 4.5 and to have p T > 30 GeV. To suppress jets from multiple proton-proton collisions in the same or nearby beam bunch crossings, tracking information is used for central jets with |η| < 2.4 and p T < 50 GeV. In the τ lep channel, these central jets are required to have at least one track originating from the primary vertex. In the τ had channel, tracks originating from the primary vertex must contribute more than half of the jet p T when summing the scalar p T of all tracks in the jet; jets with no associated tracks are retained.
In the pseudorapidity range |η| < 2.5, jets containing bhadrons (b-jets) are selected using a tagging algorithm [38]. These jets are required to have p T > 30 GeV in the τ had channel, and p T > 20 GeV in the τ lep channel. Two different working points with ∼70 and ∼80% b-tagging efficiencies for b-jets in simulated tt events are used in the τ had and τ lep channels, respectively. The corresponding light-flavour jet misidentification probability is (0.1-1)%, depending on the p T and η of the jet. Only a very small fraction of signal events have b-jets, therefore events with identified b-jets are vetoed in the selection of signal events.
Some objects might be reconstructed as more than one candidate. Overlapping candidates, separeted by R < 0.2, are resolved by discarding one object and selecting the other one in the following order of priority (from highest to lowest): muons, electrons, τ had , and jet candidates [35].
The missing transverse momentum (with magnitude E miss T ) is reconstructed using the energy deposits in calorimeter cells calibrated according to the reconstructed physics objects (e, γ , τ had , jets and μ) with which they are associated [39]. In the τ had channels, the energy from calorimeter cells not associated with any physics object is included in the E miss T calculation. It is scaled by the scalar sum of p T of tracks which originate from the primary vertex but are not associated with any objects divided by the scalar sum of p T of all tracks in the event which are not associated with objects. The scaling procedure achieves a more accurate reconstruction of E miss T under high pile-up conditions.

Signal and background samples
The LFV signal is estimated from simulation. The major Higgs boson production processes (gluon fusion gg H , vector-boson fusion VBF, and associated production W H/Z H) are considered in the reported searches for LFV H → eτ and H → μτ decays. In the τ lep channel, all backgrounds are estimated from data. In the τ had channels, the Z /γ * → τ τ and multi-jet backgrounds are estimated from data, while the other remaining backgrounds are estimated from simulation, as described below.
The largely irreducible Z /γ * → τ τ background is modelled by Z /γ * → μμ data events, where the muon tracks and associated energy deposits in the calorimeters are replaced by the corresponding simulated signatures of the final-state particles of the τ -lepton decay. In this approach, essential features such as the modelling of the kinematics of the produced boson, the modelling of the hadronic activity of the event (jets and underlying event) as well as contributions from pile-up are taken from data. Therefore, the dependence on the simulation is minimised and only the τ -lepton decays and the detector response to the τ -lepton decay products are based on simulation. This hybrid sample is referred to as embedded data in the following. A detailed description of the embedding procedure can be found in Ref. [40].
The W +jets, Z /γ * → μμ and Z /γ * → ee backgrounds are modelled by the ALPGEN [41] event generator interfaced with PYTHIA8 [42] to provide the parton showering, hadronisation and the modelling of the underlying event. The backgrounds with top quarks are modelled by the POWHEG [43-45] (for tt, W t and s-channel single-top production) and AcerMC [46] (t-channel single-top production) event generators interfaced with PYTHIA8. The ALPGEN event generator interfaced with HERWIG [47] is used to model the W W process, and HERWIG is used for the Z Z and W Z processes.
The events with Higgs bosons produced via gg H or VBF processes are generated at next-to-leading-order (NLO) accuracy in QCD with the POWHEG [48] event generator interfaced with PYTHIA8 to provide the parton showering, hadronisation and the modelling of the underlying event. The associated production (Z H and W H) samples are simulated using PYTHIA8. All events with Higgs bosons are produced with a mass of m H = 125 GeV assuming the narrow width approximation and normalised to cross sections calculated at next-to-next-to-leading order (NNLO) in QCD [49-51]. The SM H → τ τ decays are simulated by PYTHIA8; the other SM decays of the Higgs boson are negligible. The LFV Higgs boson decays are modelled by the EvtGen [52] event generator according to the phase-space model. In the H → μτ and H → eτ decays, the τ -lepton decays are treated as unpolarised because the left-and right-handed τ -lepton polarisation states are produced at equal rates. Finally, the LFV Z boson decays are simulated with PYTHIA8 assuming an isotropic decay. The width of the Z boson is set to its measured value [20].
For all simulated samples, the decays of τ -leptons are modelled with TAUOLA [53] and the propagation of particles through the ATLAS detector is simulated with GEANT4 [54,55]. The effect of multiple proton-proton collisions in the same or nearby beam bunch crossings is accounted for by overlaying additional minimum-bias events. Simulated events are weighted so that the distribution of the average number of interactions per bunch crossing matches that observed in data.
Background contributions due to non-prompt leptons in the τ lep channel and multi-jet events in the τ had channel are estimated using data-driven techniques described in Sects. 4.2 and 5.2.

Search for H → eτ decays in the τ had channel
The search for the LFV H → eτ decays in the τ had channel follows exactly the same analysis strategy and utilises the same background estimation techniques as those used in the ATLAS search for the LFV H → μτ decays in the τ had channel [22]. The only major difference is that a high-E T electron is required in the final state instead of a muon. A detailed description of the H → eτ had analysis is provided in the following sections.

Event selection and categorisation
Signal H → eτ events in the eτ had final state are characterised by the presence of exactly one energetic electron and one τ had of opposite-sign (OS) charge as well as moderate E miss T , which tends to be aligned with the τ had direction. Same-sign (SS) charge events are used to control the rates of background contributions. Events with identified muons are rejected. Backgrounds for this signature can be broadly classified into two major categories: • Events with true electron and τ had signatures. These are dominated by the irreducible Z /γ * → τ τ production with some contributions from the V V → eτ + X (where V = W, Z ), tt, single-top and SM H → τ τ production processes. These events exhibit a very strong charge anticorrelation between the electron and the τ had . Therefore, the expected number of OS events (N OS ) is much larger than the number of SS events (N SS ).
• Events with a misidentified τ had signature. These are dominated by W +jets events with some contribution from multi-jet (many of which have genuine electrons from semileptonic decays of heavy-flavour hadrons), diboson (V V ), tt and single-top events with N OS > N SS . Additional contributions to this category arise from Z (→ ee)+jets events, where a τ had signature can be mimicked by either a jet (no charge correlation) or an electron (strong charge anti-correlation).
Events with a misidentified τ had tend to have a much softer p T (τ had ) spectrum and a larger angular separation between the τ had and E miss T directions. These properties are exploited to suppress backgrounds and define signal and control regions. Events with exactly one electron and exactly one τ had with E T (e) > 26 GeV, p T (τ had ) > 45 GeV and |η(e) − η(τ had )| < 2 form a baseline sample as it represents a common selection for both the signal and control regions. The |η(e)−η(τ had )| cut has ∼99% efficiency for signal and rejects a considerable fraction of multi-jet and W +jets events. Similarly as done in Ref. [22], two signal regions are defined using the transverse mass 2 , m T , of the e-E miss  Table 2 provides a summary of the event selection criteria used to define the signal and control regions.
The LFV signal is searched for by performing a fit to the mass distribution in data, m MMC  Table 2 Summary of the event selection criteria used to define the signal and control regions (see text) the observed electron, τ had and E miss T objects by means of the Missing Mass Calculator [56] (MMC). Conceptually, the MMC is a more sophisticated version of the collinear approximation [57]. The main improvement comes from requiring that the relative orientations of the neutrino and other τlepton decay products are consistent with the mass and kinematics of a τ -lepton decay. This is achieved by maximising a probability defined in the kinematically allowed phase-space region. The MMC used in the H → τ τ analysis [35] is modified to take into account that there is only one neutrino from a hadronic τ -lepton decay in LFV H → eτ events. For a Higgs boson with m H = 125 GeV, the reconstructed m MMC eτ distribution has a roughly Gaussian shape with a full width at half maximum of ∼19 GeV. The analysis is performed "blinded" in the 110 GeV< m MMC eτ <150 GeV regions of SR1 and SR2, which contain 93.5 and 95% of the expected signal events in SR1 and SR2, respectively. The event selection and the analysis strategy are defined without looking at the data in these blinded regions.

Background estimation
The background estimation method takes into account the background properties and composition discussed in Sect. 4.1. It also relies on the observation that the shape of the m MMC eτ distribution for the multi-jet background is the same for OS and SS events. This observation was made using a dedicated control region, MJCR, with an enhanced contribution from the multi-jet background. Events in this control region are required to meet all criteria for SR1 and SR2 with the exception of the requirement on |η(e) − η(τ had )|, which is reversed: |η(e) − η(τ had )| > 2. Therefore, the total number of OS background events, N bkg OS in each bin of the m MMC eτ (or any other) distribution in SR1 and SR2 can be obtained according to the following formula: where the individual terms are described below. N data SS is the number of SS data events, which contains significant contributions from W +jets events, multi-jet and other backgrounds. The fractions of multi-jet background in SS data events inside the 110 GeV< m MMC eτ <150 GeV mass window are ∼27 and ∼64% in SR1 and SR2, respectively. The contributions N bkg-i H → τ τ and events with t-quarks), which also account for components of these backgrounds already included in SS data events. 3 The factor r QCD = N multi-jet OS /N multi-jet SS accounts for potential differences in flavour composition (and, as a consequence, in jet → τ had misidentification rates) of finalstate jets introduced by the same-sign or opposite-sign charge requirements. The value of r QCD = 1.0 ± 0.13 is obtained from a multi-jet enriched control region in data using a method discussed in Ref. [58]. This sample is obtained by selecting events with E miss T < 15 GeV, m e,E miss T T < 30 GeV, removing the isolation criteria of the electron candidate and using the loose identification criteria for the τ had candidate [34]. The systematic uncertainty on r QCD is estimated by varying the selection cuts described above. The obtained value of r QCD is also verified in the MJCR region, which has a smaller number of events but where the electron and τ had candidates pass the same identification requirements as events in SR1 and SR2.
The data and simulation samples used for the modelling of background processes are described in Sect. 3. A discussion of each background source is provided below.
The largely irreducible Z /γ * → τ τ background is modelled by the embedded data sample described in Sect. 3. The Z /γ * → τ τ normalisation is a free parameter in the 3 The r QCD · N bkg-i SS correction in the add-on term is needed because same-sign data events include multi-jet as well as electroweak events (Z → τ τ , Z → ee, W +jets, V V , H → τ τ and events with t-quarks) and their contributions cannot be separated. final fit to data and it is mainly constrained by events with 60 GeV<m MMC eτ <90 GeV in SR2. Events due to the W +jets background are mostly selected when the τ had signature is mimicked by jets. This background is estimated from simulation, and the WCR region is used to check the modelling of the W +jets kinematics and to obtain separate normalisations for OS and SS W +jets events. The difference in these two normalisations happens to be statistically significant. An additional overall normalisation factor for the N W +jets OS−SS term in Eq. (1) is introduced as a free parameter in the final fit in SR1. By studying WCR events and SR1 events with m MMC eτ > 150 GeV (dominated by W +jets background), it is also found that an m MMC eτ shape correction, which depends on the number of jets, p T (τ had ) and |η(e) − η(τ had )|, needs to be applied in SR1. This correction is derived from SR1 events with m MMC eτ > 150 GeV and it is applied to events with any value of m MMC eτ . The corresponding modelling uncertainty is set to be 50% of the difference of the m MMC eτ shapes obtained after applying the SR1based and WCR-based shape corrections. The size of this uncertainty depends on m MMC eτ and it is as large as ±10% for W +jets events with m MMC eτ < 150 GeV. In the case of SR2, good modelling of the N jet , p T (τ had ) and |η(e) − η(τ had )| distributions suggests that such a correction is not needed. However, a modelling uncertainty in the m MMC eτ shape of the W +jets background in SR2 is set to be 50% of the difference between the m MMC eτ shape obtained without any correction and the one obtained after applying the correction derived for SR1 events. The size of this uncertainty is below 10% in the 110 GeV< m MMC eτ <150 GeV region, which contains most of the signal events. It was also checked that applying the same correction in SR2 as in SR1 would affect the final result by less than 4% (see Sect. 6). The modelling of jet fragmentation and the underlying event has a significant effect on the estimate of the jet → τ had misidentification rate in different regions of the phase space and has to be accounted for with a corresponding systematic uncertainty. To estimate this effect, the analysis was repeated using a sample of W +jets events modelled by ALPGEN interfaced with the HERWIG event generator. Differences in the W +jets predictions in SR1 and SR2 are found to be ±12 and ±15%, respectively, and are taken as corresponding systematic uncertainties.
In the case of the Z → ee background, there are two components: events in which an electron mimics a τ had (e → τ misid had ) and events in which a jet mimics a τ had (jet→ τ misid had ). In the first case, the shape of the Z → ee background is obtained from simulation. Corrections from data, derived from dedicated tag-and-probe studies [59], are also applied to account for the variation in the e → τ misid had misidentification rate as a function of η. The normalisation of this background component is a free parameter in the final fit to data and it is mainly constrained by events with For the Z → ee background where a jet is misidentified as a τ had candidate and one of the electrons does not pass the electron identification criteria described in Sect. 2, the normalisation factor and shape corrections, which depend on the number of jets, p T (τ had ) and |η(e) − η(τ had )|, are derived using events with two identified OS electrons with an invariant mass, m ee , in the range of 80-100 GeV. Since this background does not have an OS-SS charge asymmetry, a single correction factor is derived for OS and SS events. Half the difference between the m MMC eτ shape with and without this correction is taken as the corresponding systematic uncertainty.
The TCR is used to check the modelling and to obtain normalisations for OS and SS events with top quarks. The normalisation factors obtained in the TCR are extrapolated into SR1 and SR2, where tt and single-top events may have different properties. To estimate the uncertainty associated with such an extrapolation, the analysis is repeated using the MC@NLO [60] event generator instead of POWHEG for tt production. 4 This uncertainty is found to be ±8% (±14%) for backgrounds with top quarks in SR1 (SR2).
The background due to diboson (W W , Z Z and W Z) production is estimated from simulation, normalised to the cross sections calculated at NLO in QCD [61]. Finally, the SM H → τ τ events also represent a small background in this search. This background is estimated from simulation and normalised to the cross sections calculated at NNLO in QCD [49-51]. All other SM Higgs boson decays constitute negligible backgrounds for the LFV signature. Figure 2 shows the m MMC eτ distributions for data and the predicted backgrounds in each of the signal regions. The backgrounds are estimated using the method described above and their normalisations are obtained in a global fit described in Sect. 4.4. The signal acceptance times efficiencies for passing the SR1 or SR2 selection requirements are 1.8 and 1.4%, respectively, and the combined efficiency is 3.2%. The numbers of observed events in the data as well as the signal and background predictions in the mass region 110 GeV< m MMC eτ <150 GeV can be found in Table 3.

Systematic uncertainties
The numbers of signal and background events and the shapes of corresponding m MMC eτ distributions are affected by systematic uncertainties. They are discussed below and changes in event yields are provided for major sources of uncertainties. For all uncertainties, the effects on both the total signal and background predictions and on the shape of the m MMC eτ distribution are evaluated. Unless otherwise mentioned, all sources of experimental uncertainties are treated as fully correlated across signal and control regions in the final fit which is discussed in Sect. 4.4.
The largest systematic uncertainties arise from the normalisation (±12% uncertainty) and modelling of the W +jets background. The uncertainties on the W +jets normalisa- shape corrections are treated as uncorrelated between SR1 and SR2. The uncertainties in r QCD (±13%) and in the normalisation (±13%) and modelling of Z → τ τ also play an important role. The normalisation uncertainty (±7%) for the Z → ee (with e → τ misid had ) background has a limited impact on the sensitivity because of a good separation of the signal and Z → ee peaks in the m MMC eτ distribution. The other major sources of experimental uncertainty, affecting both the shape and normalisation of signal and backgrounds, are the uncertainty in the τ had energy scale [34], which is measured with ±(2-4)% precision (depending on p T and decay mode of the τ had candidate), and uncertainties in the embedding method used to model the Z → τ τ background [35]. Less significant sources of experimental uncertainty, affecting the shape and normalisation of signal and backgrounds, are the uncertainty in the jet energy scale [37,62] and resolution [63]. The uncertainties in the τ had energy resolution, the energy scale and resolution of electrons, and the scale uncertainty in E miss T due to the energy in calorimeter cells not associated with physics objects are taken into account; however, they are found to be only ±(1-2%). The following experimental uncertainties primarily affect the normalisation of signal and backgrounds: the ±2.8% uncertainty in the integrated luminosity [64], the uncertainty in the τ had identification efficiency [34], which is measured to be ±(2-3)% for 1-prong and ±(3-5)% for 3prong decays(where the range reflects the dependence on p T of the τ had candidate), the ±2.1% uncertainty for triggering, reconstructing and identifying electrons [33], and the ±2% uncertainty in the b-jet tagging efficiency [38].
Theoretical uncertainties are estimated for the Higgs boson production and for the V V background, which are modelled with the simulation and are not normalised to data in dedicated control regions. Uncertainties due to missing higher-order QCD corrections in the production cross sections are found to be [65] ±10.1% (±7.8%) for the Higgs boson production via gg H in SR1 (SR2), ±1% for the Z → ee background and for VBF and V H Higgs boson production, and ±5% for the V V background. The systematic uncertainties due to the choice of parton distribution functions used in the simulation are evaluated based on the prescription described in Ref.   Table 6 provides a summary of all results, including the results of the ATLAS search for the LFV H → μτ decays [22].

Search for H → eτ/μτ decays in the τ lep channel
In the τ lep channel the background estimate is based on the data-driven method developed in Ref. [29]. This method is sensitive only to the difference between Br(H → μτ ) and Br(H → eτ ), and it is based on the premise that the kinematic properties of the SM background are to a good approximation symmetric under the exchange e ↔ μ.

Event selection and signal region definition
Events selected in the τ lep channel must contain exactly two opposite-sign leptons, one an electron and the other a muon. The lepton with the higher p T is indicated by 1 and the other by 2 . Additional kinematic criteria, based on the p T difference between the two leptons and on the angular separations between the leptons and the missing transverse momentum, are applied to suppress the SM background events, which Light leptons e ± μ ∓ e ± μ ∓ τ had leptons veto veto are mainly due to the production of Z /γ * → τ τ and of diboson (V V ) events. Two mutually exclusive signal regions are defined: one with no central (|η| < 2.4) light-flavour jets, SR noJets , and the other with one or more central lightflavoured jets, SR withJets . The kinematic criteria defining each signal region, summarised in Table 4, are optimised following two guidelines. The first one is to maximise the signal-tobackground ratio. The second one is to have, in each signal region, enough events to perform the data-driven background estimation described in Sect. 5.2. The final discriminant used in the τ lep channel is the collinear mass m coll defined as: This quantity is the invariant mass of two massless particles, τ and 1 , computed with the approximation that the decay products of the τ lepton, 2 and neutrinos, are collinear to the τ , and that the E miss T originates from the ν. In the H → μτ (H → eτ ) decay, 1 is the muon (electron) and 1. SM processes result in data that are symmetric under the exchange of prompt electrons with prompt muons to a good approximation. In other words, the kinematic distributions of prompt electrons and prompt muons are approximately the same; 5 2. flavour-violating decays of the Higgs boson break this symmetry.
Dilepton events in the dataset are divided into two mutually exclusive samples: • μe sample: 1 is the muon and 2 is the electron ( p T μ ≥ p T e ) • eμ sample: 1 is the electron and 2 is the muon ( p T e > p T μ ) With these assumptions, the SM background is split equally between the two samples. The H → μτ signal, however, is present only in the μe sample because the p T spectrum of electrons from H → μτ decays is softer then the muon p T spectrum. The number of H → μτ events in the eμ sample is negligible with the selection criteria described in Sect. 5.1.
For SM events the distributions of kinematic variables in the two samples are the same with good approximation. In particular, the collinear mass distribution differs between the two samples only for the narrow signal peak. The peak, present only in the distribution of the μe sample, is on top of the SM background, which, to a good approximation, can be modelled from the eμ collinear mass distribution.

Asymmetries in the SM background
Although the eμ-μe symmetry hypothesis is a good starting assumption, there are effects that can invalidate it and that need to be accounted for. The first effect is due to events containing misidentified and non-prompt leptons, together referred to as non-prompt in the following. These leptons originate from misidentified jets or from hadronic decays within jets. They contribute differently to the μe and eμ samples because the origin of the non-prompt lepton is different for electrons and for muons. The second effect originates from the different dependencies on p T and |η| that the trigger efficiency and reconstruction efficiency can have for electrons and muons. The non-prompt effect is accounted for by estimating the non-prompt background separately from the other backgrounds. The efficiency effect is accounted for by scaling the m coll distribution of the eμ sample with a scale factor parameterised as a function of the sub-leading lepton p T , p 2 T . As shown in Sect. 5.5, the eμ-μe symmetry is restored when these two effects are taken into account. Smaller effects, which might depend on other parameters such as η or p 1 T , are found to be negligible. Events containing non-prompt leptons The background contribution due to non-prompt leptons is estimated with the matrix method described in Refs. [68,69], which relies on the difference in identification efficiency between prompt and non-prompt leptons. Two lepton categories are defined: tight leptons, which must satisfy all the lepton identification criteria described in Sect. 2, and loose leptons, which are not required to satisfy the primary vertex and isolation criteria. By measuring separately for prompt and non-prompt leptons the tight-to-loose lepton efficiencies, defined as the fraction of loose leptons that are also tight, one can determine the non-prompt background contribution from the number of data events that have two leptons that are either loose or tight. The efficiencies for prompt and non-prompt leptons, parameterised as a function of p T and η, are derived from data with the tag-and-probe method. Prompt efficiencies are derived from an opposite-sign sample enriched in Z → e ± e ∓ and Z → μ ± μ ∓ . Non-prompt efficiencies are derived from a same-sign sample (μ ± e ± or μ ± μ ± ) where the muon is the tag lepton.

Asymmetry induced by the different trigger and reconstruction efficiency of electrons and muons
The efficiency to trigger on and reconstruct an eμ event, ε eμ , is different from the one of a μe event, ε μe . These two efficiencies can be expressed as a function of the p T of the two leptons: In this search, the leading lepton is required to have p 1 T > 35 GeV, which is on the plateau region of the trigger and reconstruction efficiencies. Hence the ratio of the efficiencies can be approximated as: Therefore, the ratio of the eμ and μe event reconstruction efficiencies can be parameterised as a function of the sub-leading lepton p T , f p 2 T . Using the fit described in Sect. 5.4, the parameter f p 2 T is determined in three p 2 T bins, 12-20, 20-30, and > 30 GeV.

Systematic uncertainties
Using the eμ asymmetry technique, the only systematic uncertainty associated with the background prediction is due to the non-prompt background modelling. This uncertainty has two components: the first one is the limited number of tagand-probe events used to extract the prompt and non-prompt efficiencies; the second one is the difference in kinematics, and therefore in sources of non-prompt leptons, between the events used to extract the non-prompt efficiency and the events in the signal regions. This second component is evaluated by measuring the non-prompt efficiencies in subsets of the nominal tag-and-probe sample. The subsets are obtained by applying, one at a time, the kinematic requirements of the signal regions. The ensuing uncertainties in the estimated number of non-prompt events can be as large as 10-50% for the non-prompt efficiency and 3% for the prompt efficiency, depending on the signal region. Uncertainties related to the signal prediction are the same ones described in Sect. 4.3 with one minor difference in the uncertainty in the signal cross section due to higher-order QCD corrections. This uncertainty is split into two anticorrelated components: ±12% in SR withJets and ±20% in SR noJets .

The statistical model
Assuming that the SM background is completely symmetric when exchanging e ↔ μ , the likelihood function for the collinear mass distribution of the eμ and μe samples can be written as: correction, and the signal contribution (s i j ), the likelihood is written as: (4)

Background model validation
The symmetry-based method is validated with simulation and with data. The validation with simulated samples is performed by comparing the signal strength measured in the SR with background samples, and with signal samples corresponding to several non-zero LFV branching ratios. The validation with data is performed in a validation region (VR) defined as SR noJets , but with at least one angular requirement reversed, φ( 1 , 2 ) or φ( 1 , E miss T ). The validation procedure consists of comparing the data, or the sum of the simulated background samples, to the total background estimated from the statistical model. The comparison is done for the eμ sample and the μe one. With the simulated samples, it is also verified that the symmetric background and the f p 2 T do not depend on the presence of an LFV signal.
Generated pseudo-experiments are used to confirm that the statistical model is unbiased. No significant discrepancy was found between the injected signal strength and its fitted value up to LFV branching ratios of 10%.

Results of the search for LFV H → eτ/μτ decays in
the τ lep channel Figure 4 compares the observed data to the yields expected from the symmetry-based statistical model. The comparison, combining the different p 2 T bins, shows the symmetric component of the background (b i j ) as a dashed line, and the total background estimation including the contribution from events containing misidentified and non-prompt leptons as a full line. As can be seen, the background estimation is in good agreement with the data over the full mass range. Table 5 sum-Events/10 GeV T bin, and the non-prompt estimate in the μe and the eμ channels. The excellent level of agreement between the fitted number of events and the observed number is due to the many unconstrained parameters in the fit.
The expected and observed 95% CL upper limits on branching ratios as well as their best fit values are calculated using the statistical model described in Sect. 5.4. Table 6 presents a summary of results for the individual categories Table 5 A summary of the fit results in the τ lep channel. The values of the fit parameters f p 2 T , which account for the ratio of the eμ and μe event reconstruction efficiencies described in Sect. 5.2, are obtained from a background-only fit, and reported for each signal region and for each p 2 T bin. The expected and observed yields correspond to the number of events used in the fit, representing the 0-300 GeV m coll range shown in Fig. 4. The quoted uncertainties in the expected yields represent the statistical (first) and systematic (second) uncertainties, respectively. The post-fit values of systematic uncertainties are provided for the background predictions. The signal predictions are given for Br(H → eτ ) = 1% in the eμ sample and for Br(H → μτ ) = 1% in the μe sample and their combination can be found in Table 6 for both the H → eτ and H → μτ hypotheses.

Combined results of the search for LFV H → eτ/μτ decays
The results of the individual searches for the LFV H → eτ and H → μτ decays in the τ had (including the result from Ref. [22]) and τ lep channels presented in Sects. 4.4 and 5.6 are statistically combined. The two channels use different background estimation techniques, leading to uncorrelated systematic uncertainties in the background predictions. The systematic uncertainties for the LFV signal are treated as 100% correlated between the two channels. Table 6 presents a summary of results for the expected and observed 95% CL upper limits and the best fit values for the branching ratios for the individual categories and their combination. There is no indication of a signal in the search for the LFV H → eτ decays. The combined observed, and the median expected, 95% CL upper limits on Br(H → eτ ) for a Higgs boson with m H = 125 GeV are 1.04% and 1.21 +0.49 −0.34 %, respectively. A small ∼1σ excess of data over the predicted background is observed in the search for the LFV H → μτ decays. It is mostly driven by a 1.3σ excess in the earlier search in the μτ had channel [22]. This corresponds to a best fit value for the branching ratio of Br(H → μτ ) = (0.53 ± 0.51)%. In the absence of any significant signal, an upper limit on the LFV branching ratio Br(H → μτ ) for a Higgs boson with m H = 125 GeV is set. The corresponding observed, and the median expected, 95% CL upper limits are 1.43% and 1.01 +0.40 −0.29 %, respectively. The upper limits on the LFV decays of the Higgs boson are summarised in Fig. 5.

Search for Z → μτ using the τ had channel
The search for Z → μτ events is based on μτ had final state and utilises the same strategy as the H → μτ analysis documented in Ref. [22], and applied to the H → eτ had search described above. The final state is characterised by the presence of an energetic muon and a τ had of opposite charge and the presence of moderate E miss T , aligned with the τ had direction. The typical transverse momenta of the muon and of the τ had are somewhat softer than those expected in Higgs boson LFV decay, due to the lower mass of the Z boson. The main backgrounds are the same as those observed in H → μτ had analyses, namely: Z → τ τ , W +jets, multi-jet, H → τ τ , diboson and top backgrounds. The m MMC μτ variable is used to extract the signal using the same fit procedure and estimation of systematic uncertainties as for the H → μτ had search. The corresponding Higgs boson LFV contribution is assumed to be negligible.
The Z → μτ analysis differs from the H → μτ had one as follows: Total background 6170 ±100 ±100 8880 ±100 ±140 Data 6134 8982 • The signal and control regions are defined in the same way as in the H → μτ had analysis, but the cut values are lowered to match the kinematics of Z boson decay products. The exact definition is given in Table 7.
• The LFV H → μτ had signal sample is replaced with a LFV Z → μτ signal sample. • The shape correction for W +jets in SR1 is obtained from the m MMC μτ > 110 GeV sideband in SR1. • Due to larger W +jets contribution in SR1 and SR2, the shape corrections for the W +jets samples are calculated using a three-dimensional binning scheme in p T (τ had ), |η(μ) − η(τ had )| and N jet . • The W +jets extrapolation uncertainty, which accounts for the difference between the W +jets ALPGEN PYTHIA and HERWIG samples, is also included as a shape uncertainty.
The numbers of observed events and background in each of the regions are given in Table 8. The efficiencies for simulated Z → μτ signal events to pass the SR1 and SR2 selections are 1.2 and 0.8%, respectively. Figure 6 shows the m MMC μτ distribution for data and predicted background in each of the signal regions. The discrepancy observed in the m MMC μτ range 80-100 GeV of SR1 was studied carefully. All the other SR1 distributions, including lepton momenta, transverse masses, and missing transverse momentum, are in excellent agreement with the predictions, and the background shapes are constrained in the control regions as well as in SR2. This discrepancy is hence attributed to a statistical fluctuation.
No excess of data is observed and the CL s limit-setting technique is used to calculate the observed and expected lim-  Table 9 The expected and observed 95% CL exclusion limits as well as the best fit values for the branching ratio of Br(Z → μτ ) [10 −5 ] are shown for SR1, SR2 and the combined fit. To calculate these quantities for SR1 and SR2, the signal strengths are decorrelated in the signal regions and set to zero in the control regions Br(Z → μτ ) ( its on the branching ratio for Z → μτ decays. The observed 95 % CL limit on Br(Z → μτ ) is 1.7 × 10 −5 , which is lower than the expected upper limit of Br(Z → μτ ) = 2.6 × 10 −5 , but still within the 2σ band. This corresponds to a best fit value for the branching ratio Br(Z → μτ ) = −1.6 +1.3 −1.4 × 10 −5 . The results for the different signal regions are summarised in Table 9.

Summary
Searches for lepton-flavour-violating decays of the Z and Higgs bosons are performed using a data sample of protonproton collisions recorded by the ATLAS detector at the LHC corresponding to an integrated luminosity of 20.