Higgs lepton flavour violation: UV completions and connection to neutrino masses

We study lepton violating Higgs (HLFV) decays, first from the effective field theory (EFT) point of view, and then analysing the different high-energy realizations of the operators of the EFT, highlighting the most promising models. We argue why two Higgs doublet models can have a $\mathrm{BR}(h\rightarrow \tau \mu)\sim 0.01$, and why this rate is suppressed in all other realizations including vector-like leptons. We further discuss HLFV in the context of neutrino mass models: in most cases it is generated at one loop giving always $\mathrm{BR}(h\rightarrow \tau \mu)<10^{-4}$ and typically much less, which is beyond experimental reach. However, both the Zee model and extended left-right symmetric models contain extra ${\rm SU(2)}$ doublets coupled to leptons and could in principle account for the observed excess, with interesting connections between HLFV and neutrino parameters.


Introduction
In the Standard Model (SM) neutrinos are massless and lepton flavours are exactly conserved at all orders in any process; for instance, the Higgs boson cannot decay in two charged leptons of different flavour. On the other hand, we know from neutrino oscillation experiments that neutrinos are not massless and that lepton flavours are not conserved.
However, until now, no lepton flavour violation (LFV) has been observed in processes involving just charged leptons (as for instance µ → eγ or µ → 3e) and only very strong upper limits have been set on their branching ratios. This is intriguing, because if lepton flavour is not conserved there should also be LFV in the charged lepton sector at some level. In fact, many low-scale neutrino mass models predict sizable amplitudes for non-oscillatory LFV processes. With the discovery of the Higgs boson the situation has changed dramatically by opening the possibility of testing LFV in Higgs decays. In fact, CMS 8 TeV data show a 2.4σ excess in the channel h → µτ [1], which is translated into a branching fraction: 1 while ATLAS shows no significant deviation BR(h → µτ ) = (0.53 ± 0.51) % [3], with only a small excess in one of the signal regions that is not statistically significant. This hint at the percent level serves as a motivation and thus we will focus our discussion on the tau-muon channel, although our analysis can be easily extended to any HLFV decay (h → ). Indeed, if confirmed, amazingly, this would be the first signal of LFV apart from that seen in neutrino oscillation experiments and, therefore, the SM, and some of its simplest extensions devised to accommodate neutrino masses (plain Dirac neutrinos or see-saw type I), would have to be extended. In fact, even if a Higgs lepton flavour violating (HLFV) signal is predicted in many models, a ∼ 1% branching fraction is too large for most of them, once one takes into account that no other LFV processes have been observed.
The goal of this paper is studying, with complete generality, which is the type of new physics that could give sizable contributions to h → µτ (which in this paper means: h → µ + τ − + h → µ − τ + ) in the light of current (and future) LHC measurements. 2 This process and the constraints imposed by other LFV processes, such as τ → µγ, have been widely studied in many papers. Some authors take a pure phenomenological approach, using an effective Lagrangian to explain the HLFV excess and to estimate the expected contributions to other LFV processes. Then, they employ the limits on these processes to set upper bounds on HLFV [7][8][9][10][11][12][13]. Also the connection of HLFV to CP-violating decays has been studied in refs. [14,15].
On the other hand, most of the research has focused on analyzing different models which, a priori, could give large contributions to the HLFV process. In the following we briefly review the different options that have been studied in the literature.
Relevant to our work, as they are tree-level prototype examples, are models with vector-like leptons and two Higgs doublet models (2HDM) with generic structure (type III). Vector-like leptons motivated by composite Higgs models were studied in ref. [16], where too low rates were obtained, BR(h → µτ ) 10 −5 (see also ref. [17]). Before the CMS hint, h → µτ in the type-III 2HDM was studied in detail in ref. [18], imposing the upper bounds from τ → µγ, and the sensitivity of LHC to this channel was studied in ref. [19]. This model has been extensively used in order to explain the excess, see for theories. Then, in the renormalizable theories we will analyze the HLFV excess and the constraints from other LFV processes (particularly τ → µγ) to set the viability of the different models to explain the HLFV excess.
We will also consider the possibility of having large HLFV generated at one loop by exchange of heavy particles. Finally, since one of the main motivation for LFV are neutrino masses, we will discuss, in view of our previous analysis, the implications of some of the most popular models of neutrino masses for HLFV, highlighting the most promising ones.
The paper is structured as follows. In section 2 we introduce HLFV from an EFT perspective. In section 3 we list all possible tree-level renormalizable realizations of the relevant operators, with at most two new particles, and estimate their contribution to HLFV once the constraints from other LFV processes are taken into account. In section 4 we discuss some ways of generating the HLFV operators at one loop and show that in general they will give too small contributions. In section 5 we study the implications of several of the most popular neutrino mass models for HLFV. Finally we summarize and give our conclusions in section 6.

An EFT approach to HLFV
The Lagrangian for the SM leptons reads where L is the lepton SU(2) L doublet, e R the fermion singlet, Φ is the Higgs SU(2) L doublet and Y e a general complex Yukawa matrix in flavour space (flavour indices have been suppressed). The covariant derivative D µ takes into account the transformation properties of the fields under the SM gauge group: for a general field that transforms non-trivially under SU(2) L (with coupling g), and has hypercharge Y (with coupling g ). T a are the generators of weak isospin SU(2) L , with T a = 1 2 σ a for SU(2) L doublets, where σ a (a = 1, 2, 3) are the Pauli matrices. In the SM Y e can always be taken diagonal and, if there are no right-handed neutrinos, lepton flavours are exactly conserved. In the SM with right-handed neutrinos, although Y e can still be chosen diagonal, the Yukawa couplings of neutrinos cannot and, therefore, lepton flavours are not conserved. Thus, despite Higgs couplings to charged leptons are diagonal at tree level, there is a contribution to HLFV at the one-loop level, although the prediction is far from experimental sensitivities [43], similarly to the results for other processes with CLFV as µ → eγ or µ → eee. However, in some extensions of the SM, HLFV could be generated at tree level and be relatively large. If all the new particles responsible for HLFV are much heavier than the electroweak scale and decouple, one can describe HLFV, with full generality, by using an effective Lagrangian including gauge invariant operators built with SM fields with dimension higher than four and whose effects are suppressed by powers of the scale of new physics [61][62][63] (see also refs. [64][65][66][67][68][69], and references therein). The lowest dimension effective operators that can give rise to HLFV have dimension six and, therefore, the effective low energy Lagrangian can be written as where family indices in C a have been suppressed. There are also dimension-8 operators that can be constructed by adding the singlet (Φ † Φ) to the previous ones. In this case, however, more new fields and couplings are needed to generate them, suppressing necessarily the contribution to HLFV operators (observables) by ∼ v 2 /Λ 2 (∼ v 4 /Λ 4 ) with respect to dimension 6 ones. Therefore, for scales above ∼ 4πv ∼ 3 TeV, dimension 6 operators are dominant with respect to the dimension eight ones, and in general the EFT is justified whenever the new physics scale is Λ v, so in the following we will restrict our discussion to dimension-6 operators. There are many dimension-six operators relevant for HLFV processes, for instance (see figure 1 for a diagrammatic representation of O Y )

4)
There could also be operators with two derivatives and one Higgs doublet or with three derivates, which could induce HLFV after the use of equations of motion (EOM). However, those operators are more difficult to generate at tree level, they give, in general, more tau-lepton mass suppression factors and, in some cases, they could induce directly τ → µγ. Therefore, they will not be discussed in the following. The operators in eqs. (2.4-2.6) can be related among them by field redefinitions, integrations by parts or SU(2) Fierz identities. For instance, by redefining L as (for simplicity we take the couplings C 1L Hermitian in flavour) one can immediately remove the operator O 1L from the Lagrangian in favour of O Y (which arises when the redefinition is used in the Yukawa Lagrangian, eq. (2.1)) and other operators that do not give HLFV [10] (see ref. [70] for the complete list of EFT operators with the Higgs boson). Notice that this redefinition of the fields is equivalent to having used the EOM to rewrite the operator with derivatives, O 1L in terms of the one without them, O Y . Therefore, in principle, one can keep only the effective operator O Y in eq. (2.4), as it encodes all the relevant physics for HLFV. If one does so, charged leptons obtain their masses from two sources: the standard Yukawa couplings, eq. (2.1), and the operator O Y . The same two terms also give the couplings to the Higgs boson, h, but there is a mismatch between the two coefficients (we use so that the diagonalization of the charged lepton mass matrix , does not imply the diagonalization of the Higgs boson Yukawa couplings. In the mass basis the Higgs interactions read where the second term is not diagonal in general and gives rise to HLFV processes. Notice that the second term in M in eq. (2.9), for Λ v, can be considered as a perturbation to the diagonal Y e term, thus we will have that both V L,R ≈ 1 + O(v 2 /Λ 2 ) and then . We also define y τ ≡ y τ τ ≈ m τ /v. From the interaction in eq. (2.10) one obtains Since Γ h is not completely known (experimentally there are only lower and upper limits) there is some uncertainty in this estimate which, however, only changes slightly the upper limit [21]. Alternatively one can use that the ratio of branching fractions R µ/τ = BR(h → τ µ)/BR(h → τ τ ) =ȳ 2 /|y τ τ | 2 is independent of the total Higgs decay width. Then, from Table 1. The scales of new physics necessary to explain the CMS result as a function of the size of the dimensionless coefficient We highlight in red the case where the EFT approach does not make sense forȳ at the per mille level.
To obtain such values ofȳ with C Y 1 one will need, from eq. (2.13), Λ 5 TeV, which is not so far from the electroweak scale. Moreover in many models C Y 1, for instance if the tau-lepton Yukawa coupling is the only source of chiral symmetry breaking one expects C Y ∼ m τ /v, or if the operator is generated at one loop one expects a 1/(4π) 2 suppression. In table 1 we give the natural scale of new physics for the different cases. It is clear that if HFLV is generated at one loop and m τ suppressed it will be very difficult to explain the CMS result without conflicting with other experiments, 3 and the best chance is for HLFV to be generated at tree level and not suppressed by the tau-lepton mass.
Before considering the phenomenological constraints on the HLFV couplings y ij it is interesting to estimate their size using naturality arguments. Let us consider the charged lepton mass matrix before diagonalization, M , in eq. (2.9). For simplicity we will take only two families, µ and τ . Then, the fact that M is diagonalized by a biunitary transformation implies that det(M ) = M µµ M τ τ − M τ µ M µτ = m µ m τ . If there are no cancellations between the two terms of the determinant, that is, if It is also clear that the argument does not work if we allow for cancellations, that is, if M µµ M τ τ ≈ M τ µ M µτ . By applying this argument to the explicit form of M in eq. (2.9) and using that Y e can be taken diagonal we can set an Substituting in eq. (2.11) we immediately find This type of constraints was first proposed in the context of 2HDM and are known under the name of the Cheng-Sher ansatz [72] (see also ref. [73]). Figure 2. Contribution to τ → µγ from the effective operator O Y . The symbol ⊗ represents effective operator couplings, the symbol • SM couplings, and × in external leptonic legs represents a helicity flip produced by m τ .
This implies that for a symmetric HLFV coupling y τ µ = y µτ , 17) which is compatible with the CMS preferred range, eq. (2.14). However, the couplings do not need to be symmetric and there could be cancellations among the mass matrix elements. Therefore, we will not impose these constraints.
We have seen that the scale of new physics responsible for HLFV is relatively low (see table 1). This suggests that there could be other LFV processes mediated by the new particles at observable rates. In fact, the same effective interaction that generates HLFV, eq. (2.10), gives τ → 3µ at tree level although it is suppressed by the muon mass. There are also one and two-loop contributions to τ → 3µ, similar to the ones appearing in τ → µγ but with a virtual photon "decaying" to muons, which are dominant because they are not suppressed by the muon mass. An estimate of these diagrams was performed in ref. [10] and compared with the experimental results. The authors conclude that the constraints from radiative decays, τ → µγ, are stronger than the ones obtained from τ → 3µ, mainly because BR(τ → 3µ) is suppressed by an additional factor α with respect to BR(τ → µγ). They get y < 0.016 (at 90% C.L.), which still allows for HLFV, see eq. (2.14). In fact, ifȳ is in the upper region required to explain CMS,ȳ = 0.003, one finds that BR(τ → µγ) ∼ 1.5 × 10 −9 to be compared with the present upper limit, BR(τ → µγ) < 4.4 × 10 −8 (90% C.L.) [71].
It is important to remark that the τ → µγ (and τ → 3µ) estimates above are calculated in the effective theory, including only the interaction in eq. (2.10) and only diagrams with SM particles running in loops, see figure 2. In spite that these calculations yield a finite result, in a complete theory there could be other contributions much larger that the ones obtained by computing loops in the effective theory. These contributions to τ → µγ appear in the EFT as bare effective operators, of dimension six or higher, obtained by matching to the full theory and cannot be computed without knowing the details of the complete theory. It is therefore important to parametrize and, if possible, estimate the form and size of these contributions.
The simplest operator that gives rise to τ → µγ is which appears, after spontaneous symmetry breaking (SSB), from a combination of the gauge invariant operatorsLσ µν e R ΦB µν andLσ µν σe R Φ W µν . We have taken into account that this process is always generated at one loop by adding a factor 1/(16π 2 ). Notice that, as pointed out in ref. [21], the Yukawa operator, O Y , and these dipole operators have the same transformation properties under flavour and chiral symmetries, so the former will typically give rise to the latter, although it could be suppressed by additional factors. With all this, the C γ ij are expected to be, at most, of order one. Particularizing for the case τ → µγ gives ev which leads to the following branching ratio Then, if C γ ∼ 1 we obtain that Λ > 30 TeV. Comparing with table 1 we see that in this case there is no hope. However, C γ does not need to be of order one. In fact, in many models the operators generated are notLσ µν e R ΦB µν but of the typeLσ µν i / DLB µν , which after the use of the equations of motion can be related to the former but with an additional factor m τ /v. Diagrammatically we say we need a m τ factor in order to flip chirality. This is always the case for theories with only additional scalars like the 2HDM. Moreover, in some cases, one of the couplings in the loop has to be proportional also to the tau Yukawa coupling, which brings an additional m τ /v factor. This is the case of the Higgs exchange diagram in the EFT, see figure 2. Thus, we will consider three classes of models depending on the expected size of the C γ coefficients: 4   estimate an upper bound on BR(h → τ µ) for the different possible values of C Y , both at tree level and at one loop, and with/without tau-mass suppressions. We conclude that if C γ is suppressed by a factor m τ /v, then C Y should be order one, while if C γ is suppressed by a factor m 2 τ /v 2 , C Y can be suppressed at most by m τ /v or by a one-loop factor, 1/16π 2 , but not by both at the same time. Of course, these are just general estimates that perhaps can be avoided in particular fine-tuned models, but at least they give us an idea of what one should expect.
On the other hand, as we have seen in table 1, in order to explain the CMS result the scale of new physics cannot be much larger than the electroweak scale and, if it is very close to it, the effective field theory treatment is not appropriate. It is, therefore, very important to scrutinize the complete renormalizable models that could give rise to large HLFV. For this purpose the effective field theory language is still useful since by classifying the different ways of generating the effective operators that give HLFV we are listing in a systematic way the different models. Thus, in the next sections we will classify different ways of "opening" the HLFV effective operators with renormalizable interactions.
The different models/topologies are shown in figures 3 and 6, and explicitly listed in tables 3 and 4, respectively. Constraints stemming from τ → µγ are present in all models because all of them contain new charged particles. In general, τ → µγ will receive several contributions: 5 a) Those which can be calculated in the effective theory (with the Higgs boson and tau- 5 Examples with explicit diagrams will be given when discussing the different topologies in the next section. leptons in the loop, see figure 2), which often are the only contributions considered. These contributions are obtained in the full theory from diagrams which reduce to figure 2 when all propagators with heavy particles shrink to a point (see left diagram of figure 4 for the case of topology C). b) When integrating the new particles at tree level one could generate additional operators that do not contribute to h → τ µ but could generate τ → µγ at one loop (this is for instance the case of operators which induce LFV interactions of the Z bosons in models with vector-like leptons, topologies C, D and E).
c) Direct contributions to τ → µγ which can only be obtained by matching the EFT to the full theory.
Contributions a) are universal and they are directly linked to h → τ µ, however, they are suppressed by, at least, two m τ factors (one factor from the Higgs-lepton coupling and one from the external line helicity flip). Therefore, they will provide the most conservative bound on h → τ µ.
Contributions b) are model-dependent, because the extra operators generated depend on the particle content. However, in some cases the parameters appearing in those contributions can be directly linked to the parameters appearing in h → τ µ.
Contributions c) are also model-dependent. There are two types of contributions of this class: c1) which can be obtained from the tree level topologies by closing two Higgs doublets and attaching a photon to one of the charged particles in the loop (in the Higgs or in the new scalars/vector-like leptons, depending on the topology). For an example of those contributions in the case of topology C see right diagram of figure 4. These contributions, by construction, depend on the same couplings that appear in h → τ µ and, therefore, will tightly constrain it. c2) Some of the particles entering in the tree level topologies can independently give contributions to τ → µγ (for instance in the 2HDM, diagrams with only the non-SM doublet; see also figure 5 for the case of topology C). These can generically be enhanced but contain some parameters which do not appear in h → τ µ. Therefore, they can be set to zero, suppressing τ → µγ without suppressing h → τ µ.
Upper bounds obtained from a), c1) and in some cases b) will be robust because in those cases the τ → µγ amplitude, C γ , will be proportional to the h → τ µ amplitude, C Y , and therefore they can not decouple, unless fine-tuned cancellations occur. In the other cases, there will be new couplings involved and, thus, one will be able to set upper bounds on h → τ µ only under certain extra assumptions. These last limits, which could be avoided in some particular set-ups, will be termed natural here.

Tree level UV completions of the HLFV effective operators
As we have discussed in the previous section, despite the fact that all dimension-six operators that give rise to HLFV can be reduced to the O Y operator, the kind of new particles that generate the O Y and O 1L,1R,2L,2R operators could be different and thus so is the phenomenology they induce. For instance, as we will see, operators of type O 2L,2R in eq. (2.6) will give rise to HLFV but with a contribution that is always suppressed by the tau-lepton mass. Moreover, they will also generate LFV interactions of the Z boson, which are strongly bounded. In addition, we will also see that, in some cases, the O Y will always appear together with some of the other operators. Therefore, we will study first the different ways of obtaining the operator O Y from tree-level exchange of new particles with renormalizable interactions. Then, we will also discuss the operators with one covariant derivative and two Higgs doublets (type O 2L,2R ). In both cases we will estimate the expected size of the HLFV decay, the relevant constraints from other processes and discuss some interesting models which realize those operators.

The Yukawa operator
We give in figure 3 all the possible tree-level topologies that generate the O Y HLFV operator, eq. (2.4), by using renormalizable interactions and including at most two new particles. In table 3 we list all the particles that can mediate the interactions in the different topologies, where S stands for a scalar and F for a vector-like fermion (see refs. [75,76]). Fermions must be vector-like because to generate the operator they must be massive before SSB and because they need to flip the chirality. We use the notation (SU(2) L , Y) and, generically, denote the Yukawa type couplings by Y , the scalar quartic couplings by λ and the (dimensionful) trilinear scalar couplings by µ.
When the scalar (3, 1) S (see-saw type II) or the fermion (1, 0) F (see-saw type I) (or (3, 0) F , see-saw type III) are present, (Majorana) neutrino masses could also be generated at tree level. Notice however that the representation (1, 0) F (see-saw type I) does not contribute to HLFV at tree level, because it does not contain charged components (shown crossed-out in table 3). In principle, the smallness of neutrino masses could impose strong constraints on some of the couplings, however since HLFV preserves total lepton number, it is always possible to avoid the constraints from neutrino masses, which violate lepton number and require extra parameters (either couplings or Majorana masses for the heavy neutral fermions). For instance, the scalar (3, 1) S is present in topologies B and C, and gets a VEV, but in none of them the coupling of the scalar to two lepton doublets is needed for HLFV. Therefore this coupling could be set to zero avoiding the neutrino mass constraints. Analogously, although fermions (3, 0) F are present in topologies C, D and E (section 3.2), HLFV relies on the vector-like mass of the fermion. Then, if lepton number is approximately conserved, we are in an inverse see-saw scenario, where neutrino masses can be made small at will, independently of the Yukawa couplings needed to obtain HLFV. In section 5.2 we will discuss the cases of having just the particle content of the see-saws (chiral (1, 0) F or (3, 0) F with Majorana mass terms, or only a (3, 1) S coupled to the lepton doublets), in which HLFV is generated at one loop.
Taking all mass scales equal for simplicity, i.e., m S ∼ m F ∼ µ ≡ m (we know that naturality and charge-breaking constraints impose µ O(m), so we are considering the most favorable scenario), the different topologies obey the following hierarchy in terms of the corresponding Yukawa couplings In all the topologies the new multiplets contain charged particles, therefore CLFV will constrain the Yukawas Y (besides the perturbativity limit, of O(1), which also applies to λ). Moreover, we will see that all the topologies which contain vector-like fermions (C and D) also generate the derivative operator, as in table 4, and will be subject to further constraints. Therefore we expect that topologies A and B are the least suppressed. In fact, as already said, topology A (2HDM) is known to generate a sizable HLFV rate.
In the following we are going to discuss in more detail the different topologies.

Topology A
The only possibility for the intermediate particle in topology A is a scalar doublet. Then, this topology belongs to the class of two Higgs doublet models, in particular to type-III 2HDM. This is because in the effective theory we assume there is a standard Higgs doublet Φ = (2, 1/2) which gives masses to the fermions and a new one S, which generates the HLFV effective operator. Therefore, the two doublets couple to ordinary leptons. From

Topology Particles
Representations Table 3. the topology A diagram and eqs.
with λ the (S † Φ)(Φ † Φ) coupling in the potential and Y the Yukawa coupling of the new doublet to the leptons. Thenȳ ∼ Y λv 2 /m 2 S , and valuesȳ ∼ 10 −3 can easily be obtained with m S few TeV and λ, Y 1. Now, one has to take into account also phenomenological constraints from τ → µγ. We can estimate the robust upper bound by attaching one of the Higgs doublets to the external tau-lepton line in topology A (see figure 3) and a photon in the (now) internal tau-lepton line (contribution of type a). This will bring a y τ factor. Moreover, this coupling will flip the tau-lepton chirality, and to flip back the chirality of the tau-lepton we need an extra y τ factor. Finally, the behavior of the integrand at small momenta requires a m 2 h factor in the denominator. Therefore, we find which relates directly the C γ to the C Y coefficient. It agrees with the one obtained in the effective theory by closing the Higgs in a loop and provides an example of a model in the first column and first row in table 2. The estimate is however too naive: first, because the complete calculation contains some log(m h /m τ ) enhancement factors, and second, because due to the y 2 τ suppression, there are two-loop contributions (Barr-Zee diagrams [74] 6 ) which are larger. Adding all these contributions and comparing with the experimental limit on τ → µγ the EFT contribution givesȳ < 0.016 (at 90% C.L.) [9,10], which still allows for large BR(h → τ µ).
In addition, there are diagrams contributing to τ → µγ with only the new scalar doublet running in the loops. Denoting its Yukawas with the leptons Y , we obtain (example of second column and first row in table 2): It is obvious that in this case h → τ µ is not directly proportional to τ → µγ because Y τ τ does not appear in the former. However, under the natural assumption Y τ µ Y τ τ , we can obtain an upper bound on the flavor-violating Yukawa: where in the last step we used the perturbativity upper bound on the quartic coupling, λ O(1). Notice that O(1) values of this coupling would yield problems with perturbativity or stability close to the EW scale. We call these type of bounds natural because they could be evaded in some scenarios (for instance if Y τ τ ∼ 0). These estimates agree with the detailed studies of 2HDM [18,20], which include all the scalar contributions (neutral and charged) and also some Barr-Zee two-loop contributions involving the top quark. In the notation of 2HDM, a large λ and small enough scale of new physics, i.e., a second Higgs light enough, is seen by having cos(β − α) large enough (but close enough to the decoupling limit sin(β − α) = 1 in such a way that the light Higgs is SM-like), where tan β = v 2 /v 1 and α is the mixing angle of the two CP-even Higgs scalars, see ref. [18]. After including all known contributions and imposing all phenomenological constraints a 1% branching fraction can be achieved, and therefore one can explain the excess seen by CMS [20].

Topology B
This topology is similar to topology A but with two new scalars, S 1 and S 2 with trilinear couplings µ 1 S 2 S † 1 Φ and µ 2 S 2 Φ † Φ. In fact, the scalar S 1 is always a doublet and in the limit of a very large mass of the scalar S 2 , which can be a singlet or a triplet, topology B reduces to topology A. From the diagram we obtain that the effective operator coefficient is which, even for µ 1,2 < m S 1, 2 and Y < 1 can easily yield BR(h → τ µ) at the percent level. The trilinear couplings proportional to µ 2 imply that, after electroweak symmetry breaking, the new scalars S 2 get an induced VEV. Thus, when S 2 belongs to a triplet there are additional constraints because the ρ parameter receives tree-level contributions from the triplet VEV v T and, therefore, it is bounded by: where in the last step we used µ 1 m S 1 from naturalness. The discussion of possible constraints from τ → µγ is, in part, similar to topology A. In particular, the contributions obtained by closing the topology and by exchange of the new doublet are essentially the same. Moreover, the singlet (1, 0) S and triplet (3, 0) S do not couple to leptons and do not give extra contributions to τ → µγ. The triplet (3, 1) S could, in principle couple to leptons, giving additional contributions to τ → µγ and generating tree-level neutrino masses. However both, neutrino masses and these contributions to CLFV, depend on a Yukawa, see eq. (5.29), which does not enter in HLFV, and thus these constraints are not robust and could be evaded. Therefore, we conclude that topologies B can give sizable contributions to HLFV if the new scalars S 2 are singlets, as they are not subject to the aforementioned constraints coming from the ρ parameter.
We want to emphasize that the Higgs doublet present in topology B has the same quantum numbers as the Higgs doublet of topology A, and thus, if the contribution from B is generated, so will in general be that of A, which involves less fields and couplings. Which one dominates will depend on the relative size of couplings and masses.

Topology C
Topology C contains one new scalar, S, and one new fermion, F . The scalar must have trilinear couplings with the SM Higgs doublet and the fermion must be vector-like to be able to flip the chirality of the SM fermions to generate the operator.
From the diagram we obtain where Y F is the Yukawa coupling of the SM doublet to the SM fermions and the new fermion, while Y S F is the Yukawa coupling of the new scalar to the SM fermions and the new fermion. Thus Theories with vector-like fermions will necessarily generate derivative operators (topologies E), which give additional contributions to h → τ µ, but these contributions are further suppressed by the tau Yukawa coupling. However, the derivative operators give rise to other processes (LFV mediated by Z exchange or violations of unitarity in the lepton mixing) which strongly constrain the coupling Y F : (Y F v/m F ) 2 0.001, see section 3.2. Moreover, from the stability of the minimum of the potential we expect µ m S , so that which can be quite large if m S is not too heavy or Y S is not too small. On the other hand, if S is a scalar triplet that gets a VEV, we obtain much stronger bounds from eq. (3.8) which is too small unless we push the Yukawa coupling close to their perturbative limit. In addition we have to impose the limits coming from τ → µγ. There are many diagrams that give contributions to this process. We will classify them as discussed at the end of section 2.
For instance, starting from topology C (see figure 3) we can close two of the SM Higgs doublets in a loop and attach a photon to the internal lepton line (see left diagram of figure  4). When shrinking the heavy particle propagators to a point this diagram reproduces the "effective field theory" result ( figure 2). This contribution is what we termed as type a in section 2 and, although robust in the sense discussed there, it is suppressed by two factors of m τ .
If instead we attach the photon to the heavy vector-like fermion, as in the right diagram of figure 4, we do not have the m 2 τ factor because the helicity flip is produced by the heavy vector-like fermion. This contribution, which is also robust and is clearly the dominant one, is completely missed in the "effective field theory" calculation (type a). In the EFT language it is obtained after matching and can only be estimated after specifying the details of the full theory. This contribution provides and example of what we denoted by type c1 in section 2. Since it is pure matching, the result is infrared finite and dominated by the heaviest mass in the loop, so we have: (3.14) To obtain an upper bound on the HLFV rate we use the lower bounds on the masses of vector-like leptons from direct searches and take m F ∼ m S . Then, using the limit on C γ in eq. (2.21), we have thatȳ ∼ C Y v 2 /( √ 2Λ 2 ) 0.00004 and from eq. (2.12) we finally get: Notice that this limit can be somewhat relaxed if there is a large hierarchy between the fermion and scalar masses.
e L e L S γ F F e R Figure 5. Natural contribution to τ → µγ, in topology C, which cannot be estimated in the EFT and contains a different combination of couplings than that appearing in HLFV (contribution type c2). It is suppressed by m τ .
There are also contributions to τ → µγ from diagrams in which only the pair new scalar/new fermion is exchanged, see for instance diagram in figure 5 which provides a contribution of the type c2, as defined in section 2. These necessarily give left-left or rightright amplitudes in the external leptons 8 and, therefore, require a chirality flip and a factor m τ . Typically we obtain C γ τ µ ∝ Y S τ F Y S µF y τ . We can see that, in this case C Y ∝ µY τ F Y S µF is not exactly related to C γ , which does not contain the factor µ and contains other Yukawa couplings. One could, for instance, take Y S τ F = 0, making this contribution to BR(τ → µγ) zero while keeping BR(h → τ µ) different from zero. Therefore the limits obtained from these diagrams are not robust. Moreover, the fact that the amplitudes from these diagrams are proportional to m τ /v makes the limits obtained from them a factor 10 4 weaker which implies that these contributions are irrelevant.
Finally, there are additional contributions to C γ from the derivative operators generated by the vector-like fermions, but as we will discuss in section 3.2 these are also m τ suppressed.

Topology D
These topologies contain two new vector-like fermions F 1 and F 2 and no new scalars. From the diagram we immediately obtain where Y F 1 , F 2 are the Yukawa couplings of the fermions F 1, 2 to SM leptons and Y 12 is the Yukawa coupling among them. Then we have where we have used the limits ( .001 coming from the extra Z interactions generated by the new vector-like fermions (see section 3.2). Moreover Y 12 is constrained by present data on h → γγ and perturbativity to be Y 12 O(1). Therefore using only these arguments it seems difficult to obtain BR(h → τ µ) at the percent level.
Regarding the constraints from τ → µγ, we can close the two external SM doublets in a loop and attach a photon to the internal charged particles (contribution c1). We obtain C γ ∼ C Y , as in the case of topology C, and the reason is the same: vector-like fermions provide the required chirality flip. As a consequence, we get the same bound as in eq. (3.15) (example of third column first row in table 2), which is robust and renders this topology not useful to obtain an enhancement in h → τ µ.
As in the case of topology C there are additional contributions both to C Y and C γ , coming from diagrams with only one vector-like fermion, which are m τ suppressed and will be discussed in section 3.2.

The derivative operators (topologies E)
We show in figure 6 all the possible tree-level topologies giving rise to type O 1L,1R,2L,2R operators, see eqs. (2.5) and (2.6), by including only one new particle, which must always be a vector-like lepton. In table 4 we list all the possible high-energy realizations.
After expanding the operators in table 4 and using the SM equations of motion, they can be written as the operator O Y , generating HLFV suppressed by an additional factor y τ . Therefore, typically they give and thus y τ µ ∼ y τ κ τ µ , with κ τ µ ∼ Y τ F Y µF v 2 /m 2 F . But the same operator will also produce LFV Z-couplings, non-universal tree-level Z-decays and violations of unitarity in the leptonic mixing matrix. We show in table 4 the new non-diagonal couplings of the different operators (see appendix A for the derivation). Higgs couplings are in units of y τ µ , Z couplings are in units of κ τ µ e/(2c W s W ) (with s W (c W ) being the sine (cosine) of weak Figure 6. Tree-level topologies giving rise to type O 1L,1R,2L,2R operators, see eqs. (2.5) and (2.6). Those involving the lepton singlets e R give topologies E 1,2 (left diagram) and those involving the lepton doublets L generate topologies E 3,4 (right diagram). The Φ and the Φ can be interchanged among the different legs.

Operator
Topology  2s W ) with κ τ µ ∼ Y τ F Y µF v 2 /(2 m 2 F ) and y τ µ ∼ y τ κ τ µ . The chirality of the charged lepton couplings to the Z are understood from the effective operator. Notice that the couplings of the W and Z to the leptons are related by g W = g ZLν − g ZLe . mixing angle) and W ones are in units of κ τ µ e/(2 √ 2s W ). The charged-current interactions are already non-diagonal at the renormalizable level via the PMNS lepton mixing matrix, and therefore this is just a correction to it, i.e., V → V (1 + O(1) κ e/(2 √ 2s W ). The most important constraints on κ τ µ come therefore from Z non-diagonal couplings.
As we see from table 4, in all cases that give HLFV there are non-diagonal Z couplings, which make HLFV negligible if it is generated only by derivative operators. For instance, from τ → 3µ mediated by the Z at tree level, we find: (3.20) Using the experimental limit BR(τ → 3µ) exp < 2.1 × 10 −8 , we get: 9 |κ τ µ | O(10 −3 ) .

(3.21)
Since we have seen that y τ µ ∼ y τ κ τ µ , we immediately obtain a bound on |y τ µ | 10 −5 and therefore BR(h → µτ ) ∼ 1200 |y τ µ | 2 10 −7 . (3.22) Moreover, the topologies with vector-like leptons will also give contributions to radiative decays like τ → µγ, which could yield stronger limits. In particular, there are one-loop contributions of type a, b and c1, which, for only one vector-like multiplet, are all suppressed by m τ . Since, in this case, C Y is also proportional to y τ , it provides an example of the second column, second row in table 2, and there is no hope to have large BR(h → τ µ). Anyway, the tree-level constraints from LFV in Z-couplings discussed above are strong enough to prevent sizable contributions to HLFV coming only from topologies of type E. Notice that although such contributions generated by just one vector-like fermion are also present in topologies C and D, in these models there are additional HLFV terms which are not suppressed by y τ and, thus, the limit from eq. (3.22) does not apply.
In appendix C we give further details of models with just vector-like leptons giving rise to derivative operators, as well as to topology D of the Yukawa operator. We show there how, in complete generality, for all vector-like models, Higgs interactions in the mass basis can be unambiguously expressed in terms of Z interactions and are therefore suppressed.

Universality of Z decays
The leptonic Z-decay branching ratios are universal to a good degree of precision. In fact from ref. [79] we have  Thus, being very conservative, we can take for all leptons the 2σ upper limit obtained for the tau-lepton

HLFV at one loop
Possible topologies for HLFV at one loop are shown in figure 7. Contrary to the previous tree-level analysis, this is not an exhaustive list but only some representative examples, most of them appearing in well-known neutrino mass models that we will discuss in the next section. One remark about HLFV at one loop is in order: while in the EFT it is very easy to identify the diagrams which generate the relevant operator, most of the calculations have been performed in particular models, where one has to take care properly of one loop (finite) field renormalizations, to enforce flavor diagonal kinetic terms at the loop-level. As a consequence, one needs to check that the result has the correct decoupling behavior in the limit M v, where M stands for the scale of new physics in the model, that is, the amplitude should scale as v 2 /M 2 , since at the effective Lagrangian level HLFV is mediated by the operators in eqs. (2.4), (2.5) and (2.6).
Very roughly we can distinguish two general types of models, which lead to different expectations for HLFV at one loop. In models where HLFV is generated via the derivative operator (middle and right diagrams in figure 7), the use of the equations of motions leads to HLFV proportional to the SM lepton Yukawa couplings. The diagram on the left of figure 7 flips helicity in the Yukawa couplings of the new fermions, F 1,2 and gives rise to the Yukawa operator. Note, however, that in many neutrino mass models the fermions running in the loop are SM leptons which also give a m τ suppression from one of the SM Yukawa couplings. In those cases C Y is proportional to y τ , and we can estimate the h → µτ rate as where λ iH is the quartic coupling of the new scalar to the Higgs, and Y its coupling to leptons (SM and additional ones). Moreover, some of the new particles in the loop have to be electrically charged, and therefore τ → µγ will always be present. Then, bounds from the non observation of τ → µγ, which is usually proportional to y 2 τ , typically give the constraint which leads to BR(h → µτ ) 10 −8 and provides an example of column two and row four in table 2.
Notice that in these scenarios there are also contributions to h → γγ and h → Zγ. As currently LHC experiments no longer see a significant excess on the γγ channel, this imposes further limits on the possibility of having sizable HLFV in these radiative models.
Even if the CLFV constraints are somehow avoided with a particular texture and/or some fine-tuned cancellations, taking Y ∼ 1 and M i ∼ v, one still expects a fairly small HLFV. For λ iH ∼ O(1), the general expectation is: allowing for some enhancement due to possible numerical factors and couplings of order one, at most. This is in agreement with the explicit calculations in the MSSM see-saw [43]. More promising scenarios are those in which the chirality flip in the Yukawa operator is provided by a new coupling, Y LR , since then there is typically an enhancement factor (Y LR /y τ ) 2 ∼ 10 4 for Y LR = O(1) (providing examples of row three in table 2). Still, to have sizable contributions one needs particular flavour structures that suppress CLFV and enhance HFLV, and large couplings, ∼ √ 4π, which could lead to instabilities or nonperturbativity close to the EW scale. This mechanism is at work for instance in SM extensions with scalar leptoquarks coupled to the top quark, studied in refs. [21,40,41].
A similar enhancement can be obtained in models with both singlet and doublet extra scalar fields (other than type-III 2HDM), with slightly smaller dimensionless couplings but large trilinear ones, µΦ † φ φ e . In this case the BR(h → µτ ) can be further enhanced by a factor (µ/M ) 2 ∼ (5 − 10) 2 , being M the scale of the new particles in the loop [83]. However, this trilinear coupling cannot be very large by naturality (and also charge breaking constraints).
The minimal supersymmetric standard model (MSSM) is a natural candidate to produce HLFV. In spite of containing two SM scalar doublets, LFV Higgs decays in the MSSM are generated at one loop, because the holomorphicity of the the superpotential prevents the coupling of the two doublets to both, charged leptons and neutrinos. LFV Higgs decays within the MSSM have been studied by several groups, finding generically a BR(h → µτ ) several orders of magnitude below the present experimental sensitivity both in the R-parity conserving [47,49,52,84] as well as in the R-parity violating case [85], as expected from our estimates above. However, a detailed study shows that it is possible to find very finetuned regions of the supersymmetric parameter space with large tan β and µ-term close to its perturbative bound, in which BR(h → µτ ) can reach the percent level satisfying also present limits from τ → µγ and h → γγ [53]. Analogously, it is possible to obtain such a large ratio in some regions of the parameter space of the supersymmetric inverse see-saw model, for particular structures of (very large) Yukawa couplings, which induce LFV slepton masses [45].
In summary, we conclude that when HLFV occurs at one loop a very suppressed rate is expected generically, BR(h → µτ ) 10 −4 , but in most of the models it is much smaller. This is due to the loop suppression factor, in many cases a y τ suppression and finally the CLFV constraints. Although sometimes these constraints can be avoided, with a certain amount of fine tuning, HLFV at the percent level requires Yukawa couplings or trilinear couplings close to their perturbative limit. In the next section, we also illustrate this generic estimates within the context of several neutrino mass models.

HLFV and neutrino masses
As we have mentioned before, up to now, the only evidence for LFV are neutrino oscillations, which also imply that neutrinos are massive. Then, since new physics is needed to explain their mass, it is natural to wonder whether the excess observed by CMS in h → τ µ can be accommodated within any neutrino mass model. It is true that, in general, one expects the new physics scales responsible for neutrino masses to be much above the TeV scale, while for large HLFV one needs new physics at the TeV. However, this does not need to be true in all neutrino models, in particular radiative neutrino mass models and left-right symmetric models can provide neutrino masses with new physics at the TeV scale. Obviously, any model containing the type-III 2HDM can potentially explain HLFV, with possibly further restrictions due to the neutrino sector. Interestingly enough, there are two well-known models for neutrino masses which naturally include two SU(2) scalar doublets coupled to leptons: the Zee model and the aforementioned left-right (LR) symmetric models, that we will discuss in the following.
On the other hand, in most models of neutrino masses HLFV is not generated at tree level but at one loop; we will estimate the expected size of h → τ µ in some of the better motivated models and conclude on general grounds that whenever HLFV is generated at one loop the typical size is too small to account for the CMS excess.

The Zee model
One of the simplest models for neutrino masses is the Zee model [86]. It introduces an extra Higgs doublet Φ 2 and a singly-charged singlet s + . In order to explain neutrino mixings correctly it is necessary that both scalar doublets couple to the charged leptons and thus it is a type-III 2HDM, see for instance ref. [87]. In the Zee model, however, it is not clear whether a sizable HLFV can be obtained as there are extra constraints coming from neutrino masses and mixings, as well as new contributions to CLFV from the extra singly-charged scalar. Therefore, it is very interesting to discuss it qualitatively, but a full detailed analysis of the model including HLFV and all CLFV constraints is beyond the scope of this work and is currently in preparation [88]. Notice that although the model only deals with the lepton sector, there needs to be couplings to quarks of the light Higgs in order for it to be SM-like, as observed. We will assume the simplest scenario in which there is no flavor violation in the quark sector and the interactions are SM-like, such that all production and decays of the lightest Higgs are SM-like. The most general Yukawa Lagrangian reads: whereL ≡ iσ 2 L c = iσ 2 CL T . Due to Fermi statistics, f ab is an antisymmetric matrix in flavour space, while Y 1 , Y 2 are completely general. The scalar potential is the general 2HDM potential with extra terms coupling the scalar doublets Φ, Φ 2 with the singly charged s + , see e.g. ref. [89]. For our purposes, the relevant piece in the scalar potential is as it violates lepton number by 2 units. Both Higgs doublets take a VEV , and therefore the charged lepton masses are given by (using the short-hand notation sin β ≡ s β , cos β ≡ c β , tan β ≡ t β = v 1 /v 2 ) 10 : We will work in the basis where m f is diagonal. 10 Notice that in this section we have changed the definition of tan β.
Regarding HLFV, h → µτ is given by: where α is the angle that rotates the neutral scalars, and similarly for BR(h → τ e) ∝ (|Y τ e 2 | 2 + |Y eτ 2 | 2 ). Let us now discuss how neutrino masses impose constraints on the size of the HLFV decays. The diagram shown in fig. 8 generates neutrino masses at one loop. The complete neutrino mass matrix reads (see for instance ref. [90]): with θ z being the mixing angle for the charged scalars with masses m s + 1 , m s + 2 , which is proportional to µ, the lepton-number-violating parameter.
The important point is that one needs Y 2 = 0 in order to account for neutrino mixing angles. If there are no big hierarchies between Y 2 and f , neglecting m e , m µ m τ , one can get the following Majorana mass matrix (which is of course symmetric): where the dashes refer to the elements (M ν ) ij = (M ν ) ji for i > j and all yukawas can be taken to be real except for Y τ µ 2 . In order to have correct mixing angles we need Y τ µ 2 and Y τ e 2 , which enter in the 1 − 2 submatrix to be non-zero. Therefore there are correlations between mixing angles and BR(h → µτ ), BR(h → τ e), and in fact a lower bound on the product of both rates is expected.
In order to have sizable h → τ µ we need sizable Y τ µ 2 and/or Y µτ 2 . 11 This means that it is preferable to have small f eτ and f µτ to reproduce the correct mixing angles and have a large HLFV, while the overall scale can be adjusted conveniently with the LNV parameter µ and/or the scalar masses. For typical values of A ∼ 10 −4 GeV −1 , the correct neutrino mass scale can be obtained Y τ τ 2 , Y τ µ 2 Y τ e 2 ∼ O(0.01) and very small singly-charged couplings, 11 Notice that Y µτ 2 (Y eτ 2 ) only enter in the neutrino mass matrix at order O(mµ) (O(me)), and is therefore subject to weaker constraints, at least as neutrino masses are concerned, than Y τ µ f eτ , f µτ ∼ 10 −8 (see for instance [90] for a numerical scan of the model). Moreover, small f eτ and f µτ also suppress the singly-charged contribution to CLFV, which will therefore be dominated in the range of interest for HLFV by the scalars coming from the doublets.
In type-III 2HDM there is an upper bound on BR(h → µτ ) × BR(h → τ e) from combining the rates of µ → eγ and µe conversion (which currently saturates the bound).
The key point is that all combinations of couplings relevant to these HLFV processes enter in CLFV when the tau-leptons run in the loop. Neglecting the dipole contribution to µe conversion, it reads: 12 BR(h → µτ ) × BR(h → τ e) 10 −8 , with a strong dependence on t β and m A , see figure 4 of ref. [21] 13 . This is clearly a very strong constraint, and the compatibility with correct mixing angles in the Zee model is under investigation [88]. Indeed, if compatible, future data confirming the BR(h → µτ ) ∼ 0.01 excess would imply an upper bound BR(h → τ e) at the order of ∼ 10 −6 [21]. Furthermore, in the simplest version neglecting the muon mass, one neutrino is massless and only inverted hierarchy is possible. Therefore 0.01 eV |m ee | 0.05 eV, which can be observed in planned 0ν2β experiments. This implies that there is a lower bound on BR(h → τ e) from the lower bound on |m ee | > 0.01 eV. All these correlations make the model phenomenologically very interesting.
We want to conclude this section by emphasizing that whether satisfying all requirements (especially eq. (5.8)) and having correct mixing angles is possible or not requires a detailed study with a full parameter scan that is beyond the scope of this work [88].

Left-right symmetric models
Left-right (LR) symmetric models provide a natural realization of the see-saw mechanism of type I and II, together with an elegant explanation for the origin of parity violation in the electroweak interactions [91][92][93][94][95]. The see-saw scale in these models is tied to the scale of parity breaking, and, if sufficiently low, they can be tested at the LHC as well as in low energy experiments such as neutrinoless double beta decay and CLFV.
The minimal LR see-saw model is based on the gauge group SU(2) L ×SU(2) R ×U(1) B−L , with equal left and right gauge couplings, g L = g R . The fermions are completely left-right symmetric, in particular in the lepton sector L L = (ν L e L ) T (L R = (ν R e R ) T ) is a doublet under the SU(2) L (SU(2) R ) group. The electric charge is obtained from the well-known relation where T 3L,3R refer to the third component of weak isospin of SU(2) L,R . 12 Notice that using EFT the upper bound is much weaker, around ∼ 10 −4 [21]. 13 The authors use an MSSM motivated scenario. It is possible that the constraint can be somewhat weakened by considering a type-III 2HDM with non-related parameters.
The scalar sector consists of a bi-doublet (2, 2, 0) which is added in order to give masses to the quarks and leptons, and two triplets, ∆ R (1, 3, 2) and ∆ L (3, 1, 2), according to their SU(2) L × SU(2) R × U(1) B−L quantum numbers. In a first stage the gauge symmetry SU(2) R × U(1) B−L is broken by the VEV ∆ 0 R = v R down to the U (1) Y of the SM. In the second stage, the neutral components of Σ develop a VEV and break the SM symmetry down to U (1) em , where v 1,2 are real and positive, and in the following we take α = 0 for simplicity, since it does not affect our discussion. The minima of the potential yield the relation , v L is the (tiny) VEV of ∆ L and γ is some combination of the scalar potential parameters and the bi-doublet VEVs, v 1 , v 2 (see refs. [92,96] for more details).
The new gauge bosons Z R and W R obtain their masses primarily from v R , while Z, W L obtain their masses from the VEVs v 1 , v 2 of the bi-doublet (and from v L v 1 , v 2 ). We neglect the mixing among left and right gauge bosons, which comes from the product of the VEVs v 1 and v 2 . This is justified by the large v R scale implied by present bounds, v R > 3 TeV [97], which does not require anymore a small ratio v 1 /v 2 to suppress W L − W R mixing.
The Yukawa Lagrangian relevant for the leptons reads: where Y 1 , Y 2 are completely general. Similar interactions exist in the quark sector, with the appropriate Yukawa couplings. If one imposes the natural left-right symmetry L L → R R , ∆ L → ∆ R , Σ → Σ † , see for instance ref. [96], then Y 1,2 are Hermitian matrices, which further restricts the parameter space with respect to a general type-III 2HDM.
After the neutral components of Σ take a VEV, we get: with t β = v 2 /v 1 . Thus, Y 1, 2 can be completely expressed in terms of m E , m D . We can work in the basis where m E is diagonal. The Yukawa couplings of the charged leptons are those of a type-III 2HDM: where we have first rotated Φ 0 , and then used that H ≈ H + h, with the mixing being ∝ v 2 /v 2 R [96]. Notice thus, that the LFV Higgs decays will be controlled by m D and suppressed by the mixing squared, roughly 2 ∝ v 4 /v 4 R , which phenomenologically is 10 −4 . This suppression can be partially overcome by taking the quartic couplings of the scalar potential which enter in the mixing large enough.
For tan β ∼ O(1), one expects Dirac masses for neutrinos of the same order as the charged leptons. However, small (large) Majorana masses of the LH (RH) neutrinos are also present from the VEV of the triplet, v L (v R ). Therefore, the large Majorana mass of the RH neutrinos yield light neutrino masses via see-saw type I , plus a see-saw type II contribution [92]: where m D encodes the connection between neutrino masses and HLFV. Notice that for v R to be at the TeV scale, the Dirac neutrino masses can not be larger that O(MeV), which may require some fine tuning, due to the relation with charged lepton masses. Furthermore, we need γ 1 in order to have a small enough type-II see-saw contribution to neutrino masses.
Regarding the bi-doublet, one of the SU(2) L doublet gets a large mass, proportional to v R , while the other remains light, at the weak scale. Thus the phenomenology of this sector is similar to the type-III 2HDM close to the alignment limit. However, in this minimal model the strong bounds from neutral flavour violating Higgs (FCNH) interactions in the quark sector imply that the second Higgs mass should be m H 15 TeV [97][98][99][100], leading to BR(h → τ µ) that is too small. Therefore in order to get HLFV observable at LHC the scalar sector has to be extended, in such a way to avoid the bounds from FCNH effects in the quark sector while keeping them at LHC reach in the lepton sector. One possibility is just to assume a different pattern of interactions with the scalars in each sector, for instance the minimal one in the lepton sector, which leads to the type-III 2HDM, but not in the quark sector. Alternatively, one can consider two (or more) bi-doublets and particular Yukawa structures which avoid the undesired quark flavor-changing effects. 14 Assuming that the scalar sector is modified so as to allow for a second SM doublet light enough, in order to get a sizable BR(h → τ µ) we still need large Y τ µ 2 , which may be in conflict with CLFV constraints, since new contributions from W R and Z are present besides those of a generic type-III 2HDM. A more detailed study of HLFV in the context of a LR symmetric scenario with an extended scalar sector and its viability is beyond the scope of this work.

Neutrino mass models with HLFV at one loop
We will now consider some well-known models of neutrino masses in which HLFV appears at one loop. The possible topologies within such neutrino mass models both for the Yukawa operator, eq.  Next, we estimate the HLFV rates of these neutrino-related scenarios.

Dirac neutrinos, see-saw type I and III, inverse see-saw
A thorough computation of HLFV in both, type-I see-saw (including the Dirac limit) and MSSM-see-saw, has been performed in ref. [43]. In the following we just estimate the expected ratios in the non-SUSY scenario (SUSY contributions, for the reasons explained in previous section, are also typically below present experimental sensitivity), and refer the reader to ref. [43] for details. If the SM is just extended to include Dirac neutrino masses, the effective Lagrangian approach is not appropriate because the new particles (right-handed neutrinos) are necessarily light. Still, diagrams like those in figure 7 with neutrinos and charged Goldstone bosons running in the loop can be used to estimate the h → τ µ amplitude in the gaugeless limit (g → 0): with Y the neutrino Yukawa coupling, m the neutrino mass matrix, m ν i the neutrino masses and U αi the neutrino mixing matrix. We also assumed the scalar quartic coupling λ ∼ 1. For Dirac neutrino masses m αi ∼ 5 × 10 −11 GeV ∼ 10 −13 v this yields negligible BR(h → µτ ) ∼ 10 −56 .
In a complete calculation in the unitary gauge, this result is obtained because the unitarity of the mixing matrix U produces an exact cancellation unless neutrinos are massive (GIM cancellation). One can be more general and assume that just light neutrinos run in the loop but that the mixing matrix is not unitary. This is the case of many neutrino mass models. In order to impose the bounds on a non-unitary lepton mixing matrix N , in particular on the µτ element, we can use the results of refs. [80][81][82] on N ≡ (1 + η)U : (5.18) In this case there is no GIM-cancellation and the rate is just the one in ref. [43] substituting U by N . We estimate: (5.19) and so: 20) where in the last line we used eq. (5.18). Notice that this diagram is not suppressed by light neutrino masses, but instead by the loop factor and by the almost unitarity of the mixing matrix (or constraints on the off-diagonal elements). This is the same dominant contribution, that does not depend on the light neutrino masses (having the mass insertion on an external leg, m τ ), that gives CLFV [80]. CLFV constraints on non-unitarity are very strong, as the constant term (independent of the neutrino masses) in the loop function no longer cancels. The difference of course stems from experimental constraints: CLFV bounds are below 10 −8 , while HLFV experimental sensitivity is currently at the level of O(0.01).
For Majorana neutrinos, assuming three right-handed neutrinos of mass m Ri > m h , one can also estimate that in the gauge limit where in the last step we have assumed a common right-handed mass and common Dirac mass, m R m αi ≡ m D , so the light neutrino masses are given by the typical see-saw formula, m ν ∼ m 2 D /m R ∼ 10 −11 GeV. In this case, BR(h → µτ ) 10 −31 . In the inverse see-saw scenario, the Yukawa couplings can be much larger than the naive see-saw scaling, Y ∼ √ m ν m R /v ∼ O(10 −6 ), because m ν ∼ µ m 2 D /m 2 R , where µ is the mass splitting between two pseudo-Dirac sterile neutrinos with masses m R µ, i.e, LFV and LNV are decoupled. The estimate for HLFV is exactly the same as in the left part of eq. (5.21), before using the see-saw formula. In principle Y could be order one, but there are strong constraints from CLFV: Therefore, we obtain BR(h → µτ ) 10 −10 , (5.23) in agreement with the full one-loop computation [44]. Assuming particular Yukawa textures, one could evade CLFV constraints yielding at most BR(h → µτ ) 10 −5 [44], for very large Yukawa couplings, Y ∼ 4, close to the perturbative limit, still unobservable.
So we conclude that the predicted HLFV rates are always well below experimental sensitivity for Dirac neutrinos and the different see-saw scenarios.

The Zee-Babu model
In the Zee-Babu model, there are two extra scalars, one singly-charged s and one doublycharged k, coupled to the SM leptons asLf Ls + + e c g e k ++ + H.c.. The k ++ (s + ) contributions arise from the derivative operators of type O 1R,2R (O 1L,2L ) through diagrams middle (right) in figure 7. As the new scalars couple to different chirality fields, both contributions do not interfere and, neglecting the light neutrino masses, we can estimate the contributions to h → µτ as where the Higgs couplings to both new scalars are λ sΦ |s| 2 Φ † Φ + λ kΦ |k| 2 Φ † Φ and the factor of 4 comes from the Feynman rules [101]. Then, for the branching ratio we find: (5.26) Again, we have to take into account all limits from CLFV and universality to constrain the couplings: • The doubly-charged scalar, k ++ , mediates tree-level LFV decays − a → + b − c − d , which provide the strongest limits on all the couplings g ab ; for instance, ). Notice that g τ τ can not be bounded by this type of processes, although it is expected to be of order ∼ g µµ m 2 µ /m 2 τ to reproduce neutrino mixings.
• The singly-charged scalar gives rise to violations of universality. In particular, from muon decay and using the limits of the unitarity of the CKM, one gets |f eµ | 2 < 0.007 m h TeV 2 [102]. Comparing decays into different charged lepton channels, one obtains the limit ||f eτ | 2 − |f eµ | 2 | < 0.035 m h TeV 2 [102].
Combining all the bounds, we obtain: which for λ sΦ O(1) (to have perturbativity and stability up to some scale not far from the EW), is at most 10 −9 .

See-saw type II
Similarly to the ZB model, both singly and doubly charged scalars of the SU(2) triplet ∆ with hypercharge Y = 1 contribute to CLFV and HLFV, although in this case they only couple to left-handed leptons, generating just the operator of type O 1L,2L . We therefore expect a similar result for BR(h → µτ ). By writing the triplet as a 2 × 2 matrix: the relevant Yukawa interaction is: where g is a symmetric matrix in flavour space. The potential terms that couple the triplet to the Higgs are: Thus the relevant terms for HLFV are: The main contribution comes from the diagram on the right in figure 7. For similar masses m ∆ ++ ∼ m ∆ + ≡ m ∆ , which is a reasonable assumption as they belong to the same multiplet, and neglecting factors of two, we find: We can use CLFV data to constraint the couplings. From [103], BR(τ → µγ) < 4.4 × 10 −8 implies |g * eµ g eτ + g * µµ g µτ + g * µτ g τ τ | 2 ( m ∆ TeV ) 4 < 0.04 . (5.34) Much as in the ZB model, tree level CLFV mediated by ∆ ++ leads to stronger constraints: BR(τ − → µ + µ − µ − ) < 2.1 × 10 −8 , implies |g µτ g * µµ | 2 / m k TeV 4 < 10 −5 , and there is a similar bound for the combination g * eµ g eτ . However g * µτ g τ τ is only constrained by BR(τ → µγ), so using the upper bound in eq. (5.34) we obtain The bound is similar to the ZB model one, due to the presence in both cases of the doubly charged scalars.

The Scotogenic model
In the Scotogenic model of neutrino masses (also sometimes referred to as Inert Doublet model (IDM)), the SM is extended by three singlet fermions N i and one scalar doublet, η, which are odd under a Z 2 symmetry, while all SM particles are even [104]. The lightest of the scalars and the N i is a dark matter candidate. Under the assumption that the scalar doublet η does not acquire a VEV, neutrino masses are generated at one-loop, which allows for a new physics scale of order TeV with much larger Yukawa couplings than in the see-saw scenarios, due to the loop suppression. With respect to HLFV, there is only the contribution coming from the second diagram in figure 7, which generates the derivative operator of type O 1L,2L .
The Higgs coupling to the inert doublet scalar η is given by Only the charged scalars contribute via the λ 3 coupling. Notice that HLFV is unconstrained by neutrino masses, which are proportional only to λ 5 . 15 We can estimate the amplitude of h → µτ from the first diagram of figure 7 to be: where m 0 = max(m ηc , m Ri ), being m ηc the charged scalar mass and m Ri the mass of the N i heavy neutrino. This leads to the branching ratio On the other hand, the branching ratio for τ → µγ is given by (we assume m ηc > m Ri , but a similar result holds in the opposite case): where f (x) = 1 − 6x + 3x 2 + 2x 3 − 6x 2 ln(x) 6(1 − x) 4 , which varies between 1/12 (x = 1) and 1/6 (x = 0). BR(τ → µγ) < 4.4 × 10 −8 [71] implies (we take the loop function f ∼ 0.1, the most conservative case): and thus by substituting this bound in eq. (5.39) we find: 15 Of course λ3 enters in the physical masses of the new neutral and charged scalars. The neutral scalars enter in the neutrino mass expression, while the charged ones give rise to CLFV. λ 3 is constrained by h → γγ, and also by dark matter phenomenology if the dark matter particle is the neutral component of the doublet. In any case, this branching ratio is beyond any future experimental sensitivity.

Conclusions
We have studied Higgs lepton flavour violation in the light of recent CMS results on h → τ µ. From the effective field theory of point, we have discussed the effective operators that can give rise to HLFV, assuming that the only light degrees of freedom are those of the SM. Then, we have analyzed the different ways of obtaining these operators from a complete renormalizable theory at tree level and at one loop. At tree level, we have listed all the topologies, containing at most two heavy new multiplets, that generate the HLFV operators. In the models obtained we have estimated the h → µτ and τ → µγ rates and have seen that in most of them, these two processes are tightly related. In the case of models containing vector-like fermions one can show that they always generate derivative operators (what we call topologies E) which lead to non-universal and lepton flavor changing Z couplings. We showed that models containing only one vector-like multiplet (topologies E) cannot give large HLFV.
On the other hand, models containing scalar triplets, which obtain a vacuum expectation value, are constrained by the ρ parameter. Models containing two vector-like leptons (topologies D) or one vector-like lepton and a scalar (topologies C) can have enhanced HLFV but then, τ → µγ is also enhanced and current bounds forbid large HLFV. Finally, models containing, at least, one new scalar doublet (topology A, and also a slight variation containing an extra scalar, topology B) are able to yield a sizable contribution, unless the new scalar in topology B is a triplet.
When HLFV is generated at one loop, in general, it is very small, < 10 −4 and typically < 10 −7 , as the rates are suppressed by a loop factor and constrained by CFLV, especially τ → µγ.
We have reviewed the most popular neutrino mass models as a possible explanation of the HLFV anomaly. In most of them (Dirac masses, see-saws type-I, II and III, inverse seesaw, Zee-Babu and the Scotogenic model), HLFV appears only at one loop and suppressed by the tau mass, and, therefore, it is too small. In the case of inverse see-saw type III the fermion triplets are vector-like, but they generate HLFV at tree level via only the derivative operator, so it can not be large. However, there are two very well-motivated models, the (general) Zee model and left-right models (with an extended scalar sector), which include an extra doublet and can explain naturally neutrino masses and in principle give a large enough h → µτ rate.
At present, neutrino oscillations are the only evidence of non-conservation of flavour in the lepton sector. Future data from the LHC, with a confirmation both by CMS and ATLAS of the h → µτ excess would be the first indication of LFV in processes not involving neutrinos. Hopefully, this work, in which we have studied general HLFV and its connection with neutrinos, will help to pin-down the preferred models.

A Non-diagonal Z couplings
To illustrate how non-diagonal Z couplings arise from the derivative operators in table 4 let us consider the topology E 1 in the first row. After SSB we will have a modification of the kinetic terms of the right-handed leptons e R where the first term is the SM kinetic term and the second one comes from the effective operator. κ is a Hermitian matrix in flavour, basically, the coupling of the effective operator.
Typically (see appendix B) and it is directly related to the y τ µ parametrizing HLFV (recall that for derivative operators y τ µ is suppressed by an additional τ Yukawa coupling factor y τ = m τ /v coming from the use of the equations of motion). y τ µ ∼ y τ κ τ µ . (A. 3) Notice, that the covariant derivative acts on objects with different quantum numbers. In the first term, e R , is a singlet with hypercharge −1 while in the second term e R v is the lower component of a doublet with hypercharge −1/2 (the same quantum numbers of e L ). Then, for the first term we havē while for the second one we obtain Here we used g = e/c W , g = e/s W and Adding the two terms we can write After renormalization of the first term, e R → (1 + κ) −1/2 e R and expanding for small κ we recover the SM coupling plus the non-diagonal Z interactions Notice, that the renormalization above only affects e R and, thus, it does not modify charged currents at all. As we have seen, the lepton flavour violating neutral current interaction appears because of the mismatch between the quantum numbers of the combination of fields appearing in the new operator and the the quantum numbers of the SM fermion fields. Using this we can generalize the procedure to all the operators in table 4 and obtain the results presented there, where we have also included the contributions to charged-current interactions.

B An example of a model generating derivative operators
For illustration, we will give here some details of a model giving rise to derivative operators. We add a vector-like lepton E = (1, −1) F (topology E 4a ) of mass M E (for simplicity we just add one vector-like lepton but the model can easily be enlarged to include several of them). The most general Yukawa Lagrangian reads where Y e is the SM Yukawa coupling. For just one vector-like lepton, Y E is 3 × 1 general matrix (for n vector-like leptons it would be a 3×n matrix and M E would be a n×n matrix). Notice that mixed bare terms such M e R E L can always be reabsorbed in a redefinition of e R . We assume that M E > v, so we can integrate-out the E. Using the EOM, and substituting back in the Lagrangian we get the effective Lagrangian: Expanding the propagator of eq. (B.2) up to dimension 6 we find: where the even powers of / D vanish due to the chirality of the (Hermitian) operator. We have also used an integration by parts to rewrite the Lagrangian in an explicitly Hermitian form and defined the 3 × 3 matrix matrix C E /Λ 2 = 1/2 Y E M −2 E Y † E (for n E's C E would be given by the same expression but with M E a n × n matrix and Λ is the lightest of the M E eigenvalues). The factor of 2 comes from the fact that we have defined the operator plus its Hermitian in eq. (2.3). Now we expand the covariant derivatives of the product of fields and use SU(2) identities like ΦΦ † = 1 2 (Φ † Φ) + 1 2 σ(Φ † σΦ) to write The second line does not give rise to HLFV, as can be clearly seen by going to the the unitary gauge, but it gives ZLVF involving only charged leptons. A similar procedure can be used for the rest of topologies involving vector-like leptons. However, it is easier to use the mismatch of quantum numbers between the renormalizable and the non-renormalizable contributions to the kinetic terms, which immediately yield the new interactions, as sketched in appendix A.

C Higgs interactions in models with vector-like leptons
In models with vector-like leptons (singlets and doublets, a similar analysis can be done for vector-like triplets), Z-boson interactions with charged leptons can be written with complete generality in terms of mass eigenstates E L (R) as [107]: where we transformed from the weak basis to the mass basis, E weak L (R) = V L(R) E L (R) . Notice that we have suppressed the flavour indices, a, in E L,R , which run over all charged leptons, standard and heavy ones, and we have defined where run only on lepton doublets. The Higgs interactions read: It is easy to show that the Yukawa coupling can be written in the following way: This general expression simplifies in some cases. For instance if we only add vector-like singlets there will be no right-handed doublets and therefore X R = 0 and the coupling is vy = X L D E , while if we add only vector-like doublets, then all left leptons are doublets, X L = I and the coupling is y = D E − D E X R [21]. Specifying for y µτ , we get: Similarly, (C.7) The first (second) terms in y µτ , y τ µ come from derivative operators obtained by exchange of a vector-like singlet (doublet), while the last term comes from topology D of the Yukawa operator, where both singlets and doublets are exchanged. This term does not involve charged lepton masses and thus will typically dominate unless Y 12 v < m τ . One can also check that the contribution to τ → µγ is always proportional to y µτ , y τ µ and, therefore, will provide robust limits on HLFV, as discussed in secs. 3.1 and 3.2.