New Signatures of Flavor Violating Higgs Couplings

We explore several novel LHC signatures arising from quark or lepton flavor violating couplings in the Higgs sector, and we constrain such couplings using LHC data. Since the largest signals are possible in channels involving top quarks or tau leptons, we consider in particular the following flavor violating processes: (1) $pp \to thh$ (top plus di-Higgs final state) arising from a dimension six coupling of up-type quarks to three insertions of the Higgs field. We develop a search strategy for this final state and demonstrate that detection is possible at the high luminosity LHC if flavor violating top--up--Higgs couplings are not too far below the current limit. (2) $pp \to tH^0$, where $H^0$ is the heavy neutral CP-even Higgs boson in a two Higgs doublet model (2HDM). We consider the decay channels $H^0 \to tu, WW, ZZ, hh$ and use existing LHC data to constrain the first three of them. For the fourth, we adapt our search for the $thh$ final state, and we demonstrate that in large regions of the parameter space, it is superior to other searches, including searches for flavor violating top quark decays ($t \to hq$). (3) $H^0 \to \tau\mu$, again in the context of a 2HDM. This channel is particularly well motivated by the recent CMS excess in $h \to \tau\mu$, and we use the data from this search to constrain the properties of $H^0$.

We explore several novel LHC signatures arising from quark or lepton flavor violating couplings in the Higgs sector, and we constrain such couplings using LHC data. Since the largest signals are possible in channels involving top quarks or tau leptons, we consider in particular the following flavor violating processes: (1) pp → thh (top plus di-Higgs final state) arising from a dimension six coupling of up-type quarks to three insertions of the Higgs field. We develop a search strategy for this final state and demonstrate that detection is possible at the high luminosity LHC if flavor violating top-up-Higgs couplings are not too far below the current limit. (2) pp → tH 0 , where H 0 is the heavy neutral CP-even Higgs boson in a two Higgs doublet model (2HDM). We consider the decay channels H 0 → tu, W W, ZZ, hh and use existing LHC data to constrain the first three of them. For the fourth, we adapt our search for the thh final state, and we demonstrate that in large regions of the parameter space, it is superior to other searches, including searches for flavor violating top quark decays (t → hq). (3) H 0 → τ µ, again in the context of a 2HDM. This channel is particularly well motivated by the recent CMS excess in h → τ µ, and we use the data from this search to constrain the properties of H 0 .

I. INTRODUCTION
The discovery of the Higgs boson [1,2], while being a spectacular triumph for both theoretical and experimental particle physics, is hopefully only the first step in a new era of discoveries in the field. In particular, the hope is that precision studies of the Higgs boson's properties will open up a pathway to physics beyond the Standard Model (SM). Among the simplest and most promising ways in which Higgs physics could deviate from SM predictions are flavor violating couplings of the Higgs boson [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. Such couplings can arise when the vacuum expectation value (vev) of the Higgs field H is not the only source of electroweak symmetry breaking. For example, in a two Higgs doublet model (2HDM), the vevs of two Higgs fields contribute, and this decouples the flavor structure of the fermion mass matrices from the flavor structure of the Yukawa couplings of the physical Higgs bosons [3,6,7,12,13,[18][19][20][21][22][23][24][25][26][27][28][29]. As a result, flavor changing processes such as h → τ µ, h → τ e and t → hq (where q is an up quark or a charm quark) become possible. An alternative source of misalignment between mass and Yukawa matrices can be higher-dimensional couplings, for instance dimension six operators of the form Here, Q i L and L i L are as usual the left-handed quark and lepton doublets, u j R , d j R and e j R are the right-handed fermion singlets, i, j are flavor indices, H is the Higgs doublet, andH ≡ iσ 2 H † is its charge conjugate field.
In fact, a recent CMS analysis [57] has reported a 2σ excess in a search for h → τ µ decays. A subsequent ATLAS analysis is consistent with this hint, but also with the null hypothesis [58]. Of course, any hint for a flavor changing neutral current (FCNC) interaction like this first has to survive scrutiny in view of low energy precision constraints before being accepted as a potential hint for new physics. In the case of anomalous hµτ couplings, the most important constraints arise from the non-observation of flavor violating charged lepton decays like τ → µγ and τ → 3µ [11,14], and the CMS hint would in fact be consistent with all constraints. Low energy constraints are much tighter for FCNC Higgs couplings involving only the first two generations of leptons, thanks to the spectacular sensitivity of experiments searching for the decays µ → eγ and µ → 3e, as well as µ → e conversion in nuclei. Similarly, anomalous Higgs couplings not involving the top quark are tightly constrained by searches for anomalous rare meson decays and anomalous contributions to neutral meson mixing. Consequently, the only FCNC Higgs couplings that could in principle be observable at the LHC besides hµτ , are hτ e, htu, and htc. (Note that the simultaneous presence of sizeable hµτ and hµe couplings is also ruled out, as is the simultaneous presence of htu and htc couplings [14].) The goal of this paper is to advance collider searches for these potentially large FCNC Higgs couplings by proposing new search strategies and recasting existing searches. In particular, we point out that effective operators like Q ij u from (2) can lead to anomalous production channels for di-Higgs final states. The process we will focus on specifically is pp → thh, arising from the operators and followed by the decays t → b ν and h → bb. This process has an extremely rich signature, outside the scope of present LHC analyses and with a very low background expectation. Going beyond the contact operator approximation, we will discuss the thh final state also in the context of a type III two Higgs doublet model (2HDM) with quark flavor violating Yukawa couplings. Models of this type are emerging as one of the leading UV completions for the FCNC operators in eqs. (1) to (3) [3,6,7,12,13,[18][19][20][21][22][23][24][25][26][27][28][29]. In a type III 2HDM, small flavor violating couplings of the light SMlike Higgs boson h are complemented by large flavor violating couplings of the heavy Higgs bosons H 0 , A 0 , and H ± , opening up new production and decay channels for the latter. The thh signature arises for instance in the process pp → t + (H 0 → hh). We will discuss this process in a detailed Monte Carlo study, and we will also consider the related processes pp → t + (H 0 → tu, W W, ZZ).
Finally, we will also discuss leptonic Higgs-induced FCNC in the 2HDM, in particular the process pp → H 0 → τ µ. We will constrain this process, which could be directly connected to the tentative hint from CMS, using existing LHC data.
The structure of the paper is as follows: in section II, we briefly introduce the effective field theory for FCNC Higgs decays given by operators like eqs. (1) to (3). We then develop an LHC search for the thh final state, carefully dissecting the kinematic distributions of the signal and the various backgrounds, and we estimate the expected sensitivity at the 13 TeV LHC. In the second part of the paper, section III, we proceed to a discussion of the quark flavor violating type III 2HDM. We adapt the general search strategy for thh production, developed in section II, to the 2HDM, and we compare its sensitivity to other constraints on the model. Finally, in section IV, we discuss lepton flavor violation in the 2HDM. We constrain the process H 0 → τ µ using LHC data, and we discuss the implications of these constraints for the CMS hint in h → τ µ. We summarize our results and conclude in section V. In the SM, fermions couple to the Higgs doublet via renormalizable dimension four Yukawa couplings, which, after electroweak symmetry breaking, source both the fermion mass terms and the Yukawa couplings of the physical Higgs boson. Therefore, in the SM, no flavor violating couplings of the physical Higgs boson are possible. In extensions of the SM, however, the mass matrices and Yukawa couplings can be misaligned in flavor space. When written in terms of contact operators, the leading contributions to such misalignment come from the dimension six operators given in eqs. (1) to (3). In this section, we focus on up-type quarks only. FCNC couplings of the Higgs boson to down-type quarks are tightly constrained [11,14], and the possible LHC signatures of FCNC couplings to leptons will be addressed in section IV. The relevant dimension four and six couplings in the up-type sector are After electroweak symmetry breaking, this Lagrangian becomes where the mass and coupling matrices are given by In the above expressions, i, j = 1, 2, 3 are again flavor indices. To improve readability, we will also use the notation i, j = u, c, t, . . . and omit the subscript u where this is unambiguous, e.g. y tu ≡ y 31 u , f tu ≡ f 31 u , etc. In the up quark mass basis, where m ij u is diagonal, we see that the flavor violating couplings satisfy Currently, the strongest experimental limits on the off-diagonal elements of y u come from AT-LAS [59,60] and impose the 95% CL constraints  7). Red dots indicate the FCNC coupling of two Higgs boson to quarks (third term in eq. (7)), which is usually dominant, blue dots stand for the FCNC coupling of a single Higgs boson to quarks (second term in eq. (7)), the contributions of which are usually subdominant.
Here, we have used the leading order expression for the branching ratio [55], supplemented by a correction factor η QCD 1 + 0.97α s = 1.10 [55,61] accounting for NLO QCD contributions: Note that a refined analysis taking into account also the process pp → th, which is relevant in the case of tuh couplings (but not for tch couplings) could improve the bounds on BR(t → uh) and |y ut | 2 + |y tu | 2 by about a factor 1.5 [55]. Constraints from CMS are of the same order as those from ATLAS: CMS obtain the 95% CL bound BR(t → ch) < 0.0056 using searches for multi-lepton and lepton plus di-photon final states. This bound translates into |y ct | 2 + |y tc | 2 < 0.14 [62][63][64]. The same bound holds also for tuh couplings. A secondary process sensitive to anomalous tuh and tch couplings is same-sign top production through t-channel Higgs exchange. The author of ref. [65] has derived limits on |y tq | 2 + |y qt | 2 from this channel by recasting the CMS same-sign di-lepton plus b jet measurements [66]. However, the resulting bounds are weaker than those from t → qh decays. Finally, an effective tuh or tch coupling may lead to anomalous di-Higgs production, mediated by a top quark in the t-channel. Unfortunately, since the rate of this process is suppressed by four powers of the small coupling constants y tq and y qt (q = u, c), it is irrelevant in practice. For instance, for y ut = y tu = 0.08 at the upper limit from eq. (13), the cross section for pp → hh is only ∼ 4 fb at √ s = 8 TeV and ∼ 7.4 fb at √ s = 13 TeV. In view of the above constraints, we will in the following use benchmark values of y tq = y qt = 0.08, leading to the expectation of potentially measurable rates for the process pp → thh, on which we will focus in the first part of this paper. The corresponding Feynman diagrams are given in Fig. 1.

II.2. LHC Search Strategy for thh Production in EFT
In the following we investigate the process pp → thh in more detail using Monte Carlo simulations, with the aim of developing a search strategy for future LHC analyses at 13 TeV and estimating its sensitivity. To maximize the number of signal events, it is desirable to focus on Higgs decay channels with large branching ratio, in particular h → bb. Since the final state offers many kinematic handles (in particular for invariant mass cuts), we expect to be able to suppress backgrounds efficiently even for hadronic Higgs decays. We will, however, require the top quark to decay semileptonically to avoid the QCD multi-jet background. Hence, our final state consists of five b jets, one hard lepton, and missing energy. The dominant background for the + 5b + / E T final state is inclusive tt + jets production. Other potential backgrounds such as W + jets or QCD multi-jet production can be efficiently suppressed by requiring b tags and by requiring two jet pairs to have invariant masses close to the Higgs mass m h . We have also considered other decay modes of the Higgs bosons and the top quark, but found them to be less favorable due to small branching ratios and due to the relative softness of leptons from Higgs decays.
For the simulation of the signal, we use MadGraph 5 v2.3 [67] to compute leading order cross sections and to generate events, which are then passed to Pythia 6.4 [68] for parton showering, MLM jet matching [69] and hadronization. We use a jet matching scale q = 30 GeV. As a detector simulation, we employ Delphes 3.1.2 [70]. Background events from tt + jets production are generated by Sherpa+OpenLoops [71][72][73][74]. We simulate events with zero or one jet at next-toleading order (NLO) level, and we allow for up to three jets simulated at leading order (LO) using the MC@NLO multi-jet merging algorithm. We treat c and b quarks as heavy in the parton shower. To confirm that background other than tt+jets are negligible, we also simulated V +jets, V V +jets, tt + V , tt + h and single-top production, with V, V = W, Z, using Sherpa + BlackHat [75]. The first of these backgrounds is treated at NLO accuracy, while the others are computed at LO. As expected, all of them are tiny after cuts, mainly due to a lack of b-tagged jets.
Our analysis pipeline starts with a set of preselection cuts: we require exactly one reconstructed isolated and positively charged lepton with transverse momentum p T ≥ 10 GeV and pseudorapidity |η| ≤ 2.4, and at least five jets with p T ≥ 20 GeV and |η| ≤ 2.5. Jets are reconstructed with the anti-k T algorithm with cone radius R = 0.5. At least four jets need to be b-tagged, assuming a tagging-efficiency of 70% and a misidentification rate of 1% for jets initiated by a light quark or a gluon [76,77]. In principle, the desired t + (h → bb)(h → bb) final state leads to 5 b jets, but due to the limited b-tagging efficiency, requiring only 4 b jets improves the sensitivity.
The longitudinal momentum component p z ν of the neutrino is reconstructed by using the on-shell condition for the W -boson, m 2 ν = m 2 W : Here, m ν is the invariant mass of the charged lepton and the neutrino, E and p T are the energy and transverse momentum of the charged lepton, and / p T is the missing transverse momentum. We assume here that the neutrino is the only source of missing energy. Note that eq. (15) has two solutions, which may be complex. To break the ambiguity, and to associate each jet with a particular parent particle (one of the two Higgs bosons or the top quark), we minimize the quantity over all possible associations between jets and parent particles and over the two possible values of p z ν . We use only the five leading jets in this procedure, and we do not distinguish between b tagged and untagged jets here. In the above expression, m (1) jj and m (2) jj are the invariant masses of jet pairs and m j ν is the invariant mass of a jet, the lepton and the neutrino. For the uncertainties in the denominators we take ∆m h = 12 GeV (the mass resolution for h → bb in CMS [1,78]) and ∆m t = 1.35 GeV (the width of top quark [79]). We have checked that varying ∆m h and ∆m t by O(1) factor relative to each other does not alter our results.
To sharpen the signal, and to reduce the background, we impose further cuts. The thre leading jets p T , denoted as j 1 , j 2 and j 3 , have to fulfill p T,j 1 > 140 GeV, p T,j 2 > 100 GeV, and p T,j 3 > 60 GeV. For the reconstructed invariant masses m j ν of the top quark and m (1) jj , m (2) jj of the Higgs bosons, we require 150 GeV < m j ν < 200 GeV and 100 GeV < m (1,2) jj < 150 GeV. The Higgs boson with the larger p T , which we will call h 1 , is required to have p T,h 1 > 300 GeV, and the Higgs boson with the smaller p T , called h 2 , has to satisfy p T,h 2 > 150 GeV. We finally require the angular separation ∆R h 1 ,h 2 bb between the two jets associated with the same Higgs boson decay to be not too large: we impose the cut ∆R max bb = max(∆R h 1 bb , ∆R h 2 bb ) < 1.5. In figs. 2 and 3, we show the kinematic distributions on which we cut, illustrating how our cuts help to separate the pp → thh signal from the pp → tt+jets background. From the p T distributions of the three leading jets ( fig. 2 (a), (b), (c)), we note that the signal (black solid) is in general harder than the background (red dashed), motivating our cuts on the jet transverse momenta. The reason background jets have on average smaller p T than signal jets is that pair produced top quarks are predominantly forward, while the signal is more central. The same behavior is also reflected in the distribution of the reconstructed transverse momenta of the two Higgs bosons ( fig. 2 (d) and (e)). Regarding the angular separation ∆R max bb , we find it to be slightly smaller for the signal than for the background, see fig. 2 (f). The reason is that for the signal, pairs of b jets originate from the same parent particle, while for the background no such correlation needs to exist. We have also considered the distribution of the number of jets, the pseudorapidity distributions of the reconstructed top quark and Higgs bosons, the angular separations ∆R between them, and the invariant mass of the hh system, but have found that these distribution do not offer additional handles to separate signal from background. Similarly, also attempts to identify background events based on the presence of two jets without b tags and with an invariant mass ∼ m W have not lead to an improvement of the sensitivity.
The most critical cuts in suppressing backgrounds in our search are those on the invariant masses m  jj . We see that, for signal events a pronounced peak is visible around the true Higgs mass m h ∼ 125 GeV. The background distributions reveal no such peak, but rather a broad bump around m (1,2) jj ∼ 125 GeV. This broad bump can be understood if we consider that the minimization procedure used to break combinatorial ambiguities (see eq. (16)) favors combinations in which pairs of jets have invariant masses close to m h by chance.
The cut flow of our analysis is summarized in table I. We find that the total leading order cross section for pp → thh is 56.2 fb at our benchmark point with y ut = y tu = 0.08. Including the branching ratios for t → b ν and h → bb, this decreases to 6.1 fb. Our cuts, especially the b tagging requirements and the mass window cuts, lead to a signal cross section after cuts of only 0.022 fb. The reason the b-tagging and mass window cuts reduce not only the background, but also the signal substantially is that due to combinatorial uncertainties it is likely that one of the two Higgs bosons is not properly reconstructed, for instance because one of the jets from its decay is very soft. Therefore, the final signal cross section is an order of magnitude smaller than the background cross section. Nevertheless, with sufficient luminosity, for instance at the high luminosity LHC,   (see table I), and in all subsequent panels, this cut is applied. All distributions are normalized to unity. a discrimination between signal and background may be possible. Using the CL s method [80] and assuming a systematic uncertainty of 30% on the signal and the background, we find that a luminosity of about 870 fb −1 is needed to exclude our benchmark point at the 95% CL.
This shows that, in the effective field theory framework, a search for the thh final state is inferior to the traditional searches for flavor violating top decays. In the next section, we will show that this conclusion changes when considering specific renormalizable models-in particular two Higgs doublet models-instead of the effective theory. Among the most studied extensions of the SM leading to flavor changing Higgs couplings are Two Higgs Doublet Models (2HDMs). As their name suggests, 2HDMs augment the SM with an additional scalar SU (2) L doublet [81] (see ref. [82,83] for recent reviews, refs. [84,85] for work  in the context of the Higgs discovery, and refs. [3,6,7,12,13,[18][19][20][21][22][23][24][25][26][27][28][29] for studies related to flavor violating Higgs couplings). In a 2HDM, we are free to change basis in the Higgs sector by forming linear combinations of the two doublets, and we will choose a basis (often called the Georgi basis) where only one of the Higgs doublets has a vev, v = 246 GeV. In this basis, the two Higgs doublets Φ 1 and Φ 2 can be decomposed into their component fields as [84], .
Here, the parameters µ 2 1 , µ 2 2 , λ 1 , λ 2 , λ 3 and λ 4 must be real, while µ 2 3 , λ 5 , λ 6 , and λ 7 can be complex. If V contains complex parameters, the quantity Im[Φ † 1 Φ 2 ] appears in the potential, violating the CP symmetry. The three neutral Higgs fields h 1 , h 2 and h 3 mix to form three physical states. In the absence of complex parameters in the scalar potential, i.e. in the CP conserving case, these physical states can be assigned definite CP parity: there is one CP odd Higgs boson A 0 = h 3 and two CP even Higgs bosons, the lighter of which is h h 1 and the heavier of which is H 0 h 2 . The condition that V is at its minimum value when Φ 1 = (0, v/ √ 2), Φ 2 = (0, 0), and that the derivatives of V in any direction in field space must vanish at this point lead to the relations From the second derivatives of V , the masses of the charged Higgs field H ± and the CP odd neutral Higgs A 0 follow as The two CP even Higgs fields h 1 and h 2 mix to form the mass eigenstates h and H 0 according to with the mixing angle α given by The masses of the physical CP even Higgs bosons are then From the requirement that the potential must be bounded from below, one can derive several further constraints on its parameters [86]: If λ 6 = λ 7 = 0, λ 5 should be replaced by |λ 5 | in the last inequality. Perturbativity moreover requires that |λ j | 4π for j = 1 . . . 7, and finally tree level unitarity imposes constraints, which we handle using 2HDMC [87].
It is often convenient to express the free parameters of the 2HDM in terms of physical observables to the extent possible. First, µ 2 2 can be expressed in terms of m H ± and λ 3 by virtue of eq. (20). The quartic coupling λ 2 is irrelevant to us because it only affects four-scalar interactions, which we are not interested in in this paper. λ 6 can be expressed in terms of λ 1 , λ 5 and the mixing angle α using eq. (22). λ 5 can be eliminated if we assume m A 0 = m H ± , which is preferred by custodial symmetry. In this case, we have λ 5 = λ 4 /2, see eq. (20). λ 1 and λ 4 can then be expressed in terms of m h and m H 0 according to eq. (23). This leads to the relations Note that there are two solutions for λ 1 and λ 4 . The + (−) solution is valid for cos 2α < 0 (> 0). Since the SM-like nature of the Higgs boson observed at the LHC requires α to be small, we use the second solution, with the minus sign on the right hand side of the expression for λ 1 and the plus sign on the right hand side of the expression for λ 4 . We illustrate the dependence of λ 1 and λ 4 on the physical mass variables in fig. 4. From the distributions in panel (a), we read off that λ 1 , which is only a function of m h , m H 0 and sin α, is always well within the perturbative regime if sin α 0.5 (as required by LHC constraints) and m H 0 ≤ 1.5 TeV. From the contour plot of λ 4 as a function of m H 0 and m H ± , fig. 4 (b), we can see that when m H ± ∼ m H 0 , then λ 4 is small as well. In our analysis, we choose m H ± = m H 0 to ensure λ 2 4 4π and to eliminate one additional parameter.
We briefly summarize how we have reduced the parameter space of the scalar potential. We started with 10 parameters, namely λ 1 , . . . λ 7 and µ 1 , . . . µ 3 . Of these, µ 1 and µ 3 are eliminated by the conditions minimizing the potential, eq. (19). λ 2 is important only for Higgs boson selfinteractions, which are irrelevant for our purposes. We have seen that the seven remaining parameters can be reexpressed in terms of sin α, m h , m H 0 , m A 0 , m H ± , λ 3 and λ 7 . Knowing that the SM Higgs mass is about 125 GeV, and assuming m H 0 = m H ± = m A 0 we have reduced the number of free parameters in the Higgs potential to four.
Most important to us in this paper is the Yukawa sector of the 2HDM. In the most general case (type III 2HDM), both Higgs doublets can couple to all fermions via arbitrary complex Yukawa matrices. While in the Georgi basis only the couplings of Φ 1 contribute to fermion masses, both Higgs doublets contribute to the Yukawa couplings of the physical Higgs bosons. Thus, the mass matrices and the Yukawa coupling matrices can be easily misaligned in flavor space, inducing flavor violation. Specifically, for the up-type quarks, the Yukawa couplings are withΦ k ≡ iσ 2 Φ † k . As before, we will use the notations i, j = 1, 2, 3 and i, j = u, c, t, . . . interchangeably, and we will omit the subscript u on η ij u,1 and η ij u,2 where doing so is unambiguous, e.g. η ut 2 ≡ η 13 u,2 . We will assume for simplicity that all flavor violating Yukawa couplings are real. After electroweak symmetry breaking, and working in the mass basis where η ij u,1 ∝ δ ij , this becomes with In the alignment limit sin α 0, h has SM-like couplings while large flavor violation can arise for H 0 . The decay rates of h and H 0 into the flavor violating final statestu andūt are given by Other important decay rates of the heavy Higgs boson H 0 are where again x a ≡ 4m 2 a /m 2 H 0 with a = t, W, Z, h. In the last expression, g H 0 hh is the coupling constant of the term It is given by g H 0 hh = 3 sin α cos α λ 7 2 sin α − λ 1 cos α + 1 2 λ 3 + λ 4 + 2λ 5 sin α 3 cos 2 α − 1 In the second equality, we have used the relations in eqs. (25) to (28) and the assumption m H 0 = m H ± = m A 0 , to express g H 0 hh in terms of sin α, m H 0 , λ 3 , and λ 7 . We have also expanded in sin α, and we see that in this case, g H 0 hh is dominantly determined by m H 0 and λ 3 . Note that, since Φ 2 does not acquire a vev, the decay rates for H 0 → W W, ZZ are just the corresponding decay rates in the SM, with an additional suppression factor sin 2 α arising from H 0 -h mixing. In the following, we will for simplicity assume that the diagonal entries of η u,2 , i.e. the diagonal couplings of Φ 2 to up-type quarks, vanish. In this case, also the rate for H 0 → tt is given by the SM rate, suppressed by sin 2 α.

III.2. Constraints on the Quark Flavor Violating 2HDM
Low energy flavor experiments impose the strongest limits on flavor violating couplings not involving the top quark (or the τ lepton). The only flavor constraints relevant to us are those from neutral meson mixing and radiative b quark decays. In particular B d -B d receives contributions from a t-H ± -W loop, which leads to the constraints |η tu 2 | O(1) and |η ut 2 | O(0.01) for m H + = 500 GeV. The radiative decay b → dγ further constrains |η ut 2 | few × 10 −3 for m H + = 500 GeV [88,89]. In the following, we choose η ut 2 = 0 to avoid this flavor constraints. With this choice, the most important bounds on the flavor violating Yukawa coupling η tu 2 come from LHC searches. As we have seen in section II, direct searches for t → qh decays require |y tu h | < 0.12. This can be translated into a limit on η tu 2 and the neutral CP-even Higgs mixing sin α using eq. (31). This limit is shown as an orange exclusion region in fig. 5. Panel (a) in this figure shows constraints as a function of the heavy Higgs mass m H 0 and of η tu 2 , panel (b) as a function of m H 0 and sin α, and panel (c) as a function of sin α and η tu 2 . In all three panels, we have assumed η tc 2 = η ct 2 = 0. Current CMS and ATLAS analyses do not directly search for flavor violating couplings of heavy Higgs bosons. However, CMS [66] and ATLAS [90] have searched for final states with same-sign di-leptons and b jets at 8 TeV. In the type III 2HDM, this final state could arise due to the top-up-H 0 interaction, for instance in the process g + u → (t → b ν) + (H 0 → W W, ZZ,tu,ūt). We have recast the CMS and ATLAS searches from refs. [66] and [90], generating pp → tt and pp → tH 0 events with the same simulation tools as in section II.2. In the presence of top flavor violating Higgs couplings, the first of these processes (same-sign top production) receives contributions from t-channel exchange of h, H 0 , while in the second process, the dominant parton level interaction is gu → tH 0 , with an up quark in the s-channel. Note also that a pp → tt signal would dominate by far over the corresponding pp →tt signal because the former is initiated by valence quarks. For the same reason, also pp → tH 0 is much stronger than pp →tH 0 . Finally, note that the process pp → th is irrelevant because its cross section is suppressed by sin 4 α compared to the cross section for pp → tH 0 , and because h can only decay to W W * , ZZ * , but not to on-shell gauge bosons. The results of our recasting are shown as red and blue exclusion regions in fig. 5. From the figure, we see that the CMS limit using only 10.5 fb −1 of integrated luminosity is somewhat stronger than the ATLAS limit using 14.3 fb −1 in the same channel. The reason is that CMS separates the data into events with two positive leptons (the dominant final state for our signal) and events with two negative leptons, while ATLAS only shows them combined, thus diluting the sensitivity.  Additional constraints on 2HDMs arise from direct searches for heavy Higgs bosons decaying to W W or ZZ [91,92,94,95]. We have recast the ATLAS searches [91,92] (green and gray regions in fig. 5), considering H 0 production both through gluon fusion and through the flavor violating processes pp → tH 0 ,tH 0 . Since gluon fusion is possible even without flavor violating couplings, non-trivial constraints can be expected even for η tu 2 = 0. In their H 0 → ZZ search [92], ATLAS have separated their event sample according to the H 0 production mode (gluon fusion vs. vector boson fusion (VBF)) and according to the decay channel (4 , 2 2ν and 2q2 + 2q2ν). We find that in our model, where production is dominated by pp → tH 0 (except at very small η tu 2 ), the strongest constraint is achieved for VBF events in the 4 category. The reason is that our signal is often vetoed in the gluon fusion channels which require low jet multiplicity, while cuts are rather loose in the 4 category. The ATLAS H 0 → W W search [91] also separates the data into gluon fusion and VBF categories, with the former required to have a jet multiplicity N j = 0 or 1, and the latter to have N j ≥ 2. To purify the VBF sample, a strong cut m jj > 500 GeV is imposed on the invariant mass of the two leading jets in the N j ≥ 2 sample, and their rapidity difference is required to be be |∆η jj | > 2.8. These strong cuts make the VBF event sample rather insensitive to our signal, and we therefore include only the N j = 0 and N j = 1 event categories in our recasting. The CMS search for heavy Higgs bosons in ref. [95] includes both the H 0 → W W and H 0 → ZZ channels, but is based on less data (up to 5.1 fb 1 at √ s = 7 TeV and up to 5.3 fb −1 at √ s = 8 TeV) than the ATLAS searches [91,92] which employ about 20 fb −1 of 8 TeV data each. Therefore, we do not include CMS results here.
One may wonder whether relevant constraints may be obtained from non-resonant di-Higgs production in the process uū → hh, mediated by a t-channel top quark. We have computed the cross section for this process, but find it to be negligibly small. In fact, ref. [96] gives a 95% CL upper limit of 0.69 pb on the cross section of non-resonant di-Higgs production. This translates into the limit η tu 2 sin α < 1.09, which is much weaker than the constraint from the exotic top decay t → hq.
The neutral Higgs boson mixing angle α is also constrained because of the cos 2 α suppression of the hW + W − and hZZ couplings. A global analysis of Higgs couplings at the LHC suggests sin 2 α < 0.34 at 95% CL [93]. Finally, we have checked the electroweak precision constraints using the oblique parameters S, T and U [79,97,98] (black dot-dashed curves in fig. 5). We employed the program 2HDMC [87] for this comparison. Since the second Higgs doublet Φ 2 does not violate custodial symmetry and since we have assumed m H 0 = m A 0 = m H ± , the resulting constraints are rather weak.
In summary, fig. 5 shows that LHC searches for t → hu impose the strongest constraints on the quark flavor violating 2HDM at m H 0 few 100 GeV. At smaller masses same-sign di-lepton searches and direct searches for heavy Higgs bosons become more important. Overall, values of η tu 2 few × 10 −1 are still allowed.

III.3.1. Production Cross Sections and Decay Rates
Let us consider again the thh final state arising from quark flavor violating couplings in the Higgs sector. We will demonstrate in the following that, in the 2HDM, this final state may offer superior sensitivity to quark flavor violating Higgs couplings already in Run II of the LHC.
If the heavy neutral CP even Higgs boson H 0 is significantly heavier than the light one, h, it can be integrated out, reducing the 2HDM to the effective Lagrangian eq. (7). However, it follows from relations (25)- (26) and from the requirement λ j < 4π that going to the limit m H 0 , m H ± h requires fine-tuning. If H 0 is not too heavy, it can be produced on-shell and decay to two h particles, so that the process pp → t + (H 0 → hh), which does not exist in the effective theory from section II, contributes to the thh final state as well. The corresponding Feynman diagrams are given in fig. 6.
In order to study thh production in the 2HDM in detail, we will consider two benchmark points within the parameter space of the model. (We will also discuss how our results are affected when departing from these benchmark points.) To find suitable benchmark points, we first use the fact that the SM-like nature of the light Higgs boson h indicates small mixing sin α. Based on fig. 5, we use the benchmark value sin α = 0.2. To further simplify the high dimensional parameter space, note that λ 7 affects the coupling g H 0 hh in eq. (40) only at order sin 2 α. Therefore, we always take λ 7 = 0 in the following for simplicity. For λ 3 , which enters g H 0 hh at O(sin α), we use the benchmark values 0 and −3. For positive values of λ 3 , partial cancellation would occur in the coefficient of sin α in eq. (40), reducing the H 0 → hh branching ratio compared to the λ 3 ≤ 0 case. For negative λ 3 , this branching ratio is enhanced. Note also that both λ 7 and λ 3 affect only the H 0 → hh rate, but not their decay rates into other final states or the H 0 production cross section. Finally, to best illustrate the impact of the flavor violating decay channel of H 0 , we always choose η tu 2 close to the current upper limit from fig. 5. For sin α = 0.2, we take η tu 2 = 0.6 as our benchmark value. As  fig. 4 (b))   we can see from fig. 5, these values are somewhat above the upper limit for m H 0 500 GeV, and below the upper limit for m H 0 500 GeV. We summarize our benchmark assumptions in table II. The t + H 0 production cross section and the cross section for the process pp → t + (H 0 → hh) are plotted in fig. 7 as a function of m H 0 . The dependence of the branching ratios of H 0 on its mass is illustrated in fig. 8. We see that the H 0 → tū,tu channels dominate for not too large m H 0 since the branching ratios for all other channels are proportional to sin 2 α. At very large m H 0 , the decay channels to bosonic final states, H 0 → W W, ZZ, hh become dominant since their rates are proportional to m 3 H 0 /v 2 , i.e. they grow with the third power of m H 0 .

III.3.2. Search Strategy and Sensitivity Estimates for the 13 TeV LHC
We now adapt our single top plus di-Higgs search from section II.2 to the 2HDM, including in particular on-shell t + H 0 production through flavor violating couplings, followed by the decays t → b ν, H 0 → hh, and h → bb. In principle, we should also consider other H 0 decay channels. In fact, as we saw in fig. 8, the dominant decay channel is the flavor violating one, H 0 → tq, in vast regions of parameter space. However, if at least one of the top quarks in the final state decays hadronically, the background from tt production is huge. If both top quarks decay semileptonically, pp → t + (H 0 → tq) contributes to the same-sign top sample. Unfortunately, the small branching ratio into the semileptonic top decay channel renders the total event number tiny. Similarly, t + (H 0 → W W, ZZ) events are difficult to extract because of small event numbers and/or large backgrounds (see our recasting of the corresponding ATLAS and CMS searches [91,92,94,95] in fig. 5).
For the thh final state, we apply similar cuts as in section II.2. However, since the signal is dominated by events with an on-shell H 0 , we add a cut on the invariant mass m hh of the two reconstructed Higgs bosons. More precisely, we require |m hh − m test H 0 | < 0.1m test H 0 , where m test H 0 is the heavy Higgs mass being tested. We also modify the ∆R max bb cut slightly, and we soften the cut on p h 1 T . This softening helps to keep more signal events, while background events are still suppressed efficiently by the m H 0 cut which could not be used in the EFT case. The cut flow for this adapted analysis is shown in table III.
We display the expected 95% CL sensitivity of our search in fig. 9 as a function of m H 0 for the two parameter points from table II. We assume signal and background uncertainties of 30%. We see that discovery prospects are best at m H 0 ∼ 400-800 GeV. In particular, a search for the t + (H 0 → hh) final state would be superior to the traditional searches for t → hu in this mass range. At smaller m H 0 , the branching ratio for H 0 → hh limits the sensitivity (see fig. 8), while at larger m H 0 , the production cross section peters out (see fig. 7).

IV. H 0 → τ µ DECAY IN THE 2HDM
We now move from Higgs couplings violating quark flavor to couplings violating lepton flavor. This is motivated for instance by the recent CMS excess in the h → τ µ channel [57]. Since in the 2HDM, any such flavor violation would be related to a misalignment in flavor space between the Yukawa couplings of Φ 1 and Φ 2 , we expect flavor violating effects also for the heavy neutral Higgs boson H 0 . Moreover, in view of the required smallness of sin α, an observable τ µh coupling requires a much larger τ µH 0 coupling and thus a large branching ratio for H 0 → τ µ. Our goal in  Table III. Cut flow table for the thh signal in the 2HDM for m H 0 = m H ± = m A = 500 GeV, sin α = 0.2, λ 3 = −3, λ 7 = 0, η ut 2 = 0 and η tu 2 = 0.6. If we use λ 3 = 0 instead, we find a signal cross section before cuts of σ prod = 192.93 fb, and a signal cross section after cuts of σ final = 0.508 fb. The cut efficiencies remain unchanged.
the following is to constrain this decay channel of the heavy Higgs boson using LHC data.

IV.1. Production Cross Sections, Decay Rates and Indirect Constraints
We consider the lepton Yukawa couplings Here, L i L are the left-handed lepton doublets and e j R are the right handed charged lepton fields. Since we will see that the coupling of H 0 to top quarks is crucial for the phenomenological of lepton flavor violating processes, we will also allow η tt u,2 to be nonzero (see section III.1). After electroweak symmetry breaking, the lepton Yukawa couplings turn into where i, j = 1, 2, 3 or e, µ, τ . We will again omit the subscript to unclutter the notation where this is unambiguous. We will also assume for simplicity that the only non-zero elements of η ,2 are η µτ ,2 and η τ µ ,2 , and that both are identical and real. The decay rates for H 0 → τ µ and h → τ µ are The CMS search for h → τ ± µ ∓ [57] constrains the combined branching ratio for these channel to be < 1.51% at 95% CL. This translates into a bound sin α (|η µτ 2 | 2 + |η τ µ 2 | 2 ) 1/2 ≤ 0.0050. The best fit value for the branching ratio is BR(h → τ µ) = (0.84 +0. 39 −0.37 )%. The strongest indirect constraint on η µτ 2 and η τ µ 2 comes from searches for the rare decay τ → µγ [11,14,99]. To quantify this constraint, we follow refs. [14,18] and work with the effective operators where c L , c R are Wilson coefficients, and the dimension-5 operators Q Lγ , Q Rγ are given by c L and c R receive contributions from 1-and 2-loop diagrams involving h, H 0 , A 0 , and H ± . The 2-loop contributions are comparable to the 1-loop terms because the latter are suppressed by the small τ Yukawa coupling [11,14]. The loop diagrams involving only h or only H 0 are given by the expressions in the appendix of ref. [14] (adopted from [100]), with the Yukawa matrices in these formulas identified with y ,h and y ,H from eq. (43) for leptons and with y tt u,h and y tt u,H from eqs. (31) and (32) for top quarks. Note that diagrams containing h and H 0 tend to cancel each other. The reason is that each diagram contains one flavor violating Yukawa coupling and one flavor conserving one. According to eq. (42), this implies that most of the diagrams involving h differ from their counterparts involving H 0 by a minus sign. (The only exception can be diagrams with h or H 0 coupled to a top quark loop if (m t /v) sin α < η tt 2 / √ 2.) The diagrams involving only A 0 are obtained in a similar way [22], by replacing the up quark Yukawa couplings in the expressions from ref. [14] by −iη ij u,2 / √ 2 and the lepton Yukawa couplings by iη ij ,2 / √ 2. Since we assume that the only nonzero Yukawa couplings of Φ 2 are η µτ 2 , η τ µ 2 and η tt 2 , and that A 0 does not mix with the other neutral Higgs bosons, the only relevant diagrams containing A 0 are the 2-loop Barr-Zee diagrams with a top quark loop. Regarding the neglected diagrams involving H ± , we estimate that their contribution is of the same order as that of diagrams involving only H 0 or only A 0 , and we will include this uncertainty in our plots.

IV.2. LHC Constraints on H 0 → τ µ in the 2HDM
If we assume the direct Yukawa couplings of the second Higgs doublet Φ 2 to light quarks to be small like for the SM-like Higgs, the dominant H 0 production channel in the lepton flavor violating 2HDM is gluon fusion, with the cross section Here, σ(gg → h) SM m h =m H 0 is the gluon fusion cross section in the SM if the Higgs mass is set to m H 0 [101]. The first term in parentheses arises from mixing between the two CP even neutral Higgs bosons h 1 and h 2 (see eq. (17)), while the second one is related to the direct coupling of a top quark loop to h 2 . If other diagonal Yukawa couplings of Φ 2 to quarks besides η tt 2 are sin α, they would contribute as well. For instance, if Φ 2 has a large coupling η bb 2 to bottom quarks, the factor in parentheses in eq. (48) would receive an extra contribution ] is a loop function [102][103][104]. The factor F (x) is given by We will, however, not consider this possibility here and assume η tt 2 to be the largest diagonal element of η 2 in the following. With the additional simplifying assumptions η µτ 2 = η τ µ 2 (see section IV.1), the overall process pp → H 0 → µτ depends on four parameters: m H 0 , sin α, η tt 2 , and η µτ 2 = η τ µ 2 We show the production cross section σ(pp → H 0 ) in fig. 10, and the branching ratios into the most relevant decay channels for two benchmark points in fig. 11.
In the following, we use the data from the CMS search for h → τ µ [57] to constrain also the decay H 0 → τ µ. In particular, we compare CMS' measured distributions of the reconstructed collinear mass m coll µτ of the µ-τ system to the predicted signals at various values of m H 0 . This comparison is shown graphically in fig. 12, which leads us to expect that constraints will be competitive for m H 0 250 GeV, while at larger masses the signal will be too small compared to the background. To produce fig. 12 and for the subsequent statistical analysis, we have simulated inclusive samples of pp → H 0 → τ µ events and applied the cuts from [57]. We have used the same simulation tools as in sections II.2 and III.2. The predicted SM background distributions are taken from [57]. For search channels with hadronic τ decays (denoted here as τ h ), the background is dominated by events with fake leptons, while for events with leptonic τ decays (denotes as τ e ), the dominant backgrounds are Z → τ τ , di-boson production, fake leptons, and, in event categories that allow for extra jets, tt. We have checked that, when setting m H 0 = 125 GeV and sin α = 1, we obtain a limit that is about 15% weaker than the CMS limit BR(h → µτ ) < 0.015, and we also confirm that the null hypothesis of the branching ratio being zero is disfavored at ∼ 2σ. We surmise that the reason why our limit is slightly weaker than the CMS limit is the omission of the event categories with ≥ 2 jets in our analysis.   Figure 11. Branching ratios of the heavy neutral CP-even Higgs boson H 0 as a function of its mass m H 0 for two different parameter points of the lepton flavor violating 2HDM. We assume here a scenario with large lepton flavor violation in the µ-τ sector, as expressed by the Yukawa couplings η µτ 2 , η τ µ 2 of the second Higgs doublet.
The main results of this section are presented in figs. 13 and 14. In the former figure, we show our limits on the cross section σ(pp → H 0 → τ µ), while in the latter we interpret these limits in the context of the lepton flavor violating 2HDM, translating them into constraints on the parameters sin α, η τ µ 2 = η µτ 2 , and m H 0 of the model. We also compare to the limits from τ → µγ and from flavor violating decays of the light Higgs boson via h → µτ (see section IV.1). At low m H 0 , where  (c) (d) Figure 12. Distribution of the collinear mass m coll µτ after cuts for the SM background from [57] and for a hypothetical H 0 → τ µ signal at various values of m H 0 . We work in the context of the lepton flavor violating 2HDM, with the parameters indicated in the plots. The panels on the left are for events with leptonic τ decays τ → eνν, denoted here as τ e , while the panels on the right include only events with hadronic τ decays, denoted as τ h . In the upper row we show events with no extra jets with p T > 30 GeV, |η| < 4.7, while in the bottom row we require exactly one such jet. We do not include event categories with ≥ 2 jets in our analysis.
BR(H 0 → τ µ) is large, our limits are in general stronger than these indirect constraints. When m H 0 > 2m W , the branching ratio to the µ-τ final state drops rapidly, and consequently also the sensitivity of our search is diminished. Nevertheless, it can still provide the strongest constraints if sin α is small and η tt 2 is sizeable and negative. In this case, H 0 production through gluon fusion is significant, while BR(h → τ µ) and BR(τ → µγ) are suppressed by sin α. Note that our limits are in general strongest for negative η tt 2 because in this case, the two terms on the right hand side of eq. (48) interfere constructively.  Figure 13. 95% CL limit on the cross section for a pp → H 0 → τ µ signal as a function of m H 0 , obtained by recasting the results from [57].

V. SUMMARY AND CONCLUSION
In summary, we have discussed several so far unexplored signatures related to flavor violating Higgs couplings. For the case of flavor violation in the quark sector, we have studied the t + hh (single top plus di-Higgs) final state, working first in an effective field theory (EFT) framework with operators of the form Q i LH u j R (H † H) and then in a Two Higgs Doublet Model (2HDM). In the EFT case, we find that only the high-luminosity LHC may be sensitive to t + hh production, while in the 2HDM, discovery prospects are excellent already in Run II, with O(300 fb −1 ) of 13 TeV data, see fig. 9. In particular, the expected limits from our proposed search can surpass those from the traditional search for t → hq decays by almost an order of magnitude. The reason for the enhanced discovery reach for t + hh events in the 2HDM is the contribution from the process pp → t + (H 0 → hh), where H 0 is the heavy CP even neutral Higgs boson.
We have considered also flavor violation in the lepton sector, as motivated in particular by the recent CMS excess in the h → τ µ channel. Perhaps the simplest explanations of this excess is provided by the 2HDM, where it is related to the possibility of large flavor violating couplings of the second (heavy) Higgs doublet, which mixes with h. Consequently, we have studied direct production of heavy Higgs bosons and their flavor violating decay in the process pp → H 0 → τ µ. We have used existing CMS data to search for this process and to set new limits on the lepton flavor violating 2HDM. Our limits are summarized in fig. 13.
To conclude, our study opens up new avenues to search for flavor changing processes in the Higgs sector. It shows that searches in specific models-in this case the 2HDM-can be orders of magnitude more sensitive than searches based only on higher dimensional operators. To illustrate this point, we have derived limits on flavor violating couplings of heavy Higgs bosons by recasting the CMS search for h → τ µ.   Figure 14. 95% CL constraints on the parameter space of the lepton flavor violating 2HDM. We show results from our search for H 0 → τ µ based on recasting [57] (green contours), from the CMS search for h → τ µ (red exclusion regions and blue 1σ preferred regions) [57], and from τ → µγ limits (brown/orange). For the τ → µγ amplitude, we estimate that the contribution A H ± of diagrams involving H ± is of the same order as the contribution A H 0 ,A 0 of diagrams involving H 0 or A 0 (solid curves, shaded regions). The uncertainty of this rough approximation is estimated by also showing the constraint in case the H ± contribution is twice as large (dot-dashed curves) or cancels A H 0 ,A 0 exactly (dashed curves). Note that typically not all of these curves are visible within the chosen plot ranges. The panels on the left show constraints on the heavy Higgs mass m H 0 and the flavor violating Yukawa coupling η µτ 2 = η τ µ 2 , the panels in the middle column show m H 0 vs. the neutral Higgs mixing sin α, and the panels on the right display sin α vs. η µτ 2 = η τ µ 2 . The three rows of plots correspond to different values of the top quark Yukawa coupling to the second Higgs doublet, η tt 2 . This coupling affects H 0 production through gluon fusion and the two-loop contributions to τ → µγ.