1 Introduction

The discovery of the Higgs boson [1, 2] not only completes the Standard Model (SM) of particle physics, but also opens new windows for searches for new physics through the Higgs portal. Although current experimental results [3,4,5,6,7,8] indicate a preference for a SM-like Higgs boson, more precise measurements are required in order to determine its true nature and whether or not it has new physics properties. Three electron-positron colliders, the Circular Electron-Positron Collider (CEPC) [9], the FCC-ee, formerly known as TLEP [10], and the International Linear Collider (ILC) [11], have been proposed by different high-energy communities, aiming to precisely study the Higgs boson properties. They are designed to operate at 240–250 GeV with a large sample of Higgs bosons collected, mainly by the \(e^+e^-\rightarrow ZH\) process. The large amount of Higgs bosons produced in a clean environment will allow measurements of the cross section of the Higgs production [12] as well as its mass [13,14,15], decay width [16] and branching ratios [13, 17,18,19] with precision far beyond that of the Large Hadron Collider (LHC). Such machines will also provide opportunities to search for new particles such as new multi-quark states [20], dark photons [21], dark matter particles [22,23,24,25,26], heavy neutrinos [27,28,29,30] and supersymmetric particles [31], and also to probe new physics scales via Higgs and electroweak observables [32,33,34,35,36,37,38,39,40]. In this work, we focus on the charged lepton flavor violating (CLFV) Higgs decays \(H\rightarrow e^\pm \mu ^\mp \), \(e^\pm \tau ^\mp \) and \(\mu ^\pm \tau ^\mp \).

The CLFV Higgs decays are interesting, because their observation may provide insight into some fundamental questions in nature, e.g., whether there is a secondary mechanism for the electroweak symmetry breaking [41], why the neutrino masses are tiny [42], and whether there is an extra dimension responsible for the gauge hierarchy generation [43]. They have thus attracted a lot of attention from both theorists and experimentalists [44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64]. The CMS collaboration reported the first hint of charged lepton flavor violation in the \(H\rightarrow \mu ^\pm \tau ^\mp \) channel with a significance of 2.4 standard deviations [63, 64]. Although this signal disappeared later [63, 64], the CLFV Higgs decays are still worthy to be studied with higher precision. On one hand, the so-called flavor anomalies, which indicate lepton flavor non-universality, reported by the B factories and the LHCb collaboration [65,66,67,68], in some sense also imply lepton flavor violation (when the lepton mass matrices are diagonalised to obtain the physical states, unequal diagonal couplings with different leptons will lead to off-diagonal couplings). On the other hand, the B meson decay channels in which the flavor anomalies are observed are always polluted by complicated strong dynamics, while the much cleaner CLFV Higgs decay channels will provide a better chance to study the mechanism generating the lepton flavor violation or non-universality once they are discovered. The potential to search for such decay channels at the High Luminosity LHC (HL-LHC) has been estimated in [69]. By this paper, we study the sensitivity of the three lepton colliders in measuring the CLFV Higgs decays based on the detector simulation of the signal events of the decay channels and the corresponding background. There are already some studies on the ILC measurement of \(H\rightarrow \mu ^\pm \tau ^\mp \) [69,70,71,72], and the difference between our paper and theirs will be discussed.

The paper is organized as follows: in Sect. 2, we perform the detector simulation of the signal and background events for the three CLFV Higgs decay channels at the CEPC and obtain the upper bounds on the three decay rates, based on which we estimate the corresponding upper bounds expected to be given by the FCC-ee and the ILC; we derive in Sect. 3 the constraints on theory parameters including the CLFV Higgs couplings, the relevant parameters in the type-III two-Higgs-doublet-model (2HDM) and the new physics cut-off scales in the SM effective field theory (SMEFT), in Randall-Sundrum (RS) models and in models with heavy neutrinos; we summarize by Sect. 4.

2 Simulation and analysis

In this section, we evaluate the possible reach of the three colliders with \(\sqrt{s}\) = 240–250 GeV, the CEPC, the FCC-ee and the ILC, in measuring the branching ratios of the three CLFV Higgs decays. Investigating the dominant Higgs production process \(e^+e^-\rightarrow ZH\), with considered only the Z boson decaying hadronically into a quark pair, we expect that the signal events each contain two charged leptons of different flavors and two jets. Therefore, the four-fermion processes will form the major SM background. The signal processes \(e^+e^-\rightarrow Z(\rightarrow q{\bar{q}})H(\rightarrow e^\pm \mu ^\mp ,e^\pm \tau ^\mp ,\mu ^\pm \tau ^\mp )\) are simulated via MadGraph v2.5.2 [73], with the corresponding three CLFV Higgs vertices implemented. The background events are generated via WHIZARD 2.5.0 [74, 75]. PYTHIA 6.4 [76] is then used to manage hadronization and parton showers for both the signal and background events. Finally, Delphes 3.4.1 [77, 78] is adopted for detector simulation. Note that we only carry out the above simulation procedure for the CEPC with \(\sqrt{s}\) = 240 GeV and an integrated luminosity of 5 \(\hbox {ab}^{-1}\). We suppose that the FCC-ee with also \(\sqrt{s}\) = 240 GeV has a same integrated luminosity and similar detector performance, and hence the CEPC results will apply to the FCC-ee. Different from the CEPC and the FCC-ee, the ILC is planned to run at \(\sqrt{s}\) = 250 GeV with four polarization options: P1 (\(-0.8\), 0.3), P2 (0.8, \(-0.3\)), P3 (\(-0.8, -0.3\)) and P4 (0.8, 0.3). The numbers give the degree of polarization of the beams. For example, the first option (P1) means that the \(e^-\) beam is 80% left-handed polarized and the \(e^+\) beam is 30% right-handed polarized. The integrated luminosities for the four polarization options are (in \(\hbox {fb}^{-1}\)) 1350, 450, 100 and 100, respectively. We find that the beam polarization will considerably change neither the angular distribution of the signal final states nor the statistical uncertainties owing to the background, which makes it possible to estimate the ILC sensitivity to the CLFV Higgs decays based on the CEPC simulation, as discussed in detail for each channel below in this section.

Table 1 The cross sections and event numbers of the four fermion processes belonging to different categories at the 240 GeV CEPC with an integrated luminosity of 5 \(\hbox {ab}^{-1}\). See text for details

For each CLFV Higgs decay channel, the production of 10000 signal events at the CEPC are simulated. We simulate the four fermion background events by WHIZARD with an integrated luminosity of 5 \(\hbox {ab}^{-1}\), giving the cross sections and the numbers of events for different categories in Table 1. The four fermion processes are classified into different categories according to their final states [79] as follows. We divide the processes into two groups. The first group contains the processes without (anti-) electron or (anti-) electron neutrino in their final states, and the processes in the second group have at least one (anti-) electron or (anti-) electron neutrino in their final states. We start with a process whose final state is constituted by two pairs of mutually charge conjugate fermions like \(u{\bar{u}}\mu ^+\mu ^-\) or \(u{\bar{u}}e^+e^-\) that can not arise from decays of two W bosons. If this process belongs to the first group, we will classify it into the “ZZ” category; if it belongs to the second group, it will be classified into “Single Z” (SZ). A process belonging to the first or the second group with two pairs of mutually charge conjugate fermions in the final state will be classified into the “ZZ or WW” (ZZWW) category or the “Single Z or Single W” (SZSW) category, respectively, if its final state can arise from decays of two W bosons. The remaining processes in the first and second group are classified into “WW” and “Single W” (SW), respectively. More details can be found in [12]. We further divide each category of processes into different sets, according to whether the final states contain only quarks (“qq”), only leptons (“ll”) or both of them (“ql”). For example, \(u{\bar{u}}\mu ^+\mu ^-\) is classified into “ZZ\(\_\)ql”, while \(\nu _\mu {\bar{\nu }}_\mu \mu ^+\mu ^-\) is classified into “ZZWW\(\_\)ll”. Note we only consider the four fermion processes listed in the footnote of Table 1. The neglected processes do not contribute to the background after the chosen cuts are made.

In the following, for each CLFV Higgs decay channel, we set appropriate cuts on both the generated signal and background events. Owing to the fact that the signal events of different channels need to be reconstructed differently and thus suffer from different backgrounds, different event selection cuts are determined through the significance optimization for different channels. Based on the signal detection efficiencies and the background event numbers, the CEPC bounds on the decay branching ratios are evaluated first, which also apply to the FCC-ee. After that, we estimate the corresponding ILC bounds by scaling the luminosity times the signal and background cross sections, i.e., scaling the event numbers, for each polarization option.

2.1 \(H\rightarrow e^\pm \mu ^\mp \)

To reconstruct the signal events of \(e^+e^-\rightarrow HZ \rightarrow e^\pm \mu ^\mp Z\), we select the events with final states containing one electron, one muon, and two jets to reconstruct the Z boson, with the di-lepton (electron and muon) invariant mass \(m_{e\mu }\) close to the Higgs boson mass and the di-jet invariant mass \(m_{jj}\) close to the Z boson mass. This event selection condition requires the cuts 70 GeV \(< m_{jj}<\) 100 GeV and 117 GeV \(< m_{e\mu }<\) 127 GeV, which are displayed in Table 2. We use the di-jet to reconstruct the Z boson because the large hadronic decay rate of the Z boson \({\mathcal {B}}(Z\rightarrow q{\bar{q}})\approx \) 70% [80] ensures a high reconstruction efficiency. Besides, either if we choose an electron pair or a muon pair to reconstruct the Z boson, leptons decaying from the Higgs boson and from the Z boson may get mixed up. Of course, one can always combine the analyses based on all the possible methods of reconstructing the Z boson to improve the statistics, but we will not do this here as we only aim at an estimation of the order of the upper bounds on the CLFV decay rates in this work.

Table 2 The numbers of background events in different categories and signal events surviving the cuts in the analysis of \(H\rightarrow e^\pm \mu ^\mp \) at the CEPC. \(N_{e(\mu ,j)}\) represents the number of electrons (muons, jets) in the final state of an event. See text for details

In Table 2 we list how many background events in different categories and how many signal events are left after the cuts based on the CEPC simulation. We find that 4115 out of the 10000 generated signal events survive the event selection, and thus the signal detection efficiency is \(\epsilon \) = 41.15%. As for the background, only one event is expected after the event selection. Given also one observed event as expected, the upper limit on the number of signal events (a Poisson variable) is \(N_{95}\) = 3.74 at 95% confidence level (CL). Then, employing

$$\begin{aligned} {\mathcal {B}} < {N_{95}\over \epsilon N_{H}{\mathcal {B}}(Z\rightarrow q{\bar{q}})}, \end{aligned}$$
(1)

with \({\mathcal {B}}(Z\rightarrow q{\bar{q}})\approx \) 70% and \(N_{H}\) = 1.05 \(\times 10^6\) the number of the Higgs bosons to be produced by the CPEC, we evaluate the upper bound on the \(H\rightarrow e^\pm \mu ^\mp \) branching ratio,

$$\begin{aligned} {\mathcal {B}}(H\rightarrow e^\pm \mu ^\mp ) < 1.2\times 10^{-5}\;\text {at 95}\%~\text {CL.} \end{aligned}$$
(2)

This bound is also accepted for the FCC-ee.

As for the ILC, the ratios of the ZH production cross sections with the four polarization options to that of the CEPC are

$$\begin{aligned}&\sigma (ZH):~~\text {P1/CEPC = 1.48, P2/CEPC = 1.0,} \nonumber \\&\quad \text { P3/CEPC = 0.87, P4/CEPC = 0.65.} \end{aligned}$$
(3)

Recalling the integrated luminosities for the four options, we obtain that about \(5.21\times 10^{5}\) Higgs bosons will be produced at the ILC. Through a tree-amplitude analysis of the \(e^+e^-\rightarrow Z(\rightarrow jj)H(\rightarrow \ell ^\pm \ell '^{\mp })\) process, we find that the beam polarization will not change the angular distribution of the final state, and thus the signal detection efficiency \(\epsilon \) = 41.15% is also valid under the same event selection condition. The only one background event listed in Table 2 arises from the process \(e^+e^-\tau ^+\tau ^-\). We calculate the its production cross section with each polarization option as what we did for the signal, and find that the background event number after 2 \(\hbox {ab}^{-1}\) data collected is expected to be 0.38. Then, for the ILC \(N_{95}\) = 3.3 and the upper bound on \({\mathcal {B}}(H\rightarrow e^\pm \mu ^\mp )\) is about 2.1\(\times 10^{-5}\) at 95% CL. The upper bound does get improved by introduction of beam polarization, mainly owing to the large ZH production cross section of P1, but the improvement is not big enough to fill the gap of the integrated luminosities between the ILC and the CEPC (FCC-ee). One might be unsatisfied with that we estimate the background by scaling the total event numbers, since the beam polarization might change the \(e^+e^-\tau ^+\tau ^-\) angular distribution. We then test the polarization impact by assuming an extreme situation where the role of the polarization is so important that the background event number at the ILC is 0. Even in such a case, the upper bound is then only reduced to 1.9\(\times 10^{-5}\), by less than 10% of magnitude. The key point is that the CEPC background is already very small, so further reduction of the background will not optimize the upper bound significantly. Therefore, we conclude that, regarded as estimates, our results for the ILC upper bounds (also for the other two channels) are acceptable.

The upper bounds on \({\mathcal {B}}(H\rightarrow e^\pm \mu ^\mp )\) expected to be given by the CEPC, the FCC-ee and the ILC are displayed in Fig. 1, together with the present upper bound \({\mathcal {B}}(H\rightarrow e^\pm \mu ^\mp ) < 0.035\%\) at 95% CL reported by the CMS collaboration [81]. We find that the three future lepton colliders are expected to improve the precision by about 30 times compared to the present CMS measurement, and also by one order of magnitude compared to the expected HL-LHC upper bound \({\mathcal {B}}(H\rightarrow e^\pm \mu ^\mp ) < {\mathcal {O}}(0.02)\%\) [69].

Fig. 1
figure 1

The upper bounds at 95% CL on the three CLFV Higgs decay rates (left) and the corresponding CLFV couplings (right) given by the LHC [64, 81] (red), compared to the corresponding bounds expected to be given by the CEPC, the FCC-ee (green) and the ILC (blue). See text for details

2.2 \(H\rightarrow e^\pm \tau ^\mp \)

To reconstruct the signal events of \(e^+e^-\rightarrow HZ\rightarrow e^\pm \tau ^\mp Z\), we again first select the events with two jets, one electron and one muon in their final states. In each event, the Z boson is reconstructed using the two jets as in the \(H\rightarrow e^\pm \mu ^\mp \) case, the tau lepton is reconstructed from the muon and the missing energy, and the Higgs boson is finally reconstructed from the electron and the tau lepton. Our tau reconstruction method has an efficiency that will allow us to estimate the CEPC sensitivity to \({\mathcal {B}}(H\rightarrow e^\pm \tau ^\mp )\), since the branching ratio of a tau lepton decaying into a muon and two neutrinos \({\mathcal {B}}(\tau \rightarrow \mu \nu {\bar{\nu }})\) is nearly 20%. Our method also greatly suppresses the background, e.g., if we choose to reconstruct the tau lepton from electron and missing energy, then the processes \(e^+e^-\rightarrow e^+e^-q{\bar{q}}\) would give a large background. The above event selection method requires the following cuts: 66 GeV \(< m_{jj}<\) 94 GeV, \(m_{\mu E_M} < 4\) GeV and 121 GeV \(< m_{e\tau } < 130\) GeV, where \(m_{\mu E_M}\) is the invariant mass of the muon and missing energy, and \(m_{e\tau }\) is the invariant mass of the electron, muon and missing energy. We further set a cut on the electron pseudorapidity of \(|\eta _e|<2\) to suppress the background arising from the SZ processes. The cuts are summarized in Table 3, where we give the numbers of background events in different categories and the numbers of signal events at the CEPC after the cuts. We find that 456 out of the 10000 generated signal events survive the event selection, and thus the signal detection efficiency is \(\epsilon \) = 4.56%. As for the background, five events are expected after the event selection, so the upper limit on the number of signal events is \(N_{95}\) = 5.51 at 95% CL. Then, employing (1) and recalling that the number of the Higgs bosons to be produced by the CPEC is \(N_{H}\) = 1.05 \(\times 10^6\), we find the upper bound on the \(H\rightarrow e^\pm \tau ^\mp \) branching ratio as

$$\begin{aligned} {\mathcal {B}}(H\rightarrow e^\pm \tau ^\mp ) < 1.6\times 10^{-4}~\text {at 95}\%~\text {CL}, \end{aligned}$$
(4)

which also applies to the FCC-ee. For the ILC, we find about 1.9 background events by scaling the SZ\(\_\)ll (\(e^+e^-\tau ^+\tau ^-\)) event numbers, and hence \(N_{95}\) = 4.2 and the upper bound on the branching ratio is \(2.4\times 10^{-4}\) at 95% CL. These results and the present upper bound \({\mathcal {B}}(H\rightarrow e^\pm \tau ^\mp ) < 0.69\%\) at 95% CL reported by the CMS collaboration [81] (the ATLAS bound is 1.04% [63]) are displayed together in Fig. 1. We find that the CEPC or the FCC-ee (the ILC) is expected to improve the sensitivity to \({\mathcal {B}}(H\rightarrow e^\pm \tau ^\mp )\) by about 40 (30) times compared to the present CMS measurement, and also by one to two orders compared to the expected HL-LHC upper bound \({\mathcal {B}}(H\rightarrow e^\pm \tau ^\mp ) < {\mathcal {O}}(0.5)\%\) [69].

Table 3 The numbers of background events in different categories and signal events surviving the cuts in the analysis of \(H\rightarrow e^\pm \tau ^\mp \) at the CEPC. See text for details

2.3 \(H\rightarrow \mu ^\pm \tau ^\mp \)

To reconstruct the signal events of \(e^+e^-\rightarrow HZ\rightarrow \mu ^\pm \tau ^\mp Z\), we still select the events containing one electron, one muon and two jets in their final states. In each event, the Z boson is reconstructed from the two jets, the tau lepton is reconstructed from the electron and the missing energy, and the Higgs boson is reconstructed from the muon and the reconstructed tau lepton. Analogous to the \(H\rightarrow e^\pm \tau ^\mp \) case, we do not consider other ways to reconstruct the tau lepton. We employ the following cuts: 60 GeV \(< m_{jj}<\) 100 GeV and \(m_{e E_M} < 5\) GeV and 120 GeV \(< m_{\mu \tau } < 130\) GeV, where \(m_{e E_M}\) is the invariant mass of the electron and missing energy, and \(m_{\mu \tau }\) is the invariant mass of the electron, muon and missing energy. The cuts are listed in Table 4, where we also show the numbers of background events in different categories and the numbers of signal events after the cuts. We find that 522 out of the 10000 generated signal events survive the event selection, and thus the signal detection efficiency is \(\epsilon \) = 5.22%. As for the background, five events are expected after the event selection, so the upper limit on the number of signal events is \(N_{95}\) = 5.51 at 95% CL. Finally, we obtain the upper bound on the \(H\rightarrow \mu ^\pm \tau ^\mp \) branching ratio given by the CEPC and also the FCC-ee,

$$\begin{aligned} {\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp ) < 1.4\times 10^{-4}~\text {at 95}\%~\text {CL.} \end{aligned}$$
(5)

For the ILC, we find about 2.6 background events by scaling the ZZ\(\_\)ll (\(\mu ^+\mu ^-\tau ^+\tau ^-\) and \(\tau ^+\tau ^-\tau ^+\tau ^-\)) and ZZ\(\_\)ql (\(jj\tau ^+\tau ^-\)) event numbers, and hence \(N_{95}\) = 4.6 and the upper bound on the branching ratio is \(2.3\times 10^{-4}\) at 95% CL. These results and the present upper bound \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp ) < 1.20\%\) at 95% CL reported by the CMS collaboration [64] (the ATLAS bound is 1.43% [63]) are displayed together in Fig. 1. We find that any of the future lepton colliders is expected to improve the sensitivity to \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp )\) by nearly two orders compared to the present CMS measurement, and also by one to two orders compared to the expected HL-LHC upper bound \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp ) < {\mathcal {O}}(0.5)\%\) [69].

Table 4 The numbers of background events in different categories and signal events surviving the cuts in the analysis of \(H\rightarrow \mu ^\pm \tau ^\mp \) at the CEPC. See text for details

According to another study of \(H\rightarrow \mu ^\pm \tau ^\mp \) at the ILC [71], the upper bound on the branching ratio is given as \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp ) < 2.9\times 10^{-5}\) at 95% CL if 90% signals survive the event selection. This bound is more stringent than that obtained in this work. A possible reason is that we only considered the cleanest method to reconstruct the tau lepton, as described previously, which makes our evaluated upper bound very conservative. It is also necessary to point out that [71] has assumed a tau lepton reconstruction efficiency as high as 70%, muon and jet detection efficiencies as high as 100% [82]. We also suspect that only the \(e^+e^-\rightarrow q{\bar{q}}\mu ^\pm \tau ^\mp {\bar{\nu }}\nu \) processes are considered as possible background sources in [71] is too optimistic. The potential of the ILC to search for the \(H\rightarrow \mu ^\pm \tau ^\mp \) channel has also been studied in [72], where the Z bosons are reconstructed using lepton pairs. There it is discussed that a signal with 3\(\sigma \) statistical significance at the ILC with an integrated luminosity of 1 \(\hbox {ab}^{-1}\) requires \(H\rightarrow \mu ^\pm \tau ^\mp \) to have a branching ratio larger than \(4.09\times 10^{-3}\).

3 Constraints on theory parameters

The Lagrangian for a CLFV Higgs decay is given by

$$\begin{aligned} {\mathcal {L}}^{H\rightarrow \ell \ell '} \ni -Y_{\ell \ell '}{\bar{\ell }}_LH\ell '_R - Y_{\ell '\ell }{\bar{\ell }}^\prime _LH\ell _R + \textit{h.c.}, \end{aligned}$$
(6)

with \(\ell \ne \ell '\). The decay width of \(H\rightarrow \ell ^\pm \ell ^{\prime \mp }\) is then calculated to be

$$\begin{aligned} \begin{aligned} \Gamma (H\rightarrow \ell ^\pm \ell ^{\prime \mp }) = {m_H\over 8\pi }\left| y_{\ell \ell '}\right| ^2 \end{aligned} \end{aligned}$$
(7)

in the zero lepton mass limit, where \(y_{\ell \ell '}\) is defined by \(y_{\ell \ell '}\equiv \sqrt{|Y_{\ell \ell '}|^2+|Y_{\ell '\ell }|^2}\). Assuming that new physics only enters via the \(H\ell \ell '\) coupling, the \(H\rightarrow \ell ^\pm \ell ^{\prime \mp }\) branching ratio is given by

$$\begin{aligned} {\mathcal {B}}(H\rightarrow \ell ^\pm \ell ^{\prime \mp }) = {\Gamma (H\rightarrow \ell ^\pm \ell ^{\prime \mp }) \over \Gamma (H\rightarrow \ell ^\pm \ell ^{\prime \mp }) + \Gamma _\text {SM}}, \end{aligned}$$
(8)

where the SM Higgs boson decay width is \(\Gamma _\text {SM}\) = 4.1 MeV [83]. This leads to the upper bounds on the CLFV Higgs couplings expected to be given by the three lepton colliders,

$$\begin{aligned}&\text {CEPC(FCC-ee):}\;y_{e\mu }< 1.0\times 10^{-4},~y_{e\tau }< 3.6\times 10^{-4},\nonumber \\&\quad y_{\mu \tau }< 3.4\times 10^{-4}~\text {at 95}\%~\text {CL,}\nonumber \\&\text {ILC:}\; y_{e\mu }< 1.3\times 10^{-4},~y_{e\tau }< 4.5\times 10^{-4}, \nonumber \\&\quad y_{\mu \tau } < 4.3\times 10^{-4} ~\text {at 95}\%~\text {CL.} \end{aligned}$$
(9)

As a comparison, we also list the current experimental bounds on the CLFV Higgs couplings,

$$\begin{aligned}&y_{e\mu }< 0.5\times 10^{-3},\qquad y_{e\tau }< 2.4\times 10^{-3},\nonumber \\&\quad y_{\mu \tau } < 3.2\times 10^{-3} ~\text {at 95}\%~\text {CL,} \end{aligned}$$
(10)

which are obtained from the LHC bounds on the corresponding branching ratios [64, 81]. All these upper bounds on the \(H\ell \ell '\) couplings are displayed in Fig. 1.

3.1 Constraints on the SMEFT

We also consider the constraints on the new physics cut-off scale \(\Lambda \) implicated by the improved bounds on the CLFV Higgs decay rates in the SMEFT [84, 85], which contains higher-dimension operators invariant under the SM gauge transformations. The dimension-six operators \(H^\dagger H{\bar{f}}'_iHf''_j\) result in the fermions coupling to the Higgs vacuum expectation v differently from to the Higgs boson after the spontaneous symmetry breaking, and the off-diagonal entries of the \(Hf_if_j\) coupling matrices are proportional to \({v^2\over \sqrt{2}\Lambda ^2}\) [86,87,88,89], namely

$$\begin{aligned} Y_{ij} = {v^2\over \sqrt{2}\Lambda ^2}C_{ij}, \end{aligned}$$
(11)

with \(f_{i,j}\) the mass eigenstates and \(i\ne j\). Assuming \(C_{ij}\sim 1\), the expected CEPC (ILC) constraint on the \(H\rightarrow e^\pm \mu ^\mp \) branching ratio will give the most stringent lower bound on \(\Lambda \), \(\Lambda \gtrsim 25\) (22) TeV. However, the order of \(C_{ij}\) depends crucially on flavor structures beyond the SM. If we adopt the Cheng-Sher ansatz \(C_{ij}\sim \sqrt{m_{i}m_{j}}/v\) [90], the \(H\rightarrow \mu ^\pm \tau ^\mp \) channel will set the most stringent lower bound on \(\Lambda \), which reads \(\Lambda \gtrsim 0.6\) (0.5) TeV.

3.2 Constraints on the type III 2HDM

Although one Higgs field is enough for electroweak symmetry breaking and giving masses to gauge bosons and fermions as in the SM, there are several motivations for introduction of two Higgs doublets [91], including requirement of supersymmetry [92], axion models [93] and baryogenesis [94] (see e.g. [95]). In a two-Higgs-doublet-model (2HDM), couplings of the Higgs boson with other particles are modified compared to the SM. Especially, the type III 2HDM naturally introduces tree-level CLFV Higgs couplings. In the type III 2HDM, two doublets \(\Phi _1 = 1/\sqrt{2}( ..., v_1+\rho _1 + ...)^\text {T}\) and \(\Phi _2 = 1/\sqrt{2}( ..., v_2+\rho _2 + ...)^\text {T}\) with hypercharge +1 couple to fermions freely. We can rotate the scalar doublets such that the vacuum expectation is entirely in the first doublet,

$$\begin{aligned}&\begin{array}{c} H_1=\Phi _1\cos \beta + \Phi _2\sin \beta ,\\ H_2=\Phi _1\sin \beta - \Phi _2\cos \beta , \end{array}\nonumber \\&\langle H_1\rangle = \left( \begin{array}{c} 0\\ v/\sqrt{2} \\ \end{array} \right) , \quad \langle H_2\rangle = \left( \begin{array}{c} 0\\ 0 \\ \end{array} \right) , \end{aligned}$$
(12)

with \(v=\sqrt{v_1^2+v_2^2}\) and \(\tan \beta = v_2/v_1\). While the mass eigenstates are given by another rotation,

$$\begin{aligned} H = \rho _1\sin \alpha - \rho _2\cos \alpha ,\quad H' = - \rho _1\cos \alpha - \rho _2\sin \alpha , \end{aligned}$$
(13)

and equivalently

$$\begin{aligned} \begin{aligned} H=&~H_1^0\sin (\alpha -\beta ) + H_2^0\cos (\alpha -\beta ),\\ H'=&-H_1^0\cos (\alpha -\beta ) - H_2^0\sin (\alpha - \beta ). \end{aligned} \end{aligned}$$
(14)

Diagonalizing the mass matrix automatically diagonalize the \(H_1^0\) coupling matrix, so the CLHV vertices only come from \(H_2^0\). Therefore, the CLFV couplings with H is given by

$$\begin{aligned}&{\mathcal {L}}^{H\rightarrow \ell \ell '} \ni -\cos (\alpha -\beta )\xi _{\ell \ell '}{\bar{\ell }}_LH\ell '_R \nonumber \\&\quad -\cos (\alpha -\beta )\xi _{\ell '\ell }{\bar{\ell }}^\prime _LH\ell _R + \textit{h.c.}. \end{aligned}$$
(15)

Under the Cheng-Sher ansatz [90], we define \(\xi _{ij} = \lambda _{ij}\sqrt{2m_im_j}/v\), where \(\lambda _{ij}\) are of order one. On the other hand, the flavor conserving H couplings receive contributions from both \(H_1^0\) and \(H_2^0\). For example, the expression for the \(Hb{\bar{b}}\) coupling is given by \(y_{b}\sin (\alpha -\beta )+\xi _{b}\cos (\alpha -\beta )\) with \(y_b=m_b/v\), and we further write \(\xi _b=\lambda _b m_b/v\) and expect \(\lambda _b\) to be of order one.

Fig. 2
figure 2

The red (solid), blue (dashed) and green (dashed) curves represent the constraints at 95% CL on the \(\cos (\alpha -\beta )\)-\(\lambda _{\mu \tau }\) plain (the regions above the curves are excluded) set by the the LHC [64], the ILC and the CEPC (FCC-ee) upper bounds on \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp )\), respectively. The orange (light orange) contours are the corresponding 1(2)-sigma allowed ranges of \(\cos (\alpha -\beta )\) by the ATLAS measurement of the \(H\rightarrow b{\bar{b}}\) signal strength [96] with \(\lambda _b\) = 1 (left) and 0.5 (right). See text for details

According to the current measurements, no obvious deviation of Higgs couplings from the SM has been found [97]. This is further confirmed by the recent observation of the \(H\rightarrow b{\bar{b}}\) channel by the CMS [98] and the ATLAS [96] collaborations. It indicates that the parameters \(\lambda _{i(j)}\) and \(\alpha -\beta \) are strictly constrained, which also makes (8) approximately valid. Here we study how the bounds on the CLFV Higgs decay rates together with the \(Hb{\bar{b}}\) coupling set constraints on the relevant parameters \(\lambda _{\ell \ell '}\) and \(\cos (\alpha -\beta )\) in the 2HDM. The \(H\rightarrow \mu ^\pm \tau ^\mp \) channel is taken as an example. From the ATLAS measurement of the \(H\rightarrow b{\bar{b}}\) signal strength [96], it can be extracted that the ratio of the \(Hb{\bar{b}}\) coupling to the SM expectation is \(1.005\pm 0.10\). We choose the order-one \(\lambda _b\) to be 1 and 0.5 as two benchmarks, and display the corresponding 1- and 2-sigma allowed ranges of \(\cos (\alpha -\beta )\) in the two panels of Fig. 2 by the orange and light orange contours. In Fig. 2, we also show on the \(\cos (\alpha -\beta )\)-\(\lambda _{\mu \tau }\) plain the constraints (the regions above the curves are excluded) set by the upper bounds on \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp )\). It is observed that \(\cos (\alpha -\beta )\) still has a large living space especially in the \(\lambda _b\) = 0.5 case, and that in the region when \(|\cos (\alpha -\beta )|\) is large, the upper bounds on \({\mathcal {B}}(H\rightarrow \mu ^\pm \tau ^\mp )\) given by the the future lepton colliders restrict \(\lambda _{\mu \tau }\) to be smaller than \({\mathcal {O}}(0.1)\).

3.3 Constraints on RS models

In RS models [99, 100] in which the fermions are allowed to propagate in the extra dimension, the large fermion mass hierarchies and the tiny neutrino masses can be explained [101, 102]. In order to generate the observed structure in the lepton sector, which means the hierarchies between charged lepton masses, the neutrino masses with a similar size and the large neutrino mixing angles, one can, in the assumption of Dirac neutrinos, set same or similar profiles in the fifth dimension for the lepton doublets and neutrino singlets and set different profiles for the charged lepton singlets (see e.g. [103]). In such a case, we find that the coefficient in (11) \(C_{ij}\sim m_\tau /v\) (see Section 4.1 of [104]). Therefore, in such models, the most stringent lower bound on \(\Lambda \), or the famous Kaluza-Klein scale \(M_\text {KK}\) [105, 106], is set by the expected CEPC (ILC) constraint on the \(H\rightarrow \mu ^\pm \tau ^\mp \) branching ratio as \(\Lambda \gtrsim 2.5\) (2.2) TeV. Since the masses of the lightest Kaluza-Klein particles are approximately 2.45\(M_\text {KK}\), nondiscovery of \(H\rightarrow \mu ^\pm \tau ^\mp \) at the CEPC (ILC) will excluded Kaluza-Klein particles with masses smaller than 6.1 (5.4) TeV.

3.4 Constraints on models with heavy neutrinos

Lepton flavor violation may originate from heavy neutrinos at one-loop level (see e.g. [107]). Taking the Inverse Seesaw Model as an example with the right-handed neutrino masses \(M_R\) close to the TeV scale, we have approximately the off-diagonal charged lepton Yukawa couplings

$$\begin{aligned} Y_{ij}\approx & {} {g\over 64\pi ^2}{m_i\over m_W}\left[ {m_H^2\over m_R^2} \left( r\left( {m_W^2\over m_H^2}\right) + \log \left( {m_W^2\over m_H^2}\right) \right) (Y_\nu Y_\nu ^\dagger )_{ij} \right. \nonumber \\&\left. -{3v^2\over 2M_R^2}(Y_\nu Y_\nu ^\dagger Y_\nu Y_\nu ^\dagger )_{ij} \right] \end{aligned}$$
(16)

according to (25) of [108], with \(Y_\nu \) the neutrino Yukawa coupling matrix, g the SU(2) gauge coupling constant, \(m_i\) the mass of the ith generation charged lepton, \(m_W\) the W boson mass and \(m_H\) the Higgs boson mass. The function \(r(\lambda )\) is given by (26) of [108]. If we assume a benchmark neutrino Yukawa coupling matrix following [108], \(Y_\nu = \{ (0.1, 0, 0), (0, 1, 0), (0, 1, 0.014)\}\), a rough calculation indicates that the lower bound on \(M_R\) set by the expected measurements of the CLFV Higgs decay channels at the three future lepton colliders will be \(M_R \gtrsim \) 0.3 GeV. Since such a small right-handed neutrino mass would not satisfy the perturbation condition, we conclude that the expected improved bounds on the CLFV Higgs decay rates put no constraint on the right-handed neutrino masses. The complete expression for the off-diagonal charged lepton Yukawa couplings, which does not rely on the expansion in inverse powers of \(M_R\), can be found in Appendix C of [109]. As discussed in [108], using (16) always overestimates values of \(Y_{ij}\), so using the complete formula will give an even looser constraint on \(M_R\), which will not change our main conclusion.

4 Summary

The future \(e^+e^-\) colliders, the CEPC, the FCC-ee and the ILC, as Higgs factories, are ideal machines for precise studies of Higgs properties. In this paper, we evaluate the potential of the three lepton colliders for searching for the CLFV Higgs decays. We find that the expected upper bounds given by the CEPC or the FCC-ee (the ILC) on the branching ratios of \(H\rightarrow e^\pm \mu ^\mp \), \(e^\pm \tau ^\mp \) and \(\mu ^\pm \tau ^\mp \) are \(1.2~(2.1)\times 10^{-5}\), \(1.6\ (2.4)\times 10^{-4}\) and \(1.4~(2.3)\times 10^{-4}\) at 95% CL, respectively. The resulting constraints on certain theory parameters are also given, including the CLFV Higgs couplings, the relevant parameters in the type-III 2HDM, and the cut-off scales in the SMEFT and in RS models.