1 Introduction

A 125 GeV scalar, h, was discovered in 2012 by ATLAS and CMS [1, 2], which resembles the Higgs boson of the Standard Model (SM), marking the success of SM up to the electroweak scale. Despite the remarkable resemblance of h with the SM Higgs, we are still unclear about the nature of electroweak symmetry breaking (EWSB), and whether the SM Higgs sector is complete remains a mystery. Many extensions of SM expand the Higgs sector by adding additional doublets [3] like two Higgs doublet models (2HDM), minimal SUSY (MSSM), three Higgs doublet model (3HDM) [4].

In this article, we study one of the simpler extensions of SM, a.k.a. 2HDM, where we extend the SM Higgs sector by an additional Higgs doublet, with both doublets coupling to fermions. As a result, we have two Yukawa matrices that cannot be simultaneously diagonalized, and thus we have off-diagonal flavor violating terms. The off-diagonal Yukawa terms can generate tree level flavor-changing neutral Higgs (FCNH) interactions; this version is referred to as 2HDM-III [5] or general 2HDM (g2HDM). The FCNH interactions are usually avoided by introducing some ad hoc \(Z_2\) symmetries to enforce the Glashow-Weinberg natural flavor conservation (NFC) condition [6], giving rise to 2HDM Type I, II, X, and Y versions [3]. However, we do not enforce \(Z_2\) symmetries but let nature provide us with its true flavor.

Recently, the Fermilab Muon g-2 experiment [7] confirmed the previous result of the BNL Muon g-2 experiment. Their combined result is,

$$\begin{aligned} a_{\mu }(\textrm{Exp}) = 116592061(41) \times 10^{-11} (0.35 \, \textrm{ppm}), \end{aligned}$$
(1)

which deviates from the community consensus SM expectation [8], \(a_{\mu }(\textrm{SM}) = 116591810 (43) \times 10^{-11} (0.37 \, \textrm{ppm})\), by 4.2\(\sigma \). The muon \(g-2\) anomaly can be explained in g2HDM by flavor violating \(\rho _{\tau \mu }\) couplings, as discussed in Refs. [9,10,11]. The lepton flavor violating (LFV) \(\rho _{\tau \mu }\) and \(\rho _{\mu \tau }\) couplings can drive \(h \rightarrow \tau \mu \), which can be of concern as the limit [12]

$$\begin{aligned} \mathcal {B}(h \rightarrow \tau \mu ) < 0.15\, \%, \end{aligned}$$
(2)

is rather stringent. However, one can overcome the strong bounds with the help of Alignment [13]. Under alignment the properties of h closely resembles that of SM Higgs, which requires the mixing angle between the two CP-even scalars hH to approach zero, \(\cos \gamma \rightarrow 0\), with the \(h\tau \mu \) coupling \(\propto \rho _{\tau \mu }\cos \gamma \).

In g2HDM, the exotic scalar H benefit from alignment with \(\sin \gamma \rightarrow 1\), and there is no suppression for \(H \rightarrow \tau \mu \) or \(A \rightarrow \tau \mu \) LFV processes. This property was exploited in Ref. [14, 15], where a detailed collider search was performed. Subsequently, CMS published [16] a detailed search for \(H \rightarrow \tau \mu \) for \(m_H \in [200,\,900]\) GeV with 35.9 fb\(^{-1}\) data. No excess was found, placing strong limits on the \(gg \rightarrow H \rightarrow \tau \mu \) cross section, but CMS has yet to update with full Run 2 data. The \(\rho _{\tau \mu }\), \(\rho _{\mu \tau }\) couplings along with \(\rho _{tt}\) also contribute to \(\tau \rightarrow \mu \gamma \) via two-loop Bjorken-Weinberg (or Barr-Zee) mechanism, which dominates over the one-loop mechanism, provided that \(\rho _{tt} \sim \mathcal {O}(\lambda _t)\) [17, 18]. The one-loop effect would be suppressed if one takes [19] \(\rho _{\tau \tau } = \rho _{\tau \mu } = \rho _{\mu \tau } = \lambda _{\tau } \sim \mathcal {O}(0.01)\). We further extend our work from Ref. [14] by respecting the current limits on \(gg \rightarrow H\rightarrow \tau \mu \) cross-sections and \(\mathcal {B}(\tau \rightarrow \mu \gamma )\).

The extra \(\tau \) Yukawa coupling \(\rho _{\tau \tau }\) with alignment is the main driver for the \(gg \rightarrow H,A \rightarrow \tau \tau \) process. In addition, \(\rho _{\tau \tau }\) can carry a complex phase, which can contribute to \(\tau \) electric dipole moment [20], or reveal itself in the CP structure of the \(h\tau \tau \) coupling. The complex phase of the \(h\tau \tau \) coupling is extensively searched by ATLAS [21] and CMS [22]. In addition, CMS [23] and ATLAS [24] have also searched for the heavy exotic scalars decaying to \(\tau \tau \) in the 200–2500 GeV mass range. This motivates us to study the collider prospects for \(H,A \rightarrow \tau \tau \) and provide predictions for HL-LHC.

This article is organized as follows. We first briefly review g2HDM in Sect. 2, and derive in Sect. 3 the constraints from the experiments on important couplings relevant to our collider study. Section 4 discusses the prospects of \(p p\rightarrow H,A \rightarrow \tau \mu +X\), while Sect. 5 is reserved for \(p p \rightarrow H,A \rightarrow \tau \tau +X\). We present the discovery potential of both \(\tau \mu \) and \(\tau \tau \) in Sect. 6 and conclude in Sect. 7.

2 The general two Higgs doublet model

In g2HDM, one can write the Higgs potential in the Higgs basis, namely [13, 25]

$$\begin{aligned} V(\Phi ,\Phi ^{'})= & {} \mu _{11}^2 |\Phi |^2 + \mu _{22}^2 |\Phi '|^{2} - (\mu _{12}^2 \Phi ^{\dagger } \Phi ' + h.c)\nonumber \\{} & {} + \frac{1}{2}\eta _1 |\Phi |^4 + \frac{1}{2}\eta _2 |\Phi '|^4 + \eta _3 |\Phi |^2|\Phi '|^2 \nonumber \\{} & {} + \eta _4 |\Phi ^{\dagger }\Phi '|^2 + \left[ \left( \frac{1}{2}\eta _5 \Phi ^{\dagger }\Phi ' \nonumber \right. \right. \\{} & {} \left. + \eta _6 |\Phi |^2 + \eta _7 |\Phi '|^2\right) \left. \Phi ^{\dagger }\Phi ' + h.c.\right] , \end{aligned}$$
(3)

where EWSB arises from \(\Phi \) while \(\langle \Phi '\rangle = 0\) (hence \(\mu _{22}^2 > 0\)). In Eq. (3), \(\eta _i\)s are the quartic couplings and taken as real, as we assume the Higgs potential is CP-invariant. After EWSB, one can find [13] from Eq. (3) the mass eigenstates h, H, A and \(H^+\), as well as h-H mixing, where we define the mixing angle as \(\gamma \). In the alignment limit, \(\cos \gamma \rightarrow 0\).

The Yukawa couplings of the Higgs bosons to quarks are given as [25, 26],

$$\begin{aligned}{} & {} -\frac{1}{\sqrt{2}}\sum _{f = u,d} \bar{f}_i\left[ (\lambda _{i}^f\delta _{ij}s_{\gamma } + \rho ^{f}_{ij}c_{\gamma })h \nonumber \right. \\{} & {} \quad \left. - (\lambda _{i}^f\delta _{ij}c_{\gamma } - \rho ^{f}_{ij}s_{\gamma })H - i\,\textrm{sgn}(\mathcal {Q}_f)\rho ^{f}_{ij}A \right] R f_{j} \nonumber \\{} & {} \quad -\bar{u}_{i}\bigl [(V\rho ^d)_{ij} R - (\rho ^{u\dagger }V)_{ij} L \bigr ] d_{j}H^{+} + \mathrm{h.c.}, \end{aligned}$$
(4)

where \(\lambda _i^f = \sqrt{2} m_f/v\) is the SM Yukawa coupling, \(\rho ^f\) is the extra Yukawa matrix and \(c_{\gamma }(s_{\gamma }) \equiv \cos \gamma (\sin \gamma )\). An analogous equation holds for charged leptons, but with V set to unity because of the rather degenerate neutrinos. As discussed in the Introduction, \(\rho ^f\) can carry nonzero off-diagonal flavor violating terms. From Eq. (4), the extra Yukawa matrix \(\rho ^f\) is combined with \(c_{\gamma }\) for h, hence the \(hf\bar{f}\) coupling vanishes in the alignment limit of \(c_\gamma \rightarrow 0\). As a result, all LFV processes such as \(h \rightarrow \tau \mu \) as well as \(t \rightarrow c h\) are highly suppressed in g2HDM. On the upside, \(c_\gamma \rightarrow 0\) impliesFootnote 1\(s_\gamma \rightarrow 1\), and nonzero flavor violating couplings like \(\rho _{tc}\)\(\rho _{\tau \mu }\) can drive our signal \(gg \rightarrow H,A \rightarrow \tau \mu \) processes, or \(gg \rightarrow H,A \rightarrow tc\) [27], even process like \(cg \rightarrow tH, tA \rightarrow tt\bar{c}\) [28, 29], \(cg \rightarrow tH,tA \rightarrow t\tau \mu \) and \(cg \rightarrow bH^+\rightarrow bt\bar{b}\) [30], \(bHW^+ (\rightarrow b\tau \mu W^+, btcW^+\) [10]). We hence see that, even in the alignment limit, g2HDM can provide a rich phenomenology at the LHC.

The \(\rho _{\tau \tau }\) coupling in the alignment limit is one of the drivers for our second signal process, \(pp \rightarrow H,A \rightarrow \tau \tau \). In addition, with complex \(\rho _{\tau \tau }\), the \(h\tau \tau \) coupling can become complex hence CP violating, with the phase [20],

$$\begin{aligned} {\tan \phi _{h\tau \tau } = \frac{c_{\gamma }\,\textrm{Im}\,\rho _{\tau \tau }}{c_\gamma \,\textrm{Re}\,\rho _{\tau \tau } + s_{\gamma }\,\lambda _{\tau }}.} \end{aligned}$$
(5)

In this article, we do not explore the complexity of \(\rho _{\tau \tau }\) and \(\rho _{\tau \mu }\) but keep them real for simplicity. Furthermore, unlike Ref. [10], we follow the “normal” or conservative guesstimate [19] of the associated extra Yukawa couplings,

$$\begin{aligned} \rho _{\tau \tau } = \rho _{\tau \mu } = \mathcal {O}(\lambda _{\tau }), \quad \ \rho _{tt} = \mathcal {O}(\lambda _t). \end{aligned}$$
(6)
Fig. 1
figure 1

Limits from CMS and ATLAS on \(\mathcal {B}(H\rightarrow \tau \mu , \tau \tau )\) under the assumption of SM-like (exotic) H production

3 Constraints on relevant parameters

The important parameters governing \(pp \rightarrow H,A \rightarrow \tau \mu + X \) and \(pp \rightarrow H,A \rightarrow \tau \tau + X \) are \(\rho _{\tau \mu }\), \(\rho _{\tau \tau }\), \(\rho _{tt}\), \(c_{\gamma }\). Finite \(\rho _{tc}\) can drive \(H,A \rightarrow t\bar{c} + \bar{t}c\) [27] and dilute \(H,A \rightarrow \tau \mu , \tau \tau \). Note that \(\rho _{ct}\) suffers strict constraints from \(B_s\)\(\bar{B}_s\) mixing as it enters the process via top-loop [31], while \(\rho _{tc}c_\gamma \) is bound by direct searches by CMS and ATLAS for \(t \rightarrow c h\) decay. Recently, CMS [32] puts the most stringent bound on \(\mathcal {B} (t \rightarrow ch) < 7.3 \times 10^{-4}\) at 95% C.L. with \(h\rightarrow \gamma \gamma \) (this has been recently surpassed by ATLAS [33], though at comparable sensitivity). Note that the bound depends on \(c_{\gamma }\), and in the alignment limit (\(c_{\gamma } \rightarrow 0\)) the bound vanishes. An interesting point about the \(t \rightarrow c h \rightarrow c \gamma \gamma \) channel is that both \(\rho _{tc}\) through tch and \(\rho _{tt}\) via \(h\gamma \gamma \) enters the decay chain. However, in this article we do not explore the implications of the CMS \(t\rightarrow ch\) study on the interplay between \(\rho _{tt}\) and \(\rho _{tc}\) couplings, since both effects vanish with alignment. Following a simple scaling of [34,35,36],

$$\begin{aligned} \lambda _{tch} \equiv \rho _{tc} \textrm{c}_{\gamma } = 1.92\times \sqrt{\mathcal {B}(t\rightarrow ch)}. \end{aligned}$$
(7)

we get \(\rho _{tc} < \) 0.52 and 5.2 for \(c_\gamma \) =0.1, and 0.01, respectively.

The \(\rho _{tt}\) coupling is responsible for the production of the exotic scalars by gluon–gluon fusion via top-loop, and it is constrained by B physics, especially \(B_{q}\)\(\bar{B}_{q}\) mixing and \(b\rightarrow s\gamma \), as well as direct searches for \(gg \rightarrow \bar{t}H^{+} b \rightarrow \bar{t} t \bar{b} b + \mathrm{h.c.}\) [37, 38]. We find that [14] B physics puts stronger bounds than direct searches. So we fix (Eq. (7) of Ref. [14])

$$\begin{aligned} \rho _{tt} = 0.2 \times \Biggl ( \frac{m_{H^+}}{150 \; \textrm{GeV}} \Biggr ) \,. \end{aligned}$$
(8)

In Fig. 1 we present the limits on \(\mathcal {B} (H \rightarrow \tau \mu )\) and \(\mathcal {B}(H \rightarrow \tau \tau )\) assuming SM-Like production for simplicity. We do not enforce this assumption in the rest of the article. An interesting result emerges: we find that CMS \(H\rightarrow \tau \mu \) is much more stringent than the ATLAS \(H\rightarrow \tau \tau \). For the case of \(pp \rightarrow H \rightarrow \tau \mu +X\), the exotic Higgs mass is reconstructed with the invariant mass, \(M_{\tau \mu }\), using collinear approximation in tau decays that allows CMS to put a more stringent limit than \(pp \rightarrow \phi \rightarrow \tau \tau +X\), in which they applied the less precise cluster transverse mass. For \(pp \rightarrow H \rightarrow \tau \tau +X\) with 330 GeV \( \lesssim m_H \lesssim \) 470 GeV, the CMS limit appears mildly better than ATLAS. However, it is probably within uncertainty.

Fig. 2
figure 2

Scan of allowed points in \(m_A\)-\(m_H\) plane for Cases A and B (see text for more details)

In this article, for simplicity we set all off-diagonal \(\rho _{ij} = 0\) except \(\rho _{\tau \mu }\), and all diagonal \(\rho _{ii} \sim \lambda _i\) except \(\rho _{tt}\) and \(\rho _{\tau \tau }\). The limits on the extra \(\tau \) Yukawa couplings \(\rho _{\tau \mu }\) and \(\rho _{\tau \tau }\) depend on the choice of mass and mass differences for A, H and \(H^+\). We shall consider four different scenarios,

$$\begin{aligned} \mathrm{Case\ A1}:{} & {} \ m_{H}< m_A = m_{H^+},\; \textrm{c}_{\gamma } = 0.01, \nonumber \\ \mathrm{Case\ A2}:{} & {} \ m_{H}< m_A = m_{H^+},\; \textrm{c}_{\gamma } = 0.1, \nonumber \\ \mathrm{Case\ B1}:{} & {} \ m_{A}< m_H = m_{H^+}, \; \textrm{c}_{\gamma } = 0.01, \nonumber \\ \mathrm{Case\ B2}:{} & {} \ m_{A} < m_H = m_{H^+},\; \textrm{c}_{\gamma } = 0.1. \end{aligned}$$
(9)

ATLAS [39, 40] and CMS [41] have placed limits on \(c_\gamma = \cos (\beta -\alpha )\) for 4 types of Yukawa interactions in two Higgs doublet models where extra Yukawa couplings are absent: Type I (\(|c_\gamma | < 0.3\)), Type II, Lepton-Specific, and Flipped (\(|c_\gamma | < 0.1\)). With extra Yukawa coupling matrices, however, it would be harder to constrain \(c_\gamma \). But the fact that h(125) resembles the SM Higgs boson — alignment, for simplicity, we choose \(c_\gamma = 0.1\) and 0.01 in Eq. (9) as benchmarks for the alignment limit.

We consider the mass difference \(|m_H - m_A| =\) 150 GeV. A higher mass difference of around 150 GeV may run into various constraints. To show allowed parameter space, we perform a random scan by setting \(m_h = 125.1\) GeV, \(\eta _{2} \in [0, 5]\) and \(\eta _{7} \in [-5, 5]\), and for the lighter H case scan \(m_{H} \in [200, 500]\) GeV and \(m_{A} \in [200, 700]\) GeV. The scan result is given in Fig. 2a, and vice versa for the lighter A case in Fig. 2b. All the points in Fig. 2 satisfy (see e.g. Ref. [10]) vacuum stability, perturbativity, unitarity and T-parameter constraints.

We see from Fig. 2 that the allowed \(|m_H - m_A|\) difference decreases as we increase the mass of H or A. We select \(m_H = [200, 500]\) GeV for Cases A1 and A2, and \(m_A = [200, 500]\) GeV for Cases B1 and B2, using some random points from scan that satisfy the mass-difference to get an estimate of \(\lambda _{Hhh}\) [42],Footnote 2 i.e the trilinear-Higgs coupling for \(H\rightarrow hh\).Footnote 3 In Fig. 3 we present the branching fractions for different decay modes of H and A for all four cases of Eq. (9). Note that \(\textrm{c}_{\gamma }\) does not affect any fermionic decay width of A, although \(\Gamma (A \rightarrow Z h) \propto |\mathrm c_{\gamma }|^2\).

Fig. 3
figure 3

Branching fractions for a, b H decays, and c, d A decays

Fig. 4
figure 4

Limits on \(\rho _{\tau \mu }\) from \(H\rightarrow \tau \mu \) (gray-shaded), \(h \rightarrow \tau \mu \) (blue-solid) and \(\mathcal {B}(\tau \rightarrow \mu \gamma )\) (red-dashed)

Using the branching fractions from Fig. 3, we estimate the limits on \(\rho _{\tau \mu }\) (\(\rho _{\tau \mu } = \rho _{\mu \tau }\)) from CMS, and find it to be more stringent than from Belle, where \(\rho _{\tau \mu }\) and \(\rho _{\mu \tau }\) also receive constraints from flavor physics, the most relevant one from \(\tau \rightarrow \mu \gamma \). Belle recently measured [43] \(\mathcal {B}(\tau \rightarrow \mu \gamma ) < 4.2 \times 10^{-9}\) at 90 % C.L., improving slightly over the BaBar limit of \(\mathcal {B}(\tau \rightarrow \mu \gamma ) < 4.4 \times 10^{-9}\) [44] at 90% C.L. The branching fraction of \(\tau \rightarrow \mu \gamma \) is [45],

$$\begin{aligned} \mathcal {B}(\tau \rightarrow \mu \gamma ) = \frac{48 \pi ^3 \alpha }{G_F^2} (|A_L|^2 + |A_R|^2)\mathcal {B}(\tau \rightarrow \mu \nu _{\tau }\bar{\nu }_{\mu }),\nonumber \\ \end{aligned}$$
(10)

where \(\mathcal {B}(\tau \rightarrow \mu \nu _{\tau }\bar{\nu }_{\mu }) = 17.39\%\) [46], and \(A_{L(R)}\) are the amplitudes based on different chiral structures coming from one- and two-loop diagrams. We include one-loop effects from all A, H and \(H^+\), and likewise for Barr-Zee type two-loop contributions. Bounds on \(\rho _{\tau \mu }\) from Belle is given in Fig. 4 red (dashed) for all four cases. We find the Belle bound from \(\tau \rightarrow \mu \gamma \) is weaker than the CMS bound from \(H \rightarrow \tau \mu \), and for all cases \(\rho _{\tau \mu }\) can be lower than \(\lambda _{\tau }\) near 2\(m_t\) threshold; the bounds for lighter A are even more stringent than lighter H due to higher production cross section for the same values of mass. The \(\mathcal {B}(\tau \rightarrow \mu \gamma )\) bounds show little mass dependence, largely because of our \(\rho _{tt}\) from Eq. (8), increasing \(\rho _{tt}\) compensates the suppression from heavier scalar mass of the two-loop diagram. We keep \(\rho _{\tau \tau } = \lambda _{\tau } \sim 0.01\).

For \(\textrm{c}_{\gamma } \ne 0\), slightly away from the alignment limit, \(pp \rightarrow h \rightarrow \tau \tau \) also puts constraints on \(\rho _{\tau \tau }\) along with \(p p \rightarrow H \rightarrow \tau \tau \) direct search by ATLAS and CMS. From Fig. 1, we select min(CMS, ATLAS) and use the \(\rho _{tt}\) ansatz of Eq. (8) to estimate the bounds on \(\rho _{\tau \tau }\) for the four cases of Eq. (9). Our results are presented in Fig. 5, where we keep \(\rho _{\tau \mu } = \lambda _{\tau }\).

Fig. 5
figure 5

Limits on \(\rho _{\tau \tau }\) from CMS [57] (salmon region) and ATLAS [58] (pink region) measurements of \(\mathcal {B} (h \rightarrow \tau \tau )\) to within one standard deviation. Also shown are the bounds from CMS [23] and ATLAS [24] searches for \(p p \rightarrow H \rightarrow \tau \tau +X\) (blue lines)

The \(\rho _{\tau \mu }\) and \(\rho _{\tau \tau }\) bounds are correlated, but since neither \(\tau \mu \) nor \(\tau \tau \) are the dominant decay mode, a closer look at the limits of Figs. 4 and 5, we find that they do not deviate much above \(\mathcal {O}(\lambda _{\tau })\). We have checked that increasing one does not significantly affect the limits on the other from CMS and ATLAS searches. For \(\tau \rightarrow \mu \gamma \), \(\rho _{\tau \tau }\) only comes at one-loop level which is chiral-suppressed, hence Belle limits on \(\rho _{\tau \mu }\) have no effect. We present the production cross sections for \(pp\rightarrow H,A \rightarrow \tau \mu \) and \(pp \rightarrow H,A \rightarrow \tau \tau \) in Fig. 6a, b, respectively, using the \(\rho _{tt}\) ansatz of Eq. (8) and branching fractions from Fig. 3.

4 Collider prospects for \(H,A \rightarrow \tau \mu \)

In this section we demonstrate our approach towards searching for the \(H,A \rightarrow \tau \mu \) channel at LHC. For \(\tau \) decay, we include \(\tau \rightarrow e \nu _{e} \nu _{\tau }\) and \(\tau \rightarrow j_{\tau } \nu _{\tau }\) decay modes, where \(j_{\tau } = \pi , \rho , a_1\). We divide our collider study into two parts, (a) fully leptonic channel, , and (b) semileptonic channel, . We consider all four cases of Eq. (9). For simplicity we keep \(\rho _{\tau \mu } = \rho _{\tau \tau } = \lambda _{\tau }\), but for estimating statistical significance, we follow the limits derived in previous section.

Analysis procedure and event generation. Two types of collider studies are performed: (a) parton level (PL) without hadronization or detector effects; (b) event level (EL) with hadronization using PYTHIA 8.2 [47] and detector effects simulated by DELPHES 3.5 [48]. For the parton level analysis, we use our code for phase-space integration using the VEGAS algorithm [49]. For the signal, we use analytic expressions from [50] to calculate \(gg \rightarrow H,A\) production at tree level, then use HIGLU [51] to estimate higher-order corrections. We use CT14LO [52] parton distribution functions to calculate leading order (LO) processes.

For the backgrounds, we use MadGraph5 [53] and HELAS [54] libraries to extract matrix elements for all possible Feynman diagrams at LO and scale them using K-factors. We apply minimal smearing of lepton and jet momenta following ATLAS [55] and CMS [56] specifications. For simplicity, we keep the smearing for e and \(\mu \) the same:

$$\begin{aligned}{} & {} \frac{\Delta E}{E} = \frac{0.6}{\sqrt{E(\textrm{GeV})}} \oplus 0.03 \ (\textrm{jets}), \quad \nonumber \\{} & {} \frac{\Delta E}{E} = \frac{0.25}{\sqrt{E(\textrm{GeV})}} \oplus 0.01 \ (\textrm{leptons}). \end{aligned}$$
(11)

We use collinear approximation [59] for \(\tau \) decay at the parton level.

Fig. 6
figure 6

Production cross sections for a \(pp \rightarrow H,A \rightarrow \tau \mu + X\), and b \(\tau \tau + X\) for \(\sqrt{s}\) = 13 TeV. The CMS2019 limits on \( \tau \mu \) cross section and min(CMS,ATLAS) limits on \(\tau \tau \) are also shown

For the event level analysis, we first generate parton level events with Madgraph, then pass it to PYTHIA 8.2 and then DELPHES 3.5. We use the anti-\(k_T\) algorithm [60] for jet clustering and keep all parameters at default values, as described in the DELPHES card for the CMS detector. The decays of \(\tau \) leptons are modeled using TAUOLA [61].

Fully leptonic channel and backgrounds. With \(\tau \rightarrow e \nu _e \nu _{\tau }\), our signal is . So in the final state we have two opposite sign, different flavor leptons along with missing transverse energy. Important backgrounds come from \(W^+W^-\), \(t\bar{t}\), \(tW^{\pm }\) and \(Z,\gamma ^* \rightarrow \tau \tau \). We use TOP++ to estimate the K-factor for \(t\bar{t}\) background [62], and MCFM 8.0 [63] to estimate the NLO corrections to remaining backgrounds. Since \(\mu \) comes directly from Higgs decay, it is quite energetic, so we select events with \(p_{T}(\mu ) > \) 60 GeV and \(|\eta (\mu )|<\) 2.4. For electron, we select events with \(p_{T}(e) > \) 10 GeV, \(|\eta (e)| < \) 2.4. In addition, we veto all events with extra jets and require 20 GeV.

Table 1 Cuts applied for leptonic and semileptonic channels in \(H, A\rightarrow \tau \mu \) study

For selected events, we reconstruct the transverse mass [64] of a lepton (\(\ell \)) and missing transverse energy (), or \(\mu \), with

(12)

where \(p_{T}(\ell )\) is the transverse momentum of electron or muon, while \(M_{\tau \mu }\) is the reconstructed invariant mass of \(\tau \) and \(\mu \) with a pronounced peak near \(m_\phi , \phi = H\) or A, using collinear approximation [65]. In the collinear approximation, the \(\tau \) coming from HA decay is highly boosted, hence we assume that its decay products are also boosted in the same direction. Under the collinear approximation, we can write,

$$\begin{aligned} p_\textrm{vis} = x\, p_{\tau }\,, \quad \textrm{and} \quad p_\mathrm{\nu } = (1 - x)\, p_{\tau }, \end{aligned}$$
(13)

where \(p_{vis}\) is the four-momentum of the visible particle(s) from \(\tau \) decay, \(p_\mathrm{\nu }\) is the total four momentum of all neutrinos from \(\tau \) decay, and x is the fraction of \(\tau \) momentum carried by \(p_\textrm{vis}\). We know the four-momentum of the visible particles and coming from the \(\nu \)’s. After some algebra, we find

(14)

Note that this assumption will only give reasonable results if the \(\tau \) lepton is the only source of missing transverse energy, and that it is highly boosted. We select events that satisfy 100 GeV and 100 GeV [16].

Table 2 Background cross sections for \(e \mu \) final state at parton and event levels
Fig. 7
figure 7

Cross sections after all cuts at parton (a, c), and event (b, d) levels for \(p p \rightarrow H,A \rightarrow \tau \mu \). Upper (lower) panels are for the (semi-)leptonic channel

We define a moving mass window of \(|M_{\tau \mu } - m_{H,A}| < 0.2 \times m_{H,A}\) to estimate the irreducible background from SM processes for a particular \(m_H\). All the cuts discussed here are summarized in Table 1. Cross sections of all backgrounds for PL and EL are presented in Table 2. The signal cross sections are given in Fig. 7a, b. It is important to note that two b-veto plays a vital role in suppressing \(t\bar{t}\) background. In the CMS [16] study, their \(t\bar{t}\) enriched control region requires at least 1-jet to be tagged as a b-jet. We have checked that when we select events that contain at least 1 b-jet the \(t\bar{t}\) becomes the most dominant background almost 20 times the contribution from \(W^+W^-\).

Semileptonic channel and backgrounds. When \(\tau \) decays hadronically to \(j_{\tau }\), the signal becomes , giving us a final state with one jet tagged as a \(\tau \)-jet, one \(\mu \) and missing transverse energy. Important backgrounds come from \(W^{\pm }j\), \(t\bar{t}\), \(W^+W^-\), \(Z,\gamma ^* \rightarrow \tau \tau \) and \(tW^{\pm }\). Event selection is similar to the leptonic channel. Following CMS [16], we require \(p_{T}(j_{\tau }) > \) 30 GeV and \(|\eta (j_{\tau })| < 2.5\), with muon selection the same as before. We again reconstruct the transverse masses and , then the collinear mass \(M_{\tau \mu }\) using Eqs. (13) and (14) to reconstruct \(\tau \) four momentum. Cuts on the reconstructed transverse masses are taken from CMS [16] and summarized in Table 1. Background cross sections after all cuts in Table 1 are applied for different Higgs masses are given in Table 3. The signal cross sections as a function of Higgs mass are given in Fig. 7c, d. We scale the parton level signal cross section and \(Z,\,\gamma ^* \rightarrow \tau \tau \) with \(\epsilon _{j_{\tau }}\) = 0.7 [66], and the mistag rates at 1/35 for 1-prong and 1/240 for 3-prong \(\tau \) decays [67].

Table 3 Background cross sections for \(j_{\tau } \mu \) final state after cuts at PL
Table 4 Cuts for \(H,A \rightarrow \tau \tau \)

5 Collider prospects for \(H,A \rightarrow \tau \tau \)

The \(H,A \rightarrow \tau \tau \) process is more challenging than \(H,A \rightarrow \tau \mu \) to probe at LHC because mass reconstruction is poorer. The tau pairs from \(gg \rightarrow H,A \rightarrow \tau \tau \) are back to back. It is difficult to determine the invariant mass of tau pairs (\(M_{\tau \tau }\)). Thus we rely on the transverse mass with visible particles and missing transverse energy from tau decays. Furthermore, for \(\rho _{\tau \tau } =\rho _{\tau \mu } = \lambda _{\tau }\), the \(H,A \rightarrow \tau \tau \) final state has half the event rate compared to \(H,A \rightarrow \tau ^+\mu ^- +\tau ^-\mu ^+\). Although there is no flavor violation, we consider the same final state as .

Contribution of \(\tau \mu \) in \(\tau \tau \). With the same final state, obviously \(H, A\rightarrow \tau \mu \) can contribute to \(H, A \rightarrow \tau \tau \). However, we find that is a powerful variable in separating the \(\tau \tau \) signal from \(\tau \mu \). This is because 100 GeV for \(\tau \mu \), while GeV for \(\tau \tau \), as \(\mu \) from the former comes from Higgs decay whereas for \(\tau \tau \) it comes from \(\tau \) decay.

For \(H, A \rightarrow \tau \tau \), both leptons (fully leptonic) and \(j_{\tau }\) + lepton (semileptonic) come from \(\tau \) decay, which give 4 and 3 neutrinos, respectively, in the final state. This makes the collinear approximation weaker for mass reconstruction. So we rely on cluster transverse mass of two visibly decaying \(\tau \)’s and (), which is given as [64],

(15)

with \(M(\tau _\textrm{vis1},\,\tau _\textrm{vis2})\) and \({p}_{T}(\tau _\textrm{vis1},\,\tau _\textrm{vis2})\) the invariant mass and net transverse momentum of the two visible \(\tau \) decays, respectively.

Fig. 8
figure 8

Cross sections for \(pp \rightarrow H,A \rightarrow \tau \tau \) for both leptonic (a, b) and semileptonic (c, d) final states after applying all cuts at parton (a, c) and event (b, d) levels

Following ATLAS [24], our selection rules for \(e\mu \) and \(j_{\tau }\mu \) final states are given in Table 4. The cross section for the signal at both EL and PL for leptonic and semileptonic channels is presented in Fig. 8. The background cross sections are given in Table 5 after all cuts at both EL and PL for leptonic channel, and in Table 6 only at PL for semileptonic channel.

Table 5 Background cross sections for \(e \mu \) final state after cuts at PL and EL in \(\tau \tau \) channel
Table 6 Background cross sections for \(j_{\tau } \mu \) final state after cuts at PL in \(\tau \tau \) channel

6 Statistical significance of the signal

We now estimate the discovery potential of all channels we have discussed. We have kept \(\rho _{\tau \mu } = \rho _{\tau \tau } = \lambda _{\tau }\) in Figs. 7 and 8 for simplicity. In this section, however, we consider the constraints on \(\rho _{\tau \mu }\) and \(\rho _{\tau \tau }\) as discussed in Figs. 4 and 5. We scale our signal cross section for \(\tau \mu \) channel using the most strict limit for each case. For the \(\tau \tau \) channel, especially with \(h\rightarrow \tau \tau \) constraint coming into play, we use \(\rho _{\tau \tau } < 0\) limits to enhance the signal estimates for Cases A1 and A2, as \(\lambda _{H\tau \tau } \simeq \lambda _{\tau } \textrm{c}_{\gamma } - \rho _{\tau \tau } \textrm{s}_{\gamma }\).Footnote 4 For Cases B1 and B2, \(A \rightarrow \tau \tau \) limits are more stringent for \(m_A < 2 m_t\), and beyond which we choose the magenta dashed (CMS +1\(\sigma \)) of Fig. 5 to stay within experimental constraints.

Fig. 9
figure 9

Statistical significance \(N_{SS}\) for \(p p \rightarrow H,A \rightarrow \tau \mu \) at PL, where a, b are for the fully leptonic channel, and c, d for the semileptonic channel. Both \(\sqrt{s}\) = 13 (blue) and 14 TeV (magenta) are given, where solid (dashed) lines are for \(\cos \gamma \) = 0.01 (0.1). The left panels a, c are for \(m_H < m_A = m_{H^+}\), and b, d for \(m_A < m_H = m_{H^+}\). We have used \(\rho _{\tau \mu } = \rho _{\mu \tau } =\) limits, derived in Sect. 3, i.e. Fig. 4

Fig. 10
figure 10

Same as Fig. 9 for \(p p \rightarrow H,A \rightarrow \tau \tau \) at PL

Fig. 11
figure 11

Statistical significance \(N_{SS}\) at the event level for (top row) and (bottom row). Both \(\sqrt{s}\) = 13 (blue) and 14 TeV (magenta) are presented, where solid (dashed) lines are for \(\cos \gamma \) = 0.01 (0.1). The left panels (a, c) are for \(m_H < m_A = m_{H^+}\), and (b, d) for \(m_A < m_H = m_{H^+}\). We have chosen \(\rho _{\tau \mu } = \rho _{\mu \tau } =\) limits, derived in Sect. 3, i.e. Fig. 4

To estimate significance, we assume Gaussian distribution, and denote \(N_S\) as the number of signal events, and \(N_B\) for background events (combining all background processes). The statistical significance \(N_{SS}\) is evaluated with [68]

$$\begin{aligned} N_{SS} = \sqrt{2 (N_B + N_S) \mathrm \ln \Bigg (1 + \frac{N_S}{N_B}\Bigg ) - 2 N_S} \,. \end{aligned}$$
(16)

For a large number of background events (\(N_B \gg N_S\)), it simplifies to become the well known discovery significance

$$\begin{aligned} N_{SS} = \frac{N_S}{\sqrt{N_B}} \,. \end{aligned}$$
(17)

We estimate the statistical significance for each signal point using Eq. (16) at the parton level for \(H,A\rightarrow \tau \mu \) and \(H,A \rightarrow \tau \tau \), for both fully leptonic and semileptonic channels. We present our estimates for \(N_{SS}\) at PL for \(H,A \rightarrow \tau \mu \) and \(H,A \rightarrow \tau \tau \), respectively, in Figs. 9 and 10 for \(\sqrt{s} =\) 13 and 14 TeV at \(\mathcal {L} = 3\, \textrm{ab}^{-1}\). In both figures, we give \(N_{SS}\) for the purely leptonic channel in (a) and (b), and for semileptonic channels in (c) and (d).

We only present \(N_{SS}\) for purely leptonic channel at EL for simplicity, which is given in Fig. 11 for \(\tau \mu \) in (a) and (b), and for \(\tau \tau \) in (c) and (d).

7 Discussion and conclusion

Extra \(\tau \) couplings \(\rho _{\tau \tau }\) and \(\rho _{\tau \mu }\) can act as good probes for exotic H, A scalars below the \(t\bar{t}\) threshold at HL-LHC. We have illustrated the prospects of discovering either H or A in \(\tau \mu \) and \(\tau \tau \) final states. We studied the constraint on relevant couplings from various searches and estimated the statistical significance at the HL-LHC with \(3\, \textrm{ab}^{-1}\) for \(\sqrt{s} = 13\), and 14 TeV.

From our study, we offer the following comments.

  • CMS \(H \rightarrow \tau \mu \) with \(35.9 \, \textrm{fb}^{-1}\) data puts stronger bound on \(\rho _{\tau \mu }\) than the latest Belle limit on \(\tau \rightarrow \mu \gamma \), which is limited to the mass range considered in this study. Intuitively, if H or A approach \(\mathcal {O}(\textrm{TeV})\), we expect Belle to do better. Limits from \(h \rightarrow \tau \mu \) depends on \(c_{\gamma }\), strengthening as \(c_{\gamma }\) increases from 0.01 to 0.1.

  • Constraints on \(pp \rightarrow H \rightarrow \tau \tau \) from ATLAS and CMS follow an interesting trend: ATLAS is better for \(m_H < 330\) GeV, but after which CMS and ATLAS are comparable. This again tells the amazing sensitivity of CMS, as the data used is only a quarter that of ATLAS. Constraints from \(pp \rightarrow h \rightarrow \tau \tau \) become important for \(c_{\gamma } = 0.1\).

  • If A is lighter, like in Cases B1 and B2, we find that the limits on \(\rho _{\tau \mu }\) are the most stringent below \(2m_t\), but becomes even weaker than Cases A1 and A2 (H lighter) beyond \(2m_t\). This is mainly due to \(\Gamma (A\rightarrow t\bar{t}) > \Gamma (H \rightarrow t \bar{t})\) with same mass. For all cases, the limits from CMS is better than \(\rho _{\tau \mu } = \lambda _{\tau }\) at and below \(2m_t\) threshold.

  • From our PL study of \(pp \rightarrow H,A \rightarrow \tau \mu \), we offer some insight:

    • Once we apply the \(\rho _{\tau \mu }\) constraints, Cases B1 and B2 become almost identical.

    • For Cases A1 and A2, lower value of \(c_{\gamma }\) does provide better significance, but above \(2m_t\) they become pretty close to each other.

    • We see a slight bump for Case A2 around \(2m_t\), which is again a reflection of the limits that we see in Fig. 4 as well as the rise in \(pp \rightarrow H\) cross section at \(t\bar{t}\) threshold.

    • We find that below 400 GeV, H or A can be discovered by HL-LHC with just a single channel. Above 400 GeV, significance can be improved by combining leptonic and semileptonic channels.

  • The \(pp \rightarrow H,A \rightarrow \tau \tau \) channel is much more challenging, owing to poor mass reconstruction for \(pp \rightarrow H,A \rightarrow \tau \tau \) and lower branching fraction than \(\tau \mu \) with \(\rho _{\tau \tau } \simeq \rho _{\tau \mu }\). We draw the following remarks at the parton level:

    • Just like Cases B1 and B2 for \(\tau \mu \), the statistical significance overlaps. We see an upward bump at the \(t\bar{t}\) threshold, mainly due to the rise in production cross section. Beyond \(2m_t\), the sharp rise of \(t\bar{t}\) kills the \(\tau \tau \) channel.

    • For Cases A1 and A2, \(c_{\gamma }\) dependence is clearly visible, and we prefer lower \(c_{\gamma }\) for better significance.

    • The semileptonic channel has a lower significance due to high QCD background compared to the much cleaner leptonic channel.

    • HL-LHC can still discover this channel up to \(t\bar{t}\) threshold, beyond that, we need much smarter classification techniques.

    • Unlike \(\tau \mu \) where we see less steep a fall in statistical significance, for \(\tau \tau \) there is a sharp drop in significance after \(2m_t\). This is mainly due to the limits on \(\rho _{\tau \tau }\), which stay almost the same for nearly the entire mass range for all four cases.

  • At the event level, the statistical significance follows a similar trend as PL for the \(e\mu \) channel as discussed. However, we see a drop in significance due to detector resolution and hadronization effects. But we can still discover the \(\tau \mu \) channel below \(2m_t\), and hopefully, by combining with the semileptonic channel, the discovery range can be further extended. However, \(\tau \tau \) remains challenging.

As discussed in Sec. III, we have set \(\rho _{tc} = 0\). A nonzero \(\rho _{tc}\) can further dilute both signals. In Ref. [14], we showed that \(\rho _{tc} \sim 0.5\) can dominate the branching fraction even beyond the \(t\bar{t}\) threshold. However, we will need a very detailed study on overall impact of \(\rho _{tc} \ne 0\), because it would relax the constraints from \(H \rightarrow \tau \mu \) on \(\rho _{\tau \mu }\) and from \(H,A\rightarrow \tau \tau \) on \(\rho _{\tau \tau }\). As discussed in Ref. [69], for the mass range considered here, \(\rho _{tc} \sim 0.5\) might be too high and a more reasonable value would be \(\rho _{tc} \sim 0.1\). But in that study, all \(\rho _{ij}\)s except \(\rho _{tc}\) are set to zero, and \(c_{\gamma } = 0\) was taken.

Another big motivation to study \(\rho _{\tau \tau }\) and \(\rho _{\tau \mu }\) is driven by electroweak baryogenesis (EWBG). As discussed in Ref. [70, 71], complex extra \(\tau \) couplings can also drive EWBG and explain matter–antimatter asymmetry of the Universe, although it has been questioned [72] whether light fermions can actually achieve this. The \(h, H, A \rightarrow \tau \tau \) processes are also considered as good channels to study CP violation [22, 73], a necessary condition for EWBG. On the other side, there is EWBG driven by \(\rho _{tt}\) and \(\rho _{tc}\) [74], but it is unclear which way to go. Nonzero \(\rho _{tc}\) and \(\rho _{\tau \mu }\) opens up some exciting new channels, such as \(cg\rightarrow tH, tA \rightarrow t\tau \mu \), \(cg\rightarrow bH^+ \rightarrow bHW^+ \rightarrow bW^+\tau \mu \) [14], hence providing a rich phenomenology for LHC.

In this article, we find that beyond \(2m_t\) it is quite challenging to probe either \(H,A \rightarrow \tau \mu \) or \(H,A \rightarrow \tau \tau \). However, we have to keep in mind that we have not considered all \(\tau \)-lepton decay modes, so by combining all channels of \(\tau \) decay and performing more sophisticated machine learning classifications, the combined channels may hold promising future for discovering H and A at the HL-LHC, or even future FCC-hh or SppC colliders.

Although we have not focused on the case in this article, let us end with a positive note by connecting \(pp \rightarrow H,\,A \rightarrow \tau \mu \) search with the recent confirmation of the muon \(g-2\) anomaly. In g2HDM, the muon \(g-2\) anomaly can be accounted for by sizable \(\rho _{\tau \mu }\rho _{\mu \tau }\) and relatively low mass H or A. If one takes the muon \(g-2\) anomaly seriously, it may well mean that CMS \(pp \rightarrow H,\,A \rightarrow \tau \mu \) search might draw a hint of a signal below the \(2m_t\) threshold with full Run 2 data.